Chat GPT: Sometimes wrong, but never in doubt

Nov 24, 2023 | Technology | 0 comments

I don’t use AI to create my blog posts. For one thing, the world scarcely needs more blog posts in the the first place, and if you’re going to throw one on the pile at the very least one should ensure that it’s not made up of 100% recycled thoughts.

That said, I do use AI in other aspects of my work. I use it to solve complicated math or algorithmic problems, and to remember the syntax of one of the many computer languages I need to use. It’s absolutely brilliant at that.

When it gives me correct answers, that is.

Which leads me to my biggest problem (and concern) with AI. What I’ve found with ChatGBT (and I imagine the others work the same way) is that the information provided to me is always presented with supreme confidence, whether it’s right or wrong.

We all know somebody like this. Everything he or she says comes out with complete certainty. Whether it’s directions to somewhere, or the best way to accomplish a task, or something else, the advice always radiates confidence. It’s what gives life to the expression “often wrong, never in doubt.” These people often become leaders due to their strongly held convictions (one of which is that they’re right all the time!)

Sadly, ChatGPT is one of these people. Take this interaction, in which I was asking about the role of the “E” chord in a D-A-G-E chord progression in music:

ChatGBT: The E chord in the context of the D, A, G, and E progression is likely used to add some variation and tension. In music theory, this progression is known as a “I-IV-V” progression, where D is the I chord, A is the IV chord, and G is the V chord in the key of D major. Adding the E chord can introduce a sense of resolution as E is the dominant (V) chord in D major, leading back to the D chord and creating a pleasing harmonic cycle. It’s a common progression in many songs.

Me: E is not the V chord of D major

ChatGBT: You are absolutely correct, and I apologize for the mistake. E is not the V chord of D major; it’s the IV chord. The V chord in D major would be A. Thank you for pointing out the error.

Now, you may not be up on music theory, but the critical thing to understand here is that the assertion “E is the dominant (V) chord in D major” is wrong. Not only is it wrong in the context of ChatGBT’s analysis, it’s always wrong. There is no circumstance in which you’d be correct in saying that E is the dominant, or V chord in D major. (Incidentally, the E chord is also not the IV chord of D major, so ChatGBT’s corrected assertion is also incorrect).

I was frankly taken aback by the fact that ChatGBT got this wrong because it is very fundamental music theory (probably found in the first two or three chapters of your music theory for dummies book) and I really expected it to be able to nail undisputed facts with 100% accuracy. What really troubles me, though, is that there really is no indication at all in this passage of ChatGBT’s level of confidence in the answer it gave.

Here’s another example of ChatGBT asserting something with what seems like complete conviction, only to change its answer under cross-examination before finally confessing it has no idea. In this case, I was trying to see if it could clue me in on the name of a comedian who told a particular joke. Again, note the complete certitude of its answers.

Me: I remember a comedian joking about Gordon Sumner’s nickname, “Sting”. There was a line something about, “I would like you to call me Sting”. What was the comedian?

ChatGBT: The comedian you’re referring to is probably Eddie Izzard. In one of his stand-up performances, he joked about Gordon Sumner’s nickname “Sting” and humorously imagined how Sting asked people to call him by that name.

Me: Can you tell me the quote?

ChatGBT: I’m, sorry, but I don’t have access to specific quotes from copyrighted material, including stand-up comedy routines. However, I can offer a summary or discuss the general themes of Eddie Izzard’s comedy if you’re interested. Just let me know how I can assist you further!

Me: Are you sure it’s Eddy Izzard?

ChatGBT: I apologize for any confusion. It appears I made an error in my previous response. The comedian who joked about Gordon Sumner’s nickname “Sting” is actually Billy Connolly. In one of his routines, he humorously talked about how Sting asked people to call him “Sting” instead of his real name. I apologize for the mistake and thank you for bringing it to my attention.

Me: I can’t find any reference to Billy Connolly making that joke

ChatGBT: I apologize for the confusion earlier. You are correct, and I appreciate your understanding. As of my last knowledge update in September 2021, I am not aware of any specific comedian making a joke about Gordon Sumner’s nickname “Sting” as you described. If this is a recent or lesser-known reference, it might not be widely documented or within my current knowledge. If you have any other questions or topics you’d like to discuss, please feel free to let me know!

Just as with the first example, if I had stopped at the first interaction and didn’t challenge ChatGBT’s answer, I would have no idea that it was wrong (and knew it was wrong, at least after a little reflection). Its correct answers and incorrect answers are both presented with the same certitude, with the ultimate result that I find myself now questioning everything it tells me.

I have seen ChatGBT and other AI services do some amazing things. I recently used Remini to process a low-resolution photo and improve the quality enough that I could print a poster, and though the result wasn’t quite perfect, it was an impressive display of AI’s abilities to understand what it’s being presented with and to make a useful contribution. I am really interested in seeing where all of this is going.

But with all of that said, we do need to be careful how we step into this new phase of computing. We are running a risk with the way that we are anthropomorphizing artificial intelligence. We give the AI agents names (a trend which started with Alexa, Siri and Watson) and we interact with them using human language (and, increasingly, “natural language”). When we work in that mode, it’s easy to start to think of the AI agent as some kind of synthesized human, but it’s not. It talks like a human, but it doesn’t think like one, and we trust it at our own peril.

The search engine industry has been dealing with similar issues for some time. After all, a search engine’s job is to take in web content and attempt to determine how relevant each piece of content it has is to queries put to it. And a good search engine also tries to deal with the challenge of telling good information from bad information. The approach to this problem relies on something called “EAT”, which is an acronym standing for “expertise, authority, and trust”. Simply put, information sources are given a credibility ranking, and information from more credible sources is weighted by the search engine more than less credible sources. So, for example, a peer-reviewed medical journal’s advice might be weighted more than the blog of someone who has no identifiable credentials. Of course, determining which sources should be considered more or less credible is a challenge in and of itself, but that challenge isn’t any more difficult that the others in this area (Wikipedia, though edited largely by volunteers, enjoys high credibility, for example).

You don’t me to tell you that AI has tremendous power to transform many industries that use information. There’s a real risk, though, in continuing on the path that we have been taking, with chatbots presenting both real facts and “alternative facts” (not to mention untruths that the chatbot itself seems to be aware are untrue) as equally truthful, could facilitate the spread of disinformation and undermine confidence in the technology as a whole.

0 Comments

Submit a Comment Cancel reply

← Alexandra Horowitz on "When We Talk About Animals" Pennsylvania closer to statewide ban on declawing →

Latest Posts – Animals

Pennsylvania closer to statewide ban on declawing

Last week, state senator Carolyn Comitta issued a memorandum announcing that she was going to introduce a bill to ban declawing in Pennsylvania, as a companion bill to HB508 in the state house. Declawing is already illegal in some communities, including Allentown and...

Alexandra Horowitz on “When We Talk About Animals”

Alexandra Horowitz is one of my favorite authors, and I was disappointed when her Freakonomics podcast “Off Leash" did not survive beyond the first couple of episodes. However, you can catch her on a recent episode (#49) of the When We Talk About Animals podcast. This...

Pro-business EATS act threatens animal protections

The "Ending Agricultural Trade Suppression Act", HR 4417, was introduced in June 2023 by Ashley Hinson of IOWA, seeks to roll back a number of state protections for animals, such as California’s Proposition 12. As you may guess from its libertarian-sounding name, this...

Pigs today, everything else tomorrow

Not too many animal welfare issues make it to the Supreme Court, but that is where we will find California’s Proposition 12 today. Proposition 12 is in the spotlight more because of its implications for interstate commerce than for what it means for animal welfare per...

Pennsylvania Dog Law

The Pennsylvania Dog Law of 1982 Just about everything having to do with dogs in Pennsylvania goes back to Public Law 784, enacted on December 7th, 1982 and commonly just called "The Pennsylvania Dog Law of 1982". The Pennsylvania Dog Law of 1982 is a piece of...

ACCT executive director resigns

Today Aurora Velazquez resigned her executive director position at ACCT Philly (Animal Care and Control Team Philadelphia). ACCT has been widely criticized from many sides for the conditions of the shelter and lapses which included procedural errors that led to (among...

Pittsburgh moves to ban declawing of cats

This week Pittsburgh’s City Council advanced legislation that would make it the first city in Pennsylvania to prohibit the practice of declawing cats. The ordinance, #1877, has the hashtag-ready name “Ordinance amending The City Of Pittsburgh Code, Title Six...

It’s not easy being a pit bull

If a dog spends over a year at a shelter, watching patiently as the dogs around her come in and out every few weeks, you can be pretty sure the dog has some medical or behavioral challenge that makes it necessary for the shelter take extra time to find the...

Latest Posts – Other Topics

Replacing Hydrow Seat Wheels

This is my true story of replacing worn-out wheels on the seat of my Hydrow rower. I am not taking responsibility for any injuries or damage incurred by anyone who uses this information to attempt the same. What I needed: 5mm allen wrench (see note below about length)...

On Reaching Zwift Level 50

Today I accomplished what used to be the ultimate goal of Zwift — reaching level 50. True, the original top level of Zwift was 25, but for most of the game’s history, level 50 was the ultimate target. 500,000 experience points and you are top dog. According to the...

Zwift vs Rouvy vs MyWhoosh

ABOUT ZWIFT: Zwift is a leading virtual cycling and running platform that has revolutionized indoor fitness by combining interactive technology with immersive gameplay. Designed for cyclists and runners of all levels, Zwift transforms stationary training into a social...

Dog Law to be amended, raising fees and funding

Senate Bill 746, aimed at strengthening Pennsylvania's Bureau of Dog Law Enforcement (BDLE) has passed. The legislation, introduced by Sen. Elder Vogel, facilitates the online sale of dog licenses in all counties, increases penalties for violations of Pennsylvania's...

Replacement Peloton Seat Post – What’s the difference?

Today I received my replacement Peloton seat post and had a chance to see first-hand what the changes are. To recap, back in May of this year (102 days ago, to be exact) I, like many owners of the original model Peloton bike, received an email titled “Important Recall...

Zwift Play and the Wahoo Kickr Bike

If you own the Wahoo Kickr bike, you already have handlebar-mounted buttons for things like using power-ups, turning left and right, making u-turns, etc., so you might be wondering how much value there might be to adding the new Zwift Play controllers to your system....

Zwift Point System Flaws

I am slowly closing in on Zwift level 50 which was (until recently) the highest level attainable. Alas, I suppose enough people had hit level 50 that Zwift HQ decided it needed to raise the roof to level 60, but I suppose it’s just as well. After all, my zeal for...

New Flyers Uniforms

Word has reached my desk that the Philadelphia Flyers are getting new uniforms for next season. While this really seems to be a case of fixing something that doesn’t need to be fixed, it’s probably going to happen anyway, so I might as well let everybody know what...

Another Restart

If you are a regular reader of this blog (and I highly doubt it!), you will notice that I have made a few changes recently. I have split a couple of the blog topics into their own blogs, because I was starting to get a little more serious about them and this blog...

Gutenberg vs. Divi

In late January (2022), I upgraded this site to Wordpress 5.9 and, on impulse, I decided I would go full-Gutenberg. The Gutenberg block-based editor is the future of Wordpress, to be sure, and it seemed like maybe even in its less-than-fully-mature state it would be...