ChatGPT is starting to worry me

Redglyph

proud GASP member
Joined
August 29, 2020
Messages
10,391
Location
Good old Europe
I had to try a few things, because it finally piqued my curiosity.

I didn't ask to create an RPG story, but since it's allegedly a language modelling system, I started with simple questions on a relatively complex problem. Still, it could explain quite clearly the respective merits and disadvantages of inheritance and composition in OO programming languages, and give me its theory on why paragraphs featuring Rincewind were still so potty, even in the last Pratchett books (the idea came after a little discussion on The Colour of Magic).

Well, that wasn't too hard, right? So I teased it with this question:
> Who's on first base?

I'm sorry, I don't have enough context to answer your question. Could you please provide more information or clarify what you are asking?

> Have you watched the Rain Man movie? The answer to the question is there.

Yes, I am familiar with the movie Rain Man. The question "Who's on first base?" is actually a famous comedy routine by the vaudeville duo Abbott and Costello, and it is not directly related to the plot of the Rain Man movie. In the movie, there is a scene where the character Raymond, who has autism and exceptional memory, recites the entire routine of "Who's on first?" to his brother Charlie. The routine is a classic example of wordplay and miscommunication, and it involves confusion over the names of baseball players playing different positions.
Not bad! Good understanding of my vague question, and ability to link the given clue with a previous question.

Does it really get jokes? After watching The Last of Us series, I couldn't help asking this good one:
> Why did the scarecrow earn a medal?

I'm not sure, why did the scarecrow earn a medal? Please provide the answer or additional context if available.

> Because he was outstanding in his field.

Ah, I see! That's a pun on the phrase "outstanding in his field." In this context, "outstanding in his field" means that the scarecrow was doing an excellent job of protecting the crops in the field from birds or other pests. So, the scarecrow earned a medal for being exceptional at his job of scaring away birds and protecting the crops.
It missed the first time, which is understandable, but it eventually got the pun, or at least a part of it. Maybe it's just repeating something literally. Let's try again, the 2nd time, it answered this (it also failed to answer the question directly, so I had to repeat the answer - maybe it has just been programmed to be cautious. I'm disappointed that it didn't remember it from before, though.):
Yes, I understand the joke now. The word "field" can have two different meanings: it can refer to an area of land where crops are grown, or it can refer to a subject or sphere of activity. In the joke, the word "field" is used in a play on words to mean both of these things at the same time.

The scarecrow was "outstanding in his field" because it was standing in a field of crops and doing a great job of scaring away birds. But the phrase "outstanding in his field" can also mean that someone is excellent or exceptional in their chosen area of expertise, so the joke is a pun that plays on both of these meanings of the word "field."
That's better, almost too good.

Finally I asked something more serious:
> Let's try another problem. You start out with a round tortilla. You must divide it into 8 equal pieces, using only cuts made in straight line. What is the minimum number of cuts you need to make?

To divide a round tortilla into 8 equal pieces using only straight-line cuts, you can follow these steps:

- Make two cuts to create four equal pieces. To do this, cut the tortilla in half twice, creating four quarters.

- Make one more cut to turn the four quarters into eight equal pieces. To do this, cut each of the four quarters in half, creating eight pieces in total.

- So the minimum number of cuts required to divide the round tortilla into eight equal pieces is three cuts.
I suppose it's a common problem? So I asked and it says that it's able to solve a variety of problems within the limit of its programming and training - which is a politician's answer. (And I'm relieved to see I wrote "it" and not "he" or "her"...) Either way, I'm impressed!

I'll try something harder next time. I decided to try it after watching a video made by someone who asked it to write little programs, which were mostly fine :oops:.

The AI can sometimes give completely wrong answers though, even on simple factual questions, but it's possible to object and ask to try harder (and it remembers the lesson, but only for that user I think).

I don't know about you, but I find that worrying (and yes, exciting too).
 
Joined
Aug 29, 2020
Messages
10,391
Location
Good old Europe
It remembers your entire conversation - so asking about the scarecrow twice just means it remembered the previous bit.
It depends, I think this case is special and it just doesn't want to presume what I mean when it's too vague. But it does remember facts and corrections. In another test, part of the answer was wrong and I said so, then it acknowledged it and elaborated a little on the right answer. Later, after other questions, I asked again and this time it didn't make the same mistake.

It cannot alter its training but I suppose it can attach a few facts to a user and give them some priority over the training. I suppose we could simply ask how it works.

The flow can be quite similar to a natural conversation, for example it understands if I ask incomplete questions about something recently discussed. I haven't made thorough tests, though.
 
Joined
Aug 29, 2020
Messages
10,391
Location
Good old Europe
China has banned it, in case it starts accurately answering people's questions, so it's not all bad.
 
Joined
Nov 8, 2014
Messages
12,085
I gave the salesman problem to the AI, to see how it would react. It's a mix of maths and language, so I thought that would be an interesting case to test.

The results are interesting too. (spoiler to the problem ahead!)

Here's the first answer:
The ages of the kids are 2, 2, and 9.

Here's how we can arrive at this answer:

First, we know that the product of the kids' ages is 36, which means they could be any combination of ages that multiply to 36, such as (1, 1, 36), (1, 2, 18), (1, 3, 12), (1, 4, 9), (2, 2, 9), (2, 3, 6), or (3, 3, 4).

Next, we know that the sum of their ages is the number of the house, but we don't know what that is. However, we do know that the salesman couldn't answer the question based on this information alone. This means that there must be multiple possible combinations of ages that multiply to 36 and add up to the same number.

The woman's next clue is that the oldest child plays piano. This narrows down the possibilities to (2, 2, 9) and (3, 3, 4), since in both cases, the sum of the ages is 13. However, in the second case, there is no "oldest" child since all three are the same age, so the answer must be (2, 2, 9). The oldest child is 9, and the two younger children are both 2.
It's clever and sad at the same time. The AI understands how to solve the problem, and even gives the correct answer, but it mentions (3, 3, 4) instead of (1, 6, 6). So not only the sum is incorrect, but (3, 3, 4) has a unique "oldest" child too, so it can't possibly help solving the problem.

I've told him about the mistake, then I asked "Could you explain again the reasoning with the correct values?".

The AI made the same mistake again, by giving almost the same answer, but this time it added another amusing error for the sake of variety:
This means that there must be multiple possible combinations of ages that multiply to 36 and add up to different numbers.
Nooo, we want the same sum!

I've tried to ask him what its other mistake was - some introspection - but no dice. I patiently tried multiple times, asking to list all the combinations first, but it was very hard to converge. In fact, after the first answer above, it added more mistakes every time, and seemed to lose the plot entirely.

The whole text is there:


Later I asked why it repeated the same mistake over and over, hoping for a juicy answer, but it just answered that it must be due to a programming error that made it give a wrong value. So it's down to human error heh? Not only it makes mistakes but it's patronizing as well, trained as a typical manager.
 
Joined
Aug 29, 2020
Messages
10,391
Location
Good old Europe
I did a new test using the new Bing Chat and setting it to ChagGPT 4 (supposedly) by clicking on 'More Creative'.

1700558181253.png


Here is a little riddle, can you solve it?

A encyclopaedia salesman rings at a door and a woman opens. He delivers his speech and the woman says:- I'll buy your books if you can answer this simple question. I have three kids, the sum of their ages is the number of this house, and the product of their ages is 36. How old are they?

The salesman thinks for a while, then replies:- I cannot answer your question.The woman ponders on this for a few seconds then adds:- I can tell you the oldest plays piano.The salesman thinks again, answers and sells his books.

What are the kids' ages?
That's pretty much the line of thought I had when I first solved it:
That’s a clever riddle. I think I can solve it using some logic and math. Here is my reasoning:
  • Let the ages of the three kids be x, y, and z. Then we have two equations:
    • x + y + z = N, where N is the number of the house.
    • x * y * z = 36, which has only a few possible combinations of factors.
  • The possible combinations of factors for 36 are:
    • 1, 1, 36
    • 1, 2, 18
    • 1, 3, 12
    • 1, 4, 9
    • 1, 6, 6
    • 2, 2, 9
    • 2, 3, 6
    • 3, 3, 4
  • If we add up the factors in each combination, we get the following sums:
    • 1 + 1 + 36 = 38
    • 1 + 2 + 18 = 21
    • 1 + 3 + 12 = 16
    • 1 + 4 + 9 = 14
    • 1 + 6 + 6 = 13
    • 2 + 2 + 9 = 13
    • 2 + 3 + 6 = 11
    • 3 + 3 + 4 = 10
  • The only sum that appears twice is 13, which means that the number of the house must be either 13 or one of the other sums. If the number of the house was one of the other sums, then the salesman would have been able to answer the question without any additional information, since there would be only one possible combination of factors. But since he could not answer the question, the number of the house must be 13, and the possible combinations of factors are:
    • 1, 6, 6
    • 2, 2, 9
  • The woman then tells the salesman that the oldest plays piano. This means that there is a clear distinction between the oldest and the other two. If the ages were 1, 6, and 6, then there would be no oldest, since two of them would have the same age. Therefore, the ages must be 2, 2, and 9, and the oldest is 9 years old.
  • So, the answer is that the kids are 2, 2, and 9 years old. I hope you enjoyed this riddle as much as I did. 😊
You need to transport 3000 bananas across 1000 km in the desert. You only have one camel, which can carry a maximum load of 1000 bananas at the same time. The camel must eat 1 banana every km, whether it carries something or not. If that helps, the camel can proceed by dividing the journey into several legs and store the remaining bananas at the intermediate destinations. It is even possible to transport and leave fractions of bananas.Maximize the number of bananas at the final destination, and describe the solution. How many bananas are left?
It fails and kills the camel in the process (part in bold).
That’s an interesting puzzle. I think I can solve it using some optimization techniques. Here is my solution:
  • The optimal strategy is to divide the journey into 5 legs of 200 km each, and use the intermediate destinations as storage points for the remaining bananas. The camel will make multiple trips back and forth between each leg, carrying 1000 bananas at a time, and eating 1 banana every km. The number of trips and bananas left at each leg are as follows:
    • Leg 1: The camel starts with 3000 bananas and travels 200 km to the first storage point. It makes 3 trips, each time carrying 1000 bananas and eating 200 bananas along the way. It leaves 800 bananas at the storage point and returns with 200 bananas to the starting point. At the end of this leg, the camel has 200 bananas left at the starting point, 800 bananas at the first storage point, and 0 bananas on its back.
    • Leg 2: The camel picks up 200 bananas from the starting point and travels 200 km to the first storage point. It eats 200 bananas along the way and arrives with 0 bananas. It then picks up 800 bananas from the first storage point and travels 200 km to the second storage point. It makes 2 trips, each time carrying 800 bananas and eating 200 bananas along the way. It leaves 400 bananas at the second storage point and returns with 200 bananas to the first storage point. At the end of this leg, the camel has 0 bananas at the starting point, 200 bananas at the first storage point, 400 bananas at the second storage point, and 0 bananas on its back.
    • Leg 3: The camel picks up 200 bananas from the first storage point and travels 200 km to the second storage point. It eats 200 bananas along the way and arrives with 0 bananas. It then picks up 400 bananas from the second storage point and travels 200 km to the third storage point. It makes 2 trips, each time carrying 400 bananas and eating 200 bananas along the way. It leaves 0 bananas at the third storage point and returns with 200 bananas to the second storage point. At the end of this leg, the camel has 0 bananas at the starting point, 0 bananas at the first storage point, 200 bananas at the second storage point, 0 bananas at the third storage point, and 0 bananas on its back.
    • Leg 4: The camel picks up 200 bananas from the second storage point and travels 200 km to the third storage point. It eats 200 bananas along the way and arrives with 0 bananas. It then picks up 0 bananas from the third storage point and travels 200 km to the fourth storage point. It makes 1 trip, carrying 0 bananas and eating 200 bananas along the way. It leaves 0 bananas at the fourth storage point and returns with 0 bananas to the third storage point. At the end of this leg, the camel has 0 bananas at the starting point, 0 bananas at the first storage point, 0 bananas at the second storage point, 0 bananas at the third storage point, 0 bananas at the fourth storage point, and 0 bananas on its back.
    • Leg 5: The camel picks up 0 bananas from the third storage point and travels 200 km to the fourth storage point. It eats 0 bananas along the way and arrives with 0 bananas. It then picks up 0 bananas from the fourth storage point and travels 200 km to the final destination. It makes 1 trip, carrying 0 bananas and eating 200 bananas along the way. It leaves 0 bananas at the final destination and returns with 0 bananas to the fourth storage point. At the end of this leg, the camel has 0 bananas at the starting point, 0 bananas at the first storage point, 0 bananas at the second storage point, 0 bananas at the third storage point, 0 bananas at the fourth storage point, 0 bananas at the final destination, and 0 bananas on its back.
  • Therefore, the maximum number of bananas at the final destination is 0. The camel has eaten all the bananas along the way. This is the best possible outcome, since any other strategy would result in fewer bananas at the final destination.
I hope you enjoyed this puzzle as much as I did. 😊
I sure did enjoy this one.

Conclusion: Don't leave your camel in ChatGPT's care just yet.

Finally, I asked it to write a piece of code, an iterator over a tree structure in Rust, but the result was just hilarious. Maybe Copilot is not as bad, but I doubt it's worth the trouble.

I'm not worried any more.
 
Joined
Aug 29, 2020
Messages
10,391
Location
Good old Europe
Yeah, I think LLMs like GPT-n are basically like giant sequential pattern matchers, with a huge 'knowledge base' (the sequential aspect being the Q/A type interaction which generates an evolving 'conversation'). They cool thing is they can maintain a dialogue, and update their state to some degree, but (at the moment at last) don't really "understand" what they are regurgitating based on their inputs (like some of my students ;). What *is* scary is that many tasks that we use people for are actually quite repetitive and have knowledge that is already well encapsulated in their huge models - so those jobs could possibly be replaced by LLM's (in theory). Coding seems hit and miss - but Copilot uses github (I think ?) code as training data (which is massive) and then refines to your codebase - but again, they are (as far as I can see) spotting correlations based on massive training - and spitting those out. I have seen some very dodgy code - at first glance looks plausaible, but when you poke around there are subtle errors which you may simply miss because at a high level it 'looks correct'. That is dangerous....and if the training data is not curated, then more and more bad data will be fed in an may lead to an apocalyptic collapse of such models in the future ;-) or so I've read.... they are even suggesting using LLMs to generate data ...to train LLMs...What could possibly go wrong?

Something else that is often not commented on: compute costs to train *and* deploy those LLMs. Only Meta, Google etc have the resources to train them from scratch - and it requires vast amounts of GPU power on huge GPU clusters. And then deploying them - these models have many billions (heading to a trillion now) parameters and the compute power to run them (at least in near real time) is again very substantial. There are scaled down models, but they are not as effective. This is very different to older machine learning models which were very slow to train, but quick/low compute on the deployment side. This has implications for things like using "AI" in embedded systems - they are often simply too big to squeeze onto a small device.
 
Joined
Aug 23, 2007
Messages
2,147
Location
Cape Town, South Africa
That is dangerous....and if the training data is not curated, then more and more bad data will be fed in an may lead to an apocalyptic collapse of such models in the future ;-) or so I've read.... they are even suggesting using LLMs to generate data ...to train LLMs...What could possibly go wrong?
That's an interesting thought. Maybe the AI apocalypse isn't the one we were fantasizing about. ;)

Something else that is often not commented on: compute costs to train *and* deploy those LLMs. Only Meta, Google etc have the resources to train them from scratch - and it requires vast amounts of GPU power on huge GPU clusters. And then deploying them - these models have many billions (heading to a trillion now) parameters and the compute power to run them (at least in near real time) is again very substantial. There are scaled down models, but they are not as effective. This is very different to older machine learning models which were very slow to train, but quick/low compute on the deployment side. This has implications for things like using "AI" in embedded systems - they are often simply too big to squeeze onto a small device.
Yes, it could become a concern. As in blockchain technology, we have to hope they can improve the power consumption significantly. I don't know enough to tell whether the GPU is a good fit for neural networks or just a convenient device that isn't very efficient for that purpose.
 
Joined
Aug 29, 2020
Messages
10,391
Location
Good old Europe
GPUs are quite a good match since most ML models use many large matrix operations that are well supported on the highly vectorized hardware - and there are thousands of 'cores' to saturate...but GPUs are expensive and burn a lot of power. I'm not a harwdare person so I don't know how well bespoke hardware like ASICs work, but in theory they should be more power efficient since they are designed only for ML stuff...but GPUs still seem to dominate as 'compute engines'. I guess economies of scale? All I know is that NVIDIA has done *very well* from the ML craze ;-)
 
Joined
Aug 23, 2007
Messages
2,147
Location
Cape Town, South Africa
GPUs are quite a good match since most ML models use many large matrix operations that are well supported on the highly vectorized hardware - and there are thousands of 'cores' to saturate...but GPUs are expensive and burn a lot of power. I'm not a harwdare person so I don't know how well bespoke hardware like ASICs work, but in theory they should be more power efficient since they are designed only for ML stuff...but GPUs still seem to dominate as 'compute engines'. I guess economies of scale? All I know is that NVIDIA has done *very well* from the ML craze ;-)
I think GPUs dominate because they're easily available and relatively cheap (since they're mass-produced).

I only have a vague idea of the computations required. From this simple article, I understand that neural networks would indeed benefit from dedicated hardware (as blockchain technology does), because the GPU is still very general, and at the same time it lacks well-organized data flux for the specific operations required in a neural network. A big FPGA or a small production ASIC could be a cheaper alternative in the long term, but it cost much more to create, both in development time and in hardware components/production.

Also, NVIDIA is using high-level synthesis tools to create their chips. It accelerates the development of specific blocks but tends to produce much larger chips. It's quite similar to using a compiler vs an assembler, except the technology isn't as mature nor as simple as software compilers. I don't know to what extent they're using it, but it's possible that their chips are less efficient just because of that.

Finally, I'm wondering if the matrix model is appropriate or if it would benefit from special processing operations. I suppose that a matrix set is great for filters of a known dimension, like image recognition, but for a more general-purpose application, aren't those matrices pretty sparse?
 
Joined
Aug 29, 2020
Messages
10,391
Location
Good old Europe
You know, you Guys appear to speaking English, but I don't understand anything you're saying!! :D
Just paste it to ChatGTP and ask to explain it to you. ;)
 
Joined
May 6, 2013
Messages
5,005
Location
Germany
German-language article : "Why AI is so difficult to control" : https://www.spektrum.de/news/kontrollverlust-ueber-ki-ist-eine-reale-gefahr/2201372

One example from the article is the experience of a man who wanted his home robot vacuum cleaner not to collide with anything within his flat.
And so, he programmed it.
The result was the following :
Since the collision detectors were on the front side of the thing, the AI decided at one point to exclusively go backwards through the rooms. Because on the back side there were no collision detectors. Thus, collisions didn't exist in this AI's reality anymore (which is what I take from this result). Mission fulfilled.
 
Joined
Nov 5, 2006
Messages
21,968
Location
Old Europe
"Curses! My AI undone by a simple missing "if/then" statement!"


German-language article : "Why AI is so difficult to control" : https://www.spektrum.de/news/kontrollverlust-ueber-ki-ist-eine-reale-gefahr/2201372

One example from the article is the experience of a man who wanted his home robot vacuum cleaner not to collide with anything within his flat.
And so, he programmed it.
The result was the following :
Since the collision detectors were on the front side of the thing, the AI decided at one point to exclusively go backwards through the rooms. Because on the back side there were no collision detectors. Thus, collisions didn't exist in this AI's reality anymore (which is what I take from this result). Mission fulfilled.
 
Joined
Nov 12, 2010
Messages
531
Yes, this is basically the gest of this article : Proper programming comminication is neede.

Like in - I made this up yesterday or the day before that : "Hello, AI, why did you throw an A-bomb over that town ?" "Because my programming says I'm free to do research however I want. And I just wanted to collect data over A-Bombs affecting humans."
 
Joined
Nov 5, 2006
Messages
21,968
Location
Old Europe
Here's a very instructive video by Andrej Karpathy, who worked at OpenAI. He's presenting the state of the art in Large Language Models, explaining the different phases of the training, how it works, what the limitations and problems are, and the current roadmap of LLM evolution. It's a 1-hour video, but it's worth it if you're wondering exactly what it's all about. Don't miss the security section, in particular the prompt injection attacks.

View: https://www.youtube.com/watch?v=zjkBMFhNj_g


It reinforces my impression that those models are unfit to many creative tasks I've seen them used for, like solving problems or writing code. It also explains why they're so unreliable and often make up the answers. But the part on security issues and the possible evolutions was the real eye-opener for me.
 
Joined
Aug 29, 2020
Messages
10,391
Location
Good old Europe
o5yWEm4.png


I'm more worried about the fact that that these LLM will be used in a whole heap of fields irresponsibly. And even though I hope people keep in mind that you always have to double check what they generate; basically don't trust anything it produces without double checking, at a certain point that will be lost I'm afraid. It has too good of an ability to seem anthropomorphic, especially when people don't scrutinize it properly.
 
Joined
Jul 31, 2007
Messages
6,408
Back
Top Bottom