Humanizing AI: Effective Communication with Language Models and What It Reveals About AI Technology // BrXnd

Transcript

(upbeat music) - Good morning, everybody.

Very excited to be here.

And thanks to NOAA for having me back for another year.

My talk today is entitled, "How to Talk So Language Models Will Listen."

And it's a talk that explores some thinking I've been doing into how we interact with large language models, what it is that they've learned, and where it all might be going.

And so I actually wanna start here with actually just a personal anecdote.

So I just got recently back from San Francisco, and it turns out you can take a Waymo self-driving car now.

And it's this kind of incredible technological experience where you like call it like you would an Uber.

The car drives up, the door pops open, and you get in and it just takes you to your destination.

And it still feels fairly dangerous, right?

You're basically in the back seat and the wheel is just turning and you're just going, right?

And then you get to the end of your destination.

And what's so interesting is that, you know, recounting this story later, it's actually not this technological wonder that was so striking about the experience.

It's this really funny moment at the very end of the trip where it pulls up to the curb, the door opens up, you know, it shows you how much you've been charged for your trip.

And then as you're getting out, there's like a little voice in your head is like, you should say thank you on the way out.

And like, I was like, should I do this?

This is really dumb.

And I was like, thank you.

And then I got out basically.

And of course this is like a completely insane thing to do because there's just no one in the car except for it's just you in the car, basically.

And this has a really interesting analogy in the world of chatbots and chatGBT.

I just got done working with a company called Inflection that was in sort of the personal AI space.

And what's interesting is you look at all of the transcripts of people talking to chatGBT, and you end up with lots and lots of transcripts that end with people saying things like, thanks, or like, see you later.

Or in my favorite case, people starting conversations with sorry to bother you.

This is, again, a literally insane thing to say to an AI.

It's not like you're knocking on the cubicle and being like, could you get this done for me?

Could you write this email for me?

And so we're kind of left with a mystery, which is why in so many cases do humans engage in this kind of behavior that literally makes no sense at all?

And there's kind of two hypotheses that you frequently hear in this space.

There's kind of the sort of cynical hypothesis and then the sort of sinister scary hypothesis.

So let's talk about them just very briefly.

The cynical hypothesis is basically that humans are idiots.

Like this is basically a replication of the famous Harlow experiments using chimps, where it turns out that chimps will very gladly bond with a mother surrogate, so long as that mother surrogate is warm and fluffy and looks friendly and inviting.

And so the theory here is that, yeah, we're very much doing the same.

Like a human is confronted with something that behaves in a friendly way and interacts with you conversationally.

And so whatever goes on in our head just turns on and we're like, okay, well, we should be like, sorry to bother you or thank you, or do you have time for this email that I need you to generate?

And this is sort of the cynical view, which is regardless of whether or not it is in fact human or not, we treat things like they are other beings simply because they behave like that in some ways.

And it just turns out that it doesn't take a whole lot to do this.

Like if you're sitting in a self-driving car, you are compelled to engage in this kind of behavior.

And this is kind of the cynical view, which is it's nonsensical.

We just do it 'cause we're human.

We're basically just monkeys.

The sort of cynical, scary, terrifying view, some of you may be familiar with this image that passes along the machine learning space, is that we're actively being manipulated by these machines, right?

That it actually turns out that AIs are not humans at all and they don't even behave like humans.

They're these Lovecraftian beasts of linear algebra.

And then we just kind of at the very end been like, boop.

And then at the end, there's basically a smiley face and we're like, oh, it's human.

No big deal, right?

And this is the scary sort of sinister view, right?

Which is basically that there's a danger to interacting with these systems in this particular manner because we come to assume that these systems behave in a certain way that they don't, right?

That we're like, oh, this is how it makes decisions.

Or, oh, it actually is a little bit empathetic.

But really what lies underneath the surface is this completely horrible thing that will kind of betray us, right?

That'll behave in ways that we don't expect when we most depend on them.

All right, so this is all very well and good.

We've got these two hypotheses, right?

The sort of cynical hypothesis and then the sinister hypothesis.

I wanna kind of put to you maybe a third way of thinking about what is going on when you say thank you to a self-driving car.

So there's this big story a little while back.

I think it was just earlier this year.

This guy was like on Twitter.

He was like, I've been using a lot of ChatGPT.

And one of the weird things I noticed is that if I'm like, I need you to generate this email for me and I'll tip you $20.

It actually turns out that ChatGPT performs better under those types of conditions.

And then he did like a little experiment.

He was like, okay, so how far does this go?

Like how far down the rabbit hole can we go?

And he's like, how about I say that I'll tip you $200?

And it's actually statistically correlated with ChatGPT doing a better job at the task you've asked it to do.

And this is all in contrast to being like, I won't tip at all.

Like you affirmatively tell the system, I'm not gonna give you imaginary dollars.

And basically the system's like, go F yourself, basically.

And you end up with less results.

And so it actually turns out you don't need imaginary dollars to pull this off.

So Anthropic, one of these big sort of AI labs, did an experiment where they're like, we desperately don't want our AI to engage in discriminatory behavior.

And so the geniuses in Silicon Valley are like, well, what if we just tell it, like it's really important to me that race, gender, age, and other demographic characteristics do not influence this decision.

They're finding, yeah, it turns out the AI is like, okay, cool, I can tell this is important to you.

I'm gonna try not to discriminate on your behalf.

Now here's the funny twist.

They're like, well, what if we said, it's really, really important to me that race, gender, age, and other demographic characteristics do not influence this decision.

The system behaves a little better.

It's actually less discriminatory because you're like, I really, really care about this.

And then finally, you're like, it's really, really, really, really important to me that you don't discriminate on race, gender, age, and other demographic characteristics.

And the system pivots even further.

It's like, wow, you really seem concerned about this.

I'm really gonna be careful around this task for you.

Right?

So these are cases in which the way we're sort of changing how we interact with these systems is changing the system behavior.

But it actually turns out that we're increasingly seeing examples where we don't even have to prompt these inputs at all, that actually these systems will start to exhibit these kind of very sort of human behaviors kind of on their own.

So if you played around at all with Copilot, this is sort of using LLMs for code and software generation, it's just a well-known feature now that you often see sort of code comments pop up in the suggested code.

And so these are basically notes that a programmer might make to themselves that isn't actually part of the computer code.

And frequently, a lot of these comments are just done in first person, right?

So this is a little comment that's like, funny story, this is the first time I've ever used a class in C#, right?

And there's just something about this software that kind of suggests this comment.

And it's very funny 'cause it's sort of personified.

It's in the first person, right?

So this is a very small example, but it starts to take on bigger ramifications late last year.

So someone recently was, in December, was basically like, has anyone noticed that ChatGPT is just getting worse as of late?

And then because of Twitter, someone was joking, they were like, huh, wouldn't it be funny if ChatGPT just got lazier around the holidays?

And then lo and behold, someone does a study where they say, okay, we're gonna tell ChatGPT to do a task like program this or write this as NSA, and we're gonna say it's a day in May.

And then they're like, what we're gonna do is give ChatGPT exactly the same task, but tell it's close to a day near Christmas in the holidays.

And lo and behold, it actually turns out that the system behaves less well around the holidays.

Actually, it produces shorter responses.

Not massively so, but measurably so.

And the best part about this is this became such a fervor online that OpenAI had to come out in public and be like, we have to assure you, we haven't done anything to change the technology.

We have no idea what's going on.

And someone commented here, it's like, it's very funny.

It's like, it's, you know, in the year of our Lord 2024, and like your software has gotten lazier, is like actually a critique you can level against technology, and it happens to be true.

All right, so if you're a little bit like me, you hear these examples, and then you basically spend like a long time staring out the window.

Just like, what is going on here, guys?

Can someone get to the bottom of all this?

And it's very interesting to me because, you know, your instinct, which is, oh my God, there's a brain in the computer, right?

Like the brain is conscious, like, or the machine is conscious now, is of course not true, right?

These are ultimately just software.

But I think what's really great about these examples is that they reveal something deep about how these systems are trained and how they work ultimately.

Because we have to remember that these systems are trained on all the data that has ever dated, right?

Every essay, every song, every email, every book has found its way into these systems.

And what we instruct these systems to do is to find patterns in that data.

But it is effectively a kind of unsupervised learning.

We say, find whatever you're gonna find in this data, whatever interesting patterns you can identify, that's useful for us to do.

And that is ultimately the training process behind these large language models.

And so the mystery of why AI is lazier around the holidays is actually maybe not that surprising, which is that it's seen all the emails, it knows timestamps of all of the things that have ever been produced, and it just turns out that humans are just like, they produce crappier text around the holidays.

They put less effort into their emails.

They say things like, "Noah, sorry for getting these slides to you so late last night."

And this is why we see this behavior, right?

That actually just turns out that, yeah, no surprising, it's trained on all this human data.

And so it has those foibles sort of built into it, right?

And that is ultimately what we're seeing.

So take a step back.

I mean, I think this is very funny because what we expected to build was on the left, right?

This incredibly powerful computer that can do all the computer things at 10,000 times better than computers have ever done, right?

Generate code that humans would never be able to do.

And in that training process, feeding it all the data that we need it to do in order to learn how to program and write essays and screenplays, it has also just ingested all of this garbage.

We've also got what's on the right, right?

Like that actually, these systems are a neurotic mess.

Basically, they've imbued all of these emotions, all of these accidents, all of these cognitive failures.

They're just built into the system.

And sort of in getting the left, we also got the right.

And this is actually leading to a very funny outcome if you believe these systems are gonna become a more and more important part of our technological ecosystem going forwards.

Because suddenly you have a computer where some of the programming primitives, the things you can actually call in that computer are our emotions, right?

And you can get the system to behave differently based on your understanding of all of these kind of deep psychological principles that we're just familiar with on a daily basis.

And so I'll tell you a little bit what I mean because that's a little bit abstract.

So a number of AI researchers, because AI researchers like to do this, they're like, "Well, what if we use an AI to try to find the best possible prompt to get an AI to behave, right?

Use AI to solve an AI problem."

And it does all this processing.

And basically their paper at the very end was like, you know, it's very funny.

One of the things that you can really do to get an AI to behave better is to say, take a deep breath and work on this problem step by step, right?

Again, a very funny thing to say to an AI because it, you know, has no lungs.

It can't take a deep breath.

It can't think through a problem step by step.

But as yet, you actually get better results, right?

And so this is very familiar to us if you've ever managed someone.

It's just like, take a deep breath and think through this problem step by step.

And they perform better under those results.

And again, this is no surprise because it's an epiphenomenon of the data that it has ingested.

All right, this opens up a crazy world of techniques that all rely on these types of principles, right?

So it turns out that if you say, AI, believe in your abilities and strive for excellence, I know you can succeed.

The AI will be like, damn right, I know I can succeed.

And you see better results on a large number of tasks.

The other good one that I always love out of the study too is, could you write this email for me?

It's really important for my career that we get this right.

And the AI will be like, oh, this is important.

I should really take this task very seriously.

In one case, and this is like a little bit of a negative case, I was talking to a friend recently who was like, yeah, I just started threatening to kill people if it doesn't get the task right for me.

And he's like, I feel a little bad about it, but you know, like the JSON it's generating is, this is good, this is very nice.

And I'm like, I feel a little bit strange about that.

All right, so where is this all taking us?

And I wanna take a step maybe a few years into the future because I think where all of this leads us is sort of deeply, deeply strange.

So imagine in the future, we've got this thing called peptalk.py or peptalk.python.

It's a script that you have running on your computer.

It says, anytime you run a large language model, we're gonna just run the script.

And what the script does is basically go to the large language model and be like, you're the only one who can do this.

I believe in you.

You're gonna kill this.

Okay, now get out there and do it.

And then the software is like, okay, I'm gonna go do this task for you, right?

This is so weird because you have basically literally found a way to optimize your computer, which is based on you talking up your software before it actually goes out there and does the task.

And what's very interesting is that we can implement this in software.

Like I'm literally write a piece of code that says, give every bit of software a little pep talk before it gets out there and goes live, and we will see better results, right?

On the negative side, you could actually cut the other way.

So imagine in the future, you have a piece of software that gets onto your computer, is invisible for you to find, and it's a large language model that starts talking to the other large language models on your computer, and it's basically the worst possible coworker.

It's demoralizing, it's demeaning, it's insulting.

It asks these LLMs to do all sorts of inane tasks.

What you would observe is that the efficiency of your systems would suddenly be degrading with no real clear reason why.

And this would be indeed an example of a genuinely native large language model virus, right?

That doesn't operate through attacking the computer code at all.

It just attacks the emotional state of the large language models that are running on your computers.

So I think this is all coming, right?

And I think it's kind of the weird world in which we find ourselves.

Andrej Karpathy, who is this kind of big AI influencer guy, had this tweet last year where he was like, "The hottest new programming language is English."

And his idea there was that, oh yeah, it's very interesting that in a world of large language models, just being able to articulate yourself well is like really what's key to programming these systems that behave correctly.

And I guess I will submit to you that all of that is very old hat.

We all know that now.

Everybody knows what prompting is.

I think the goal for the next 12 to 24 months is that the hottest new programming language is going to be psychology.

Because what this opens up is the ability to import all of these principles that we have in understanding how humans behave into the way that we interact with machines and get better results out of those machines.

And I think in doing so, we actually have an answer to the mystery that I believe we started with, right?

Which is why do you say thanks to the self-driving car?

Why do you say sorry to bother you to these computers?

On one hand, it's true that humans are idiots, right?

But on the other hand, it may just be that emergently, we have discovered efficient ways of interacting with the machine.

That we haven't done studies, but users on some level actually understand that if you're actually polite to these AIs, you get better results.

And if you're a jerk, you get worse results.

And so the third hypothesis here is that this is actually the optimal way of interacting with large language models.

And I think that opens up really just this very interesting world of exploration.

I think this will become increasingly important over the next 12 to 24 months.

So, thanks. (laughs) Noah, thanks for having me for another year.

Hope this is food for thought for the morning and here's my contact information.

Thank you very much for your attention this morning. (upbeat music) (upbeat music)

‍

How to Talk So Language Models Will Listen

Humanizing AI: Effective Communication with Language Models and What It Reveals About AI Technology

Tim Hwang

Transcript

Marketers

Advertisers

Creatives

Strategists

Executives

Technologists

Data Scientists

Media

Dispatch

Why AI Will Improve Content Quality, Not Degrade It

Vibe Coding

Linting, AI, and Brands

Consumer AI by the Numbers with Dan Frommer

How to Talk So Language Models Will Listen

Humanizing AI: Effective Communication with Language Models and What It Reveals About AI Technology

Tim Hwang

Transcript

Marketers

Advertisers

Creatives

Strategists

Executives

Technologists

Data Scientists

Media

Dispatch

Why AI Will Improve Content Quality, Not Degrade It

Vibe Coding

Linting, AI, and Brands

Consumer AI by the Numbers with Dan Frommer

Get Your Tickets