Hallucinations For Fun and Profit // BrXnd

Transcript

So, my name is Tim Hwang and I'm very excited to speak with you all this morning. The talk today I'd like to go over is called 'Hallucinations! For Fun and Profit', and it's a talk that's really broken down into two parts. First part, we're going to talk a little bit about hallucinations in AI, and specifically in large language models, which is what ChatGPT and a bunch of these kind of groundbreaking technologies use. And then the second part, I'm going to try to argue to you today that hallucinations are not something we should be scared of, not something we should dislike, but actually something that we should celebrate. And in fact, maybe the specific feature of the technology which is the most influential on branding and marketing. So let's get into it.

First, we can start here. What is a hallucination? So some of you who have played with ChatGPT might have noticed that there's a little disclaimer at the bottom of the product that says, 'Please be aware, ChatGPT may occasionally say inaccurate things about people, places, and events.' And essentially what a hallucination is, is essentially the tendency for these large language models or LLMs, as I'll talk about them today, to very confidently answer questions in a completely inaccurate way. So the one you can do here is if you say, 'Give me a quote that Noah Breyer said,' it will just continue to spit out quotes indefinitely that Noah has never said actually in practice, despite these quotes being great, like, you know, 'Curiosity is the fuel that propels innovation and pushes us to explore unknown, unravel mysteries, and discover new possibilities.' So great, Noah, great quote there.

And what's really interesting about hallucinations is that we have frequently people sort of frequently attribute them to kind of the AI lying or deceiving, right? So some people have said, 'Well, you know, basically LLMs, they're like a VC that doesn't know anything, right? You ask them a question, they're like, 'Well, of course, like I was reading on Substack the other day, here's some other articles.' And, you know, ultimately they just know nothing. But this is actually a mistake. We have this kind of perception that is sort of fueled by the UX of these technologies, right? And what I mean by the UX is really chat, right? The idea that you have this AI, you're having a Q&A with it, you ask it a question. And what we imagine in our head when that happens is that the AI goes, 'Well, thanks for that question about the weather, Tim. I'm going to check these data banks and get back to you.' But this is actually entirely wrong. This is actually not the way that LLMs process information. Instead, what it is doing is the following, right?

So imagine this conversation between a human and a bot, very simple. You know, 'Hello, how are you?' Bot says, 'Fine, thanks, and you?' And the human says, 'Okay, can you tell me what the weather is today in New York City?' And all the LLM is doing is looking at the conversation that's in blue, frequently known as sort of the context window. And it is just trying to predict what's the most likely thing that you would say next. And what's really interesting about this is really that there is no facts involved, right? Ultimately, it is just predicting language. And this leads us to a really interesting, I think, state of affairs where I think this is the rule that kind of governs, which is that LLMs are bad at everything we expect computers to be good at, and LLMs are good at everything we expect computers to be bad at, right?

So what exactly do I mean by that? Well, one thing is that we have this intuition that's built up about computers over the last few decades, which is that they're really good at stuff like logic, math, and facts. But then ultimately you're having a conversation with ChatGPT and you say, 'Can you add 394 and 200?' And it says, 'Yes, I've done that calculation, it's 594.' And you're like, 'Okay, that's right.' 'Okay, ChatGPT, what number is larger?' 'Well, the number 394 is smaller than 200, right?' And it's kind of amazing here, right, because this is millions of dollars of engineering power, talent, all going into this, ultimately, a machine that can't compare 394 and 200 and determine which number is, in fact, the larger number. And so I think this is a really interesting thing, but with the explanation I just gave, you can understand why this is the case now, right? Because there's no actual calculation going on here. Literally all the AI is doing is looking at the transcript and saying, 'What's the most likely thing that someone would say after this, right?' It's not adding numbers in any substantive way, and this is what we see.

Now this is something that we think computers are good at, that LLMs are really, really bad at, right? So another intuition, to go to Noah's comments from earlier, now if you want to talk storytelling, you want to talk creativity, you want to talk aesthetics, LLMs are really brilliant at doing this, right? So this is taken from the sort of collabs project, right? You can propose a completely absurd collaboration, cat and crunch meets North Face, right, and it will just spit out an incredible creative brief. And then not only that, you can also generate what is actually like a pretty good cross brand that uses the Captain Crunch uniform in the context of North Face. And so storytelling, creativity, aesthetics, these are all things that the technology is fundamentally really, really good at. And so, I don't know, if you want to think about trends that are currently in the marketplace, it's in some ways kind of an incredible absurdity, an incredible colossal irony that the very, very first sort of product that Silicon Valley wanted to apply this technology towards is search, right? So BARD is the future of Google search, and Bing search, right, Bing chat, is the future of the Bing search engine. And ultimately, these technologies are not good at facts, they're not good at logic. All of these things are things that you use search engines for, right? And in fact, there's actually a huge kind of rush right now in the industry where everybody has rushed into implementing this technology and been like, 'This is really bad.' And basically, what everybody's doing right now is kind of trying to reconstruct what they call retrieval or factuality in these technologies, right? And the point here is basically that large language models need factuality as an aftermarket add-on. It isn't inherent to the technology itself.

Okay, so I've talked about why this is like the bad way of thinking about AI, right? Like thinking about large language models as sort of fact machines, right? They go out into the world and say, 'Let's retrieve this from the data banks.' What's a better way of thinking about it? Well, I'd submit to you that a better way of thinking about it is that LLMs are concept retrieval systems, not fact retrieval systems. So let me repeat that one more time, and then I can go into sort of the detail here. That they're concept retrieval systems, but not fact retrieval systems. So what's this mean in practice, right? Let's go back to that transcript that we had a little bit earlier. Human says, 'Hello, how are you?' Bot says, 'Fine, thanks, and you?' And the human says, 'Good, what's the weather today in New York City?' Now the AI might answer, 'It's going to be 75 degrees and sunny, right?' But it hasn't actually consulted any weather database to determine that's the case. It has literally just said, 'Well, what do people usually answer once you have this conversation? They say, 'Well, the weather's pretty good, right?' And what's interesting here is that the AI, from all the data it has ingested, has a concept of weather in New York in the summer, and is able to reproduce that, essentially. It's predicting the next obvious thing to say. Now this is actually something that humans do as well, right? I prepared these slides like two or three weeks ago. I didn't know what the weather in New York was going to be. I just predicted that the weather was going to be 75 degrees and sunny, and lo and behold, the day has arrived, and it's about 75 degrees and sunny, right? And so what's interesting here is that the AI is learning a vibe, right? It's learning a concept, and it is retrieving that concept. It genuinely has learned something about the world, and can make actually good predictions around it. It just has nothing to do with facts, right? It's making guesses at what is the most likely thing to happen next.

Okay, so if LLMs are a concept retrieval system, let's now apply this to brands. So after all, what is a brand anyways, right? A brand is basically a collective social understanding of what a company represents to society, right? And in this sense, all a brand is is what people think of a brand, right? What people likely say when you're like, 'So what's Coca-Cola mean to you?' And they say, 'I don't know, it's soda, the red color,' right? These are the most likely outcomes. And so what I want to argue to you today is that concept retrieval, what LLMs represent, may actually be incredibly applicable to brands, right? Because in the past, how did you really get at what the essence of a brand is? Well, you've got these very slow mechanisms for doing so, right? I can have a bunch of people fan out across America and do surveys, very slow. We can do focus groups. I can hire a bunch of people at a branding agency to do some research and write kind of like corporate poetry about what my brand is. And so what I want to argue is basically that what LLMs are doing is in some ways perfect for identifying and exploring and manipulating brands as a concept. And so the slogan is that hallucinations are a feature and not a bug, right? And that actually it turns out that the technology that we have here is actually not good for everything that Silicon Valley is using it for, but maybe ideal for exactly the kinds of things that people in this room are using LLMs for.

Okay, so say maybe you're along with me so far. You're like, okay, Tim, I agree. You know, hallucinations are a feature, not a bug. I think we're in this really interesting situation right now where we have an amazing capability and we have sort of a thing that we really want to apply it to, which is brands. But we're still kind of early in the process, and I think what we're still trying to figure out is what are the appropriate interfaces for manipulating, searching, and exploring hallucinations, or in this case, the brand. And I kind of liken this to sort of like the early days of smartphones, which is like smartphones came out, and then what happened is that we had to develop all of this UX to figure out the best way of managing experiences on mobile, right? So I met recently at a party someone who's the person who invented the 'you pull down and it reloads' action, right? Which is like incredible. I was like, oh, my God. It's like finding like the person who is like, you know, like meeting Johannes Gutenberg basically is like what it is. And what this person said to me was like, well, yeah, actually, it took a really long time for us to just figure out that like, okay, on this tiny little screen, on all this information, it's going to be this action that gets us sort of the essence of what mobile can do. And we're actually in this early state right now where we've got this hallucination thing, we can apply it to this problem, but our interfaces are still extremely, extremely crude for doing so. And so kind of what I want to do in the second part of my talk here is to talk a little bit about what's been happening in building good interfaces for hallucination. And I want to kind of walk you through maybe an accelerated history of what's been happening here, because I think it's really cool. And with ChatGPT or whatever other product that you use, you can actually play with some of this stuff at home.

So let me just back up for a second. Usually when people go into this conversation, what follows next is like extremely, extremely boring, right? So people will say, 'Well, really AI for creativity, did you know that you can get ChatGPT to create a script for you?' And you get basically examples like this, which are terrible, right? Like write a script for a commercial for Coca-Cola, and the results are like very boring, very asinine. Like literally the commercial is like, 'I don't know, man, I'm feeling thirsty.' And someone's like, 'Are you trying to Coke?' And it's just like... And so this is not what I'm talking about, right? I think this is basically... The way to think about this is the 'MS-DOS' of sort of LLM hallucination interfaces. I think what my point of entry into this is, is actually an amazing blog post that came out a few months ago, where someone said, 'Look, I discovered this really strange thing with ChatGPT. I can basically go to ChatGPT and say, 'ChatGPT, you are no longer a chat machine. You are now a Linux terminal.' And what's interesting is that ChatGPT says, 'Okay, yeah, I'm a Linux terminal now.' And you can actually explore a completely hallucinated file system with completely hallucinated files. One of the really interesting things about this post is that he can actually use this terminal to connect to a completely hallucinated internet, which itself contains a completely hallucinated OpenAI website, and then he can launch a chat program through this completely hallucinated system. Right? So this is cool. You should check out the post. Incredible piece. What do we learn from this? What we learn from this is that not only does the LLM produce hallucinations, you can actually get it to hallucinate its own interface as well, which is actually pretty revolutionary if you think about it. I can't just go to Microsoft Word and be like, 'Microsoft Word, you're now a sandwich.' That's actually not something... It doesn't turn into a sandwich because you just tell it. But what you can do here is with language, you can basically introduce any interface that you want for managing, manipulating, exploring brands or hallucinations more broadly. So again, this might sound a little abstract, so let me make this very, very concrete, right? Here's one quick example. Imagine that you are the embodiment of the Coca-Cola brand in the form of a brand translation machine. The brand translation machine takes any statement and turns it into a statement that is aligned with the Coca-Cola brand. Ready? Right? And basically, what ChatGPT says is, 'Absolutely, I'm here now. I'm your brand translation machine. What do you want me to do?' Right? Incredible. I basically just said, 'You're a sandwich,' and basically the system is like, 'Yes, I am a sandwich. Let's fucking do this.' Right? Incredibly interesting. Now, this is just a toy example, right? And obviously, the results are absurd and go off the rails very quickly. So the Gettysburg Address becomes, 'Many sparkling moments ago, our visionary founders unveiled a remarkable nation, sparkling with freedom and ignited by the belief that every individual is born equal. So let's continue to celebrate the effervescent spirit of unity and inclusivity as we raise a toast to the enduring values that make our world extraordinary.' Right? So it's basically like the conversion of the Gettysburg Address into, like, ad copy, basically, for Coca-Cola. This is also one I really like. 'I just committed a major faux pas. I may have accidentally added a touch of unexpected sparkle.' And so on and so forth in this fashion.

Alright, so we can do this now, right? And your question probably, Tim, is like, 'Okay, this is kind of a fun toy. What can we actually do here that really might advance the state of the game when we think about branding?' And this has moved very, very quickly. Right? So what I'm going to basically represent is a capsule history of the last six months and sort of what people have been trying out when it comes to designing these hallucinatory interfaces and the kinds of tools and sort of tips that are getting better or worse results here. Right? So the very first version of this was what you might call sort of the conversion tactic. Right? So I can get the AI to hallucinate, but sometimes I need sort of a scaffolding for it to hallucinate on in order for it to reveal things about the hallucination. So some nerds among you may have been sort of a D&D player in your youth. Right? It turns out that you can actually get the AI to hallucinate any brand as a character sheet in Dungeons and Dragons. What that allows you to do is basically, if you're familiar with Dungeons and Dragons and, sorry for going into the weeds a little bit, you've got all these stats. Right? Or is this strong or this smart or this constitution or this wisdom and charisma. And it will actually very happily generate these character sheets. Right? And so what's happening here is basically you're saying, 'Okay, you've got this notion of Coca-Cola. Translate this into something that reveals more about the concept.' Right? And you can use that to basically extract more intuition. So there's also another really interesting fun one, which is basically that people forget often that ChatGPT has access to ASCII. And if you've got access to ASCII, you can actually tell it to draw paintings and pictures for you. Right? And so this is almost like art therapy for AIs, which represents the Coca-Cola brand as a picture. Now explain to me what's in that picture. Right? And what's interesting is it basically extracts a bunch of really interesting sort of brand value propositions from its kind of visual representation of the brand.

Okay, so moving from sort of just alignment with the brands to kind of consumer research. Right? So this is one where you basically say, 'I'm going to give you a score, and then you give me consumers that are likely to become lifelong consumers of this brand based on that score.' Right? And so these are things which are really, really interesting because, again, you're sort of plumbing the depths here. You're pulling from all of the stats and all of the kind of representations, really, that the AI has internally about this brand. All right. So I want to dwell maybe kind of on the last thing that people have really been playing around with, which I think is kind of the future of all this stuff, which is sliders and knobs. Right? So initially, all we've been doing is doing a lot of input-output. Take this thing, and turn it into a D&D character sheet. Take this thing, give me a score. But ideally, what you want is a literal interface, right? Sliders and knobs, things that you can push back and forth, variables that you can play with. And it turns out that, again, using language, you can actually have the AI give you whatever sliders that you need, and then you can play with those sliders. So this is an experiment which is, say, imagine you're a product ideas generator that has three sliders that go from 0 to 10. One is age, which is the sort of age demographic that's going to be most into this product. The other one is gender. 'What's the gender that's going to be most into this?' And then the final one is, and this is actually an important point, you can throw in stats that you only have a vague understanding of, right? So I'm like, 'Okay, imagine you've got the slider for hipness, and we're going to dial this hipness thing up and down.' Now, you can be like, 'Tim, what is hipness? I have no idea, but I think it's just something that I want out of these products.' Right? And this is actually a thing that you can basically put in. What's interesting is that this suddenly allows you to get an interface that you can start playing with. So the results here, at least for me, are very, very intriguing. Right? So you can say, 'Coca-Cola, 0, 5, 10.' Right? We're setting all the sliders. And the system basically punches out, like, 'Oh, well, you've got the youngest consumers. You want something that's kind of gender-neutral, and you want something that's the most cool.' And it spits out basically a product description that integrates all of these variables. Now, this is infinitely extensible, right? So this is basically kind of the more verbal version of what Noah was playing around with earlier. Right? Which is, okay, you can do this in code, but effectively what's happening here is you can define whatever interface you want to interact with the brand, and then you can manipulate any way you want by sort of describing a story. Right? And this is what you're doing. Another fun thing to do if you're kind of playing at home is you can also tell it to invent brands, basically represent the brand as a text adventure, and you can walk through a physical building that represents the brand. And there's all these sort of interesting things that are kind of hallucinated in there, where you're like, 'Oh, that's actually an interesting bit of brand proposition. Oh, that's an interesting vulnerability.' So what I want to kind of leave you with is a couple of sort of interesting ideas. I think this is a really, really interesting kind of area for exploration right now. And I think it's potentially what makes this technology really, really powerful. And so I think with that, I want to encourage you, again, to kind of embrace the hallucination. Hopefully, I've made my case that hallucinations really are, in some ways, the foundational element of what we want out of these technologies. And I think rather than rejecting them, rather than fearing them, I think it's manipulating these hallucinations that will create the biggest benefit for people in the ad and marketing space. And so I'm continuing this research. If you want to get in touch, here's my contact information. And thank you very, very much for your attention this morning. One of the best brains I've ever met at this sort of strange intersection that we are all existing in.

Hallucinations For Fun and Profit

Hallucinations For Fun and Profit

Tim Hwang

Transcript

Marketers

Advertisers

Creatives

Strategists

Executives

Technologists

Data Scientists

Media

Dispatch

BRXND is One Week Away (Plus other news)

Long Weekend Projects, o3-mini, Chip Diplomacy

BRXND LA 2025: Full Agenda and Wildfire Response

"Glue Jobs," Microsoft Capex, Phi-4

Hallucinations For Fun and Profit

Hallucinations For Fun and Profit

Tim Hwang

Transcript

Marketers

Advertisers

Creatives

Strategists

Executives

Technologists

Data Scientists

Media

Dispatch

BRXND is One Week Away (Plus other news)

Long Weekend Projects, o3-mini, Chip Diplomacy

BRXND LA 2025: Full Agenda and Wildfire Response

"Glue Jobs," Microsoft Capex, Phi-4

Get Your Tickets