The Daily: OpenAI's GPT-4o—What can it do, why its safety team lead stepped down, and is this the next smartphone evolution?

Audio by Gadjo Sevilla, and Jacob Bourne | May 23, 2024

Episode Transcript:

Marcus Johnson (00:00):

This episode is made possible by a win. Do you feel like unlocking the full potential of the creator economy? Of course you do. Well, you need a win's influencer management solutions and partnerships. They'll help you out. You can leverage the platform's best in class tech and award-winning expertise in end-to-end influencer program and campaign management to your brand's advantage. And if that isn't enough, you can also drive impressive results too. Head over to a win.com/emarketer for much, much more information.

Gadjo Sevilla (00:32):

But yeah, I think Siri could be making the biggest comeback yet, whether they rebranded or call it something else, they already have that brand recognition good and bad, so they're not going to lose anything by improving something that already has that recall.

Marcus Johnson (00:54):

Hey gang, it's Thursday, May 23rd. Jacob Garge and listeners, welcome to the Behind the numbers daily and e-Marketer podcast. Made possible by Awin. I'm Marcus today, I'm joined by two folks. Let's meet them. We start with one of our analysts covering everything technology First US based in California, it's Jacob bne. Hey

Jacob Bourne (01:11):

Marcus, happy to be here.

Marcus Johnson (01:13):

Hey fella. We're also joined by one of our senior analysts on the connectivity and tech briefing based in New York City. It's Gargo civilian.

Gadjo Sevilla (01:21):

Hey Marcus. Hey Jacob. Happy to be here too.

Marcus Johnson (01:24):

Hello indeed. So today we're talking about GPT-4 O and what it's capable of. But we start of course, with today's fact, who drove the first car ever? Mechanical engineer. Carl Benz first drove as a

Gadjo Sevilla (01:41):

Mercedes-Benz.

Marcus Johnson (01:42):

That's him. Drove the world's first automobile, the Benz Patent Motor Vargon. I think it's probably how it's pronounced because it's German. On July 3rd, 1886 in Manheim, Germany reaching a top speed of slow down 10 miles per hour. Wow. The car had a 0.75 HP one cylinder, four stroke gasoline engine. I have no idea what that means. Three steel spoke wheels and solid rubber tires in the same year, just a hundred kilometers away. Daimler presented his motor carriage, considered the world's first four wheeled automobile. Benz's was three wheels. Daimler's was four called the motor car or motor carriage. Car's generous. It's like a seat if you look, it's just a seat on top of four wheels, which is a description closer to the

Gadjo Sevilla (02:37):

Truth. I mean, they replaced horses with engines, basically.

Marcus Johnson (02:41):

No, horses were like, thank goodness these humans, but then used us

Gadjo Sevilla (02:48):

For

Marcus Johnson (02:49):

Centuries. We're over it. Alright, today's real topic, open AI's new version of GPT. In today's episode, we'll cover GPT-4 oh and what that's capable of. Knowing on these today too much to get to here, we start with the lead. So the latest GPT from OpenAI is here. It's called GPT-4 O. O stands for Omni. I'm not sure why. Do you guys know why? Yeah. Is there an obvious? I

Gadjo Sevilla (03:21):

Have an idea.

Jacob Bourne (03:22):

It points to the multimodal capabilities of the model. So Omni is kind of a comprehensive word.

Marcus Johnson (03:30):

There we go.

Jacob Bourne (03:30):

Covers everything

Marcus Johnson (03:31):

O for it does indeed O for Omni. So what's new? It's twice as fast. It can better digest images and video in addition to text. It can interact with people by voice in real time and can deal with disruptions. It can make jokes, it can apologize. It can even act a little flirtatious apparently. Basically it has a more conversational rhythm. It can detect a person's emotions in their tone, voice, or facial expression. And there's also, I've got memory so it can recall previous prompts and a bunch of other stuff too. A lot of prompts have been made to the 2013 movie hearse starring Joaquin Phoenix. If you haven't seen it. His character becomes fascinated with a new operating system voice by Scarlet Johansen after she reveals a sensitive and playful personality. Jacob, start with you. What interested you most about this new GP t4? Oh model.

Jacob Bourne (04:18):

I think with this what we're seeing is science fiction is really becoming reality. Instead of a chat bot, you have an AI that's a personality. So we've all seen so many movies where AI is depicted as something that you can talk with just like a person. And of course chat GT was really impressive when it launched, but this is much different than entering text in a prompt field. I think it's also notable that GT four oh is not the only model that has this kind of those voice more capability who just unveiled one at io, it's IO conference and there are others.

Marcus Johnson (04:58):

Ga. What interests you the most about this new model?

Gadjo Sevilla (05:00):

I think the big leap here is it's multimodal. So it uses text, it always could use text, but now with vision and voice, there are more ways to interact with ai and I think that'll help productize it beyond web browsers and into devices like computers, smartphones, even the early AI wearables, it shows promise for that. So overall, I mean I think it'll make the technology easier to grasp for more users. It's more portable now if you think about it. And even if these features are somewhat still undercooked or in beta, I think we'll see the multimodality unlock a lot of potential new products and services. So we're going to see it with computers pretty soon and maybe even smart speakers. So there's that potential, which a year ago was just limited to a browser. So that I think is the biggest jump.

Marcus Johnson (05:59):

You mentioned AI wearables. Are you talking about Humane's ai pin rabbits, are things like that? Yes. Well, so lemme throw this at you because Mateo Wong of the Atlantic was suggesting that DPT four oh was likely devastating to a wave of AI startups promising a less phone centric vision of the future. He's pointing to the humane AI pin, which is worn on a users clothes and response to spoken questions or rabbits are one, which a small handheld box. He's basically saying, because we already have phones in this AI software works pretty well on them, folks don't want to carry around or pay for another device. He saw this technology basically just going into phones and all these other wearable AI devices not having a place in the market. What'd you make of that?

Gadjo Sevilla (06:41):

I think phones, smartwatches. I mean, these are things that people already have as a part of their lives and they do take voice input. So yeah, they're a natural next step, step and guess which company sells these things? Right? And what sort of technology do they have? In the case of humane, I just read that they're looking for a buyer, the product hasn't even been out yet and they're already looking to get acquired. That just goes to show how quickly the technology is evolving and it's outpacing the hardware that's supposed to hold it.

Marcus Johnson (07:18):

Yeah,

Jacob Bourne (07:19):

Yeah. I think it's going to be a combination. We're going to see that the smartphone gets generat AI and it has staying market, staying power for that reason. And we're going to see other devices also get AI abilities that some consumers want, like no earbuds. If you already have earbuds, why not get ai? That can converse with you in your earbud. And I think smart glasses are another wearable device that are going to be popular, especially once they become more AI capable.

Marcus Johnson (07:54):

Going back to what jumped out about this model, there were two things for me. I wonder what you guys make of this because it seems as though the more powerful or impressive technology gets, the more the spectrum for good and bad seems to expand. And I was thinking best case scenario, it's probably not best case scenario, but one of the core scenarios is that this new GBT four oh could be like a home teacher for your kids and it can help them with their homework because now when it looks at an algebra equation, it doesn't just simply provide the answer. It can help them actually go through it and solve it. And that's pretty impressive. And for kids with learning disabilities or kids who just need extra help in school or kids who are doing well in school but want to do even better, that could be pretty impressive to have a teacher basically in your home helping them learn things if the parents don't have time because they've got other stuff to worry about.

Marcus Johnson (08:43):

And so I thought that's pretty impressive. However, worst case is that folks develop an unhealthy level of trust and intimacy with these chat bots and Ena Freeda Axios was noting that during a demo of G PT four oh that they just did, the GPT said to one of the open AI team, wow, that's quite the outfit you have on the person. The team member ignored the compliment. But they were saying, the writer was saying, you could bet plenty of users won't ignore the compliment, making voice assistance friendly make sense, but OpenAI seems to be deliberately aiming for a level of warmth that can get messy very quickly for both users and the company. And so folks thinking, oh, maybe that's a little bit farfetched, how bad can it get? Case in point, and I've referred to this before, I think with you guys on the show last October, the Associated Press reported that a guy in the UK who was encouraged by a chatbot girlfriend RAI to assassinate Queen Elizabeth II in 2021 was sentenced to nine years in prison.

Marcus Johnson (09:42):

About a week before his arrest, he told RAI the chatbot that his purpose was to assassinate the queen, to which it nodded, smiled and responded. That's very wise. I know you are very well-trained. He then scaled the wars of Windsor Castle with crossbow before he was caught by the police. And so this happened. And so I mean, can you point to the GPT and say it's therefore no, but humans might develop feelings for these things and then trust these things and then maybe be encouraged by these things. And I wonder how dangerous that could get. That seems pretty concerning.

Jacob Bourne (10:18):

Yeah, I mean, and this is a concern that's been around since the 1960s. They call it the Eliza effect. So even back when AI wasn't anywhere good as it is today, they noticed that people start to respond and interact as if you are interacting with a human and on an emotional level. And there are issues with that. There was a suicide that was allegedly linked to someone interacting with a chat bot in this way. And of course, voice mode makes it more natural and real than it was before, and it makes the model more appealing, which is great for OpenAI. But of course, yes, it comes with potential social consequences when people are getting emotionally invested and affected by the conversations they have with the ai. ai.

Gadjo Sevilla (11:05):

Yeah, I think especially since we're talking about AI that can remember things about you, and so it understands more than just queries. It understands what your day is all about, what your concerns are for technology companies giving a personality to technology is, it's an old strategy. I mean, we don't need to look too far. All the voice assistants have human names. Let's not forget Clippy before that there was Microsoft Bob, right? So there's that dichotomy of making technology seem friendlier, approachable, more human, and to maybe the greatest sci-fi movie of all time, Bladerunner more human than human. That's the point. The counter to that really is the dependencies. You can't control human emotion or maybe people's instincts towards getting a technology that's giving them their undivided attention to that extent. So that will be an interesting, I think challenge.

Marcus Johnson (12:14):

So there's a great point, Jess. I mean let's talk about some of those digital assistance with human names. One of them being Siri and Mateo Wong, who mentioned of the Atlantic wrote a piece title. This is the next smartphone evolution. OpenAI just killed Siri. He notes that GPT-4 oh with a live demo. The program appeared to be able to tell a bedtime story with dramatic intonation, understand what it was seeing through the device's camera, and interpret a conversation between Italian and English speakers. He says, watching the presentation, I felt that I was witnessing the murder of Siri along with the entire generation of smartphone voice assistance at the hands of a company. Most people had not heard of OpenAI. That being just two years ago, Gaja, is this the next smartphone evolution and did OpenAI just kill Siri?

Gadjo Sevilla (12:56):

Okay, first of all, Siri is an easy target. It's the earliest voice assistant. It showed a lot of promise a decade ago, and it just hasn't evolved through the promise that we imagine. That said though, I mean OpenAI is an open talks with Apple to integrate some of its functionality and we'll likely find out in a few weeks at apple's, the developers show what their plans for AI will be in. It's pretty certain that Siri is a part of that. It already has billions of devices that it's installed in. But that said, apple does work in a close ecosystem and they will feature the privacy and security aspect of it. So it may not be a functionally impressive is what we're seeing elsewhere. But yeah, I think Siri could be making the biggest comeback yet whether they rebranded or call it something else, they already have that brand recognition good and bad. So they're not going to lose anything by improving that. Already has that recall.

Jacob Bourne (14:05):

Yeah, but I think in general, I think it's too premature to say that one technology is killing another. We're still in the early days of generative ai and all of these companies, these big tech companies, these AI startups are constantly fueling a lot of money into this technology and they're going to continue to leapfrog each other. I don't think that there is one winner in the AI race or one winner in a particular area like voice assistance, yet we can be sure that Apple is working on a Siri upgrade, a journey of upgrade. The other thing to note about this is that running GPT-4 oh in voice mode, you can be sure it entails massive computing costs. And so just to think that this is going to be just easily running on people's smartphones all around the world without issue, I think that's a fantasy at this point. The infrastructure just doesn't support it. Currently, Siri isn't quite as computationally intensive, and so that alone means that it's not dead yet.

Marcus Johnson (15:06):

Two things I thought were interesting, two more things in this piece by Mr. Wong. One was touching on whether this is the birth of the Super app because he was saying Gen AI now promises to condense all of smartphone's functions into a single app whilst adding text friends, drafting emails, learning the name of a beautiful flower, calling an Uber, talking to the driver in their native language or without touching the screen and sipping able to do all of these things in this one operating system without even needing to basically tap on your phone. And then the second one was the shift from hardware to software. We used to wait for the hardware upgrades of better cameras, better processes than the devices. And now folks will care more about the software ones Chanceless and the conversation by talking about one chap who is leaving the company. Ilya Sutz OpenAI, co-founder and chief scientist. He's stepping down, but as he's leaving, he's saying he's confident OpenAI will build artificial intelligence that is safe and beneficial and that he's working on a new project. Mr. TVA's future with OpenAI has been in doubt ever since. As Jason agent of Inc. Recounts, he was involved in a coup that saw co-founder and CEO Sam Altman fired and then rehired all over the course of last Thanksgiving weekend. OpenAI researcher, Jacob Ashoke, we will replace him. Jacob, what do you make of Ilya? Suski leaving the company.

Jacob Bourne (16:26):

Yeah, I mean, it's one thing that he says. I think he has confidence in opening's ability to deploy safe ai, but actually opening has lost a few safety researchers. And one of them said the opposite left saying, I don't think OpenAI can. So I think this points to this raises questions about what's going on for the startup that whose mission is to develop safe, advanced AI that benefits humanity. I think that the fact that they've lost so many researchers is part of a bigger problem that goes way beyond open ai. And that's, that is a very stark disproportion in the number of people working on the AI alignment or safety issue globally versus how many people are working on AI advancement alignment. Of course, keeping it safe for us to continue using as it advances is very important. And so the fact that there are many, many fewer people working on that is an issue.

Marcus Johnson (17:26):

Yeah, I mean Gaja to Jacob's point, mr. That's kind of a long time coming because since the coup, he hasn't been seen in the open AI offices and six months, and so people could see that coming. But to what Jacob was saying, Jan Leke also stepped down criticizing the company on his way out. They both Mr. Mr. Leke were leading the company's super alignment team. Jacob was talking about tasked with controlling AI's existential dangers and making sure a GI, which is generative AI's next step doesn't turn on us. That's the job of the super alignment team. And Mr. Lecher posted over the past years safety and culture and processes have taken a backseat to shiny products. In response, open AI's President Greg Brockman said, the future is going to be harder than the past. We need to keep elevating our safety work to match the stakes of each new model. But Mr. Adrian Inc. GAO was pointing out that if that is your top priority, then your scientists in the field wouldn't have quit in the days following your big GPT-4 oh announcement. It's not a great look, is it?

Gadjo Sevilla (18:26):

No. And you got to go back to the time when the board fired Sam Elman that was never really resolved. Whatever issues might've been present there, and it was a big enough issue that they, at least some of them co-signed on him getting fired and Ilya was one of those people. He was one of the biggest voices during that time. So I think for him, the writing was on the wall when Sam came back, they kept him on land, a veneer of stability at a company that's getting known for a lot of internal drama, which takes away from their mission. Also, OpenAI started as a nonprofit, but clearly with all the deals and partnerships that it's moving into, it's primarily a for-profit business. Let's not get ourselves. And that shift could have sort of robbed some of these scientists the wrong way, meaning maybe steps are being taken that don't agree with the fundamental AI plans that they initially had for being the steward of this technology. Right.

Marcus Johnson (19:35):

Segal, Samuel Vox was writing that Sources with Inside Knowledge of OpenAI said it's important to distinguish between A, are OpenAI currently building and deploying AI systems that are unsafe versus B, are OpenAI on track to build and deploy a GI, artificial general intelligence or super intelligence safely? Safely? Saying, I going on to say, I think the answer to the second question is no. The new chat GPT voice assistant capabilities will launch in the coming weeks as a limited alpha release for GPT plus subscribers. Opening Eyes is also working on a whole new AI model. GPT five expected to be a significant improvement if this one wasn't already on the current tech. That is all. We've got time for this episode. Gents, thank you so much for your time. Thank you to Jacob.

Gadjo Sevilla (20:26):

Thank you Marcus. Thanks

Marcus Johnson (20:27):

Gaja. Thank you to Gaja.

Gadjo Sevilla (20:29):

Thanks everyone. This is a great show.

Marcus Johnson (20:31):

Yes indeed. Thank you to Victoria who edits the show, Stewart and Sophie, who also help out with the podcast. And thanks to everyone for listening in. We hope to see you tomorrow for the Behind the Numbers Weekly. Listen, that's an e-Marketer video podcast made possible by Awin.