The Daily: AI models that can “reason”, the push to invent a new AI device, and the possibility of “Safe SuperIntelligence”

On today’s podcast episode, we discuss what to make of former Apple Chief Design Officer Jony Ive working on a new AI device, what an AI model with “reasoning abilities” can actually do, and whether Ilya Sutskever’s new AI startup can create safe superintelligence. Join host Marcus Johnson, along with analysts Jacob Bourne and Grace Harmon, for the conversation.

Subscribe to the “Behind the Numbers” podcast on Apple Podcasts, Spotify, Pandora, Stitcher, YouTube, Podbean or wherever you listen to podcasts. Follow us on Instagram

Episode Transcript:

Marcus Johnson (00:00):

Let eMarketer analyze your data. How come? Well, our customer reports marry original survey data with your company's proprietary data and perspective to create branded analytical reports for your marketing campaigns. Lead the narrative around key topics. Visit emarketer.com/advertise to learn more.

Jacob Bourne (00:18):

So I think that models that can really think critically and do more like advanced mathematics, for example, I think that could help spur Agentic AI, AI that can operate autonomously on behalf of humans in the background.

Marcus Johnson (00:41):

Hey gang, it's Thursday, September 26th. Grace, Jacob and listeners, welcome to the Behind the Numbers daily and eMarketer podcast. I'm Marcus, today I'm joined by two people. We start with one of our technology analysts based in California. We call him Jacob Bourne.

Jacob Bourne (00:58):

Hi Marcus. Thanks for having me today.

Marcus Johnson (01:00):

Hey fella. Yes indeed. Thanks for being here. We're also joined by one of our analysts who writes for our technology briefing based in California as well. Her name is Grace Harmon.

Grace Harmon (01:10):

Hi Marcus. It's nice to be here.

Marcus Johnson (01:11):

Hey, thank you for hanging out with me and Jacob for today's episode. What do we have in store for you? We're talking some AI. But first, back to the day. What was the first city to reach a population of 1 million in the world, ever? Guesses?

Grace Harmon (01:30):

In the world?

Jacob Bourne (01:30):

1 million.

Marcus Johnson (01:33):

Yeah. Think of a very popular city in history.

Grace Harmon (01:37):

Rome.

Marcus Johnson (01:37):

Yes. Bang. Good guess.

Jacob Bourne (01:41):

I was going to say that. I was going to say that.

Marcus Johnson (01:44):

You probably weren't. Rome in 133 B.C. according to Encyclopedia Britannica was the first to hit 1 million. London reached the mark 1700 years later and around 1801 according to the UK census and 1 million people called New York City home from 1875 according to Boston University. As of 2016, there are over 500 cities in the world that have a population in excess of 1 million. According to the UN. The UK has one, bless our little heart as London, of course. Nearly two, but not quite. America has nine and China has the most with over 100. So one fifth of the cities with over a million people from China. Anyway, here's real topic, the push for a new AI device and AI models that can now reason.

(02:50):

All right, folks, so let's talk about AI devices. So former chief design officer at Apple Jony Ive has confirmed that he's working with OpenAI, CEO Sam Altman on an AI hardware project. Alex Kranz at the Verge, his team includes 10 folks including Tang Tan and Evans Hankey, two key people who worked with Mr. Ive on the iPhone. Mr. Kranz notes that last year it was Rumored to be inspired by touchscreen technology and the original iPhone. But few specifics have been nailed down and Mr. Altman recently confirmed that it is not a phone that they are working on. Grace, I'll come to you first because you wrote about this for us. What's your take on this announcement that former Apple Design wiz Mr. Ive is working on a new AI device with OpenAI?

Grace Harmon (03:38):

Yeah, I mean there's not a lot of details out right now. We haven't seen any design, there aren't any details about when it's going to be coming out. It just seems like something that is going to be exciting for people that are really gen AI fanatics are really daily active users of these platforms. I guess I would say early on that there is so much on-device AI on most smartphones right now, so it might not be something that is necessary for everyone.

Marcus Johnson (04:03):

Yeah. Yeah, that has always been the question I've had is, "Do we need one?" Well first of all, it feels like everyone who's tried has struggled. Other AI centric devices, Jacob, they haven't exactly flown off the shelves, right?

Jacob Bourne (04:20):

Humane's Ai Pin is a notable case that it's actually did quite poorly. And I think it's not surprising, device sales in recent years have ebbed and flowed, and I think tech companies are really looking at AI as a means to boost device sales and something new is what they're going for. And I think the fact that we haven't heard details is unsurprising. Because what is it going to look like? What's the form factor here? And of course they haven't figured it out because it's a really difficult thing. If they're trying to replace the smartphone, you really have to overcome the habituation of billions of people in order to do that. Because it's everybody's go-to device we have in our pockets. It does everything we need to do. It's like a little computer you have on you, it's not conspicuous. And then of course there's a slew of wearables it some perform well, some perform not well.

(05:11):

And of course all of them are getting AI enablement. And they want to come up with something new. And I don't think that there's any silver bullet here. And I think probably the best path forward is for them to raise something that complements the smartphone instead of replaces it, kind of part of an overall ever expanding ecosystem of AI enabled devices. But still, as Grace pointed out, that's not easy either because then are you just duplicating what your smartphone can already do with another device and do people really want to spend money on that? This is going to be an uphill battle for them.

Marcus Johnson (05:42):

Really?

Jacob Bourne (05:43):

I think the marketing as well as some really exciting features are really what's needed for it to be successful.

Grace Harmon (05:49):

On the design side too, if they're wanting this to be similar to a phone, just in terms of design, it can't be as big a phone. I guess I'd start with that. There are some things there where if it is the size of your iPhone, no one's going to want to carry that around, and the price needs to be lower than an iPhone is going to be because it doesn't have everything that an iPhone can do.

Jacob Bourne (06:09):

Right. And you have other AI powered smart glasses that are also kind of emerging that... And maybe that's what they're going to develop is an AI powered smart glass.

Marcus Johnson (06:21):

So let me just go through a few of the devices. So you mentioned Humane's wearable AI Pin, and there was a Verge article. I think Grace, you might have noted this, that they were saying since the AI pins released in April, about one third of all units of sold have been returned. So that's not going great. There's the rabbit device, little kind of box device, which was written off as a gimmick by Wired, the Verge, CNET and others. Then you come to Smart Glasses. So you've got Snapchat, they're trying to crack the Smart Glasses code, have done for a while. They have a new proprietary OS to power its fifth generation of AR spectacles. Amazon's Echo Frames received little attention. But then you come to Ray-Ban and the Meta Smart Glasses. Their first generation sold only 300,000 pairs in 18 months, and less than 10% of purchases was still being actively used two years later. But we'll see for the next version. Are they the company? Because it seems like they might be onto something. There's been a bit of buzz about them recently within latest or the next generation of those glasses.

Jacob Bourne (07:23):

Meta, I don't think it released official numbers about the second iteration of the Smart glasses, but they said it performed better than the first lineup. Yeah, I think what's great about the Meta's Ray-Bans is that they're priced at $299, $300. I think that's relatively a good target price point for something that's really new to consumers that someone might not be sure how much they're going to use it or not. So it's a bit of a purchase risk. So I think that's one good thing. I think the other huge selling point is that they're stylish, which is something that many of the other smart glasses out there are not. That's another big thing with anything that's not in your pocket is it has to look good whether it's on your wrist or in your ears or on your face. And so that's another thing.

Grace Harmon (08:10):

I also don't know that there's a winning formula for use cases. If we're looking at what these guys are planning, it would have kind of one single purpose. And just because it only has one single purpose doesn't mean it won't do well. Because again, coming back to the Humane AI Pin, and one of the biggest issues was just the hardware. It overheats, it gave [inaudible 00:08:31] responses. There's some other products in the market right now that also have single use cases like the Note Pen, and I don't think there's any sales figures on that at this point, but what it can do is record and transcribe, that is it.

Jacob Bourne (08:44):

Right. And people are pointing out that, "Well, smartphones can do that too, so why would you need that?"

Grace Harmon (08:49):

Exactly. There's free apps.

Marcus Johnson (08:52):

Yeah. Yeah, before the form factor get settled upon, folks might want to figure out what we are going to use this dedicated AI device for. And again to use your word Grace, is there a use case people need help with in terms of price. There is one final device I mentioned the $99 AI friend, it's kind of an Apple air tag shaped device that folks wear around their neck, like a necklace is powered by Anthropics, Claude 3.5 LLM.

Jacob Bourne (09:19):

One final thing is just looking at Apple's historical success with the iPhone. And it targeted a number of things, providing a privacy features, a premium product that is part of a greater ecosystem that basically got a huge following of dedicated users. And I think that even if this Eye of Altman device isn't something that is completely novel, if it can provide great privacy and great interoperability, those factors could be enough to maybe get some traction.

Marcus Johnson (09:53):

Yeah.

Grace Harmon (09:54):

I would have, I guess one last thing to say about that, which is I think that the price is going to be very tricky. Because if it's too high, people are not going to want it. New phones, I'm in need of a new iPhone right now and they're really expensive. If it's too low, it's going to be really problematic for them because Powering Chat GPT is really expensive. I think it was semi-analysis found that it's potentially up to $690,000 per day just to operate Chat GPT. And if they're expanding it to even more platforms, that's even more operating costs. They have to find the right balance. I don't know exactly what that would be, but I think that might be a make or break.

Jacob Bourne (10:33):

Yeah, I agree with that Grace. And I think I wouldn't be surprised if we're going to see Altman want to take another loss-leader approach. Just like with Chat GPT where it was free initially. Of course these won't be free, but they might be priced at a loss initially to get people to buy it. And then that could change with future generations.

Marcus Johnson (10:53):

Yeah.

Grace Harmon (10:54):

I think a subscription model is pretty likely.

Marcus Johnson (10:56):

Let's move from the hardware to the technology that's going to be powering whatever device comes out of these efforts. "Open AI just released o1 as a preview of its first model with reasoning abilities." Writes Kylie Robinson of the Verge. "The strawberry model as it's known, is better at writing code and solving multi-step problems than previous ones because it's using a think then answer approach." Ms. Robinson explains that, "Open AI taught previous GBT models to mimic patterns from its training data. But with o1, it trained the model to solve problems on its own using a technique known as reinforcement learning, teaching the system through rewards and penalties. It then uses a chain of thought to process queries similarly to how humans process problems by going through them step by step." She notes. Jacob, how does an AI model with reasoning abilities change things?

Jacob Bourne (11:54):

Yeah. Well first we need to see how well this one performs in the wild when it's fully released, so that remains to be seen. But actual reading skills I think could change everything really. So far we've seen impressive AI models, but really they're creative and very linguistically skilled, but they're also, they've been shown to be not great at math for example, not great at critical thinking and multi-step reasoning. And I think those lack of critical thinking abilities really makes them more prone to hallucinations as well as kind of limiting the kind of use cases you can get out of them as well. So I think that the models that can really think critically and do more like advanced mathematics, for example, I think that could help spur Agentic AI, AI that can operate autonomously on behalf of humans in the background. We might see that become more of a realistic technology. And ultimately I think critically thinking AI would definitely lay the foundation for artificial general intelligence.

Marcus Johnson (12:59):

I'm wondering how, or they're showing its workings thing change things. Because Open AI said they've designed interface to show the reasoning steps as the model thinks. However, it doesn't really show its workings, does it? Because the full detailed chain of thought that it uses to work something out, it is by and large hidden from folks otherwise competitors would be able to steal their secret sauce.

Jacob Bourne (13:23):

And what's interesting is Open AI was kind of threatening users who were trying to figure it out. In other words, they are being very protective about attempts to jailbreak it to find out how it works. Which is unsurprising for Open AI, they've been pretty tightlipped about their models. It also could be in part that just like with all advanced AI models, they don't fully know how it works yet. You can't fully get that pathway between input and output. There's this black box element to it.

Marcus Johnson (13:54):

Yeah. Yeah. Grace, it's interesting because I've never really... We've been so conditioned to type something into whatever and get an instant response. And it wasn't until I was reading an article by Matteo Wong of the basically saying, "Well, how powerful can LLM's, large language models be when you give them time to think?" And this model, o1 is given a limited amount of time to process queries. So it can say, "I'm running out of time, let me just get you an answer." However, Matteo Wong of the Atlantic was pointing out that the more time o1 was given to respond to a question, the better it performs. And we don't always need answers immediately, we've just become accustomed to that. But I'm wondering what it's going to look like when folks start to realize that they can sacrifice speed for quality and accuracy.

Grace Harmon (14:39):

I mean, I think that's a game changer there is the degree there I guess with how users will interact with it, that no one's very patient anymore with technology. I'm not, if Google search is taking too long, I will just in my head think that something's broken and reload the page. I don't wait that long. So I think that this is a game changer in terms of users are being posed with exactly the proposal gate, which is are you willing to wait longer for a better response? And I think that also kind of highlights some of the issues that are maybe existing with Chat GPT, which is prone to hallucinations. There's been some points where critics, Oregon users were really pointing out some of the very, very inaccurate, sometimes dangerous outputs it was giving.

(15:23):

So yeah, I think OpenAI is posing the question, just, "Would you wait longer?" And I think some people maybe not, maybe they're just not patient enough. I know that I think it's about a 20 to 32nd response time, that's not that long, but it might be too long for some people. I think that this puts pressure on other companies to put out reasoning models more quickly as well, because it's just kind of becoming that as we focus more and more on AI, as AI adoption is increasing at businesses that chatbots as the beginning level might not be enough.

Marcus Johnson (15:53):

Yeah, it's also interesting because it can ask for some direction. At the beginning of the chain of thought it may say, "I could do this or that, what should I do?" And so the human in the situation could help guide the AI to the answer that it wants. The problem here though is taking more time to answer means more power, which means more cost. And Mr. Wong of the Atlantic was noting that the o1 preview outputs is roughly four times more expensive than for GPT for raising the stakes of how soon I can be profitable, if ever.

Jacob Bourne (16:28):

Yeah, and I think the speed is really important here. And part of that cutting the power and the costs on the back end really is about chips, Nvidia and AMD and the other chip makers helping to get that cost down. But as far as just the speed from the end user's perspective, it depends on if the output is helping the user save time, well then it's worth the wait. If 20 minutes saves you an hour, well then yeah, you're going to wait that 20 minutes, of course.

Marcus Johnson (16:57):

Let me throw this at you both real quick before we move to our last question. Which is in this Atlantic article I was reading, it was basically saying it didn't make this leap, but I kind of did, which is that describing these new models as having human characteristics might not be the best marketing play for consumers. Because in the piece it was saying, "OpenAI set its model uses a chain of thought. AI search startup Perplexity says its product understands you. Anthropic has described its leading model Claude as having character and a mind." And Google says, "Its AI can reason." People are already becoming increasingly more nervous about AI and I don't think humanizing it is going to alleviate those concerns. Pew Research found from 2021 to 2023 the share of folks who said that they were more concerned than excited about AI and daily life went from 37% to 52%. Those more excited than concerned fell from 18% to 10%. So what's your take on them trying to humanize these models to make them feel more user-friendly, gentler?

Grace Harmon (18:00):

I think that the five models or five steps that OpenAI has laid out, chatbots, reasoners, agents, innovators, organizations. It can read as the steps to AGI, it also can read as the roadmap to developing AI that can completely replace a human worker. That last step is an organizer, is AI that can do the work and tasks of an entire organization. So I think it depends on maybe your pessimism, maybe your concerns, but I think that it can be viewed as two different roadmaps.

Marcus Johnson (18:36):

Let's end the episode by talking about another AI company, which has just been announced. Ilya Sutskever's AI startup, which raised over 1 billion dollars. Dan Primack and [inaudible 00:18:49] Were writing an article about this that the monster investment in the new company called Safe Superintelligence or SSI reflects the pedigree of the company's co-founders who include former OpenAI chief scientist, Ilya Sutskever, fellow OpenAI Vet Daniel Levy, and former Apple AI Chief Daniel Gross. The new company says, "Our singular focus means no distraction by management overhead or product cycles. And our business model means safety, security, and progress are all insulated from short-term commercial pressures." Mr. Sutskever had concerns over the pace of development at OpenAI when he was on their board. Grace, to go to you first for this one, how realistic is Safe Superintelligence, which is the name of the company, but also what they're trying to do.

Grace Harmon (19:32):

I think part of this endeavor is that you are trying to get AI that understands the consequences of its own actions to create really, really, really safe AI. And I don't know if our current understanding and current limitations of AI are there to get very quickly. But this is a really important thing to focus on. I think that reflects, like you said, both the pedigree and the positive notoriety of the founders and also the growing concerns that everyone has about safe ai. So I would say in some ways it is an insurmountable endeavor to guarantee quickly. But it's the company keeps working at it and if investors are patient, it seems like something really important.

Jacob Bourne (20:17):

Yeah, certainly important. I think the really interesting thing here is you have a combination of two things that are both nowhere near existing. The first thing is obviously superintelligence. Even artificial general intelligence, which is not quite the same thing, is a ways away. And the second thing is this notion that people could have a control over superintelligence and trust it when really the term superintelligence implies that it's smarter than the smartest humans, which seems like a contradiction to say that a human could control something that's far smarter than it. So I think the needle that they're trying to thread here is to say that, "Okay, we can build a superintelligence that's benign." In other words, "It can outsmart us, but it's not going to because it doesn't want to, because we built it that way. It's not inclined to do anything unsafe."

(21:05):

And I think, yeah, how could you possibly guarantee that, is the question. And I think that's really the ultimate goal of this initiative. And I think it's important not because either of these things, a superintelligence or even a safe one is imminent, but because tech companies are pouring billions of dollars into trying to create a superintelligence. And there's concern that there's a disproportionate or a lack of proportionate investment in the safety aspect of it. And so even if a super intelligent AI isn't possible, I think what we can be sure of is that AI is going to continue to get more advanced. And as it gets more advanced, then the safety concerns increase. And so that you really want to see a proportionate amount of safety research to match the amount of investment that's going on in AI advancement. I think this initiative kind of is a step in that direction.

Marcus Johnson (22:01):

It doesn't even need to get to superintelligence for it to be dangerous, right?

Jacob Bourne (22:04):

Right.

Marcus Johnson (22:06):

This company appears well-intentioned, but Kelsey Piper of Vox was saying that, "It's the intelligence part that's the double-edged sword. Technology is rarely bad, it's just that it can be used for good or bad." And Segal Samuel of Vox noting that, "OpenAI-o1 is the first model to score medium risk in its capabilities with chemical, biological, radiological, and nuclear weapons." Meaning while it's not capable enough to walk a complete beginner through developing a deadly pathogen, the evaluators found that it can help experts with the operational planning of reproducing a known biological threat. Yeah.

Grace Harmon (22:44):

I'd also say outside of maybe ethical concerns, we're also seeing the stakes for, I guess bad acting AI grow. We've spoke before about the proposed bill in California. There's the AI Act in EU. The actual concrete stakes for having AI that is not safe are growing. So it's not just, again, an ethical benefit for focusing on things like say superintelligence. There's now a financial benefit. That could be something that isn't media return on investment for investors, but it protects our investment.

Jacob Bourne (23:17):

Yeah. And going back to this idea of Agentic AI, AI agents, again, they don't have to be super intelligent. If they're built in a way that they can take action, even if it's booking a flight online or doing something else in the background without a person actually clicking a button and telling it to do that, that creates a whole host of safety risks in and itself.

Marcus Johnson (23:45):

All right folks. Well, that's where we have to leave today's episode, but thank you so much for hanging out with me for the conversation. Thank you to Grace.

Grace Harmon (23:52):

Thank you Marcus. Jacob, it was nice to be here.

Marcus Johnson (23:55):

Thank you to Jacob.

Jacob Bourne (23:55):

Pleasure, Grace. Pleasure, Marcus. Thanks for having me.

Marcus Johnson (23:57):

Yes indeed. Thank you to Victoria, she edits the show. Stuart runs the team. Sophie does our social media. Thanks to everyone for listening in. We hope to see you tomorrow for the Behind The Numbers Weekly listen. That's an eMarketer video podcast. You can watch that on YouTube or Spotify now if you would like, or of course you can just listen to it the usual way.

 

"Behind the Numbers" Podcast