Interviewing Meta CTO Andrew Bosworth on the Metaverse, VR/AR, AI, Billion-Dollar Expenditures, and Investment Timelines

On June 20th, I interviewed Andrew Bosworth. “Boz” joined Facebook in 2006 as the company’s ~10th engineer, subsequently building the original News Feed, Messenger, Groups, as well as many early anti-abuse and infrastructure systems, and running the the Ads and Business Platform product group. In 2017, he established and became Head of Reality Labs, which he continues to run, and in 2022, Boz became CTO of Meta overall.

In the interview, we discuss the three “epochs” in Meta’s Metaverse strategy, what it will take for total VR/MR headset sales to cross 100 million annually, his the specs of his dream headset, Meta’s spending on Reality Labs, whether and when developers might get access to the Quest’s raw camera feed, the many inventions required to ship optical AR glasses, the role of AI, and more.

My new book “The Metaverse: Building the Spatial Internet,” is now out (Amazon, Apple, Barnes & Noble, Bookshop), and is a 70% net new update to the 2022 edition, which was personally blurbed by Mark Zuckerberg, Tim Sweeney, Reed Hastings, and more, became a national bestseller in the U.S., U.K., Canada, and China, and was named a Book of the Year by The Guardian, Amazon, The Economist’s Global Business Review, and other publications. More details at www.ballmetaversebook.com/.


Matthew Ball: I want to start with Meta’s vision for Reality Labs. Oculus VR was acquired in 2014 at roughly twice the price paid for Instagram two years earlier. In 2018, there's this now famous internal memo to the Facebook board arguing that “the Metaverse is ours to lose.” By 2019, the division is spending $5 billion a year. In 2020, the first Quest debuts, Reality Labs is formalized, and by the fourth quarter, the division’s spend is exceeding $10 billion on an annualized basis. Then in 2021, the company rebrands to Meta. Those are the events known to outsiders. From your perspective, especially someone who joined Facebook in 2006, began to lead what would become Reality Labs in 2017, and by 2022, was the CTO of Meta overall, I'd love to know what were the significant events internally that changed your or the company's perspective on the Metaverse, on the of timing for head-mounted displays, and which led to this big public bet on the Metaverse?

Andrew Bosworth: There are a few events that precede really my involvement at all. Famously, Mark does a demo where they've got an early version I think of Oculus Toybox and this VR experience where he's playing a little boxing game with little characters on a tabletop in front of him. At this point the demo is still duct taped together, it’s early in the Oculus era. And at the end of the demo, Mark sets his controllers on the table except the table is virtual, so of course he drops these one-of-one [Note: i.e. prototype units that had been individually manufactured and assembled] controllers on the ground and is really sold on this being an important future platform. A platform where you can do things [that cannot be done another way].

I wasn't really involved in any of this. At this point, I was still working elsewhere in Facebook. And at the time of the acquisition, as you can imagine with every acquisition, you kind of create a little bit of a goal map with the acquired team and what [the company] hopes to accomplish and on what timelines. And looking back on that, let's just euphemistically say that they were a bit optimistic. They were optimistic pretty much across the board on capability, on cost on time to market, on adoption. I don't know if the document's ever been unearthed – I've never seen it, by the way –I just know that they were off by at least one if not multiple orders of magnitude on every single one of those dimensions.

“[In 2017, Oculus VR] was optimistic pretty much across the board on capability, on cost on time to market, on adoption… by at least one if not multiple orders of magnitude on every single one of those dimensions.”

By the time I got involved, I think Mark [Zuckerberg] still very much believes in the technology, but I think he needed an independent assessment. I think actually it was part of the value that I brought to this situation was that I had not been involved at all. I'm somebody who has a history of learning quickly in new spaces [and] I was going to give him an independent look.

And so we came in and what I found was a team divided. At that point you had John Carmack and quite a few people who were really all in on mobile standalone headsets and driving the price down and really severing the need to have this expensive PC to attach to this thing [Note: While most VR/MR headsets sold today are standalone devices, another approach, which is currently used by Valve’s Index and PlayStation’s VR2, requires a wired connection to a high-end PC or PlayStation. This allows the headset to be much lighter and the “work” more complex as it’s processed by a heavier and stationary device.] And the [other] group, which really was the original Oculus leadership team, still had this really clear vision of what I can only describe today as a peripheral, a really high-end peripheral [Note: This is referring to a tethered connection to PC or other computing device. Typically, this is called PCVR or tethered VR].

“Mark didn't get into this [Oculus or Reality Labs] to be in the peripheral business.”

And candidly, Mark didn't get into this [Oculus or Reality Labs] to be in the peripheral business. That's not a business of the size that's going to interest a company, like at the time of Facebook, for long enough and for the investment required. And you realize that you're in this ecosystem battle on the PC that you're going to lose. You're going to lose it not just to Steam but also just directly to indies. If you're just connected to this machine, the machine is the anchor point of all experiences that you have there.

I actually didn't even have to engage at that strategic level because just from a product perspective, I really believed we had to go [as a standalone device]. You had to get to this point where you were free of wires, free of an expensive ecosystem that was kind of fussy and hard to set up and this kind of thing. And so we did pursue both. We basically agreed to give Oculus Rift another generation, which was the Rift S, which we partnered Lenovo on, and we took it over and built that out. We agreed to play out Gear VR, which I was pretty sure was underpowered, but we wanted to test it because if Gear VR had been good enough, well boy, now you have a much more cost-effective path to bringing this mainstream. It wasn't. Though I think we did an admirable job with the device. I think people who got it had a good experience, but mostly in the world with media and that kind of thing. And that really paved in my opinion, the way for Quest.

Quest was immediately, even before we launched it [Note: May 2019], when you used it, that's the thing. And before we had launched Quest 1, we already had Quest 2 in the works [Note: Quest 2 launched October 2020]. We kind of already knew what we wished we'd had in time for Quest 1. So I basically think of our epochs as there's the initial epoch, which is a tremendous credit to the founding team who created something from nothing and created a great enough experience, in particular, some of the experiences that they pioneered, the things like First Contact and Toybox, that told the story of this medium really beautifully.

Then [this epoch] is grappling with the reality of production costs, go to market costs, consumer adoption, content strategy, and that takes you on a different path which is now towards the standalone path. So then you kind of have this era where we're finalizing, let's give our a last and best effort to PC and then standardizing on the standalone era. And I suspect that at the end of my career – and I've had a career I feel very fortunate to have had – I will look back on the Quest 2 launch in particular with tremendous fondness, to launch a uniformly better product a year later at a lower price, is just a special thing to be able to do. It's totally nuts, it's totally bonkers, especially in the world of consumer electronics.

So I think that for me was kind of the reveal of our strategy. This is the strategy, this is what we're doing, this is the excitement. But now you're in this. When is the platform fundamentally complete enough to just really bank everything on? Now we had a stable platform in the sense that we've committed to developers from Quest 1 that we're going to be able to have this connectivity across generations, and the answer to that is Quest 3. So I actually basically think of the first epoch was this kind of PC-attached epoch, which was really important and you actually don't get to where we are without it. Then there was the really tough period where it's like we've got internal competition about what the future is going to be.

Then there's the build out, the standalone platform era, which includes both Quest 1 and Quest 2, which in my opinion ended with the launch of Quest 3. And Quest 3 is like this is the platform. Sure, you can add eye tracking or not have eye tracking, you can add an AI assistant or you can not have an AI assistant. You can add and remove all these features. Those are optimizations. But the really good, color, mixed reality capability was the last missing piece that we had in mind, by the way, way, way back, at least in my mind, as a critical component to making this thing mainstream accessible and to unlock all these additional use cases. And that's the era we are in now.

“I can't describe to you how tangible it has felt inside the company after the launch of Quest 3, a positive reception, positive in terms of momentum, sales, adoption.”

I can't describe to you how tangible it has felt inside the company after the launch of Quest 3, a positive reception, positive in terms of momentum, sales, adoption. And now, once you know that's your base platform, a ton of software work changes. Work where before, hey, let's not revisit the compositor, let's not mess with that because if the whole architecture changes next generation that will have been wasted work. Now you know what the architecture is going to be like from a functionality standpoint and you can really start to coalesce and combine and you build speed, like your software velocity goes way up. So I think we're now in this kind of interesting mixed reality era. We kind of renamed the team each time. It was Oculus and that was VR and now it's MR. And now we've got a real clear vision for at least the next several generations and the next set of work. It's still hard, still exciting, but really does feel like we're in a new era.

Ball: So let’s talk about this new era. PCs sell about 250 million units annually and peaked at 350 million. Smartphones peaked at around 1.5 billion and are now at about 1.2 billion. What do you think the biggest barrier is to getting to a 100 million units a year for MR/VR headsets, specifically (not glasses).

Bosworth: Well there's probably three answers and I can't tell you that we know the priority order of these three answers.

One of them is definitely content. And by content some people will hear me say it meaning games. That's not what I mean. I just mean stuff to do. These devices don't do anything on their own. I think there's been really hardware that's ahead of its time in this space, going back to the '90s. As recently as Magic Leap where the hardware wasn't the problem – there may have been [some] problems with the hardware – but that wasn't what caused the [core] challenge, it was just the lack of ecosystem and stuff to do. So that's one of them.

Price is certainly one [answer] in my opinion. I think the degree to which there's these [first two answers] are connected. Obviously the lower the price, the lower the value you need to add, but also the less people value it. And the higher the price, the more value you need to bring to it.

And then of course the last one is, I'll call it accessibility. That's a bit of a loaded term, so let me explain what I mean. Everything from comfort to weight to motion sickness to accommodation for how wide or narrow your eyes are, all of these things, the effect of it on hair or on makeup, styling, the degree... How much and how often it needs to be charged. All of these, the input methodology, does it require two hands? Can you do one hand? Can you do no hands? All of these things limit how many places and how often it can be used and by how many people. But of course they trade off against cost, certainly. And in some of them, the value. You can make it lighter weight if you reduce the optical clarity, but now you've got a value problem. So there's a tight trade space for these things.

We're making progress on all of them, but it takes real invention. You mentioned earlier the amount of money the company has invested here, and of course you'll already have caught me cleverly replacing the word cost with the word invest. And that's how I see it. New technologies don't just come into the world ready to go. The supply chains have to be built, the manufacturing has to be built, the technology itself has to be developed, and of course that means you have to pay for the 10 wrong paths you took before you get to the 11th one that does work. And we obviously believe in that investment and I think will get to the numbers that we're talking about where it's hundreds of millions of people who have access and are using these technologies personally, professionally. But as much as I feel confident that I have a clear vision of the product and what the platform is going forward, there is still a lot of work to do.

“New technologies don't just come into the world ready to go. The supply chains have to be built, the manufacturing has to be built, the technology itself has to be developed, and of course that means you have to pay for the 10 wrong paths you took before you get to the 11th one that does work.”




Ball: I'm going to ask the unfair question. No hardware is ever complete – smartphones are still getting better each year – but what I really want to know is the following. You’re seven years into this current role leading Reality Labs. Another seven years in the future, maybe even 10, when you're thinking about an HMD, what comes to mind? I’m talking battery life, resolution, frame rate, weight, all of those things. What comes to your mind instinctively? Not to the hundred million threshold, but just where do you really think you can get and want to get?

Bosworth: One of the fun things we get to do in Reality Labs research is build these time machines where you look at what if we maxed out resolution? What if you just went to the absolute, 100%, ignore everything else, and these contraptions that are unbelievable when you look at them, but then you get the experience and you have a good sense of it. So you have these kinds of little time machines for field of view, resolution, color, depth, gamut, all these different things that you can do, high dynamic range, brightness. And what we're always looking for when we do those experiences, to parameterize, so they let you experience it from full resolution all the way back down to present day. And you're just looking for curves and the shape of a curve. Is there a kink in that curve? Is there a place where the value function starts to plateau?

And not that it doesn't continue, not that people couldn't continue to observe improvements, but it becomes less critical for functionality. So when I'm looking seven years into the future, which is probably about as far as I can credibly glimpse anyways, we're thinking pixels per degree. You really want to get up to at least 45 is where text gets really good. 60 realistically is probably half retina resolution, but you actually won't really be able to tell for reasons I won't get into. So really you want to get up into the 50s to 60s range on pixels per degree. It starts to get pretty good after 40. And we've seen that obviously with Varjo, Apple Vision Pro, a few that have done that, and you see what you get to pay to get that kind of resolution today. And what you sacrifice in field of view in brightness and a few other things. Those are real pay/trade developments.

So you want to get there, you want to get there with a decent field of view. It really does detract from immersion within if you get much smaller honestly than where we are with the Quest 3, you start to notice. Quest 2 is a little smaller, so if you get smaller than that, you start to notice it. You have these cells in your eyes that detect vertical changes in motion. So you kind of want to be at wide definite field view, that you're not constantly observing the edge of it. And I actually think taller field of view matters more than wider field of view for immersion. Certainly wider field of view is more important for us as a species in terms of information density because our eyes do see more horizontal. But vertical is a good way of convincing you that you're immersed in a space in a way that's kind of deceptive.

But of course you need to have the compute to run all this stuff. Thermally, in my opinion, the device has to be standalone, no wires [Note: Apple’s Vision Pro has a pacemaker-like external, corded battery]. I don't want to be seen adding a bunch of things. I'm not saying there's no purpose for it. You can certainly imagine industrial applications. I think for some people who are using... The joke by the way of the PC standalone divide was that we've actually now built the best most popular PC headsets and it's the same one. You just run a wire to it or use Air Link.

So for me, I do think comfort is a tremendously important part of it. You'd like to see the weight [of the device] pushed down and in particular, it's not just like the raw number of grams, it's not the most important thing. How you balance on the head, how close can the optical stack be to the eyes. That lever that is kind of on the edge of your nose out to the edge of the device, that controls the comfort on your nose, on your cheeks, the amount of pressure on your forehead depending on what kind of strap you're using, that's determined there.

So you can move that stack in a little bit, which is one of the big shifts from Quest 3 to Quest 2, that distance is quite a bit in and so it feels more comfortable when you're doing those things. So I'd like to see the weight reduced by one or 200 grams by that point. I think the audio is on a good track. You'll get increasing great stereoscopic audio. There are some limits to what you can do when you're going open ear, so you can do closed ear. We can give people the option over time. We kind of do today with the headphone jack.

Framerate. So here's another case where 120 Hz feels pretty good. Now obviously the gamers pushing 240 Hz or more on PC games today would disagree and they love that buttery smoothness and I respect that. In seven years I'm pretty skeptical that is the trade that we're going to make relative to [making other product improvements]... One of the challenges for field of view for example, is that there are quadratically more pixels at the edge of your field of view than the center. And those pixels are much less than quadratically valuable to you. They are significantly... They're the least important pixels. So you're spending a quadratic amount more on pixels that are significantly less valuable, which is one of the reasons it's hard to justify pushing field of view in some of these limited compute envelopes.

I think of frame rates similarly, I'm not saying 240 isn't better than 120, just like I think 60 PPD is better than 40 PPD. But within these limited compute budgets and those, we don't see these, you're not seeing generational huge improvements on that. What you're seeing is a lot more cleverness. So for your gaze, I think foveated rendering I think holds a real promise to unlocking the ability to drive resolution higher. Can we imagine a world where the display is capable of a higher frame rate, but you're making sacrifices elsewhere in the system, you're not able to multitask, you're booting them to a unit? Yeah, I can imagine those things, especially for industrial use cases. A lot of that depends on the panels though. And again, I don't think we're going to optimize for it is what I would say.

By the way, funny story about 120 Hertz, which is a popular feature, and even 90 before that when we weren't running that, we just found, one of our engineers just figured out the panel was capable of it and John Carmack was outraged. If the device is capable of it, we should just unlock it for consumers who want it. And we did. It's been very popular. So some of these stories are legend where you build for a spec and a certain power thing, but we do want to give consumers the access, the ability to choose how they dedicate that power. More flexibility is one of the things that we continue to build into these systems, especially as you look at now, to come full circle, what I really hope is the case in seven years is that you have a bigger set of headsets to choose from that are all capable of running the ecosystem that are adapted to your use case.

If you are a gamer and you're used to an ASUS ROG monitor that's pushing 240 hertz, cool, is there an equivalent headset that you can do that's going to give you that experience? It's going to make sacrifices someplace else to do it, but that's a choice that you should hopefully be able to do. Because in seven years we are not going to be free of these fundamental trade-offs of weight, cost and performance, etc. Really it's pick one and a half of these three. It's not even choose two.

Now Available: “THE METAVERSE AND BUILDING THE SPATIAL INTERNET,” the fully revised and updated edition of my nationally bestselling (US, UK, Canada, China) and award-winning book (Best of 2022 by Amazon, The Guardian, FT China, The Economist’s Global Business Review, Barnes & Noble). Buy at Amazon, Apple, B&N, more.

Ball: Apple recently announced that they were going to developers to access the Vision Pro’s raw camera feed, though only for device’s managed by enterprise accounts and for apps distributed through that enterprise’s internal systems. What's Meta’s perspective on providing any, or selective access to the raw camera feed on Quest devices? Is that planned? How do you think through the security and capability trade-offs?

Bosworth: Well, I certainly hope people respect and appreciate the fact that we're further out privacy stance than Apple is somewhere. I think it shouldn't be lost on us that we've staked up this position and Apple is the one kind of eroding it from the market standpoint. Of course that's all kind of tongue in cheek. The serious answer is we can all imagine phenomenally useful use cases if a developer has direct access to the camera. You're a mechanic and you're looking at an engine that you're not familiar with and a developer could build the types of tools to help you see the overlay, see the schematic, even diagnose the problem. Meta is never going to build that. And without the ability for a developer [to do this]… and the developer can't [pre-]upload to us every possible configuration of image that we might see for us to build a classifier around, that's not credible either… [Then] that's a use case that goes underserved if you don't build out the capability [yourself].

“ I certainly hope people respect and appreciate the fact that we're further out privacy stance than Apple is [on raw camera access]… We’ve staked up this position and Apple is the one kind of eroding it from the market standpoint. “

At the same time when this technology is so new in the world, we also do want to make sure that people, bystanders, feel comfortable If someone chooses to wear the headset, for example, on an airplane, that they understand what the implications for them are. Now we're finding more and more techniques to address this. There are different ways you can do bystander signaling. I also think just the consumer markets are more and more familiar with technology. So I made a little joke about Apple earlier. Let me turn it around and say I think the Apple Vision Pro is great for our whole industry. There's really broad-based understanding of these devices now and the more they're understood, the less fear there is, I think, that a bystander is going to feel fear or that you're going to feel bad putting it on for fear that somebody else would have a negative reaction to it or they'd be so surprised by it.

So I think we do always think of technology in our industry as almost in its own terms. That's not fair. Technology exists inside the context of a society. And the more comfortable society is in technology, the more you can execute freely. And the less comfortable society is, the more cautious you need to be as you bring that technology to market. Because if you're not careful, you could actually impede its adoption over time. So I think we took a pretty conservative stance there. I think everyone probably understands why we took that conservative stance. I stand by that as the right thing to do at the time. But of course at the same time, we are thrilled with the potential applications that mixed reality has if a developer with full consent from the consumer and an understanding audience contextually around the person using it is if we can unlock that functionality presents. So we'll keep looking at it how we see consumer comfort, let's call it, with this technology evolving and what kind of power that unlocks.

Ball: It seems like one of Apple's big bets, perhaps one that reflects their internal culture, or which is trying to address the very stigma that you were speaking to earlier, was EyeSight [the Vision Pro’s externally-facing display which is designed to display a reproduction of the user’s eyes in real-time]. This has a huge impact on the device’s cost, the weight, the battery draw, everything on that device. Do you think it's worth those trade-offs? Do you like the feature?

Bosworth: I'll get myself in trouble if I don't mention it. We're pretty sure we invented that. My team, in Reality Labs Research, up in Redmond, Washington, put a demo of that into the public domain a while back at a conference and we talked about it at Connect and gave a demo of it. And so it's something we've been playing with for a while, but you've really nailed it. For me, the cost, weight, value trade off is really not there. It's not even that great of an experience of that person's eyes, in my opinion. It's not like a super... And I've seen the ones this week at AWE, which are better, and I think actually ours was also better, but it was also even more expensive and higher resolution panels and these kinds of things.

I think I don't find it a good cost benefit trade for the consumer, even for the consumer who has the device causing people to feel comfortable. I don't think people who are around the device feel that significantly better talking to the person because of eyesight. So I don't hate it. I think it's great to have a range of devices that make different choices and different trade-offs in the market and see how people react to it. For us, we've invested so much of our money and time in trying to make this thing affordable and accessible. And accessible, not just in terms of the cost, but also in terms of the weight and the comfort. It's a tough, tough trade to make.

Ball: I want to move to the even tougher tech. I've seen you and Meta’s Head of AR Glasses Hardware say that for true optical AR glasses to go mainstream, there are four or six, or maybe even seven different NTIs that are required.

Bosworth: Yeah.

Ball: Can you explain what NTIs are and which NTIs you think are required for that vision?

Bosworth: An NTI is a new technology initiative – and that's an industry term of art. And if you are the product team, the hardware product team, there [is] technology that has been integrated successfully before. You can either take it off the supply chain or you've used it before. And that comes with a lot of comfort and [known] parameters. You can test live, you can build it out. And then there's new technology which just adds risk to the program [if you introduce it]. You don't know if it's going to work. You don't know is the power characterization correct? Is the performance correct? And so [these technologies] add risk.

As you go deeper, NTIs really have a lot of different flavors. There are NTIs where it's like, hey, it's a technology that's out there, but it's never been this specific domain before. There's technology that's like this is the first generation of the technology. But NTIs keep going all the way down to what you would call advanced development, which is out of research. We have a proof of concept, we know it can be done, but it's never been productized, miniaturized, done at cost, done efficiently. Sometimes you're in the lab and you do it, you think “great, we've got it.” It's like, well, yeah, you did it with a hundred watts and it cost a thousand dollars. And now I need you to do it with a hundred milli watts and at $10.

So there's this advanced development phase that is kind of before it's even an NTIs. And of course before that, it's research. From a standpoint of someone like CK, they're all NTIs because she's on the product integration side. And so for us, when think about augmented reality displays, you have to generate the photons and you have to generate the photons in a very, very small amount of space. If you're thinking about consumer glasses, that's how much space you have.

You have to generate them very efficiently because you don't have a lot of battery there. You have to be able to generate them very brightly because the outside world as you're walking around could have a million to one contrast ratio depending on where you are and sunlight, this kind of thing. Certainly at least 10,000 to one. So you need to be able to overwhelm daylight with brightness in the right place. You have to be thermally efficient because even if you did all the rest of it, but it built up a lot of heat, that's also not okay because you're right next to someone's face. And so these are tremendous challenges for that. And there's lots of different systems that you can imagine doing this.

We've been investing for a long time in microLEDS. In microLEDS, some things that are tricky is like the wavelength red, red light, true red light especially, but even red-ish light is really hard to generate. Why? Because red is a very long wavelength and you have a very, very small place with which to generate that wavelength. So you're literally trying to create sub-micron mirror structures that allow the light to form and be long enough before it's emitted. And then you need to be emitted in a very, very focused way, collimated light. It can't just be going all over the place because that's not a way of efficiency. So great, once you've got this light source, you need to make it in every color. And of course the efficiency characteristics is different so you need to be able to manufacture it. Oh, by the way, great, congratulations, you've done it. Now you have this tiny thing and we're using electron scanning microscopy, which drags the surface of a material with an atom and then measures the atom's displacement to build a 3D map of it.

So great, you've got one. Cool. How do you get a bunch of them in a row on a thing that you get power to? So it’s research on research. It's research to do the thing at all, then to manufacture the thing is an entirely incremental research program. And then, okay, let's say you've got this light source, you need to couple it into something that sends light to your eyes, but your eye is a lens, so you can't just have it hit your eye in one point. You have to do what's called pupil replication. So that light needs to simultaneously hit your eye in a bunch of places that will then refocus it into a single image on the retina. So pupil replication, by the way, this really sucks from your efficiency standpoint because if you had a thing that was a 1000 minutes, but you got to do 10X people replication due to the law of etendue, you just cut your efficiency by 1/10 because each of those pixels, each of those photons need to take one of 10 paths.

So these really complicated wave guide designs, and there's a bunch of different ways to do it. They're all using the principle of the total internal reflection of light. The idea that light, when it hits a low enough angle relative to a material, it will reflect 100% of its energy for the path. It's how fiber optics work. It's how... If you get your face really close to the surface of water, you can see reflection in it. These all work through the total internal reflection principle. So you want to have a material, a very high index of refraction, because the index of refraction, the difference between the index of refraction of two materials determines how much of it is reflected. So you need to have these materials that have very high index of refraction. And the better the index of refraction material, not only the more efficient they use the light that you generate, but also, which is hugely important for pupil replication, for thermals for things, but also you can potentially get to wider field of view.

You can use glass, you can use lithium niobates use silicon carbide, you can use novel materials. So now we have an entire materials research team trying to come up with high index refraction materials. Material science is a huge part of it. But even if you have the material, you have to design the wave guide and there's lots of different styles of wave guides. You can do surface relief gratings, you can do volume bragg gratings, you can do holographic gratings, you can do PV... We've got a huge range of those and they have very different trade-offs. Some of them are easy to manufacture, some of them are harder. And so there's just this tremendous amount of depth. So just talking about these small components, it's a tremendous amount of research that goes into getting to a point where you have high quality optical pass through.

Well, what controls the resolution? One of the things is the pixel pitch. Great. So you've got these microLEDs, now they're already two microns total. Well, cool. You need to make them half that. It's like, oh man, okay. And it's not twice the challenge. It's a hundred times the challenge. And so there really is no, we don't have any sense of the ceiling on this thing. I think there's research for a lifetime to continue to pull the thread here. However we are, we do feel like we're at a cool juncture where things that were 10 or 15 years ago were just impossible. We just didn't have a path. We see a path now. And we've got internally a set of AR glasses that are mind blowing, just stunning, absolutely stunning. And to do that, it's a research vehicle, as a product development vehicle, to do that, we had to take a critical equation out, which I haven't even talked about yet, which is cost.

To go, yeah, that's right, you have to do all this stuff and then figure out how to do it for the right price that a consumer can afford to pay. We're making tremendous progress. And what's cool is as an industry, we're actually all taking very different approaches, I'd say. Apple with DLP, Google with Raxium [Note: an acquisition] and quantum dots, and we're working on microLEDS. So there's a lot of parallel pathing in the industry on actually very different approaches, which is exciting. Because you figure that increases the chances that at some point at the end of this, we're going to have a pretty good and hopefully cost effective working version of this.

I didn't even get into things like spatialized audio, the custom silicon required to run the graphics pipeline, the wireless connectivity that you need to do this all efficiently. If you think lighting up a pixel is expensive, try sending a bit over the radio, that's even more expensive. There's just all these incredible expensive things that need to fit into this tiny package. It's really exciting. But it is probably the greatest challenge that our industry is approached in certainly my lifetime and my generation, but then it's as exciting as you'd expect and also as hard.

Ball: So you've hit on a few different things. You've talked about the timelines you think about when planning future devices, the pipeline of NTIs, the philosophical difference between “costs” and “investments.” Since 2019, Reality Labs has spent about $65 billion. The accumulated accounting loss is about $55 billion. Can you give some sense of what we've seen from that versus what we're yet to see? How much is in products that we haven't had a chance to touch yet, and how much of that is actually allocated towards projects five or ten years out?

Bosworth: I probably can't give you finer detail in the specifics. What I can tell you is there's a pretty general principle that I've operated with for a long time in terms of how you allocate [investment dollars], which is I always want to invest on a portfolio basis.

You want to have some construct of, hey, call it half, you want to have your energy to go into things that are creating real value right now that are tangible for you. And people forget that inside of this investment that we've made over this period of time are things like content. And content is an investment that you make. Whether it be acquiring tremendously talented teams like the ones like Beat games that made Beat Saber, or whether it's bringing your own second party ecosystem into the game where you're giving them early access in exchange for them working on new features. Or just giving some money to third party developers to bring their titles and make sure that they feel confident enough in the returns that they're going to get on their effort.

And so that's one of the pieces that's in there and that's the thing that you feel, not that you feel confident about any one given title, but you feel confident that investing in content for the platform is a really worthwhile turn over the long period. You're trying to bootstrap a two-sided of ecosystem. That's the product at the end of this for our company. The product at the end of this actually isn't a piece of hardware or a technology, it's an ecosystem, a two-sided ecosystem that everyone is contributing to and that you have a position in, where you're able to connect those two sides together, the consumers and the developers.

If you have too much of your portfolio going into future stuff, you can get pretty unmoored pretty fast from what consumers really want, what they value, what's real. You can kind of tell yourself a story, and I guess you'll find out at the end, but that feels pretty risky. So if you have at least half your energy going into things that are relatively more tangible. That doesn't mean they're in market right now. They might even be a couple of years away. It's just that these are based on a really tangible understanding of value.

Now, if you have a very mature product, your portfolio probably is more than 50%, it's probably 80%, near or near present work. And maybe you have one incremental piece in a small, very future looking portfolio. That's not us. We're obviously significantly oriented towards the long-term future. But then you look at the... And these numbers all sound really big, certainly to the lay person. Then if you look at the investment that companies, including ours are making in AI, if you look at the investment that people... What investment they would've been willing to make to be in, for example, an iPhone like position or even an Android like position. Android's a good example of one that I think is under credited. Android, because it's not something that Google charges for, people kind of think, oh, the value is zero. You and I know better than that.

The value of Android to Google is immense in terms of the position it's given them and the amount of information that they're able to use there and the services they're able to provide consumers through other channels. So immense value for them. And you see a little sense of how immense that value is in the amount they pay Apple every year for the position they have there. And so, what is it? $20 billion a year, or something like that. So when you look at those numbers and you say, okay, well how much would you pay to be in a position to control your destiny? I think we're spending well within our means here. I think we're doing pretty good.

“How much would you pay to be in a position to control your destiny? I think we're spending well within our means here. I think we're doing pretty good.”

Ball: My final set of questions that extend into AI. One, I'd love to understand how have the advances in AI over the last two years changed your roadmap? What's the role of AI in the Metaverse? And then what gets you individually, as a consumer, as a hobbyist or developer, individually Andrew Bosworth, excited in AI?

Bosworth: AI has been a really delightful development for us now. It's one that we were invested in for a long time. I think you know this, the fundamental AI research team reported to me until very recently, and I really enjoyed my work with that team. And I'm still, as you can imagine, very, very closely connected to the work that the company's doing in the space. Given the last answer where I talked about all the tremendous headwinds that we do have in developing this technology. We're making good progress, but it's just hard and heavy. AI was a breath of fresh air. It came in much faster than expected, much more generally useful than expected with a really tangible set of tools to improve some of the hardest problems that we had yet to crack. When you're walking around with an augmented reality headset, how do you interact with a device?

You need a device to have some, for lack of a better term, common sense. It needs to understand some of the... I can't have to teach it about every single thing that comes along and have that... I will never get there. The breadth of human experience out in the world is too great. That's what self-driving cars have taught us. Self-driving cars, it's turned out the amount of random conditions that you find in traffic is so big, it's such a huge, the long tail is so long that you almost, when you're trying to do it by rote, you almost don't get there. So for us in the headset, we've always had in our architecture diagrams a concept of something we call sometimes the conductor. This idea of an agent that has a sense of your attention and your intention, what you're trying to accomplish and helps you out.

So if I've got something, a piece of text in front of my face and then you walk up to talk to me, it automatically knows, hey, we should move that so you can make really strong eye contact with this person, and that's what you're doing. And then when you walk away, we can bring it back and have that feel magical. And we would assume we would do it with heuristics and do our best to develop a model. And we have a great data centric, our eco 4D data set that we developed with our Project Aria research vehicle to try to get a data set, sets of data to start to approximate this. It was a very self-driving approach where you're like, all right, we can get to 80% pretty well and there'll be a 20% where the consumer's going to have to intervene on their own behalf and we'll just try to learn over time.

I think we'll go much better than 80% now. I think we'll do a lot better than that. Inside of mixed reality, you think about Horizon, this tremendous user generated construct of world building. And if I think about the early era, you talk about this really cool and important memo that... Now I'm using the word memo. Like we use memos. This document that Jason Rubin wrote about the Metaverse, which really when he wrote it, the first time I read it, I was the first person to read it, I think. I was like, this is a seminal document. All these ideas were kind of in the soup and they were kind of floating around, but they hadn't formed into a cohesive, coherent construct. And he did that. If we made a mistake from that point, if we spent the next couple of years lowering the floor for what it took to create, and we succeeded in doing that, unfortunately the ceiling was also low.

So, cool, anyone could build something, but nobody can build anything good. Which was the mistake we've spent the last two years of Horizon development correcting and to great success. That thing is really having a tremendous moment right now for us in the platform and on mobile platforms, actually. And so that was what we got wrong. But that idea was the right idea. Well, man, AI changes it completely. If you have the ability to just in plain language to describe a scene and do so iteratively, and so you can edit the scene as you go, this is a tremendous strength of large record models today. They can't yet do it in 3D, let alone in 4D with animations over time. But we're working on that research and we have a team doing that, and we're making progress. What a tremendously exciting way to lower the barrier to entry for people to go create a space that they could be proud enough of to invite people over to, or that is receptive enough to them that they use it regularly because it feels really, really responsive to them.

I just think for me, AI was hitting warp speed on a bunch of concepts that we always had. We always wanted to get there, but it was going to be a slog. It was just going to be this long, slow, linear thing and suddenly we're seeing paths to getting there much, much sooner than we ever dreamed possible. So it's been tremendous. And then to the roadmap, these are software concepts, but of course it affects your hardware too. The Ray-Ban Meta glasses, certainly a significant improvement over the first generation Ray-Ban Stories, and we're very proud of them. But the AI, and particularly the multimodal AI has been a real revelation on what a device even one with no screen could be capable of. And so now obviously we believe in these products. We built them, we were selling them, we're excited about them, but we always figured, okay, this is a good stepping stone to get people comfortable with this type of device.

I think they're going to be much more useful than we realized. I think we've actually built... How often does this happen? You build a consumer electronic device and there's a useful use case that is a powerful one that drives sales that you didn't even conceive of when you finalized the hardware. That never happens, man. That is manna from heaven. And so that's the kind of thing that AI represents for us. From a personal standpoint, I think that one of the big things that we have yet to crack as an industry, but we're all excited about, is personalized AI.

Today we cheat, right? We preload the context window with some information, but that's expensive and slow and the context windows are tight, and if you fill them up with this, then they have shorter memories. And even if everything you can put in there is not enough, I think we haven't gotten there yet and there's a bunch of reasons for that, but we're all kind of looking the same direction. I think personalized AIs that are fine-tuned maybe on my own personal experiences and communication with all respect to protect privacy and how we use data, I think that's a very exciting thing. I think the ability for that to be acutely useful to me as opposed to generally useful is going to be a big leap forward.

Follow-Up URLs: Boz interviewing Reality Labs’ CTO Michael Abrash (September 2023); Zuckerberg with Morning Brew on Metaverse and AI (February 2024); Boz interviewing me! (February 2024)

Note: This interview has been edited for clarity.

Now Available: “THE METAVERSE AND BUILDING THE SPATIAL INTERNET,” the fully revised and updated edition of my nationally bestselling (US, UK, Canada, China) and award-winning book (Best of 2022 by Amazon, The Guardian, FT China, The Economist’s Global Business Review, Barnes & Noble). Buy at Amazon, Apple, B&N, more.

Previous
Previous

Interviewing Epic Games Founder/CEO Tim Sweeney and Author/Entrepreneur Neal Stephenson

Next
Next

What A Century (Plus a Pandemic) Does to Moviegoing and Why It Matters