The DeanBeat: Nvidia CEO Jensen Huang says AI will autofill Metaverse 3D images

Interested in knowing what’s next for the gaming industry? Join game leaders to discuss new parts of the industry this October at GamesBeat Summit Next. Register today.


It takes types of AI to create a virtual world. Nvidia CEO Jensen Huang said this week during a Q&A at the GTC22 online event that AI will automatically populate 3D images of the metaverse.

He believes that AI will take the first step in creating the 3D objects that populate the vast virtual worlds of the metaverse—and then human creators will take over and refine them as they see fit. And while that’s a pretty big claim about how smart AI will be, Nvidia has the research to back it up.

– Advertising –

Nvidia Research is announcing this morning that a new AI model could help make the massive virtual worlds created by a growing number of companies and creators easier to populate with a wide variety of buildings, vehicles, 3D characters, etc.

This kind of mundane imagery represents a huge amount of tedious work. Nvidia said the real world is full of variety: the streets are lined with unique buildings, with different vehicles passing by and different crowds passing by. Manually modeling a 3D virtual world that reflects this is extremely time-consuming, making it difficult to populate a detailed digital environment.

It’s this kind of task that Nvidia wants to facilitate with its Omniverse tools and cloud service. He hopes to make life easier for developers when it comes to building metaverse apps. And self-generated art—as we’ve seen this year with DALL-E and other AI models—is one way to lighten the burden of building a universe of virtual worlds like in Snowfall Where Loan plays one.

Nvidia CEO Jensen Huang speaks during the GTC22 keynote.

I asked Huang in a Q&A with the press earlier this week what could accelerate the metaverse. He hinted at the work of Nvidia Research, although the company has not said anything until today.

“First of all, as you know, the metaverse is created by the users. And it’s either created by us by hand or created by us using AI,” Huang said. “And and in the future, it’s very likely that we’ll describe a feature of a house or a feature of a city or something like that. And it’s like this city, or it’s like Toronto, or it’s like New York, and it creates a new city for us. And maybe we don’t like it. We can give it additional prompts. Or we can just keep hitting “Enter” until it automatically generates one that we want to start from. And then from that, from this world, we’ll modify it. So I think AI to create virtual worlds is happening as we speak.

Details of GET3D

Nvidia GET3D is formed using only 2D images and generates 3D shapes with high-fidelity textures and intricate geometric details. These 3D objects are created in the same format used by popular graphics software applications, allowing users to immediately import their shapes into 3D renderers and game engines for further editing.

Generated objects can be used in 3D representations of buildings, outdoor spaces or entire cities, designed for industries such as gaming, robotics, architecture and social media.

GET3D can generate a virtually unlimited number of 3D shapes based on the data it is trained on. Like an artist transforming a piece of clay into a detailed sculpture, the model transforms numbers into complex 3D shapes.

“At the heart of this is the very technology that I talked about just a second ago, called large language models,” he said. “To be able to learn from all mankind’s creations and to be able to imagine a 3D world. And so words, through a great pattern of language, will one day come out, triangles, geometry, textures and materials. And from there we would change it. And because nothing is pre-baked and nothing is pre-rendered, all this physics simulation and all lighting simulation must be done in real time. And that’s why the latest technologies we’re creating around neuro RTX rendering are so important. Because we can’t do it with brute force. We need the help of artificial intelligence to achieve this.

With a training dataset of 2D car images, for example, it creates a collection of sedans, trucks, racing cars and vans. When trained on animal images, it features creatures such as foxes, rhinos, horses, and bears. Due to the chairs, the model generates a selection of comfortable swivel chairs, dining chairs and reclining chairs.

“GET3D brings us closer to democratizing AI-powered 3D content creation,” said Sanja Fidler, vice president of AI research at Nvidia and head of the Toronto-based AI lab that created the tool. “Its ability to instantly generate textured 3D shapes could be a game-changer for developers, helping them quickly populate virtual worlds with varied and interesting objects.”

GET3D is one of more than 20 Nvidia-authored papers and workshops accepted to the NeurIPS AI conference, which takes place in New Orleans and practically Nov. 26-26. December. 4.

Nvidia said that while they were faster than manual methods, previous 3D generative AI models were limited in the level of detail they could produce. Even newer reverse rendering methods can only generate 3D objects based on 2D images taken from different angles, forcing developers to create one 3D shape at a time.

Instead, GET3D can produce around 20 shapes per second when running inference on a single Nvidia graphics processing unit (GPU) – acting as a generative adversarial network for 2D images while also generating 3D objects. The larger and more diverse the training dataset it has learned from, the more varied and
detailed output.

Nvidia researchers trained GET3D on synthetic data consisting of 2D images of 3D shapes captured from different camera angles. It took the team just two days to train the model on approximately one million frames using Nvidia A100 Tensor Core GPUs.

GET3D gets its name from its ability to generate explicit textured 3D meshes, meaning that the shapes it creates come in the form of a triangular mesh, like a papier-mâché model, covered with a textured material. This allows users to easily import the objects into game engines, 3D modelers and movie renderers – and edit them.

When creators export the shapes generated by GET3D into a graphics program, they can apply realistic lighting effects as the object moves or rotates in a scene. By incorporating another AI tool from NVIDIA Research, StyleGAN-NADA, developers can use text prompts to add specific style to an image, such as changing a rendered car to become a burnt-out car or a taxi, or turning an ordinary house into a haunted one.

The researchers note that a future version of GET3D could use camera position estimation techniques to allow developers to train the model on real-world data instead of synthetic datasets. It could also be improved to support universal generation, meaning developers could train GET3D on all possible 3D shapes at once, instead of having to train it on one category of objects at a time. .

Prolog is Brendan Greene's next project.
Prolog is Brendan Greene’s next project.

So AI will generate worlds, Huang said. These worlds will be simulations, not just animations. And to handle all this, Huang foresees the need for a “new type of data center in the world”. This is called a GDN, not a CDN. It’s a graphics streaming network, battle-tested through Nvidia’s GeForce Now cloud gaming service. Nvidia has taken this service and is using it to build the Omniverse Cloud, a suite of tools that can be used to build Omniverse apps anytime, anywhere. GDN will host cloud games as well as Omniverse Cloud metaverse tools.

This type of network could provide the real-time computation that the metaverse requires.

“It’s an interactivity that’s essentially instantaneous,” Huang said.

Are there any game developers asking this? Well, actually I know someone who is. Brendan Greene, the creator of the battle royale game PlayerUnknown’s Productions, called for this kind of technology this year when he announced Prologue, then unveiled Project Artemis, an attempt to create a virtual world the size of Earth. He said it could only be built with a combination of game design, user-generated content and AI.

Well, damn it.

The GamesBeat belief when covering the video game industry is “where passion meets business”. What does it mean? We want to tell you how much the news means to you, not only as a decision maker in a game studio, but also as a game fan. Whether you’re reading our articles, listening to our podcasts, or watching our videos, GamesBeat will help you learn about and engage with the industry. Experience our briefings.

Leave a Comment