Dr. Jim Fan – Leading the way in AI
Earlier this year, I led the Voyager project, and there’s no game better than Minecraft for the infinite creative things it supports. Minecraft has 140 million active players and is insanely popular because it’s open-ended and does not have a fixed storyline for you to follow. You can do whatever your heart desires in the game. When we set Voyager free in Minecraft, we saw that it was able to play the game for hours on end without any human intervention. The video here shows snippets from a single episode of Voyager, where it explores terrains, mines materials, fights monsters, crafts recipes, and unlocks new skills, all without any pre-programming. This is what we call lifelong learning, where an agent is forever curious and forever pursuing new adventures. Voyager scales up massively on the number of things it can do but still controls only one body in Minecraft.
The Future of AI in Robotics
Can we have an algorithm that works across many different bodies? Enter Metamorph, an initiative co-developed at Stanford. Metamorph is a foundation model that can control not just one but thousands of robots with very different arm and leg configurations. It has shown that it is able to control thousands of robots to go upstairs, cross difficult terrains, and avoid obstacles. Compared to Voyager, Metamorph takes a big stride towards multi-body control. Taking everything one level further, we have ISAC Sim, Nvidia’s simulation effort. ISAC Sim can accelerate physics simulation to a thousand times faster than real-time and can procedurally generate worlds with infinite variations. If an agent can master 10,000 simulations, it may very well just generalize to our real physical world, the simply the 10,000 and1st reality. The goal is to train a foundation agent by scaling it up massively across lots of realities.
AI in the Real World
I believe in a future where everything that moves will eventually be autonomous, and all the AI agents, physical or virtual, will just be different prompts to the same Foundation agent. Dr. Jim Fan, one of my favorite AI researchers, recently posted an announcement that his Ted Talk is finally live. He proposed the recipe for the foundation agent, a single model that learns how to act in different worlds. This was one of the first big AI research studies that blew my mind and opened me up to what was possible. Dr. Jim Fan was one of the people behind Voyager, the open-ended embodied agent with large language models. The impressive thing about Voyager was that it was able to learn continuously and keep progressing, unlike other models that plateau. Another groundbreaking achievement by Dr. Jim Fan’s team at Nvidia was teaching a robot how to spin a pencil in its fingers, something that was considered near impossible. They used GPT 4 to code reward models for various robots in Nvidia’s simulation, and the results were better than those produced by human experts.
The Hyperbolic Time Chamber
This next generation of robotics training in time compression chambers, where time runs faster, draws comparisons to the hyperbolic time chamber from the TV show Dragon Ball Z. Nvidia is not just a chip company; it is a world leader in AI research, simulation, and robotics. The company’s ability to simulate realities that function similarly to our own raises questions about our base reality and whether we are just automatons running and learning new skills for the benefit of others in a higher reality. The deeper we go with simulations, the more we have to question the nature of reality.
In Conclusion
Dr. Jim Fan and his team at Nvidia are leading the way in AI research, simulation, and robotics. Their groundbreaking work with Voyager, Metamorph, and ISAC Sim is pushing the boundaries of what is possible in the field of AI and robotics. The future they envision, where everything that moves will eventually be autonomous, and AI agents will be prompts to the same Foundation agent, is both exciting and thought-provoking. As we continue to advance in the field of AI, simulations, and robotics, we must constantly question the nature of reality and the potential implications of our creations.