Google’s new robotics AI can run without the cloud and still tie your shoes

-

[ad_1]

Google’s new robotics AI can run without the cloud and still tie your shoes

We sometimes call chatbots like Gemini and ChatGPT “robots,” but generative AI is also playing a growing role in real, physical robots. After announcing Gemini Robotics earlier this year, Google DeepMind has now revealed a new on-device VLA (vision language action) model to control robots. Unlike the previous release, there’s no cloud component, allowing robots to operate with full autonomy.

Carolina Parada, head of robotics at Google DeepMind, says this approach to AI robotics could make robots more reliable in challenging situations. This is also the first version of Google’s robotics model that developers can tune for their specific uses.

Robotics is a unique problem for AI because, not only does the robot exist in the physical world, but it also changes its environment. Whether you’re having it move blocks around or tie your shoes, it’s hard to predict every eventuality a robot might encounter. The traditional approach of training a robot on action with reinforcement was very slow, but generative AI allows for much greater generalization.

“It’s drawing from Gemini’s multimodal world understanding in order to do a completely new task,” explains Carolina Parada. “What that enables is in that same way Gemini can produce text, write poetry, just summarize an article, you can also write code, and you can also generate images. It also can generate robot actions.”

General robots, no cloud needed

In the previous Gemini Robotics release (which is still the “best” version of Google’s robotics tech), the platforms ran a hybrid system with a small model on the robot and a larger one running in the cloud. You’ve probably watched chatbots “think” for measurable seconds as they generate an output, but robots need to react quickly. If you tell the robot to pick up and move an object, you don’t want it to pause while each step is generated. The local model allows quick adaptation, while the server-based model can help with complex reasoning tasks. Google DeepMind is now unleashing the local model as a standalone VLA, and it’s surprisingly robust.

[ad_2]

Source link

Latest news

What Happens During a Fire Watch? Inside the Process and Protocols

When a fire alarm system fails or a sprinkler line goes offline, things don’t pause until it’s fixed. In...

Bremont Is Sending a Watch to the Moon’s Surface

A multifaceted decahedral black ceramic bezel and sandwich-style three-piece case—a reworking of Bremont's signature Trip-Tick construction—house a chronometer-rated...

The Most WIRED Watches at Watches and Wonders 2026

The case is white zirconium oxide ceramic with a Ceratanium bezel and back, rated to handle temperature swings...

Bitcoin Price Pumps 6% Near $75,000 As Shorts Liquidate

Bitcoin price surged more than 5% in the evening of April 13, climbing near the $75,000...

You Can Soon Buy a $4,370 Humanoid Robot on AliExpress

Listing consumer electronics on the internet's large ecommerce marketplaces is a key step in “democratizing” the products, allowing...

Must read

You might also likeRELATED
Recommended to you