• Gemini Robotics-ER 1.6 achieves 93% instrument reading accuracy with agentic vision, up from 23%, enabling reliable autonomous industrial inspections.
  • Boston Dynamics’ Spot robot now leverages the model to monitor facilities independently, interpreting gauges and displays without human intervention.
  • The model records a 10% improvement in video-based injury risk perception over Gemini 3.0 Flash, reinforcing safety in human-facing environments.

On April 14, 2026, Google DeepMind announced the launch of Gemini Robotics-ER 1.6, a significant upgrade to its reasoning-first model designed to help robots understand their physical environment with unprecedented precision.

The new model introduces enhanced spatial reasoning, multi-view understanding, and a groundbreaking instrument reading capability that allows robots to interpret gauges, sight glasses, and digital displays—a feature developed in collaboration with Boston Dynamics to enable fully autonomous facility inspections.

The release positions Google deeper in the competitive robotics AI space, where companies like Figure AI, Tesla, and Physical Intelligence are all racing to develop robots capable of reasoning through complex tasks rather than simply executing pre-programmed commands. Gemini Robotics-ER 1.6 is now accessible to developers through the Gemini API and Google AI Studio, marking a strategic push to expand beyond research labs into real-world industrial applications.

Instrument Reading Capabilities for Industrial Inspections

The most notable addition in Gemini Robotics-ER 1.6 is the ability to read industrial instruments—a capability developed specifically for facility inspection use cases. The model can interpret circular pressure gauges with sub-tick accuracy, vertical level indicators, chemical sight glasses, and digital readouts by combining visual reasoning with code execution. When processing a reading, the system zooms into images, identifies needles and markings, and applies mathematical calculations to estimate values with remarkable precision.

According to Interesting Engineering, the instrument reading accuracy jumped from 23% to 93% when agentic vision was enabled—a leap that makes the difference between a useful tool and an unreliable one in industrial settings. The model uses intermediate reasoning steps, including zooming into images, pointing and code execution for proportion estimation, and applying world knowledge to account for variables like camera distortion when estimating liquid fill levels in sight glasses.

The collaboration with Boston Dynamics proved essential for developing this capability. Their Spot robot can now visit instruments throughout facilities and capture images for real-time interpretation, enabling autonomous monitoring without human intervention. Marco da Silva, Vice President and General Manager of Spot at Boston Dynamics, stated that capabilities like instrument reading and more reliable task reasoning will enable Spot to see, understand, and react to real-world challenges completely autonomously.

Gemini Robotics-ER 1.6: Safety Improvements and Developer Availability

Google described Gemini Robotics-ER 1.6 as its safest robotics model yet, with significant improvements in adherence to physical safety constraints and hazard identification. The model demonstrated a 6% improvement in text-based injury risk perception and a 10% improvement in video-based injury risk perception compared to Gemini 3.0 Flash, as measured through the Asimov Benchmark v2. This safety-first approach reflects a broader industry trend toward deploying AI systems in human environments where mistakes could have consequences beyond a failed task.

The model functions as a high-level reasoning brain for robots, capable of executing tasks by natively calling tools like Google Search, vision-language-action models, or other user-defined third-party functions. It specializes in core robotics capabilities including spatial logic, task planning, and success detection—the ability to determine when a task has been completed correctly.

Multi-view reasoning allows the system to process multiple camera streams simultaneously, including overhead views and wrist-mounted feeds, creating a more complete understanding of the environment even when occlusion or poor visibility complicate the picture.

Google DeepMind’s Gemini Robotics-ER 1.6 marks a turning point in industrial automation, demonstrating that the gap between experimental robotics and real-world deployment is narrowing faster than expected. With instrument reading accuracy reaching 93% and safety benchmarks trending upward, the model signals that autonomous robots capable of operating reliably in complex industrial environments are no longer a distant prospect—they’re arriving now.

Leave your vote