Robotics – Page 2 – All About AI

Genie Envisioner: A Unified Video-Generative Platform for Scalable, Instruction-Driven Robotic Manipulation

RoboticsAugust 26, 202584Views 0Likes 0Comments

Embodied AI agents that can perceive, think, and act in the real world mark a key step toward the future of robotics. A central challenge is building scalable, reliable robotic manipulation, the skill of deliberately interacting with and controlling objects through selective contact. While progress spans analytic methods, model-based approaches, and large-scale data-driven learning, most…

Meet DeepFleet: Amazon’s New AI Models Suite that can Predict Future Traffic Patterns for Fleets of Mobile Robots

RoboticsAugust 21, 202583Views 0Likes 0Comments

Amazon has reached a remarkable milestone by deploying its one-millionth robot across global fulfillment and sortation centers, solidifying its position as the world’s largest operator of industrial mobile robotics. This achievement coincides with the launch of DeepFleet, a groundbreaking suite of foundation models designed to enhance coordination among vast fleets of mobile robots. Trained on…

NVIDIA AI Introduces End-to-End AI Stack, Cosmos Physical AI Models and New Omniverse Libraries for Advanced Robotics

RoboticsAugust 16, 202588Views 0Likes 0Comments

Nvidia made major waves at SIGGRAPH 2025 by unveiling a suite of new Cosmos world models, robust simulation libraries, and cutting-edge infrastructure—all designed to accelerate the next era of physical AI for robotics, autonomous vehicles, and industrial applications. Let’s break down the technological details, what this means for developers, and why it matters to the…

NVIDIA Releases Cosmos-Reason1: A Suite of AI Models Advancing Physical Common Sense and Embodied Reasoning in Real-World Environments

RoboticsAugust 11, 202589Views 0Likes 0Comments

AI has advanced in language processing, mathematics, and code generation, but extending these capabilities to physical environments remains challenging. Physical AI seeks to close this gap by developing systems that perceive, understand, and act in dynamic, real-world settings. Unlike conventional AI that processes text or symbols, Physical AI engages with sensory inputs, especially video, and…

NVIDIA AI Releases GraspGen: A Diffusion-Based Framework for 6-DOF Grasping in Robotics

RoboticsAugust 6, 202582Views 0Likes 0Comments

Robotic grasping is a cornerstone task for automation and manipulation, critical in domains spanning from industrial picking to service and humanoid robotics. Despite decades of research, achieving robust, general-purpose 6-degree-of-freedom (6-DOF) grasping remains a challenging open problem. Recently, NVIDIA unveiled GraspGen, a novel diffusion-based grasp generation framework that promises to bring state-of-the-art (SOTA) performance with unprecedented…

NVIDIA AI Presents ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning

RoboticsAugust 1, 202592Views 0Likes 0Comments

Estimated reading time: 5 minutes Introduction Embodied AI agents are increasingly being called upon to interpret complex, multimodal instructions and act robustly in dynamic environments. ThinkAct, presented by researchers from Nvidia and National Taiwan University, offers a breakthrough for vision-language-action (VLA) reasoning, introducing reinforced visual latent planning to bridge high-level multimodal reasoning…

URBAN-SIM: Advancing Autonomous Micromobility with Scalable Urban Simulation

RoboticsJuly 27, 202595Views 0Likes 0Comments

Micromobility solutions—such as delivery robots, mobility scooters, and electric wheelchairs—are rapidly transforming short-distance urban travel. Despite their growing popularity as flexible, eco-friendly transport alternatives, most micromobility devices still rely heavily on human control. This dependence limits operational efficiency and raises safety concerns, especially in complex, crowded city environments filled with dynamic obstacles like pedestrians and…

VeBrain: A Unified Multimodal AI Framework for Visual Reasoning and Real-World Robotic Control

RoboticsJuly 22, 202578Views 0Likes 0Comments

Bridging Perception and Action in Robotics Multimodal Large Language Models (MLLMs) hold promise for enabling machines, such as robotic arms and legged robots, to perceive their surroundings, interpret scenarios, and take meaningful actions. The integration of such intelligence into physical systems is advancing the field of robotics, pushing it toward autonomous machines that don’t just…

EmbodiedGen: A Scalable 3D World Generator for Realistic Embodied AI Simulations

RoboticsJuly 17, 202590Views 0Likes 0Comments

The Challenge of Scaling 3D Environments in Embodied AI Creating realistic and accurately scaled 3D environments is essential for training and evaluating embodied AI. However, current methods still rely on manually designed 3D graphics, which are costly and lack realism, thereby limiting scalability and generalization. Unlike internet-scale data used in models like GPT and CLIP,…

Google DeepMind Releases Gemini Robotics On-Device: Local AI Model for Real-Time Robotic Dexterity

RoboticsJuly 12, 202588Views 0Likes 0Comments

Google DeepMind has unveiled Gemini Robotics On-Device, a compact, local version of its powerful vision-language-action (VLA) model, bringing advanced robotic intelligence directly onto devices. This marks a key step forward in the field of embodied AI by eliminating the need for continuous cloud connectivity while maintaining the flexibility, generality, and high precision associated with the…