
ByteDance, the parent company of TikTok, has quietly rolled out a groundbreaking AI-powered robot system capable of folding clothes and clearing dining tables using natural language commands. At the centre of this innovation is a vision-language-action model called GR‑3, which has been integrated into the two‑armed mobile robot prototype known as ByteMini.
How the ByteDance AI Robot Works
The core of ByteDance’s system is GR‑3, a large-scale multimodal model trained on image and text data, fine-tuned using human motions within virtual reality environments, and refined by replicating actual robot trajectories. When embodied in ByteMini, GR‑3 enables the robot to:
-
Hang a shirt onto a clothes rack by inserting a hanger
-
Distinguish objects by size or position (for example a larger plate or an item on the left)
-
Quickly pick up items and place them in designated spots
-
Execute full tabletop cleaning tasks using a single natural language instruction
The system adapts in real-time, handling unseen objects, even manipulating short-sleeve shirts despite being trained on long-sleeve garments.
What’s New With ByteDance Robots
For the tech nerds, GR‑3 uses a vision-language-action (VLA) model that merges visual inputs from ByteMini’s cameras. Using natural language, the robot then translates these inputs to generate real-time actions.
It combines a transformer-based vision encoder with a fine-tuned language model to understand tasks and execute them through ByteMini’s two robotic arms and mobile base.
The robot learns through VR-captured motion demos and real-world fine-tuning, allowing it to handle deformable items like clothes and adapt to unfamiliar tasks.
Previous attempts at laundry-folding devices like FoldiMate and Laundroid failed to reach widespread adoption decades ago. However, real-time object recognition and vision‑aligned action routines might just be the missing X factor ByteDance managed to nail. Despite threats of U.S. restrictions, ByteDance is doubling down on AI and robotics, with its Seed research division leading development.