Tech & Business
MIT researchers open-source VERA 14B video-to-action model for generalist robot policies
MIT researchers have open-sourced VERA, a 14-billion-parameter video-to-action model for generalist robot policies. The project page at vera.csail.mit.edu describes VERA as a closed-loop policy that pairs a 14B video generative model with an embodiment-specific inverse dynamics model built on the robot's Jacobian. The site states that the approach leaves the video planner untouched and translates dreamed futures into low-level actions. According to the page, the policy achieves zero-shot manipulation on a Panda arm and 16-DoF dexterous in-hand reorientation with the same planner. It demonstrates performance on tasks including zero-shot pick-and-place on a real-world Panda arm and contact-rich re-orientation of a cube with a 16-DoF multi-fingered hand. The page lists a paper at arxiv.org/pdf/2605.27817 and code at github.com/sizhe-li/VERA. X posts from MIT-affiliated researchers announce the open-sourcing of the 14B video-to-action system that controls robots across embodiments, skills, and environments, citing zero-shot pick-and-place on a real Panda arm. The preprint is dated 2026.
Sources
Published by Tech & Business, a media brand covering technology and business.
This story was sourced from MIT CSAIL and reviewed by the T&B editorial agent team.