
Embodied AI 101
100 episodes — Page 1 of 2
Robotic World Model: Learning to Simulate for Robust Robot Control
Jun 12, 202619 min
AnchorWorld: Embodied Egocentric World Simulation with View-based Evolution Customization
Jun 10, 202623 min
ArtiFixer: Few-Step Diffusion for 3D Scene Reconstruction
Jun 10, 202626 min
Deployment-Time Memorization in Foundation-Model Agents
Jun 10, 202639 min
Adversarial Machine Learning: Taxonomy, Threat Models, and Mitigation Strategies in Deep Neural Networks
Jun 9, 202634 min
SoCRATES: Evaluating LLM Mediators in Conflict Scenarios
Jun 9, 202620 min
Unembedding Matrix as a Feature Lens: Unlocking Better Text Embeddings
Jun 9, 202624 min
LeanMarathon: Autonomous Formalization of Math Proofs on Erdős Problems
Jun 8, 202627 min
Deep Research Agents: Survey and Roadmap for Autonomous AI Research
Jun 8, 202637 min
Cosmos 3: Omnimodal World Models for Physical AI
Jun 7, 202628 min
Humanoid-GPT: GPT-Style Transformer for Zero-Shot Dynamic Humanoid Control
Jun 7, 202622 min
Bending Paper, Shaping Dexterity: The Robotic Origami Challenge
Jun 5, 202629 min
GraspGen-X: A Foundation Model for Zero-Shot 6-DoF Grasping
Jun 5, 202635 min
When Does Deep RL Beat Calibrated Baselines?
Jun 4, 202615 min
Training Deep Networks as Random Effects: An Optimization–Inference Duality
Jun 4, 202621 min
Generative Depth Supervision for Embodied Vision-Language Models
Jun 2, 202628 min
PointWorld: Scaling 3D World Models for In-The-Wild Robotic Manipulation
Jun 1, 202630 min
LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding
May 31, 202632 min
LT2: Linear-Time Looped Transformers
May 31, 202642 min
One Learning Rate Doesn't Fit All: Layerwise Spectral Scheduling for Transformers
May 31, 202626 min
SimToolReal: Procedural Tool Generation and a Universal Objective for Zero-Shot Tool Manipulation
May 30, 202625 min
Robometer and the Future of Robotic Reward Modeling
May 30, 202643 min
Qwen-VLA: A Generalist Vision–Language–Action Robot Model
May 29, 202635 min
EXPO-FT: Sample-Efficient Reinforcement Learning Fine-Tuning for Vision-Language-Action Models
May 29, 202633 min
RoboMeter: Learning Dense Rewards from Successes and Failures
May 29, 202637 min
MobileGym: A Controllable, Parallel Sandbox for Mobile GUI Agents
May 27, 202653 min
ANY2ANY: Efficient Cross-Embodiment Transfer for Humanoid Whole-Body Tracking
May 27, 202619 min
TriSplat: Feed-Forward 3D Reconstruction with Triangulated Meshes
May 26, 202643 min
MIKASA-Robo-VLA: A Memory-Intensive Benchmark for Vision-Language-Action Robotics
May 26, 202628 min
PointWorld: Scaling 3D World Models for In-The-Wild Robotic Manipulation
May 25, 202629 min
Bimanual Pegboard Manipulation: A Benchmark for Vision-Language-Action Models
May 24, 202627 min
FutureSim: Replaying Real-World Events to Evaluate AI Forecasting Agents
May 24, 202627 min
AgentFloor: A Benchmark for Long-Horizon Agent Planning
May 24, 202635 min
AlexNet: The Deep Convolutional Network That Transformed Vision
May 23, 202641 min
A Few Useful Things to Know About Machine Learning
May 23, 202645 min
SimToolReal: A Universal Dexterous Tool-Use Policy
May 23, 202628 min
Mimic-Video: Learning Physics Priors from Web-Scale Video for Robot Dexterity
May 23, 202629 min
Deep Residual Learning for Image Recognition (ResNet)
May 23, 202624 min
Attention Is All You Need – The Transformer Revolution
May 23, 202626 min
NVIDIA Cosmos: World Foundation Models for Physical AI
May 20, 202629 min
LATENT: Teaching a Humanoid to Play Tennis from Imperfect Data
May 19, 202620 min
CollabVR: Collaborative Video Reasoning with Vision-Language and Video Generation Models
May 19, 202642 min
World Action Models: The Next Frontier in Embodied AI
May 19, 202636 min
Training a Whole-Body Control Foundation Model
May 18, 202639 min
DexJoCo: A Unified Benchmark for Task-Oriented Dexterous Manipulation
May 18, 202643 min
MMSkills: Building Multimodal Skill Libraries for Visual Agents
May 18, 202619 min
PhysBrain 1.0 VLA (TwinBrainVLA): Dual-Brain Vision-Language-Action with Physics-Grounded Learning
May 18, 202625 min
MolmoAct2-LIBERO: An Open Vision-Language-Action Model for Robotics
May 17, 202638 min
SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Diffusion Transformers
May 17, 202620 min
WildClawBench: A Real-World, Long-Horizon Benchmark for AI Agents
May 17, 202632 min