PLAY PODCASTS
Embodied AI 101

Embodied AI 101

100 episodes — Page 1 of 2

Robotic World Model: Learning to Simulate for Robust Robot Control

Jun 12, 202619 min

AnchorWorld: Embodied Egocentric World Simulation with View-based Evolution Customization

Jun 10, 202623 min

ArtiFixer: Few-Step Diffusion for 3D Scene Reconstruction

Jun 10, 202626 min

Deployment-Time Memorization in Foundation-Model Agents

Jun 10, 202639 min

Adversarial Machine Learning: Taxonomy, Threat Models, and Mitigation Strategies in Deep Neural Networks

Jun 9, 202634 min

SoCRATES: Evaluating LLM Mediators in Conflict Scenarios

Jun 9, 202620 min

Unembedding Matrix as a Feature Lens: Unlocking Better Text Embeddings

Jun 9, 202624 min

LeanMarathon: Autonomous Formalization of Math Proofs on Erdős Problems

Jun 8, 202627 min

Deep Research Agents: Survey and Roadmap for Autonomous AI Research

Jun 8, 202637 min

Cosmos 3: Omnimodal World Models for Physical AI

Jun 7, 202628 min

Humanoid-GPT: GPT-Style Transformer for Zero-Shot Dynamic Humanoid Control

Jun 7, 202622 min

Bending Paper, Shaping Dexterity: The Robotic Origami Challenge

Jun 5, 202629 min

GraspGen-X: A Foundation Model for Zero-Shot 6-DoF Grasping

Jun 5, 202635 min

When Does Deep RL Beat Calibrated Baselines?

Jun 4, 202615 min

Training Deep Networks as Random Effects: An Optimization–Inference Duality

Jun 4, 202621 min

Generative Depth Supervision for Embodied Vision-Language Models

Jun 2, 202628 min

PointWorld: Scaling 3D World Models for In-The-Wild Robotic Manipulation

Jun 1, 202630 min

LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding

May 31, 202632 min

LT2: Linear-Time Looped Transformers

May 31, 202642 min

One Learning Rate Doesn't Fit All: Layerwise Spectral Scheduling for Transformers

May 31, 202626 min

SimToolReal: Procedural Tool Generation and a Universal Objective for Zero-Shot Tool Manipulation

May 30, 202625 min

Robometer and the Future of Robotic Reward Modeling

May 30, 202643 min

Qwen-VLA: A Generalist Vision–Language–Action Robot Model

May 29, 202635 min

EXPO-FT: Sample-Efficient Reinforcement Learning Fine-Tuning for Vision-Language-Action Models

May 29, 202633 min

RoboMeter: Learning Dense Rewards from Successes and Failures

May 29, 202637 min

MobileGym: A Controllable, Parallel Sandbox for Mobile GUI Agents

May 27, 202653 min

ANY2ANY: Efficient Cross-Embodiment Transfer for Humanoid Whole-Body Tracking

May 27, 202619 min

TriSplat: Feed-Forward 3D Reconstruction with Triangulated Meshes

May 26, 202643 min

MIKASA-Robo-VLA: A Memory-Intensive Benchmark for Vision-Language-Action Robotics

May 26, 202628 min

PointWorld: Scaling 3D World Models for In-The-Wild Robotic Manipulation

May 25, 202629 min

Bimanual Pegboard Manipulation: A Benchmark for Vision-Language-Action Models

May 24, 202627 min

FutureSim: Replaying Real-World Events to Evaluate AI Forecasting Agents

May 24, 202627 min

AgentFloor: A Benchmark for Long-Horizon Agent Planning

May 24, 202635 min

AlexNet: The Deep Convolutional Network That Transformed Vision

May 23, 202641 min

A Few Useful Things to Know About Machine Learning

May 23, 202645 min

SimToolReal: A Universal Dexterous Tool-Use Policy

May 23, 202628 min

Mimic-Video: Learning Physics Priors from Web-Scale Video for Robot Dexterity

May 23, 202629 min

Deep Residual Learning for Image Recognition (ResNet)

May 23, 202624 min

Attention Is All You Need – The Transformer Revolution

May 23, 202626 min

NVIDIA Cosmos: World Foundation Models for Physical AI

May 20, 202629 min

LATENT: Teaching a Humanoid to Play Tennis from Imperfect Data

May 19, 202620 min

CollabVR: Collaborative Video Reasoning with Vision-Language and Video Generation Models

May 19, 202642 min

World Action Models: The Next Frontier in Embodied AI

May 19, 202636 min

Training a Whole-Body Control Foundation Model

May 18, 202639 min

DexJoCo: A Unified Benchmark for Task-Oriented Dexterous Manipulation

May 18, 202643 min

MMSkills: Building Multimodal Skill Libraries for Visual Agents

May 18, 202619 min

PhysBrain 1.0 VLA (TwinBrainVLA): Dual-Brain Vision-Language-Action with Physics-Grounded Learning

May 18, 202625 min

MolmoAct2-LIBERO: An Open Vision-Language-Action Model for Robotics

May 17, 202638 min

SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Diffusion Transformers

May 17, 202620 min

WildClawBench: A Real-World, Long-Horizon Benchmark for AI Agents

May 17, 202632 min