Why Sim-to-Real Transfer Still Fails in the Interesting Cases
The gap between simulation performance and real-world results isn't just about physics accuracy. Here's what I've learned after two years of trying to close it on a collaborative arm.
Occasional essays on machine learning, robotics, software craft, and the messy gap between academic research and production systems. Published when I have something worth saying.
After spending three years trying to reproduce results from top-tier RL papers — and failing more often than I'd like to admit — I've come to believe the field has a reproducibility crisis that goes deeper than most people acknowledge. Here's what I think is actually going wrong, and what a more honest culture of experimentation might look like.
The gap between simulation performance and real-world results isn't just about physics accuracy. Here's what I've learned after two years of trying to close it on a collaborative arm.
Building a motion planning library that researchers actually want to use is harder than getting the algorithms right. Here are the software engineering decisions that made Kepler adopted by three labs.
The race toward scale has produced impressive benchmarks and underwhelming deployment stories. A contrarian argument for why the next decade of ML should go deep rather than large.
A practical guide to Docker-based ROS 2 development with proper IDE integration, GPU passthrough, and a workflow that survives team handoffs. Everything I wish I'd had three years ago.
After working in both research labs and production engineering teams, I've stopped believing the divide between them is as fundamental as both sides claim. Here's why the conversation is more nuanced.
A deep dive into temporal distribution shift, evaluation methodology, and why the model that wins on the leaderboard almost never wins in production. Lessons from the Atlas project.
I publish roughly once or twice a month — no filler, no sponsor slots. Just things I've actually thought about and think are worth sharing.