Posts

Experiment setup: RealSense camera over an ArUco-tracked plate held by a Franka Panda arm

Balancing a ball on a plate with a Franka Panda

The task Our team from KAIRA was invited to the Europe Embodied hackathon, where we picked the robotic arm challenge. The goal was to build a vision-to-robot pipeline that can balance a ball on top of a plate. It’s a classic double integrator problem that doesn’t rely on AI. The pipeline Perception — RealSense D405 overhead → ArUco plate localization → HSV ball detection → homography to plate coords (u,v), rim normalized to 1.0. State estimation — constant-velocity Kalman filter; outlier gating + velocity for damping + latency lead (readout extrapolated forward along velocity to cancel camera delay). Calibration — Kabsch for camera→robot extrinsic; automatic axis-map (tilt→roll) so no manual sign tuning. (see Challenge 1) Control — PD/PID on the ball error → desired tilt (θx, θy) → absolute pose level_pose · R_tilt streamed to a Cartesian-impedance node. (see Challenge 2) Safety — tilt clamp, slew limit, lost-ball re-level, return-to-level on exit. Challenge 1 — aligning the coordinate systems The ball position is measured in the image frame (u,v)and the robot tilts about its end-effector x/y. These differ by an unknown rotation + sign that changes any time the camera moves. Calibrating this map wasn’t easy. We tried doing it manually in the beginning, but switched to measurements later on: ...

Workspace setup with the SO-101 robot arm, custom gripper, and peg/keyhole on the table

Training a robot to thread a peg through a keyhole in 36 hours

The setup RoboHack2026 is the first iteration of a robotics-based hackathon organized by AI Team (https://epflaiteam.ch/). You get one robot and 36 hours, as well as tons of tools to impress a jury. We were given a LeRobot SO-101 that we had to assemble first. The last time I built my own hardware was in high school, so I was glad my teammates were able to offset my weaknesses here. Getting it up and running after 4 hours was such a relief, and playing around with teleoperation is a lot of fun! ...

Diffusion language model architecture: masked input tokens through a Transformer to predicted output

Building My Own Diffusion Language Model

To be, fo hend! First her sense ountier to Jupits, be horse. Wise words! This is the results of 2 hours of training my very own PyTorch Diffusion Language Model on an M2 MacBook Air. You can check out the code over at GitHub: github.com/Encrux/simple_dlm Why? Diffusion Language Models are kind of a hot-topic right now in Machine Learning. The basic idea: corrupt some data with noise, then train a model to reverse that corruption over many small steps. ...

MuJoCo robot arm executing an LLM-generated trajectory on a tabletop scene

What happens when you let an LLM write robot programs?

In 2022 I was tasked with designing an interactive robot system that takes a natural language instruction and outputs a working robot trajectory. This is a write-up of the design decisions and a showcase of a web demo using MuJoCo that. The code in this demo is a rebuild of the original code "Prompt: Put all the cubes on the plate" How can you make robots listen to what a user says? Human-Robot-Interaction (HRI) has come a long way over the last couple of years. Vision-Language-Action Models are the de facto state of the art in robotics research. PhAIL is a leaderboard aimed at measuring real-world performance against humans, which makes it clear that general physical AI is not there yet. ...

Many DDNet bot agents running through a 2D platformer level

Unsuccessfully training AI to play my favorite childhood game

I trained 1,500 parallel bots to play DDNet, a cooperative 2D platformer with grappling hooks, freeze mechanics, and maps that take actual players hours to complete. This is what it looks like: In this video you can see my army encroaching agents running through the initial section of a Tutorial map that teaches basic movements and adds difficulty incrementally. It’s designed for someone who’s never played before, lends itself well for reinforcement learning. ...