Achille Triomphe | Scientific Worklog

Achille Triomphe | Scientific WorklogDeep learning kernels, GPU tiling optimizations, and compiler tools.https://www.achilletriomphe.com/en-usLoRA Without Regrethttps://www.achilletriomphe.com/blog/lora-without-regret/https://www.achilletriomphe.com/blog/lora-without-regret/An exploration of when low-rank adaptation matches full fine-tuning, covering learning rate invariance, capacity limits, and the geometry of parameter updates.Mon, 01 Jun 2026 00:00:00 GMTLoRAFine-TuningPEFTOptimizationDeep LearningAchille TriompheDissecting ThunderKittens: Anatomy of a Compact DSL for High-Performance AI Kernelshttps://www.achilletriomphe.com/blog/dissecting-thunderkittens/https://www.achilletriomphe.com/blog/dissecting-thunderkittens/A deep dive into Stanford's embedded DSL for CUDA, exploring how its 16×16 tile abstractions map directly to Tensor Cores, shared memory, and warp-group MMA on Hopper.Thu, 21 May 2026 00:00:00 GMTCUDAGPU KernelsHigh-Performance ComputingThunderKittensAchille