<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>Achille Triomphe | Scientific Worklog</title><description>Deep learning kernels, GPU tiling optimizations, and compiler tools.</description><link>https://www.achilletriomphe.com/</link><language>en-us</language><item><title>LoRA Without Regret</title><link>https://www.achilletriomphe.com/blog/lora-without-regret/</link><guid isPermaLink="true">https://www.achilletriomphe.com/blog/lora-without-regret/</guid><description>An exploration of when low-rank adaptation matches full fine-tuning, covering learning rate invariance, capacity limits, and the geometry of parameter updates.</description><pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate><category>LoRA</category><category>Fine-Tuning</category><category>PEFT</category><category>Optimization</category><category>Deep Learning</category><author>Achille Triomphe</author></item><item><title>Dissecting ThunderKittens: Anatomy of a Compact DSL for High-Performance AI Kernels</title><link>https://www.achilletriomphe.com/blog/dissecting-thunderkittens/</link><guid isPermaLink="true">https://www.achilletriomphe.com/blog/dissecting-thunderkittens/</guid><description>A deep dive into Stanford&apos;s embedded DSL for CUDA, exploring how its 16×16 tile abstractions map directly to Tensor Cores, shared memory, and warp-group MMA on Hopper.</description><pubDate>Thu, 21 May 2026 00:00:00 GMT</pubDate><category>CUDA</category><category>GPU Kernels</category><category>High-Performance Computing</category><category>ThunderKittens</category><author>Achille</author></item></channel></rss>