Yash Datta — Read the Source
Read the Source takes one mechanism behind modern AI at a time and works it all the way down: the theory in full, a runnable implementation, the safety and alignment properties, and what it takes to run it in production. Reading one of these should leave you understanding the mechanism: why it works, and where it breaks.
From the latest post
Streaming softmax: the recurrence that makes FlashAttention work
A two-state recurrence (log-sum-exp with a running-max correction factor) turned 200K-token context windows from a hardware fantasy into a routine training run.
softmax( [ 1, 2, 3.0 ] ) = [ 0.090, 0.245, 0.665 ]
Drag the third value; the three weights recompute live. This is a figure from the article itself, not a mock-up.
Read the Source
All writing →Papers
- JavelinGuard: Low-Cost Transformer Architectures for LLM Security
First author. Low-cost transformer architectures for detecting malicious intent in LLM interactions.
- DeepContext: Stateful Real-Time Detection of Multi-Turn Adversarial Intent Drift in LLMs
Originated the core idea; led and executed by Justin Albrethsen.
New pieces by email — saucam.substack.com.