Writing
Essays on how modern language models work, and on building and evaluating the systems around them.
Individual essays, newest first. Multi-part arguments are collected under Series.
Read traces before you write the labeling guide
LLM evaluation, honestly
Stop vibe-checking your agent
LLM evaluation, honestly
Behind the scenes of AI agent frameworks
What an agent actually is
Information extraction didn’t disappear. It moved inside the workflow.
Research foundations of modern LLMs
The fine-tuning stack: one loss, different data
Research foundations of modern LLMs
Retrieval is older than RAG: from DPR to end-to-end
Research foundations of modern LLMs
The encoder didn’t die. It became the embedding model
Research foundations of modern LLMs
Pretraining objectives: why decoder-only won
Research foundations of modern LLMs
PPO is REINFORCE plus five fixes
How LLMs learn to reason
REINFORCE: the world before the gradient
How LLMs learn to reason
No matching items