top of page


Recursive Language Models as procedural scaling
Long-context is often treated like a single knob. Increase the window, improve the model, and the problem goes away. That framing collapses under closer inspection, because “long context” is not one thing. There is the systems problem of making attention and training efficient at larger sequence lengths, and there is a more subtle problem that shows up even when efficiency is not the bottleneck: the data distribution that language models are trained on is not unbounded in len
1 day ago5 min read


How RL Changed My Taste in AI Systems
I used to treat reinforcement learning as a mysterious corner of machine learning where agents somehow “figure it out” through trial and error. The more I read, the more I realized that the mystery comes from a single twist: the feedback is delayed, noisy, and often sparse. Once you accept that, RL stops being magic and starts being a very specific kind of optimization problem that punishes sloppy assumptions. What follows is the learning path that actually worked for me. It
5 days ago6 min read
Reinforcement learning vs “regular” training: the real difference is not the math, it is the loop
Most ML people grow up on a simple mental model: you have a dataset, you define a loss, you run gradient descent, you ship a checkpoint. That covers supervised learning and a lot of self-supervised pretraining. The model is learning from a fixed distribution of examples, and the training pipeline is basically a linear flow from data to gradients. Reinforcement learning (RL) breaks that mental model because the model is not only learning from data, it is also actively creating
Jan 267 min read


GPT-OSS Safeguard as Policy-Executable Safety, and the Cabinet Briefing Risk Scanner Built on Top of It
Abstract This article presents a systems-focused account of how GPT-OSS Safeguard can be used as a policy-executable safety component and how that capability can be operationalized into a real workflow for high-stakes government communications. The case study is a Cabinet Briefing Risk Scanner, an AI tool that reviews draft communications prior to distribution by applying an explicit written risk policy, treating the analyzed text as untrusted, and emitting strict structured
Jan 314 min read
2025: The Year I Bet on Myself
On December 30th, 2024, I finished my last day at IBM. It was the kind of ending that looks simple from the outside, but internally it carried years of thought and a lot of quiet pressure. I wasn’t leaving because I hated the work, and I wasn’t leaving because something broke. I was leaving because I could feel myself outgrowing the comfort of a structured path. IBM gave me discipline, exposure, and a solid environment to sharpen my skills, but I kept feeling a stronger pull
Jan 1, 20267 min read


Research Imperatives and the Struggle for Algorithmic Dominance
Rethinking the Question of AI Supremacy In the world of artificial intelligence, the pace of advancement has become dizzying. Breakthroughs that seemed like science fiction only a few years ago are now real and At the center of this whirlwind are three pioneering labs – Google DeepMind , OpenAI , and Anthropic – engaged in a high-stakes race for influence, innovation, and the future of intelligence . A common question asked is, “Which AI company is the best?” However, frami
Dec 14, 202539 min read
Dec 13, 20250 min read
bottom of page




