articles
Articles
Longer technical writeups from lonely guy on reinforcement learning, LLMs, cyber environments, robotics, and systems.
- Project Halide, Building A Small-Model Diagnostic Workbench For Damaged FilmAn open-weight film diagnostics workbench built around MiniCPM-V 4.6, Nemotron-Mini-4B, real negative failure cases, and a validator that became as important as the model.
- Teaching a 1.5B LLM to be a SOC Analyst (Without Burning Down the Network)Fine-tuning a 1.5B LLM with GRPO to defend against realistic enterprise attack chains, and the unexpected lessons learned from reward hacking.