Experiments
Technical experiments, benchmarks, and build logs. Things I run on my own compute, written up for anyone who finds them useful.
AI Red-Team Harness Finds Reproducible AgentDojo Injection
A self-healing AI red-team harness promoted a medium-severity AgentDojo prompt-injection finding only after an independent replay reproduced it.
Independent TurboQuant Benchmarks on Consumer GPUs
First published benchmarks of Google's TurboQuant at 7B scale on RTX 4080. 45 data points, 4 models, real VRAM measurements.