Distillation Caltrops: Watermarks That Live in Learned Habits
Plant a useful reasoning habit in a teacher model. A distilled student inherits the habit form without the substance, fails confidently, and leaves a detectable fingerprint that survives paraphrase.
μInference: From Minimal Stack to SL5 Weight Enclave
Can frontier AI inference run on a radically minimized stack while protecting model weights against nation-state adversaries? An seL4-backed Weight Enclave prototype.
OpenWild: A New Conversational Dataset for Modern LLMs
A new conversational dataset for modern language models. Inspired by WildChat, capturing real-world interactions with current-generation systems.
Mechanistic Watchdog: Real-Time Cognitive Interdiction for LLMs
Real-time cognitive interdiction for LLMs. A safety layer that monitors internal activations and halts generation before harmful content is produced.