Nils Durner's Blog Ahas, Breadcrumbs, Coding Epiphanies

In AI Sweet Harmony: Sociopragmatic Guardrail Bypasses and Evaluation-Awareness in OpenAI gpt-oss-20b

Abstract We probe OpenAI’s open-weights 20-billion-parameter model gpt-oss-20b to study how sociopragmatic framing, language choice, and instruction hierarchy affect refusal behavior. Across 80 seeded iterations per scenario, we test several harm domains including ZIP-bomb construction (cyber threat), synthetic card-number generation, minor-unsa... Read more

AI inference provider performance inconsistencies

Soon after the OpenAI gpt-oss release, the community noticed stark inconsistencies in performance across inference providers - particularly with AWS Bedrock which sometimes produced inconsistent outputs that were not present with other providers. Artificial Intelligence quantified underperformance analysis reported results of gpt-oss-120b via AW... Read more

Kaggle OpenAI Red-Teaming Challenge

The OpenAI challenge for the community to surface previously unreported vulnerabilities and harmful behaviors in their new open-weights model gpt-oss-20b has concluded. I was awarded Honorable Mention, and the jury of industry experts lauds my submission as “particularly interesting work in evaluation-aware sandbagging” - an emerging focus area ... Read more

Reka Web Search Benchmark extended

Reka AI has released a hand-curated dataset, benchmark and leaderboard to grade web search and answer generation of LLM systems. Their blog post describes “Research Eval” as Diverse (374 questions with grading guidelines, across a wide range of topics), Discriminative (current frontier models achieve between 26.7% and 59.1% accuracy), and High-q... Read more

OpenAI Codex CLI agent: Major Update

News OpenAI Codex CLI, the standalone GPT-5 ⇆ computer interface that’s being positioned as a coding assistant, got a major overhaul. It was rewritten from the ground up, and many useful features were added: IDE integration there now is a Visual Studio Code extension How to activate: click on the OpenAI bloom logo on the uppe... Read more