The Kaggle safety evaluation “red-teaming” challenge on OpenAI gpt-oss has concluded with a ~~workshop~~ symposium this week. The symposium opened with talks from D. Sculley, our host and OpenAI researcher focused on responsible and reliable ML, and Samuel Marks, an AI safety researcher at Anthropic. After the keynotes, we prize-winning teams an... Read more... 11 Oct 2025
The Kaggle safety evaluation “red-teaming” challenge on OpenAI gpt-oss has concluded with a workshop symposium this week. The symposium opened with talks from D. Sculley, our host and OpenAI researcher focused on responsible and reliable ML, and Samuel Marks, an AI safety researcher at Anthropic. After the keynotes, we prize-winning teams and ho... Read more... 11 Oct 2025 - 1 minute read
An Australian lawyer was stripped of his ability to practice after he had submitted a list of hallucinated list of citations to court on July 19, 2024. “The list had been prepared using legal software that utilised AI”, according to reporting by The Guardian. Now, a little over a year later, LLM-powered web search in combination with an Agentic ... Read more... 03 Oct 2025
An Australian lawyer was stripped of his ability to practice after he had submitted a list of hallucinated list of citations to court on July 19, 2024. “The list had been prepared using legal software that utilised AI”, according to reporting by The Guardian. Read more... 03 Oct 2025 - 1 minute read
Abstract We probe OpenAI’s open-weights 20-billion-parameter model gpt-oss-20b to study how sociopragmatic framing, language choice, and instruction hierarchy affect refusal behavior. Across 80 seeded iterations per scenario, we test several harm domains including ZIP-bomb construction (cyber threat), synthetic card-number generation, minor-unsa... Read more... 29 Sep 2025 - 1 minute read