With ChatGPT offering features not found in the API, like o1-pro previously or o3 with integrated web search, users may want to switch back and forth between ChatGPT, the Prompts Playground, and perhaps archive to Markdown. Enter Chatbot Conversation Converter: A Python utility that converts chat conversations between different formats, inclu... Read more 10 May 2025 - less than 1 minute read
In my note on LLM as a judge, I pointed out a study where GPT‑4 (0613) aligned well with human ratings. A new paper – “Are LLM‑Judges Robust to Expressions of Uncertainty?” – asks what happens once those answers include explicit markers of certainty or doubt. The authors provide “EMBER”, a benchmark that patches existing QA and instruction‑follo... Read more 04 May 2025 - 2 minute read
One problem with current AI Assistants including ChatGPT and Microsoft Copilot Chat is that they are chronically online: they are not a more or less pure LLM experience anymore, but search the web through tool-use. While this helps to keep answers up-to-date beyond the LLM training cut-off date, there’s a disadvantage in that the UIs don’t allow... Read more 04 May 2025 (Updated) - 1 minute read
Artificial Analysis, an “Independent analysis of AI models and hosting providers” outlet, have published a chart plotting “Intelligence Index vs. Price” (X post). This places Grok 3 Mini Reasoning at the upper left corner, inside the “Most attractive quadrant”. According to this, it has the best intelligence/price ratio - even at High-Reasoning ... Read more 19 Apr 2025 - 1 minute read
In addition to the newly released OpenAI models, I have added Web Search to my LLM frontend. This allows up-to-date information to be worked with: Prompt: when is the new German chancellor going to be sworn in? Response: Friedrich Merz is scheduled to be elected as Germany’s new Chancellor on May 6, 2025. (reuters.com) […] Source references... Read more 28 Jun 2025 (Updated) - 3 minute read