In a recent LinkedIn discussion around a review of PDF-focused KI-Assistants (ComputerBase article), I touched on limitations, workarounds, and the ongoing tension between domain-specific tooling and general-purpose, ever-evolving AI platforms. The Article in Brief The ComputerBase article put Adobe’s new KI-Assistant, Google’s NotebookLM, and... Read more 12 May 2025 - 3 minute read
With ChatGPT offering features not found in the API, like o1-pro previously or o3 with integrated web search, users may want to switch back and forth between ChatGPT, the Prompts Playground, and perhaps archive to Markdown. Enter Chatbot Conversation Converter: A Python utility that converts chat conversations between different formats, inclu... Read more 10 May 2025 - less than 1 minute read
In my note on LLM as a judge, I pointed out a study where GPT‑4 (0613) aligned well with human ratings. A new paper – “Are LLM‑Judges Robust to Expressions of Uncertainty?” – asks what happens once those answers include explicit markers of certainty or doubt. The authors provide “EMBER”, a benchmark that patches existing QA and instruction‑follo... Read more 04 May 2025 - 2 minute read
One problem with current AI Assistants including ChatGPT and Microsoft Copilot Chat is that they are chronically online: they are not a more or less pure LLM experience anymore, but search the web through tool-use. While this helps to keep answers up-to-date beyond the LLM training cut-off date, there’s a disadvantage in that the UIs don’t allow... Read more 04 May 2025 (Updated) - 1 minute read
Artificial Analysis, an “Independent analysis of AI models and hosting providers” outlet, have published a chart plotting “Intelligence Index vs. Price” (X post). This places Grok 3 Mini Reasoning at the upper left corner, inside the “Most attractive quadrant”. According to this, it has the best intelligence/price ratio - even at High-Reasoning ... Read more 19 Apr 2025 - 1 minute read