Nils Durner's Blog Ahas, Breadcrumbs, Coding Epiphanies

ChatGPT - Search Engine or not

In discussions about the regulation of AI services under European law, ChatGPT is in focus — this time concerning the Digital Services Act (DSA). Prompted by a commentary by Luca Bertuzzi on MLex, a LinkedIn exchange recently debated whether ChatGPT qualifies as a “Very Large Online Search Engine” (VLOSE) under the DSA framework. My contribution... Read more...

Security Vulnerabilities in Model Context Protocol (MCP) Servers

The Model Context Protocol (MCP), dubbed “USB-C for AI”, but essentially an out-of-process execution variety of the tool-use paradigm originally by Anthropic, has raised security concerns before. Now, a blog post finds that the official Python SDK makes MCP server services available to the world (0.0.0.0), not just locally (127.0.0.1). As a resu... Read more...

PDFs, LLMs, and 'The Bitter Lesson' in Document AI

In a recent LinkedIn discussion around a review of PDF-focused KI-Assistants (ComputerBase article), I touched on limitations, workarounds, and the ongoing tension between domain-specific tooling and general-purpose, ever-evolving AI platforms. Read more...

LLM Chat Conversion Tool

With ChatGPT offering features not found in the API, like o1-pro previously or o3 with integrated web search, users may want to switch back and forth between ChatGPT, the Prompts Playground, and perhaps archive to Markdown. Enter Chatbot Conversation Converter: A Python utility that converts chat conversations between different formats, inclu... Read more...

EMBER: Epistemic Markers as a Stress‑Test for LLM‑based Evaluation

In my note on LLM as a judge, I pointed out a study where GPT‑4 (0613) aligned well with human ratings. A new paper – “Are LLM‑Judges Robust to Expressions of Uncertainty?” – asks what happens once those answers include explicit markers of certainty or doubt. The authors provide “EMBER”, a benchmark that patches existing QA and instruction‑follo... Read more...