Nils Durner's Blog Ahas, Breadcrumbs, Coding Epiphanies

Document-to-Markdown Converters for LLM Use

Recently, a few open-source tools for converting PDFs, Office documents, and other formats into Markdown have drawn attention. Among these are MarkItDown from Microsoft, Docling from IBM Research, PyMuPDF4LLM, and the Jina AI Reader API. They aim to provide text suitable for downstream tasks, including LLM-driven analysis, without requiring manu... Read more...

o1 Pro Mode & Llama 3.3

Quick notes on last week’s foundation model releases: Read more...

[UPDATED] Amazon Nova foundation model release

[Update 2025-07-21: AWS has added Amazon Bedrock API keys. I haven’t tried this myself yet, but this could be a simplification to setting up IAM as described below.] Read more...

AI Agency: Philosophical Foundations

The term “AI Agent” has become increasingly prevalent in discussions about artificial intelligence, yet its meaning remains somewhat ambiguous. This ambiguity stems partly from different conceptualizations of agency across disciplines and languages. A recent LinkedIn discussion, sparked by Maximilian Seeth’s introduction to AI ethics, highlighte... Read more...

German NER experiments: Presidio, spaCy, GLiNER

As I experimented with the Microsoft Presidio live demo for PII, I found that neither model does very well with German language when the objective is to also identify organization names. Cloning the HuggingFace space that hosts this demo allows one to enable use of other models (through setting the environment variable ALLOW_OTHER_MODELS = 1), b... Read more...