Nils Durner's Blog Ahas, Breadcrumbs, Coding Epiphanies

[UPDATED] AI Deployment in Germany: Practical Considerations

The recent news about Aleph Alpha’s strategic shift away from developing large language models (LLMs) to focus on AI integration services has sparked discussions about the state of AI deployment in Germany. While this change might seem discouraging for those hoping for a strong European contender in the AI race, it’s worth examining the practica... Read more

Privacy Concerns in ChatGPT's GPT Ecosystem

A recent study has shed light on significant privacy and data protection issues within ChatGPT’s GPT ecosystem. Key Findings The study reveals several alarming practices: Widespread Data Collection: GPTs and their associated Actions (external services) collect extensive user data, often without proper disclosure or consent. ... Read more

Pharia-1-LLM: A Closer Look at Aleph Alpha's Latest Release

Aleph Alpha recently announced the launch of Pharia-1-LLM, a new language model series with two 7B foundation models. After reviewing the available materials, including the Model Card, I’ve been able to answer some questions about this release in online discussions. Here’s a breakdown of key points and considerations: Model Characteristics and ... Read more

Mitigating LLM Hallucinations: The Power of System Prompts

On the persistent issue of factuality hallucinations in Large Language Models (LLMs), a LinkedIn post by Maxime Labonne gave as an example the “Indigo Sock Game” - a non-existent game that, according to him, most models will nonetheless confidently describe when prompted. This phenomenon underscores the ongoing challenges in ensuring LLM reliabi... Read more

LLM Benchmarks: The Impact of Temperature and Sampling

Recent discussions around the Aidan Bench (https://github.com/aidanmclaughlin/Aidan-Bench) have highlighted the significant impact of temperature settings and sampling methods on benchmark results for large language models (LLMs). Sam Paech’s experiments (https://x.com/sam_paech/status/1823295200724398244) with the GPT-4o-mini model demonstrate... Read more