Nils Durner's Blog Ahas, Breadcrumbs, Coding Epiphanies

Computation with LLMs

Popular wisdom holds that Language Models are “not made for computation” - and such is thus best avoided. This is backed by this study that confirms limitations also with o1 (albeit much higher). This does not hold true for “Language Models like ChatGPT”, e.g. as claimed by Tech Crunch, however: as an AI System, it extends beyond the basic Large... Read more

Bosch GenAI Playground

A little over a year ago, Bosch (a German multinational engineering and technology company) announced “BoschGPT”, an custom language model developed in partnership with German startup Aleph Alpha, which was intended to serve as an internal chatbot for Bosch employees to access and query the company’s vast knowledge database. This initiative repr... Read more

Open Source AI: A Reality Check

A WIRED article from August 2023 about “The Myth of ‘Open Source’ AI” resurfaced in my feed yesterday. The article, which focused on Meta’s Llama 2 release, raised concerns about the true openness of AI models. However, the field’s developments over the past 12+ months present a different picture. Since Llama 2, Meta has released Llama 3, 3.1, ... Read more

Visual Bongard Puzzles & VLMs: Technical Review

A recent paper titled “Bongard in Wonderland: Visual Puzzles that Still Make AI Go Mad?” has garnered some attention through German tech media. The study claims to demonstrate fundamental limitations in visual language models’ (VLMs) reasoning capabilities. However, upon closer examination of their methodology and code implementation, several is... Read more

Pelican vs. Llama 3.1 405B and others

Motivated by starkly different results from different Llama 3.1 405B providers on one hand, and claims - particulary derived from the Chatbot Arena that quantized versions are no different on the other hand, I have been wishing for a telltale sign that 1) conclusively proves otherwise and 2) tells providers apart. Good news: Simon Willison has s... Read more