Nils Durner's Blog Ahas, Breadcrumbs, Coding Epiphanies

Google's Gemini: Predictions and Implications

Google has allegedly given a few companies preview access to Gemini: Reuters. Some commenter predicted General Availability in December. If the GPT-4 timeline is any measure, that’s not unreasonable to assume: Read more...

Claude 2: Model vs. System

A comparison chart of various Chatbots has raised some questions on Claude 2, in particular the “Free” price tag. Like with “ChatGPT”, different people mean different things by it, and it again helps to think in the categories previously established by Miles Brundage: Models, Platforms and Systems. Here is his slide extended with Claude 2: Read more...

ChatGPT Plus: Closer look at Plugins and Advanced Data Analysis

OpenAI have renamed “Code Interpreter” to “Advanced Data Analysis”. That’s still a misnomer to me, so I’ll try to explain: both ChatGPT Plus premium features, “Code Interpreter” and “Plugins” are basically two use-cases of the same underlying LLM/GPT feature: what Microsoft Research described in their “Sparks of AGI” paper as “Tool Usage”. Read more...

Understanding GenAI: Models, Systems, Platforms and Use Cases

Following up on my post about ChatGPT (not) getting dumber, a commenter remarked that Bard did better on this particular math excersise. Read more...

ChatGPT Getting Dumber: A Closer Look

The Wall Street Journal has a piece on “Why ChatGPT Is Getting Dumber at Basic Math”. This is rooted in the same junk science by Zou et al discussed previously. What happened in the meantime: one of the alleged “model degradations” was determined to be a broken benchmark script by Zou et al plus a behaviour change by GPT-4. Simon Boehm fixed the... Read more...