Nils Durner's Blog Ahas, Breadcrumbs, Coding Epiphanies

Corporate spending on AI

Ethan Mollick recently shared some intriguing data about AI spending patterns. The numbers, sourced from a Ramp report, seem to paint a rosy picture for OpenAI. But it’s worth digging a bit deeper. According to the report, OpenAI is experiencing impressive retention and growth numbers. A whopping 82% of companies that spent on OpenAI a year ago... Read more

GPT-4o multimodality cookbook

OpenAI have updated the API cookbook to walk through the basics of multimodality and using GPT-4o via the API. However, the code sample there proves that GPT-4o does not (yet?) process video natively and instead relies on images extracted. The code sample does this at a fixed sampling rate of 0.5 Hz (so every two seconds). My question if there a... Read more

AI-Assisted Summarization: MIRO Boards and GPT for Complex Discussions

In a recent LinkedIn discussion initiated by Alexander Eichler about the European Wallet, an interesting side conversation emerged regarding the use of AI for summarizing complex discussions. I was brought into this conversation to provide insights on potential AI solutions. The Challenge Alexander Eichler had participated in a discussion invo... Read more

AI Agent 'Aileen 2'

The “Nvidia & Langchain AI Agents contest” concluded on Tuesday at 2 in the morning, and I’m proud that I crossed the finish line 3 hours early with a little something: Introducing 𝑨𝒊𝒍𝒆𝒆𝒏 𝟐: an AI office agent and my entry into the ‘NVIDIA and LangChain Generative AI Agents Developer Contest’ in the Small Language Model category. AI agent... Read more

WildVision Arena Benchmark

Ethan Mollick remarks that “people realize how capable mutlimodal AI is, right now, out of the box”. I agree: For those who want to try and compare different models, there is https://huggingface.co/spaces/WildVision/vision-arena. Besides the usual suspects, I recommend trying Reka AI Flash. Read more