OpenAI have updated the API cookbook to walk through the basics of multimodality and using GPT-4o via the API. However, the code sample there proves that GPT-4o does not (yet?) process video natively and instead relies on images extracted. The code sample does this at a fixed sampling rate of 0.5 Hz (so every two seconds). My question if there a... Read more 15 May 2024 - less than 1 minute read
In a recent LinkedIn discussion initiated by Alexander Eichler about the European Wallet, an interesting side conversation emerged regarding the use of AI for summarizing complex discussions. I was brought into this conversation to provide insights on potential AI solutions. The Challenge Alexander Eichler had participated in a discussion invo... Read more 15 May 2024 - 2 minute read
The “Nvidia & Langchain AI Agents contest” concluded on Tuesday at 2 in the morning, and I’m proud that I crossed the finish line 3 hours early with a little something: Introducing 𝑨𝒊𝒍𝒆𝒆𝒏 𝟐: an AI office agent and my entry into the ‘NVIDIA and LangChain Generative AI Agents Developer Contest’ in the Small Language Model category. AI agent... Read more 14 May 2024 - 1 minute read
(Updated on Jul 18) So OpenAI have (pre-)released a new member of the GPT-4 family. This has been seen around the web under the guise of “gpt2-chatbot” or “GPT-4 Lite”, and is now available on the free tier of ChatGPT (with limits) and via API. It’s important to realize that it’s a whole new model, in several ways. Techniques learnt & devel... Read more 14 May 2024 - 1 minute read
Ethan Mollick remarks that “people realize how capable mutlimodal AI is, right now, out of the box”. I agree: For those who want to try and compare different models, there is https://huggingface.co/spaces/WildVision/vision-arena. Besides the usual suspects, I recommend trying Reka AI Flash. Read more 07 May 2024 - less than 1 minute read