Nils Durner's Blog Ahas, Breadcrumbs, Coding Epiphanies

Vision model spatiality

Following up on an announcement that LLaVA-NeXT had been merged into Huggingface Transformers, someone asked for a “vision-language model which can distinguish the left side from the right side of the frame/picture”. My response: There was a preprint paper just recently where they superimpose a grid onto the original image and also add coord... Read more

IBM Research: synthetic Q&A

IBM Research introduces “LAB”: “Large-scale Alignment for chatBots”. From their blogpost: The large language models (LLMs) behind modern chatbots are pre-trained on the raw text to learn an abstract representation of language. This then primes them to learn many tasks quickly once they see labeled, detailed instructions during alignment. But ... Read more

Anthropic Claude 3 released

Anthropic have released the Claude 3 family: Announcement. Early comments from my filter bubble: not available in the EU hosting options also include Microsoft Azure, in addition to AWS? overreaching guardrails: Claude still refuses work at times, e.g. coding a website may be gaming and misrepresenting benchmark scores. The comparisons t... Read more

Suno AI at MoCA Taipei

The AI art exhibition Hello, Human currently running at MoCA Taipei features several works created using Suno AI. To my surprise, Suno AI has singing voices! Very beautiful, and only slightly glitchy at times. Image: Vistor enjoying artwork by Koya Matsuo in the “Hello, Human” exhibition created by Keith Lam and Escher Tsai. The artworks “Tori... Read more

AI Use Cases - according to analysts

Business analysts all have their own theories and predictions on GenAI. One evergreen is “summarization”, perhaps gleaned from Model Cards. Same for the recent Forrester event in Milan: The other main reason consumers are using AI, in Italy but especially in Germany, is to summarize something. My observation is different: Anecdotally, ra... Read more