Nils Durner's Blog Ahas, Breadcrumbs, Coding Epiphanies

Gemini 1.5 Pro released

Google Gemini 1.5 Pro, including its great capacity of 1 million tokens, is currently available for free at https://aistudio.google.com/. It’s officially not available in the EU, and its Terms of Service, among others things, mandate that: You may only access the Services (or make API Clients available to users) within an available region. (I... Read more

Vision model spatiality

Following up on an announcement that LLaVA-NeXT had been merged into Huggingface Transformers, someone asked for a “vision-language model which can distinguish the left side from the right side of the frame/picture”. My response: There was a preprint paper just recently where they superimpose a grid onto the original image and also add coord... Read more

IBM Research: synthetic Q&A

IBM Research introduces “LAB”: “Large-scale Alignment for chatBots”. From their blogpost: The large language models (LLMs) behind modern chatbots are pre-trained on the raw text to learn an abstract representation of language. This then primes them to learn many tasks quickly once they see labeled, detailed instructions during alignment. But ... Read more

Anthropic Claude 3 released

Anthropic have released the Claude 3 family: Announcement. Early comments from my filter bubble: not available in the EU hosting options also include Microsoft Azure, in addition to AWS? overreaching guardrails: Claude still refuses work at times, e.g. coding a website may be gaming and misrepresenting benchmark scores. The comparisons t... Read more

Suno AI at MoCA Taipei

The AI art exhibition Hello, Human currently running at MoCA Taipei features several works created using Suno AI. To my surprise, Suno AI has singing voices! Very beautiful, and only slightly glitchy at times. Image: Vistor enjoying artwork by Koya Matsuo in the “Hello, Human” exhibition created by Keith Lam and Escher Tsai. The artworks “Tori... Read more