Nils Durner's Blog Ahas, Breadcrumbs, Coding Epiphanies

Exploring Audio Input with Gemini 1.5 Pro

Simon Willison recently asked about experiments with audio input to Google Gemini 1.5 Pro and Flash models, noting that the ability to query audio files beyond simple transcription is an intriguing and potentially underexplored capability. Read more...

Benchmarking AI Vision

Ethan Mollick, Associate Professor at The Wharton School, recently shared two key developments: Read more...

Gemma-2: Impressive or Just Well-Dressed?

Google recently released their open-source Gemma-2 models (27b and 9b variants), which have been gaining attention in the AI community. In a LinkedIn post, Peter Gostev, Head of AI at Moonpig, highlighted that the 27b variant is now ranking slightly higher than Meta’s 70b model, despite being 2.5 times smaller. Read more...

AI Wiki

A colleague pitched the idea of creating an “AI Wiki” with best pratices. Read more...

[UPDATED] Clash Eval

description: “Details the Clash evaluation framework for LLM comparison, describing methodology, scoring metrics, and case study results across multiple models.” layout: post title: “ClashEval: When LLM Safeguards Clash with RAG” date: 2024-05-20 last_updated: 2024-05-20 tags: [llm, rag, misinformation, ai-safety, aleph alpha] — Read more...