Nils Durner's Blog Ahas, Breadcrumbs, Coding Epiphanies

LLM as a judge

Paper about “A Meeting Assistant Benchmark for Long-Context Language Models” with a remarkable side-note: We also provide a thorough analysis of our GPT-4-based evaluation method, encompassing insights from a crowdsourcing study. Our findings suggest that while GPT-4’s evaluation scores are correlated with human judges’, its ability to differ... Read more

xz Backdoor

A lot has been (and continues to be) written about the xz Backdoor. What is, however, even more troubling is that this yet another demonstrated open-source supply chain attack, perhaps with years of preparation in advance. It could have hit(*) any downstream maintainer, just like with the faker.js incident, but there were two possible evasion f... Read more

LLM Tokenizer comparison

A poster on LinkedIn highlighted the Xenova Tokenizer Playground to compare Tokenizer efficiency. I remarked: There is however a difference between what this Playground calculates and what the relevant APIs report as actually used (and therefore billed) input tokens. With a short German sentence: Xenova Playground „Claude“: 13 Tokens Clau... Read more

Apple Chip Flaw

The tech press is busy reporting on an alleged “Apple Chip Flaw Leaks Secret Encryption Keys”. This is not a real concern. Rather, it’s junk science. The researchers used a third-party cryptography tool/library, called “OpenSSL”. Apple’s own cryptography library is called “CryptoKit”, and this is commonly used by Apps and macOS/iOS/iPadOS itse... Read more

[UPDATED] Ai Attribution Art

description: “Examines AI-generated art attribution methods, including digital watermarking, metadata embedding, and industry standards for provenance tracking.” layout: post title: “AI Attribution in Art” date: 2024-03-26 last_updated: 2024-03-26 tags: [ai, art, software engineering, genai] — AI Attribution in Art offers perspectives for tech ... Read more