Nils Durner's Blog Ahas, Breadcrumbs, Coding Epiphanies

LLM on smartphone

Following up on our conversation on quantized models on smartphones, Stefano Fiorucci wrote a post about how to run a small language model on a smartphone: . This involves either Layla Lite App or Temux. One commenter recommended LLM Farm on iPhone. Read more

Gemini Large context & Video

RAGfluencers are of course discontent with Gemini’s very large 1 million token context window, noting the high costs associated with a large number of input tokens. “It feels like a very niche use case”. The niche for the the 1M tokens would be multi-modality in general and video in particular. My modest experiments suggest that the model does ... Read more

LLM as a judge

Paper about “A Meeting Assistant Benchmark for Long-Context Language Models” with a remarkable side-note: We also provide a thorough analysis of our GPT-4-based evaluation method, encompassing insights from a crowdsourcing study. Our findings suggest that while GPT-4’s evaluation scores are correlated with human judges’, its ability to differ... Read more

xz Backdoor

A lot has been (and continues to be) written about the xz Backdoor. What is, however, even more troubling is that this yet another demonstrated open-source supply chain attack, perhaps with years of preparation in advance. It could have hit(*) any downstream maintainer, just like with the faker.js incident, but there were two possible evasion f... Read more

LLM Tokenizer comparison

A poster on LinkedIn highlighted the Xenova Tokenizer Playground to compare Tokenizer efficiency. I remarked: There is however a difference between what this Playground calculates and what the relevant APIs report as actually used (and therefore billed) input tokens. With a short German sentence: Xenova Playground „Claude“: 13 Tokens Clau... Read more