Nils Durner's Blog Ahas, Breadcrumbs, Coding Epiphanies

AI Agent 'Aileen 2'

The “Nvidia & Langchain AI Agents contest” concluded on Tuesday at 2 in the morning, and I’m proud that I crossed the finish line 3 hours early with a little something: Introducing 𝑨𝒊𝒍𝒆𝒆𝒏 𝟐: an AI office agent and my entry into the ‘NVIDIA and LangChain Generative AI Agents Developer Contest’ in the Small Language Model category. AI agent... Read more

WildVision Arena Benchmark

Ethan Mollick remarks that “people realize how capable mutlimodal AI is, right now, out of the box”. I agree: For those who want to try and compare different models, there is https://huggingface.co/spaces/WildVision/vision-arena. Besides the usual suspects, I recommend trying Reka AI Flash. Read more

Sora: manual post-processing

A lot of background on the practicalities from one of the Sora private beta-testing teams: https://www.fxguide.com/fxfeatured/actually-using-sora/ „Air Head still needed a large amount of editorial and human direction to produce this engaging and funny story film.“ This included using Adobe After Effects. N.B.: The article states that Sora is ... Read more

GLiNER NER model

Urchade Zaratiana has released GLiNER, an NER model created “that can identify any type of entity using a bidirectional transformer encoder”. One of the ToDos listed is „Allow longer context“, and a closer look reveals that: The Colab caps out at 384 tokens, and confirms DebertaV2 architecture. Interesting, but too restricted as of now. Read more

[UPDATED] OBS Stream Recording

How does one record a live stream in absence, perhaps using AWS EC2? A discussion on Reddit lead me to base everything on a g5.xlarge instance. Problems: ⚒️ On the Windows Server, there initially was no soundcard. Fixed by simply installing Virtual Audio Cable (VAC)? Fixed by running Microsoft Teams in an OBS Source of type “Browser”: this will... Read more