A video that’s currently captivating my social media timeline demonstrates a fascinating leap in AI-driven animation. Developed by Chinese research groups, this demo represents a significant milestone in what I’d love to see in a “Generative AI” product or service. The technology, called LivePortrait, animates static images based on a driver v... Read more 09 Jul 2024 - 1 minute read
Maxime Labonne, Staff Machine Learning Scientist at Liquid AI, recently posited that while the models themselves have made significant progress, user interfaces haven’t kept pace. Labonne points out that current LLM interfaces don’t align well with how people typically use these models. Users often engage in back-and-forth conversations, edit p... Read more 05 Jul 2024 - 1 minute read
Recently, Philipp Schmid shared an interactive LLM pricing comparison tool hosted on Hugging Face. This tool allows users to filter providers and models, comparing them side-by-side. It’s an impressive effort that includes a wide range of providers such as Fireworks AI, Groq, Replicate, and IBM. While this tool is undoubtedly useful, I couldn’t... Read more 02 Jul 2024 - 1 minute read
Recently, I came across a LinkedIn post by Armand Ruiz, VP of Product - AI Platform at IBM, discussing the differences between chatbots, copilots, and agents. While the post provided a general overview of these AI categories, it prompted me to inquire about a more specific topic: IBM’s Granite models. Having recently participated in the NVIDIA ... Read more 30 Jun 2024 - 1 minute read
Simon Willison recently asked about experiments with audio input to Google Gemini 1.5 Pro and Flash models, noting that the ability to query audio files beyond simple transcription is an intriguing and potentially underexplored capability. Michael Gackstatter reported issues processing German audio with Gemini 1.5 Pro, receiving a “cannot proce... Read more 29 Jun 2024 - 1 minute read