Outlet “TestingCatalog” reports that Microsoft is testing several new features for Copilot. As part of the prompt field, there are “Think Deeper”, “Deep Research” and “Action” (TestingCatalog blog post]. The latter may replicate OpenAI Operator and “Computer Use”, but that’s not certain. OpenAI Operators runs inside a cloud VM, but TestingCatalo... Read more 04 May 2025 (Updated) - less than 1 minute read
GPT-4o for image generation has been released - as part of ChatGPT and Sora. It supersedes the Dall-E 3 model, which was originally released in October 2023, but remains the best OpenAI image generation model available via their API. Most notable for me, 4o image generation not just supports text prompts (like Dall-E did), but also image prompt... Read more 27 Mar 2025 (Updated) - less than 1 minute read
In the LLM frontend comparison for my article on process visualization, I have included whether or not the particular frontend includes support for PDF files. Some of the frontends considered do so only poorly: either they only use the text portion, or the PDF gets pre-processed through RAG. OpenAI have now added the capability to process PDF fi... Read more 19 Mar 2025 - less than 1 minute read
Several outlets report about Russian propaganda agencies poising AI training data, citing Newsguard: An audit found that the 10 leading generative AI tools advanced Moscow’s disinformation goals by repeating false claims from the pro-Kremlin Pravda network 33 percent of the time They only seem to have published 5 out of the 15 questions the... Read more 19 Apr 2025 (Updated) - 3 minute read
OpenAI have made additional agentic features available to developers: Web Search, File Search, Computer Use and an SDK that improves on Swarm (Announcement). The “Computer Use” API gives access to the CUA model, which also powers OpenAI Operator. My notes on this: fixed “Bing search: OpenAI news” with initial screenshot demo script here th... Read more 13 Mar 2025 - less than 1 minute read