Headline at Wired: “Microsoft’s AI Can Be Turned Into an Automated Phishing Machine”. It is/was even worse than just phishing: someone retrieved a confidential document (signed with Docusign) from a public Copilot: post on X. This is a nice case of OpenAI Miles’ distinction of what gets deployed: it’s not primarily the AI Foundation Model that’... Read more 10 Aug 2024 - less than 1 minute read
Simon Willison, creator of Datasette and co-creator of Django, recently asked on Twitter for a “vibe check” on Llama 3.1 405B. He was particularly interested in whether it’s becoming a credible self-hosted alternative to the best OpenAI or Anthropic models, and if any companies previously hesitant about sending data to API providers are now usin... Read more 30 Jul 2024 - 1 minute read
Insightful article in WSJ, as anti-AI sentiment seems to be growing: Technology providers increasingly offer kitted-out AI premium products, although they have yet to gain traction among many enterprise customers. Tools like Copilot for Microsoft 365 or Gemini for Google Workspace are turning out to require a lot of hand-holding to make them ... Read more 29 Jul 2024 - 1 minute read
A recent paper in Nature, “AI models collapse when trained on recursively generated data” by Shumailov et al., has sparked a heated debate in the AI community about the potential risks of using synthetic data for training language models. The paper suggests that indiscriminate use of model-generated content in training can cause irreversible def... Read more 28 Jul 2024 - 2 minute read
Ethan Mollick, Associate Professor at The Wharton School, recently noted some significant gaps in current LLM benchmarking: No benchmark for LLM hallucination rates Few benchmarks with human comparisons Lack of common benchmarks for use cases like innovation, writing, persuasion, human interaction, education, and creativity Mollick poi... Read more 20 Jul 2024 - less than 1 minute read