New Post 11-8-2023

Top Story

OpenAI’s blockbuster DevCon

OpenAI continues to lap the field of LLM competitors, moving on from the era of homemade hobbyist toy projects to the coming phase of industrial strength applications, and a robust Apple-style app store.

Announcements included:

  1. A turbocharged, cheaper GPT-4 with 128,000 token context length

  2. Seamless use of vision, data analysis, text-to-image, as well as chat

  3. Roll-your-own tools for custom chatbots

  4. An app store

  5. A “copyright shield” to protect users from copyright lawsuits

Bigger, faster, smarter, cheaper, easier.

AI is coming of age. And OpenAI is widening their moat.

Clash of the Titans

Google Bard hires human reviewers, sparking privacy uproar

Google is desperately trying to get back in the running in an area it literally invented (the state of the art Transformer architecture for Large Language Models was developed by a team at Google. And every single member of that team has left the company.) But it keeps stubbing its toe. Now GOOG has stirred up a hornet’s nest of controversy over privacy, by revealing that it has human reviewers monitor conversations on its flagship chatbot Bard. Google says this is necessary to continuously tune Bard to be more helpful, relevant, truthful, and less likely to go off the rails with racist rants or instructions on how to build a meth lab. Correctly handled, this would have been a one-day minor controversy. Blundering Google has turned it into a brushfire which threatens to spread. Keep your popcorn handy.

Elon Musk’s x.ai announces Grok, yet another LLM

Elon loves attention. It’s fun, it feeds his ginormous ego, and (not coincidentally) distracts from scrutiny of his many colossal failures. Like the dumpster fire at X (the Twitter that dare not say its name). Now he enters the AI wars with Grok, a solid but unspectacular ChatGPT 3.5 knock-off, with a sassier personality. Handled competently, the Grok LLM ploy could add billions to Elon’s depleted bank account, maybe even make X/Twitter worth the $45 billion he paid for it again (his own accountants now value it at $19 billion, a nearly 60% loss.) Anyone wanna bet Elon handles this competently? Anyone?

This man just lost $26 billion. Why is he laughing?

French team releases tiny LLM that competes with the big boys

Size isn’t everything. And star-studded French AI startup Mistral is out to prove it. Founded by escapees from storied AI labs such as Google’s Deepmind and Facebook’s LLaMa project, Mistral aims to be the OpenAI of Europe, only… better. More refined. More European. Less American. N’est-ce pas?

 Now they have developed a 7 billion parameter LLM (ChatGPT has 175 billion), fine-tuned it, and showed it to equal the performance of Facebook/Meta’s flagship LLaMa 70 billion parameter model, which clocks in a performance just shy of ChatGPT. Proving that it ain’t all about the parameters, and that smaller models can compete (for now.)

Fun News

Survey finds that few Americans use (or fear) ChatGPT

Despite all the hype about AI, Pew Research polling shows only 18% of Americans have ever used ChatGPT. Also, of those who have heard of ChatGPT, only 19% thought it would have a major impact on their job. However, 47% are concerned about the impact AI may have on everyday life.

Scientists use AI to decode cat facial expressions

Cats are famously imperious and aloof, and it can be hard to know what Fluffy’s thinking. Scientists are hard at work using AI to help decode the meaning of the facial expressions of cats. This is part of a much larger field of study in which researchers are using AI to decode the postures, movements, and vocalizations that various species use to communicate with others of their kind. (We reported on work with whales a few weeks ago.) Ultimately, the goal is to be able to decipher their communications, and even communicate back to them in their own “language.”

AI startup Runway demos first physical device for video creation

Runway is a text-to-video startup that is beloved by the Twitterati. On Friday, they announced 1stAI Machine, a physical video editor for AI. The demo clip is intriguing, but the concept seems a bit retro - isn’t my iPhone a physical video editing device? Check out the clip and decide for yourself.

AI in Medicine

UpToDate gets an AI makeover

Medical knowledge advances quickly, and practicing physicians can find it hard to keep current. That’s why searchable medical databases of best practices, such as the industry-leading UpToDate, are widely used as an in-the-exam-room resource for looking up the latest consensus on how to diagnose and treat a wide range of diseases. Currently, these practice aids are structured like a medical Google, that physicians search with key words. Wolters Kluwer Health, the publisher of UpToDate, is now beta-testing an AI upgrade to its flagship product, which should allow much more efficient and intelligent queries, getting the latest knowledge into the hands of the busy practitioner faster and more effectively.

Paper Chase

Google Deepmind researchers propose AI classification tiers

The big brains at Deepmind want to be able to track humanity’s progress toward Artificial General Intelligence. To do that, they are proposing a classification framework to track the performance, scope, and autonomy of AI models. The framework is modeled on the widely used classification of autonomous driving systems, from Level 0 (fully manual) to Level 5 (fully autonomous.)

Yeah, this is the AV classification, because the paper doesn’t have any cool graphics

Fine Tuning Without Tears

Retraining a base model to fine tune it for your custom use case can be time consuming, tedious, and expensive. Now, researchers from Microsoft, UC Berkeley, and Georgia Tech have come up with an approach to optimize your model without the drudgery and expense of changing parameter weights. They call the method PASTA, for Post-hoc Attention Steering Approach. It steers the operation of selected attention heads at inference time with user-input emphasis codes, in order to ensure a desired result that is less likely in the base model.

Academic papers so rarely have good graphics (sigh)

That's a wrap! More news next week.