New Post 1-3-2024

Keith Collins MD & Addie Politi
January 03, 2024

Top Story

Key events in AI in 2023

It’s been a wild year. Let’s reel off just some of the top developments in AI in 2023:

JANUARY - OpenAI’s ChatGPT reaches 100 million active users just 2 months after launch, the fastest takeoff in web app history.

FEBRUARY - The US Copyright Office ruled that AI-generated images could not be copyrighted.

Meta/Facebook announces its Llama 2 AI model, available for free to researchers. The model weights were leaked somehow, and all of a sudden any AI enthusiast had free access to a highly capable model. Meta, to its credit, made lemonade out of these lemons, and took on the role of open source AI champion, distinguishing itself from all the other large propietary AI companies such as OpenAI, Anthropic, and Google. This unleashed a flood of innovation, with enthusiastic volunteers working to improve Meta’s mode for free, while Meta reserves the right to charge for Llama usage by any big firm. So far, Meta seems to be winning this bet in open source, and open source in general is becoming an ever more viable alternative to the big proprietary AI models.

MARCH - OpenAI launches GPT-4, still generally considered the most capable model so far (although the competition is closing the gap.)

AI luminaries plus Elon Musk call for a 6-month pause in developing AI systems more powerful than GPT-4. Literally no one paused for even a second (of course), but nobody has yet beat GPT-4 either, 9 months later.

Adobe becomes the first major imaging company to embed AI text-to-image technology into its flagship products, such as its industry-leading photo editor Photoshop. AI tools are seen as a way to shield users from the daunting complexity of the Photoshop function set, allowing ordinary people to attain extraordinary results.

APRIL - The EU releases the first draft of a proposed AI Act, which it heralds as the “world’s first comprehensive AI legislation.” An amended version passes in December, one of the major differences being a carve-out for open source models. This specifically helps the major EU-based AI startup, Mistral, whose board and investors, quite coincidentally, include a number of political heavy hitters.

MAY - Swiss scientists rebuild the spinal cord of a paralyzed patient, who can now walk with their AI neural bypass device.

Hollywood writers go out on strike over the use of AI to replace them, and in July, they are joined in the strike by the Actors Guild. In both cases, the studios eventually settled, in agreements that protect writers from being replaced by AI chatbots, and protect actors from having their digital likenesses used without compensation. These historic agreements are the first battles over how human workers will be displaced or assisted by AI, but they will likely be far from the last.

Nvidia, the world’s leading maker of Graphical Processing Units (GPUs), the major processor needed for AI, sees its stock soar 27% almost overnight, making it one of the very few companies in the world to sport a trillion-dollar market capitalization.

JUNE - AI decodes whale language, and allows scientists to talk back.

JULY - OpenAI releases Code Interpreter and Data Analysis, a powerful coding and data analysis assistant, which is still a poorly understood and underused tool IMO.

AUGUST - Open source AI repository Hugging Face raises money, giving it a $4 billion valuation, showing how quickly open source models are becoming a viable alternative to the large proprietary models such as ChatGPT, Anthropic’s Claude, Google’s Bard (and someday we hope, Genesis), and Elon Musk’s Grok.

SEPTEMBER - OpenAI reveals ChatGPT-V, a multimodal version that can handle voice and images. Google had been threatening to release its next-generation model, “Genesis”, which was gonna be sooo much better than ChatGPT because…. Genesis was gonna be multimodal. Oops. Google delayed its launch of Genesis, yet again. (“Curses! Foiled again!” as Snidely Whiplash would say.)

Mistral, a super-hyped AI startup in France with some gangsta AI talent (previously at Meta/Facebook and Google), which has been touted as Europe’s answer to OpenAI, released a surprisingly capable open source 7 billion parameter model. (This is tiny for LLMs - the original ChatGPT has 175 billion parameters.) These boys have been busy this year - see December.

Meta/Facebook releases a flurry of open source AI tools for image editing and such, plus… smart glasses! You know like Google’s much-derided Google Glass smart glasses from 2014, but cooler. No, really, these things might actually take off this time with embedded AI allowing the device to act as a hands free smart assistant. A trend to watch.

OCTOBER - President Biden signs a wide-ranging Executive Order regulating AI.

NOVEMBER - Elon Musk (remember him? Mr. “Pause AI development for the safety of humanity!” back in March?) - yeah, he releases his own AI model (surprise, surprise!), named “Grok”, which turned out to be another ChatGPT 3.5 lookalike, with a sassier personality. Grok appears well positioned to be another also-ran in the AI model wars.

OpenAI hosts its first Developer Day conference, showing off lots of nifty new capabilities of its GPT models, and stirring lots of enthusiasm among devs.

Barely 2 weeks later, OpenAI’s board abruptly fires CEO Sam Altman for reasons still unclear. Immediately, all 775 OpenAI employees sign a letter vowing to quit the company unless Sam is brought back. The board caves, and within days Sam is back at the helm with a new board. And you thought you had a tough weekend.

Ex-Apple employees launch the new Humane AI pin, a wearable AI assistant.

DECEMBER - Google finally releases a trimmed-down version of its next-gen Gemini model, which still can’t match GPT-4 (but the full model is gonna be suuupergood, right Google?) They introduce it with a wowser of a demo, which immediately goes viral, but it turns out that the demo was FAKED ( WTF?!!?), and Google has shot itself in the foot yet again. Soon it may not have a leg to stand on.

Midjourney releases version 6, yet another significant upgrade. MJ is still the leader in the text-to-image sector, but rivals such as DALL-E 3 from OpenAI and others are closing the gap. Midjourney has always been a crowd favorite for its scrappy, bootstrap, self-funded model that eschews VC investments. We may be coming to the point where depending solely on cash from operations may limit the company’s growth, and threaten its leadership of a category that it practically invented.

And finally, Mistral, those Gallic scamps, lashed together 8 of their previously released 7B models in what is called a Mixture of Experts architecture, and came up with something that competes well with all other models except the industry-leading GPT-4. They call this Franken-model Mixtral. (Get it? Mistral + Mixture of… sigh.)

Sam Altman the once and future CEO of OpenAI, before his comeback.

Predictions for 2024

Open source models will get more and more capable, progressively closing the gap with at least the current Tier 1 proprietary models.
Small models will proliferate and increase in power and quality, finding broad use in businesses that want to run models on their own hardware with proprietary data.
Large models will increasingly become multimodal, able to handle text, audio, images, and video.
We will see more and more AI systems in production, where businesses integrate them into their standard workflows.

Clash of the Titans

New York Times sues OpenAI for mega-$$ over copyright

After months of threats, and negotiations behind closed doors, the NYT finally pulled the trigger on a sweeping lawsuit against OpenAI for alleged copyright violations - and it’s a barn burner.

Here’s why the NYT might win:

This suit is much stronger than previous suits by artists and authors because the NYT actually formally copyrights all of its content, meaning that copyright law actually applies.
They hired one of the best law firms in the business to handle their case
The exhibits show what appear to be shocking examples of OpenAI’s GPT models reproducing NYT articles verbatim
The NYT leans hard on the special role of journalism as one of the underpinnings of our democracy.

Here’s why they probably shouldn’t:

The vast bulk of the uses of GPT fall pretty clearly into the exemptions for “fair use” - producing short snippets or paraphrased summaries of copyrighted material.
In order to produce the seemingly damning verbatim reproductions of NYT articles, the lawyers stacked the deck by giving the GPT models very unusual prompts, in which they gave the URL of the article and the first half of the text, and then asked GPT to complete the article.
OpenAI has already fixed much of what the NYT claims is problematic behavior in their models - ask GPT today to reproduce an article of copyrighted material, and it will generally decline
OpenAI has already demonstrated that it is prepared to pay for access to copyrighted material - it recently announced a multi-year deal with mega-media corp Axel Springer. The NYT is just demanding above-market rates for access, because it’s sooo “special.”
The NYT’s claims of specialness are clearly special pleading, but may resonate with the court, which may be suspicious of a powerful new technology like AI, that might upend prior established institutions.
The NYT’s proposed remedies are over the top - payment by OpenAI of hundreds of millions if not billions of dollars in compensation, plus destruction of all AI models, from any company, that were ever trained on NYT content (which is mostly all of them.)

TL;DR - this case will be settled. Neither side can afford to lose this case, and the NYT is just ramping up the pressure on OpenAI to cough up above-market rates by launching a highly-publicized (by the NYT itself) lawsuit.

Here’s a link to the best, balanced take on the NYT suit against open AI

TikTok caught using ChatGPT to build a competitor AI

ByteDance, social media phenomenon TikTok’s parent company based in China, has been caught blatantly violating its terms of service agreement with OpenAI, using ChatGPT to design and train their own proprietary AI model, which could compete with ChatGPT.

This has caused OpenAI to suspend ByteDance’s account with ChatGPT.

ByteDance has defended itself by saying that it used OpenAI technology only to a very limited extent, during the testing and verification phase of its own model. But previously released internal documents from ByteDance indicate that ChatGPT was used in almost every phase of development of its own model, code named “Project Seed.” (“Liar! Liar! Pants on fire!”)

OpenAI is investigating, and will determine if it ultimately terminates ByteDance’s access to their GPT models.

This was a particularly boneheaded move by ByteDance, which has been under investigation by the US Congress for spying on users, with widespread calls for TikTok to be banned in the US.

ByteDance found with hand in OpenAI’s cookie jar

China’s TenCent releases app to manipulate smartphone apps for you

China’s tech giant TenCent released AppAgent, a multimodal agent framework that intelligently operates smartphone apps to achieve the user’s goals. AppAgent can “see” the smartphone screen, and can generate virtual “touches” on selected icons or menu choices.

This avoids the necessity for special programming or dealing with app API interfaces. It’s like an old-time player piano, only for smartphones. AppAgent learns from observing human actions, or from independent exploration of the apps.

The description is here: Announcement of AppAgent on Twitter

The paper with links to the open source repository is here: AppAgent Paper

Fun News

Dealership chatbot easily hacked to offer Chevys for $1

The Chevy dealership in Watsonville, CA rolled out a customer service chatbot. Within hours, AI-savvy customers were jailbreaking the chatbot with prompt injection techniques, and getting the chatbot to offer absurdly generous deals on vehicles, such as lowering the price to $1, or including an all-expense paid vacation package. Lucky the chatbot included a disclaimer, that all information had to be checked with the dealership.

Chevy dealership chatbot jailbroken to offer absurd deals

AI advice to Kenyan entrepreneurs helps only top performers

It’s become a truism that use of AI tends to help the poor performers most. But not always, it seems.

Researchers from the business schools at Harvard and Berkely conducted a randomized controlled trial with 640 Kenyan entrepreneurs, giving them access through their phones to a generative AI mentor based on GPT-4. Entrepreneurs were encouraged to ask for advice to improve their business. The result was that the previous top performers improved another 20%, while the results of the poor performers actually declined.

Analysis of the results indicated that good performers asked more focused, actionable questions, and acted effectively on the advice that they got. In contrast, the poorer performers asked worse questions and struggled with implementation.

AI helped the best get better among Kenyan entrepreneurs

AI-generated novel wins literary prize in China

A professor at Beijing’s prestigious Tsinghua University generated a science fiction novel in about 3 hours using AI, and as a test, submitted it to a literary competition. The AI novel won second prize.

The ruse was then revealed, causing speculation among the judges about the future of AI in literature.

AI-generated novel wins literary prize in China

The novel’s illustrations, such as this one, were also AI-generated

Jailed opposition leader in Pakistan uses AI clone to campaign

It’s hard to campaign from jail. (Although Boston’s legendary mayor Michael Curley managed to campaign and win while under indictment.) Former Pakistan Prime Minister Imran Khan is running as an opposition candidate, and the current government has jailed him for alleged corruption. Supporters are calling the charges a pretext to silence Khan.

Unable to give stump speeches from his cell, Khan drafted notes for a speech which he then gave to his lawyers. His campaign then used AI to flesh out the notes into a 4-minute speech in Khan’s rhetorical style, and the resulting text was put into Khan’s voice with audio cloning technology from AI startup ElevenLabs. The result was the headline speech at an online “virtual rally” of Khan’s supporters. At most recent count, the virtual rally was viewed by 4.5 million people.

AI in Medicine

AI discovers a powerful new class of antibiotics

Researchers at MIT, Harvard, and other institutions have developed an AI system for developing new antibiotics, a critical need in an age of increasing bacterial resistance to our standard medications. In the Nature article in the link below, researchers describe how the AI system found a new class of antibiotics (the first in 36 years.) This new class of antibiotics is highly effective against 2 drug-resistant bacteria that are rising threats: methicillin resistant Staph Aureus (MRSA) and vancomycin-resistant enterococci.

The authors note that their methods are perfectly general, and should be able to produce many more classes of therapeutic compounds in the future.

AI discovers powerful new class of antibiotics

Autonomous chemical research with large language models

Researchers at Carnegie Mellon University have developed an AI-controlled chemical research system that they call “Co-scientist.” The system autonomously designs, plans, and performs complex experiments by harnessing LLM-based AI, such as GPT-4, to laboratory automation systems.

The current system was a successful proof of concept, the authors are currently working on more capable real world systems.

EPIC partners with Microsoft, startups to integrate AI into EHR

EPIC is one of the largest EHR vendors for large hospital systems. It is now working on integrating AI into every aspect of its complex, sprawling application.

Of particular interest to physicians, Microsoft and AI startups Abridge and Suki are all working with EPIC to integrate their separate approaches to an “AI scribe”, an AI system that would convert the patient-physician conversation to a text that the AI system would then summarize into a note that could be edited, approved, and entered into the patient record as the encounter note. In this way, physicians could face the patient during an encounter rather than facing the computer. Abridge estimates that their system could save a busy clinician 2 hours per workday.

EPIC partners with Microsoft, startups to integrate AI into the EHR

That's a wrap! More news next week.