New Post 2-5-25

Keith Collins MD
February 05, 2025

Top Story

OpenAI’s latest AI model is free to all users

In what appears to be a direct clap-back at the massive hype China’s DeepSeek AI model has been garnering, OpenAI released its latest and most advanced model, o3 (don’t ask, they truly stink at naming). o3 pushes the envelope on “reasoning” even further than its predecessor, o1. (There is no o2. Deal with it.) “Reasoning” models cause the AI to slow down in answering a question, proceeding step by step and checking its work. Surprisingly, this apparently simple tweak improves the accuracy and sophistication of the model’s answers significantly, and o3 is scaling new heights on benchmarks based on math, coding, and science. o3 can also connect to the web through search, to get the most up-to-date answers. Best of all, o3 is available free to all users of ChatGPT - just check the box for “Reason.”

OpenAI’s latest AI model is free to all users

OpenAI releases its new o3 “reasoning” model free to all users.

Clash of the Titans

China’s viral DeepSeek model causes immediate crackdown in the West

Released a mere 2 weeks ago, China’s highly-efficient DeepSeek AI model became an instant sensation in tech circles, who were wowed by its clever engineering. Reactions in Western government circles was also swift - and decidedly negative. DeepSeek was officially banned in Italy and in Texas, those two bastions of Western culture (get it?), over concerns about data privacy. Belgium and Ireland have initiated investigations. Several US federal agencies, including the Navy and NASA, have issued advisories to employees not to use DeepSeek. And there is even a bill in Congress now which would make downloading DeepSeek, or any Chinese AI, a federal crime punishable by 20 years in prison. Touched a nerve, didn’t they?

Stepping back, it’s clear that in addition to the very real data security concerns (the Chinese did after all recently penetrate the US phone system so thoroughly that the FBI reversed decades of policy and began encouraging phone users to use apps with end-to-end encryption), it is fashionable to be a China hawk these days, both on national security grounds, and on economic grounds, since AI is likely to produce gushers of economic growth.

The geopolitics of AI after DeepSeek

When asking AI to help pick a birthday gift for your wife can lead to 20 years in the slammer.

Figure projects making 100,000 humanoid robots over the next 4 years

Humanoid robot startup Figure has just announced a second major customer, after BMW. Figure’s robots are already on the assembly line in BMW’s plant in Spartanburg, South Carolina. Although declining to name its new customer yet, Figure now states publicly that they project that they will manufacture and sell over 100,000 humanoid robots in the next 4 years. Competition in the humanoid robot space is heating up, with several Chinese robotics companies producing impressive models at surprisingly low prices.

Figure projects making 100,000 robots over the next 4 years

Fifure’s Figure 02 humanoid robot is already assembling BMWs in Spartanburg, SC.

Fun News

Apple Watch saves skier’s life after 1,000 foot fall, alerting rescuers

AI is weaving its way into everyday life, sometimes in surprising ways. In this case, a skier’s life was saved after a 1,000 foot fall which broke his leg and made him unable to seek help. Luckily, his Apple Watch detected his fall with its sensors and on-device AI algorithms. The watch then sent out emergency alerts to local rescuers, who were able to find the victim and take him to safety and much-needed medical care.

Apple Watch saves skier’s life after 1,000 foot fall

Video from the rescue helicopter shows the downed skier’s friend waving them in.

Harvard students learn twice as much in less time with AI tutors

A randomized controlled trial at Harvard compared students’ mastery of physics with active-learning lectures, or with an AI tutor. The AI tutor won hands down. The students assigned to the AI tutor learned twice as much material as the active-learning group, and spent approximately 18% less time on the tasks. In addition, the students with the AI tutor felt significantly more engaged and more motivated than the active-learning group.

Harvard students learn twice as much in less time with AI tutors

Harvard students learned twice as much physics with an AI tutor.

Agents are the new AI obsession

Hard on the heels of the shift in AI to “reasoning” models (see our Top Story above), AI companies are pursuing the dream of AI Agents. Agents are AI applications that can actually do things for you, like reserve a table at a restaurant, or order groceries through Instacart, all without close human supervision. The current examples of agents - Operator from OpenAI, Ask for Me from Google, Computer Use from Anthropic, and Proxy from European AI startup Convergence - are all pretty rudimentary, more proof of concept or prototype than a finished product. But momentum is building as reasoning models mature and connections to online services are built in. Likely there is an agent in your very near future.

European AI startup Convergence wants Proxy to be your agent

The Convergence team project tech wizardry with their questionable wardrobe choices.

Deep Research from OpenAI. And from Google.

The new AI reasoning models are good at computer coding, at solving tricky math problems, and are a necessary foundation for AI agents (see above.) They are also awesome at producing complex and in-depth analyses, such as a 15-page report on the potential market for high quality Augmented Reality glasses that cost less than $100. OpenAI and Google have both released a product that produces such reports, and both decided to name their product Deep Research. Go figure. Both products are claimed to produce PhD-level results. For Google, you’ll have to fork over $ 20/month for their Gemini Advanced plan, which is similar to OpenAI’s Plus plan, which also costs $20/month. However, to access OpenAI’s Deep Research, you have to pay $200/month for their Pro plan, although OpenAI promises to bring it to their Plus users soon.

How to use Google’s Deep Research

Popular Science’s view of how Google’s Deep Research works.

Robotics

DARPA test shows that 1 human can control a swarm of 100 robots

The Defense Advanced Research Projects Agency (DARPA) - the agency that sponsored the research that led to the internet - has lately been funding projects in robotic warfare. Recently DARPA commissioned a test of the ability of a single individual to manage swarms of heterogeneous robots - aerial drones as well as land vehicles - in a simulated combat environment. The test was a wild success - it proved that a single well-trained individual “swarm commander” could successfully manage swarms of up to 100 heterogeneous robots to accomplish complex, time-sensitive missions.

DARPA test shows that 1 human can control a swarm of 100 robots

Field Engineering paper with cool graphics

A single “swarm commander” successfully controlled up to 100 robots at a time.

Firefighting robot tackles blazes in Kent, England

The Kent Fire and Rescue Service is trying out a mobile robot which can enter blazing buildings, climb stairs, do reconnaissance with its cameras, and even pump 2000 liters of water onto a fire if hooked up to a hose, all while being remotely operated by firefighters from a safe 600 meters away. A spokesman for the KFRS said that the major motivation for trying the robot was to protect the firefighters in highly risky blazes.

Firefighting robot tackles blazes in Kent, England

Modeled on bomb-disposal robots, the Kent firefighting bot keeps firefighters safe in risky blazes.

AI in Medicine

AI helps Swedish physicians detect 24% more breast cancers

Arguments rage over whether AI will replace physicians, or whether they will enhance their performance. This study is a ringing endorsement of the second view. Over 100,000 women in southwest Sweden were randomly assigned to receive mammogram readings from 2 physicians (the standard practice) or by one physician and an AI specially-trained to recognize breast cancer in mammogram images. The AI read the image first, and classified each image as low, medium, or high risk for containing cancer. Medium and high risk images were tagged with the areas of the image that the AI found suspicious, and gave each of these areas its own risk score, estimating the likelihood that the area contained cancer. The tagged image was then passed on to the human radiologist for final reading. Under these conditions, the physicians found 24% more cancers, with no increase in the false-positive rate. This Human-in-the-Loop approach to AI in medicine is likely a fruitful direction for at least the short-to-medium term.

AI helps Swedish physicians detect 24% more breast cancers

AI with Human-in-the-Loop detected 24% more cancers with no increase in false positives.

Australian AI detects lung diseases with 96+% accuracy

Researchers at several Australian universities have developed an AI model that can detect multiple different lung diseases from ultrasound videos, with 96.57% accuracy. This model is yet another example of Human-in-the-Loop architecture, since it is designed to be able to explain to the physician what it found and why it made the diagnoses that it did. This builds physician trust and confidence in the AI, and can be used as a teaching tool as well.

Australian AI detects lung diseases with 96+% accuracy

AI detects lung diseases in ultrasound videos.

That's a wrap! More news next week.