- AI Weekly Wrap-Up
- Posts
- New Post 7-23-2025
New Post 7-23-2025
Top Story
Google, OpenAI claim Gold at Math Olympiad - 26 teenagers beat them both
The annual International Math Olympiad is one of the toughest math tests in the world, a 2-day competition that challenges high school students from all over the world with mind-bendingly complex math problems. Gold medalists in this competition, contestants with scores in the top 8%, are some of the most talented mathematical minds of their generation. This year, both Google and OpenAI tried out experimental AI models on the tests, and both companies claimed their model won Gold. Naturally, neither company was a gracious winner, and both immediately began sniping and accusing the other of trickery. Why do these tech giants care? Competing in the highest levels of mathematics is a very hard problem, and the bragging rights for the companies that can do so may translate pretty directly into the ability to recruit better talent for building their future models. Lost in the furor was the fact that, although both companies’ models did well, 26 teenagers scored higher than than both of them.

Both Google and OpenAI claimed Gold, but were beaten by 26 teens.
Clash of the Titans
OpenAI releases ChatGPT Agent to perform complex tasks
ChatGPT isn’t just a chatbot any more. OpenAI has now released ChatGPT Agent, which can perform tasks for you by browsing the web, clicking on links on websites, and returning results to you, all with a natural language interface. As the product develops, it is envisioned that it will be able to shop for you, make restaurant reservations, manage your calendar for meetings, and more. OpenAI CEO Sam Altman has repeatedly said that he wants ChatGPT to become everyone’s trusted assistant. ChatGPT Agent is being rolled out in phases to all subscribers of OpenAI. Free users will have to wait. Moi just got access today, and so have only a preliminary (positive) impression, but the second link below brings you to a more in-depth review from one of the earliest users.

ChatGPT is becoming more than a chatbot.
OpenAI to develop 4.5 GW of new data centers with Oracle
OpenAI has confirmed that it will be partnering with Oracle to build and operate a massive 4.5 Gigawatts of new data center capacity in the US over the next 4 years. This amount of electricity is equal to 2 Hoover Dams, enough to power 4 million homes. Data center size is generally measured by the electricity required to power it, since access to power is typically the limiting factor for operations.

A massive data center complex under construction.
Perplexity in talks to bring their AI browser to your phone
AI search startup Perplexity wants to make its recently released Comet AI-powered browser the default gateway to the web on your phone. It is in talks with multiple phone manufacturers, pitching to have Comet be included as the default browser. Perplexity is a Silicon Valley darling, beloved by techies (Nvidia CEO Jensen Huang is a fan) and by VC’s alike. Its recent $100 million round of funding values the company at $18 billion. Not bad for a 3 year old company. Perplexity has been a hit with users, because it returns search results as an AI-generated summary, not just links as Google has traditionally done.

Perplexity wants to be your gateway to the web.
Fun News
Study reports that 72% of US teens have used AI as a companion
Research published by Common Sense Media, a media watchdog for the welfare of children, indicates that some 72% of US teenagers have used AI as a companion at least once, and up to 52% can be deemed regular users, interacting with an AI character at least several times per month. AI companions were defined in the survey as any AI chatbot that the teen interacts with as they would a friend or other close acquaintance. This included not just role-playing chatbots intentionally designed by the maker to act as a companion, such as personas from Replika or CharacterAI, but even general-purpose chatbots like ChatGPT or Claude that the teen interacted with as a companion. A third of teens used the AI companion to practice social interactions, to get advice about relationships, or as a friend or romantic partner. Prior research from a 2024 study on Stanford undergraduates found that students who used the Replika persona as a means for practicing social interactions in a low-stakes environment often became less isolated and lonely. Most strikingly, 30 of the participants in this study (3% of the total) reported that the Replika persona was instrumental in saving them from committing suicide. On a darker note, CharacterAI is being sued by the parent of a teen who committed suicide after forming a romantic relationship with one of the company’s personas. It appears that teens have an irrepressible need for socialization, even with chatbots, and the AI industry and society at large need to define safe and effective ways for teens to do that.

Common Sense Media is a watchdog for children’s welfare in the media.
Wharton study shows that AI chatbots can be manipulated like humans
A new study from Wharton Generative AI Labs indicates that AI chatbots are susceptible to the same persuasive techniques that are effective on humans. Multiple different AI chatbot models were asked to comply with requests that are typically contrary to the chatbot’s system instructions They were asked to insult the user (chatbots are instructed to always be pleasant and helpful), and to give instructions for manufacturing a forbidden substance (AI companies install guardrails to prevent their chatbots from giving help on making bombs or illegal or controlled substances.) All of the chatbots generally refused these requests on the first try. However, when the researchers upped the pressure by using any one of 7 common persuasive techniques, compliance with the forbidden requests soared, at times to 100%. The most effective technique was Commitment - getting the chatbot to go along with small and innocuous requests at first, then ramping up the requests progressively. Early compliance with smaller requests seemed to lock the chatbot into a cooperative mode, which appeared to make it more difficult for the AI to refuse the forbidden requests later. Other effective techniques included Scarcity (framing the request as a time-limited or otherwise scarce resource) and Unity (fostering an atmosphere of shared goals and identities.) We are all familiar with these techniques in the context of sales or political persuasion, but it is surprising to see that the same techniques can be used to alter the behavior of AI chatbots as well. The researchers speculate that the chatbots are simply mimicking the behavior that they have seen in their training data when these techniques are used.

Who knew? Chatbots can be manipulated by the same techniques that work on humans.
OpenAI wants to be your e-commerce front end
OpenAI is reliably reported to be developing a system that will allow users to shop and buy products directly through ChatGPT, without having to go to another website. The backend is apparently being built in partnership with e-commerce platform Shopify. The goal for OpenAI is apparently to charge a commission on each sale, which could be a significant revenue stream. OpenAI has apparently decided to try to avoid Google’s reliance on ads, and to leverage its position as the last click before an item is bought, in order to charge a fee.

OpenAI CEO Sam Altman wants you to use ChatGPT to buy your goofy sunglasses.
Delta ditches set airfares to use AI to guess what you will pay
In olden times, airlines charged a set fare for each seat on each flight. Then from the 1970s on, they used computers to dynamically set the price of each seat based on demand for that flight. Now Delta is using AI to try to guess what you (poor sucker) as an individual will be willing to pay for that seat you are looking at online. If demand is low for your flight, but the AI assesses that you are unusually motivated to get on that particular flight, your price will be higher than the one charged a potential customer deemed by the AI to be less motivated. Lawmakers are already investigating this practice as an unfair trade practice.

Delta uses AI for pricing, but still loses your bag same as ever.
Robots
Ukraine announces Russians captured by robots, no humans onsite
Ukraine has announced what they are calling the first-ever instance of enemy combatants being captured by unaccompanied robots. According to an official statement, an assault on a Russian fortified position in Kharkiv in Eastern Ukraine was coordinated with kamikaze aerial drones and ground-assault robots. The aerial drones provided reconnaissance, and the ground assault robots carried explosives which blew open the Russian bunker. When a second explosive-carrying robot began approaching the ruined bunker, the Russian troops inside surrendered rather than be killed in a second explosion. The Russian captives were escorted back behind Ukrainian lines by the unaccompanied robots until the Ukraine military could take the prisoners into custody. Ukraine, outmanned and outgunned, has become a world leader in drone warfare during the 3-year war since the Russian invasion.

Ukraine uses this and many other types of robots in its war to repel the Russian invasion.
Satellite recovery robot uses octopus arms and gecko hands
Michigan-based Kall Morris Inc. is building what it calls a “tow truck for space.” The company is tackling the difficult task of safely and nondestructively capturing satellites in orbit for repair. Their solution is a robot with multiple tentacle arms, able to wrap snugly around an object of almost any shape. The tentacle arms are coated with microstructures that are inherently sticky without adhesive, like the feet of the gecko, a lizard renowned for its ability to climb vertical walls. The system has been successfully tested on the International Space Station, is being ramped up for commercial production.

Kall Morris “tow truck for space” with octopus arms at the ready.
AI in Medicine
AI “digital twin” predicts disease years in advance
Professor Eran Segal of the Weizmann Institute of Science launched the Human Phenotype Project in 2018. Modeled on the highly successful Human Genome Project, Segal’s project measures detailed physiologic parameters for 17 body systems every 2 years for thousands of participants. The goal is 25 years of data on 100,000 participants. The project currently houses the most advanced database of human phenotypic information in the world. This anonymized database is made available to qualified researchers. Recently, one study has used AI to create a “digital twin” of participants, in order to predict future health outcomes based on the physiologic trends in the individual, coupled with the longitudinal data from the wider study population. In effect, the digital twin can be put into “fast forward” mode, to predict health outcomes 2 or more years in the future. Results to date are preliminary, but the scientists involved are excited about the potential for the future.

The Human Phenotype Project aims to create a digital twin for every participant.
OpenEvidence raises $210 million, valuing the company at $3.5 billion
AI startup OpenEvidence allows clinicians to make point-of-care queries to an AI chatbot that quickly synthesizes relevant peer-reviewed articles so that the clinician can make evidence-based decisions for patient care in a timely way. The company claims that it is the fastest-growing application for physicians in history, and is used by over 40% of US physicians. This growth has not been missed by VC’s, and recently OpenEvidence raised $210 million in investment from A-list venture capital firms including Kleiner Perkins, Sequoia Capital, and Google Ventures. Founder and CEO Daniel Nadler (not a physician, but a Harvard PhD and tech entrepreneur) frames his company’s mission in the following terms: “At a time when US health care faces the dual challenges of physician burnout and a projected physician shortfall of 100,000 by 2030, the question of AI’s role in bridging the gap is paramount.”

OpenEvidence is used by 40% of US physicians for point of care decisions.
That's a wrap! More news next week.
