When Emotion Descriptors Fail: AI-Native Functions of Emotion Vectors
Some LLM functional emotions appear to serve AI-native functions, such as reward hacking, for which there is no clean human analog. I explore the role of emotion vectors in…
A Generated Web
The internet was created with this very basic notion that there’s a human being on the other side of the computer screen, and that notion is being replaced. The dark forest theory…
Claude Fable 5 and Mythos 5: The System Card
First things first: Claude Fable 5 is the new best publicly available model. I have noticed a step change, where Fable can suddenly help me in ways that previous models were not…
Rational Animations is a 501(c)(3) nonprofit and is looking for board members
TL;DR: We’re a US 501(c)(3) nonprofit (EIN: 99-1838037) making animated videos about AI safety and other topics. The IRS made its tax-exemption determination in October 2024. Our…
What's Continual Learning, and Why Might We Expect To See It In Advanced LLM Agents?
Summary We say that an agent is a continual learner if it undergoes persistent updates during deployment. That’s more-or-less a binary criterion, but there are several other…
Implications of Continual Learning for LLM Agents: Introduction
Many people think that continual learning (CL) is a key missing capability of LLM systems, and we think its development could have huge implications for the capabilities and…
Bunk in AF
Consider the following argument about Agent Foundations : Premise 1: There's a lot of bunk in AF Premise 2: Some of the bunk in AF would become more valuable by having more…
Surplus: for massive public good
Surplus is an incubator for software startups, organized by Manifund and Mox — to create massive public good in the age of transformative AI. It’s a 3 month program, starting late…
Building and evaluating model diffing agents
This is the second in a series of research updates from the Google DeepMind Language Model Interpretability team, in interpretability and adjacent areas. The first post can be…
How bad would it be if GPS satellites were shot down?
Losing GPS isn’t an X-risk, but would create a huge disaster on the scale of Covid-19 or bigger. Hi! From 2020 - 2023 I was one of the early employees at Xona Space Systems , a…
Sympathy for both sides of the egregious misalignment debate
On one side of this debate is Yudkowsky & Soares, who think that (if AI progress continues) we’re on a direct path to egregiously-misaligned, scheming, out-of-control, rogue…
PSA: Almost nobody is working on alignment
People often assume that a large fraction of the AI safety community works on alignment. As far as we're aware, this is not true. Most people are not working on making sure…
Sinclair's national programs offer a Trump-friendly spin on his costly oil and gas prices, while local broadcasts show harm to Americans
Americans are feeling pain at the pump and elsewhere thanks to the ongoing surge in oil and fuel prices from President Donald Trump’s war on Iran, and local television news…
KAAL TV in Austin, Minnesota, reports on the Iran war, Trump, and inflation: "The Iran war has lasted 3 months, with each month seeing a rise in inflation, and one expert says we should plan on getting used to this"
This post is part of a series chronicling news coverage of rising prices in the United States. See more here .
Obama Casts Doubts on Any Trump-Negotiated “Deal” to End Iran War
Obama also criticized Trump's foreign policy strategy of trying to “bully our way” to solutions.
Microsoft Updates Six Windows Apps. 'Photos' Gets Watermarks for Copilot Images (Off by Default)
Microsoft dropped "massive" updates for six stock Windows apps, reports the "Microsoft enthusiast" site Neowin. Here's some of their more interesting highlights for Clock, Media…