Thoughts on the The Senate Framework for Mitigating Extreme AI Risks.
My response to Senator Romney, Reed, Moran and King's call for input
Last week Roll Call reported “A handful of lawmakers say they plan to press the issue of the threat to humans posed by generative artificial intelligence after a recent bipartisan Senate report largely sidestepped the matter.”1 Specifically, Senators Mitt Romney (R-N.M.), Jack Reed, (D-R.I.), Jerry Moran ( R-Kan.), and Angus King, (I-Maine), joined by a handful of reps on the House side, have begun active calls for Congress to take AI risk seriously.
Trying to chart a path forward, the legislators have published a “Framework For Mitigating Extreme AI Risks.” In brief this framework proposes that developers of models trained on an “enormous amount of computing power” and are “broadly capable; general-purpose and able to complete a variety of downstream tasks; or are intended to be used for bioengineering, chemical engineering, cybersecurity, or nuclear development” must:
1. Implement so called ‘Know your customer requirements,’ that is vet, know, and report customers, especially foreign persons.
2. Notify an “oversight entity” when developing a highly capable model while also incorporating certain safeguards and cybersecurity standards into development.
3. Go through evaluation and obtain a license prior to release to try and prevent models from yielding “bioengineering, chemical engineering, cybersecurity, or nuclear development” risks. License acess would be tiered according to perceived risk levels.
The framework also proposes the creation of a new agency, or investiture of new powers in an exsisting agency or council, to implement these regulations.
Fundamentally, the senators fear AI might represent a uniquely dangerous technical inflection point. Their theory: while AI isn’t itself a weapon of mass destruction(WMD), its remarkable abilities might meaningfully decrease the burden of creating or operationalizing a WMD or cyber attack.
At a high level, I worry action along these lines risks jumping the regulatory gun with questionable or even negative benefits.
Uniquely, the senators have opened this framework to public comment and critique. Below are my submitted thoughts on the nature of these stated risks, and proposals for a alternative policy path forward.
Prioritizing Innovation
It is no question AI will be one of the most important developments in our lifetimes. While ChatGPT and image generation have ‘wowed,’ often unrecognized are the game changing breakthroughs AI is unleashing across other more impactful domains. Here are just a few examples:
1. In healthcare, Alphafold solved the protein folding problem, leading to the first AI generated drug to reach FDA phase II clinical trials. Notably, this drug is a cure for Idiopathic Pulmonary Fibrosis – a terminal condition that impacts five million globally. With just one AI output, five million lives could be saved. Remarkably, this drug was discovered in 3 years, compared to the industry average of 6-10. Just-in-time biomedicine may be on the horizon.
2. In energy, advanced weather prediction has enabled precise wind energy supply predictions 36 hours in advance. The result: a 20% wind energy profitability jump, lowering both carbon emissions and energy prices.
3. In materials science, AI is enabling broad-scale, rapid materials discovery. Already DeepMind’s GNoME has discovered 380,000+ new stable materials, a broad approach likely to grow more targeted over time.
This isn’t “promise;” this is the here and now. As Congress considers first-step AI regulations and frameworks, the opportunity cost for innovation must lead decisions. Safety is essential, but if we discourage technologies like automated drug discovery, literal lives could be lost.
The Challenge of Uncertainty
To preserve both innovation and safety, Congress’ principal task will be contending with uncertainty. In the past two years technical change has been non-stop, challenging assumptions and undermining well-laid plans. The risk of legislative failure is uniquely high. Having read the proposed framework, I worry such an approach doesn’t adequately work around current uncertainty. Specifically, the framework underestimates the uncertainty of the following:
Identifying the Correct Risks
As AI are digital systems – cyber-risk is a guarantee. Other risks, including Chemical, Biological, Radiological and Nuclear (CBRN) risk, are deeply uncertain, however. At this juncture, verifiable evidence that stated CBRN risks are either valid, pressing, or require regulation is missing.
While it is often claimed that AI could meaningfully decrease the knowledge burden required for CBRN attacks, this hypothesis is largely unproven. In fact, existing evidence often suggests the opposite. Today it is common knowledge that combining ammonia and bleach yields deadly results. Still, the unique combination of cheap over-the-counter material access and low information burdens hasn’t yielded resulting chemical attacks. Easy knowledge access is no guarantee of risk. CBRN attacks are hard and physical constraints matter deeply. No matter how capable a system, CBRN success still requires mechanical skill, often-regulated raw materials, disbursal mechanisms, evasion of authorities, and intent. There is a reason the U.S. has not experienced a successful attack since the 2001 anthrax attacks.
Also uncertain is the exact nature or range of risks. At the moment of capable AI’s birth, we cannot anticipate with certainty what risks might emerge nor the severity of each risk. If CBRN risk turns out to be a regulatory red herring, we hazard the following:
Overregulation. While sometimes necessary, licensure and regulatory requirements create innovative and competitive frictions. If actors need pre-approval for development, training runs or deployment, costs will rise, research will slow, and smaller actors will fail. Overregulation also slows safety mitigation. When cyber insecurities are discovered, for instance, the suggested approval process could delay the training runs needed to safeguard systems.
Prioritization. Already, AI’s impact on cyber risk is a well-evidenced concern and one worthy of already limited government time, attention, and resources. If this valid risk, or any others that emerge, must compete with hypothesized CBRN risks, both state and corporate safety resources and attention will be spread thin. We only have so much capacity for action and need to focus limited resources on serious, evidenced challenges.
Identifying Correct Risk Proxy Variables
Today we do not know what proxy variables can reliably identify high risk systems. The cited 1026 operations threshold is indeed massive and almost certainly correlates with the highly capable today. This figure, however, is rooted in exceedingly recent technical assumptions called “scaling laws”[1] that appeared only in the past four years. Such recency should give pause. If enduring, effective regulation is the goal, recent technical observations offer poor regulatory footing. What seems like a worthy AI risk proxy variable today, might not work tomorrow.
Already, the validity of this proxy is questionable. In April 2024, Meta released Llama 3, a language model in GPT-4, Claude 3, and Gemini’s capability class. What distinguishes Llama 3 is scale. While GPT-4 is thought to run on one trillion parameters (a variable tightly correlated with training operations), Llama 3 wields a lean, mean 80 billion. 12.4 times smaller. Already the suggested correlation between operational scale and capability is in question. Looking to the future, market forces are likely to further degrade the usefulness of this proxy. When it comes to profit, less is more. Computation is expensive, and industry is racing to minimize training run operations and model size. While using operations as a proxy for capability risk makes sense at present, this assumption is actively breaking down and may even fail in just one year’s time.
Any alternative will unfortunately face similar forward uncertainty. Not only is AI tech transforming, but so is our understanding of the root capabilities needed to yield each stated risk. Without examples, we cannot validate what proxies might correlate with what risks. Failure is probable.
Building Around Uncertainty
Safety is indeed important, but amid such uncertainty it is unclear if action of this type will help. As an alternative, I propose Congress lean into and build around the assumption of present unpredictability focusing on setting clear priorities, building the state capacity and agility to respond to technical change and emergent risk, AI measurement, and innovation prize challenges.
Set Priorities
Rather than hypothesize each and every risk that could emerge, we should instead focus on prioritizing which targets to safeguard regardless of uncertain risks. Critical infrastructure reform is the natural starting point. Today our critical infrastructure designation is far too broad and according to CISA’s estimates, covers well over 50% of the U.S. economy. Found within are many “optionals” including casinos, ride share vehicles, zoos, car paint manufacturers, and the cosmetics industry. To ensure the most important assets remain secure, Congress should consider a more select grouping. For AI safety, a narrow aperture will yield immense benefit. By focusing efforts, the U.S. can concentrate its threat surveillance dollars on a few important bets. This will increase the likelihood any emergent AI risk is spotted. When unexpected threats do emerge, prioritization will enable quick decisive action. At the time of crisis, the U.S. must be set up for immediate response, rather than waste time analyzing what is truly worth safeguarding.
Invest in State Capacity and Agility
The second step is to build the state capacity needed to rapidly identify, react, and solve unexpected risks. The first piece of this puzzle is talent. AI is a general-purpose technology and will yield a wide range of threats. Robustly monitoring and responding to AI’s general-purpose impact will therefore need considerable analytic hands-on deck. Unfortunately, offices including NIST and CISA are grossly understaffed, handicapping our ability to understand and respond to AI challenges with ease. NIST’s National Vulnerability Database (NVD) for instance is experiencing backlogs due to low staff. If AI yields a greater volume of cyber threats; this already underwater cyber infrastructure will be unable to manage.
The second piece of this puzzle is deregulation. If agencies are overburdened with rules that bind quick action and block their ability to develop and adopt the tools we need for the AI age, no amount of capacity or talent will enable success. Rules should be analyzed pre-emptively to identify processes that hold back action so that if the unexpected emerges, bureaucracy will not burden impact. During the COVID pandemic the confusing, often irreconcilable regulations that bound California’s unemployment insurance bureaucracy made the quick action needed to get millions assistance in a pinch impossible. Rather than help, excessive rules denied agility and produced billions in waste and months of backlogs. To prevent a similar mess in the face of AI risk, focus should fall on cleaning up the rules so our agencies can respond with rapid, effective force when needed.
Invest in Measurement.
The uncertainties of AI have led to a long list of hypotheticals. To ensure issues are spotted early, validated by data, and subsequent action is targeted appropriately, Congress should invest in data collection and issue monitoring efforts. Congress should consider additional funding and authorizations for the Bureau of Labor Statistics, the Census Bureau, and NIST’s new AI safety Institute to conduct ad hoc voluntary issue surveys. With better data, issues can be grounded, and responses timed appropriately. Without such data collection, however, real challenges could go unnoticed until it’s too late.
Fund Grand Challenges
When issues are identified, rules, regulations and effort are often not enough. Tools and engineering are key. Congress should fund a series of ad hoc innovation prize challenges, often called “Grand Challenges,” and give NIST or other agencies the authority to implement such challenges without paperwork, process, or approval. When emergent risks are discovered, this flexible model will enable funding and solutions to immediately follow. While this cannot guarantee a result, it will ensure agile action and immediately bring to bear the industrial might of our nation’s many engineers.
Other Directions
These policy thrusts fundamentally lean into the uncertain. We don’t know what threats are worth our focus, but we do know adaptation is needed. By setting the table today, problems can be met head-on when they emerge tomorrow. It’s important to note, however, these directions are not a panacea. Other routes should be considered. Education is key to both AI safety and AI diffusion, Congress should investigate how to build a more robust workforce. Diplomacy is key to global security and American competitive success; Congress should fund and equip the executive with the tools to make deals and ensure the United States leads the way on AI. Beyond, there are many discrete investments and choices we can make today to ensure a stable, prosperous tomorrow.
Thank you again for the opportunity to comment. I appreciate the work your offices are putting into this worthy effort and appreciate the open mind.
The side-stepping framework in question is Senator Schumer, Rounds, Young, and Heinrich’s recent AI policy Roadmap Driving U.S. Innovation in Artificial Intelligence. This white paper is the result of the year long ‘listening’ project involving numerous AI Insight Forums, hearings and meetings to help the Senate get a grasp on Artificial Intelligence. While I have quibbles with the document, it’s a great start – I’d say a solid75% of the content could move things in a positive direction. While I wouldn’t say the idea that AI is a threat to humans’ is missing from the document (to be explained perhaps later) the senators are correct to say it’s not the focus.