Four Types of Flexibility to Make AI Loads a Compatible Dinner Guest for Our Nervous Hosts - US Electric Grids
“Don’t Take the Tablecloth with You!”
Planning Models, Interconnection Studies, and Notice-Based, Telemetered Off-Ramps Explained in Terms of Something We All Understand: Gracefully Executed Dinner Parties
CSIS Energy for AI, April 2, 2025 - Event Link
The Worst Dinner Party Ever which inspired this piece - The Residence, on Netflix
There’s been a lot of talk lately about whether AI data centers can offer flexibility to the grid — and most of it veers straight into hopeful banter on voluntary demand response programs. I had the opportunity to correct that misconception today at CSIS Energy Security and Climate Change Program alongside colleagues across the build, operate, source, and scale sides of the AI large-load ecosystem.
Here’s the bottom line: there is no commercial or physics pathway to rely on voluntary, price-based curtailment as a singular viable strategy for large, firm loads seeking interconnection. No one is trading some millions in curtailment credits for billions in compute SLA penalties. And that does not even touch the physics problem — short- or no-notice ramps are operational liabilities for both the grid and the sites in question. Regardless of what you think about the vision that 99% uptime expectations will hold at these facilities and continue to demand 8760-hour load profiles, this is a business for which the Value of Lost Load (VOLL) dwarfs the sum of residential, mid-size C&I, and large industrial classes. There is no rational clearing price at which it makes economic sense for a hyperscale data center for AI to engage in market-based load curtailment — not when the cost of interruption exceeds the value of any demand response payment by orders of magnitude. That... is pretty much writing on the wall when uptime itself is the business model.
Put differently: if data center developers and customers are going to band together to drive flexible consumption behaviors, it will not be motivated by curtailment-for-cash programs. It will be because something blocks their ability to interconnect and operate at all — whether that is unacceptable time-to-market delays, grid planning shocks, emergency legislation, or interconnection study outcomes which make their economics collapse.
At that point, flexibility is not a nice-to-have. It is a survival tactic — a way to keep growing compute footprints in a power-constrained world.
I really appreciated this Substack article from David Myton which aptly captures the surprise many compute industry readers may face when reading academic studies about the possibility for rapid demand response from DC workload facilities. They want to know more about how to make this commercially viable for differentiated workloads, adhere to contract commitments on compute speed and quality, workload classification, changes to applications developed only for certain types of zonal failovers: indeed, someone needs to be doing that work while us grid nerds try to figure out the power access side of the equation.
At the CSIS event, I shifted the conversation on load flexibility to focus on what types of flexibility actually help grid operators, and how we design that flexibility into planning studies, interconnection approvals, emergency procedures, and co-location frameworks — not just into after-the-fact load shedding contracts.
To land the point with a broader audience — those who do and don’t speak "RTO" fluently — I used a simple analogy: AI data centers are dinner party guests. The grid is the host. Flexibility is about behaving like a guest who gets invited back to a Very Important but generally chaotic dinner scene.
Here are the four types of flexibility that matter — and the rules of the dinner party which frame those requirements. S/O to fellow Shonda Rhimes lovers — turns out The Residence may have influenced more of my analogies than I realized as I took the stage.
1. N-1 Resilience Requires Defensive, Graceful Ramping, not Massive Load Drops
Your host needs to trust that when you get up from the dinner table, you do not take the tablecloth with you in your belt and wipe out the provisioning system for everyone else. Large load drop is the single most important N-1 contingency challenge today that must be solved for to manage sudden swings in grid stability - in this context, flexibility means leaving the dinner table gracefully.
The grid is built on the N-1 principle: we can lose one thing and keep going. What it is not built for is the instantaneous loss of 400 MW of hyperscale AI load because a data center trips on a voltage blip. Flexibility in this instance is about ensuring the grid operator does not have to simulate the entire site as a failure in every planning study - operational flexibility against failure following a disturbance is the key.
So we need defensive flexibility for disturbances — flexible, grid-aware ramp-downs triggered by disturbance, not disconnection.
What this means in practice:
Ride-through capability so the site does not vanish at the first disturbance.
Smart controls that ramp, not drop - the ultimate flex grid operators care about.
Batteries or on-site resources that cushion the grid and help it adapt to stability surprises.
2. Study a Portion of Load Requests as Firm and the Rest as Flexible to Enable "Overbooking" Today's Infrastructure
"I have the appetite for three people. I'll eat everything served. But I have brought my own snack, so I'll only eat for two, and you can serve me a double-entrée faster." CSIS Link:
Ireland and the Netherlands figured this out: if a data center demonstrably shows it can curtail part of its load (and brings the dispatch response capability, telemetry, and proof of curtailment through self-supplied generation etc.), it can connect sooner, avoid triggering massive upgrades, and still serve its core compute needs.
In the Netherlands today, this study process allows overbooking available transmission infrastructure for Load Interconnection, just like ERCOT overbooks transmission for Generator Interconnection in its "Connect and Manage" Framework.
Source Note: The Dutch National Regulatory Authority permits up to 10% overbooking of the grid's capacity for general assets, and up to 50% for flexible assets. This approach enhances the efficiency of the grid by accommodating more connections without immediate infrastructure expansion.
What this means in practice: U.S. grid operators need to agree on a nationwide, adaptable non-firm interconnection study process that allows us to get these outcomes faster, where flexibility can be modeled into planning studies and interconnection requests for large loads, and curtailment can be proven to the satisfaction of a standardized set of parameters with telemetry and other grid operator visibility solutions. Unfortunately, we do not have in the U.S. the benefits of singularity in decision-making that the Dutch grid benefits from. TenneT is the national transmission system operator (TSO) in charge of national-level transmission planning and operation of all high-voltage lines. Regional Distribution System Operators (DSOs) like Liander, Stedin, and Enexis operate medium- and low-voltage distribution networks, are public interest entities, must follow central planning directives, have performance obligations tied to capacity availability, and have a single regulator (gasp!) This is the ACM, which sets uniform national rules, tariffs, and enforces coordination between TSO and DSOs. This means they have nice things we do not have in the States: one set of rules administered top-down for load studies, legal requirements to offer non-firm interconnection options to grid connection-requesting customers, and standardized capacity limitation contracts, dynamic tariffs, and flexible planning assumptions for the transmission system. If there is one augmentation I would like to make to the CSIS report on this topic, it would be this - a wish list of all the nice things I would like us to build from the top down in the U.S. for load-side interconnection, planning, and operational response. (I talk about that here).
What this type of flexibility looks like in practice:
The site takes less than your maximum draw at peak hours and serves that full coincident demand with onsite power or removes the load through shut-off, workload reallocation, thermal efficiency measures (like pre-cooling hours in advance of peak).
The curtailed profile is in your site model and used to approve the interconnection request faster with option to draw more from the grid when further upgrades are built and operating.
Site backs up its peak hours consumption profile with telemetry to show the grid operator that the site's appetite is manageable during peak as modeled, and can respond to ordered dispatch to ramp down back to expectations during peaks.
3. Emergency Curtailment for Grid Stress Conditions Means Have the Flexibility to Comply with When - and On What Magnitude and Timescale - the Grid Operator Needs You to Reduce Load
"You might be asked to quietly leave before the filet mignon is served… because we didn’t make enough, and the Prime Minister of Australia needs one."
Pre-committed orchestration logic that responds to system signals and utilizes all the site control, auxiliary power control, compute workload control solutions available to you as the site and to your interconnecting substation entity.
Integration with site controllers and BESS for graceful power-down.
Predictable, pre-approved behaviors the grid can count on.
4. Self-Limiting Interconnection Rights: Never Exceeding What the Grid Sees as Your Net Load
"This is the guest that always does what they said they would do, and never adds or subtracts burden for the host, even if the behind-the-scenes situation is that there is a potentially dramatic increase in the guest's appetite (that is unknown to either the guest or the host)
If you declare 100 MW at the POI, you never draw more — even if your internal load grows. That’s self-limiting behavior, enforced by real controls, not just intent.
Load orchestration that caps POI draw (firm rights).
On-site gen or batteries that carry excess internal demand and enable growth in tenant use cases - soak up load growth without showing the grid a change in net load.
Verified telemetry that the grid operator can trust.
In this model, flexibility means that the site achieves scaled compute without forcing new transmission. The loads grow behind the meter, not through the meter, and net loads visible to the grid are firm.