Panels summary > AI in HPC: Another Tool in the Box?

Panel: AI in HPC: Another Tool in the Box?

Takeaways from the SCOPE 26 Panel

This post summarizes the panel discussion held on March 11, 2026, at the SCOPE conference (Science at the Convergence of AI and Exascale Computing), hosted at the Institut Henri Poincaré, Paris. The panel brought together Julie Deshayes, Paola Cinnella, Jean-Pierre Vilotte, and Shantenu Jha, moderated by Julien Le Sommer.

The framing question for this panel was deliberately open: is the arrival of AI in high-performance computing a stress, a disruption that forces us to throw away what we have built, or an incentive for change and a genuine invitation to rethink what scientific computing is for? Over ninety minutes, four researchers working in climate modelling, computational fluid dynamics, geophysics infrastructure, and autonomous workflow design gave answers that were technically specific, and at times quite personal.

The ‘in, out, and about’ framework: where is AI actually useful?

A large part of HPC is aimed to carry high-fidelity simulations with enormous computational and power costs. The panel identified two concrete AI strategies already in production in climate modelling: full surrogate replacement, where a trained machine-learning model stands in for a costly simulation when sufficient multi-fidelity pre-trained data exists; and hybrid simulation, where AI upgrades physics-based closure models with limited observational data, making them general enough to explore broader parameter spaces.

Based on these examples, a useful framing emerged: ‘in, out, and about’: AI operating inside computation (hybrid solvers, surrogates), AI outside computation (the traditional HPC job with AI analysis), and AI about computation: the reasoning and planning agent that decides which computations to run at all. Right now the community is investing heavily in the first two use-cases and little works consider the third, while it could be critical relative to a key principle: the most efficient computation is the one you never had to run.

Julie Deshayes added an important distinction between two modes of scientific HPC activity: operational activities, like forecast systems or service continuity, that must be maintained sustainably regardless of the method, and scientific activities, involving testing hypothesis and understanding the system, where genuine experimentation with novel workflows is not only possible but necessary. The critical step she was forced to take when first introducing AI into her models was to rationalise evaluation: to write down explicitly, with data and numbers, what “ground truth” is and what “good” looks like. That effort, she noted, is beneficial even without AI — it opens the community to external contribution.

Benchmarking is a governance problem, not just a measurement problem

In CFD and turbulence modelling, Paola Cinnella described a legitimacy crisis: a large number of practitioners are building AI-assisted turbulence models without understanding the physical constraints that experienced turbulence engineers treat as elementary requirements. With these limited understanding and evaluation, models score well on one case and fail to generalise. The proliferation of competing approaches, each with different data and no shared standards, is creating confusion rather than progress.

Her response: a community-governed evolutionary benchmark approach. This approach promote having a defined suite of cases, with increasing complexity, and several evaluation metrics not restricted to mean squared error alone but on physically meaningful quantities like skin friction, which some popular AI models predict catastrophically badly. The key word is evolutionary: once everyone has adapted to a static benchmark, you must move the bar, otherwise methods overfit to the benchmark rather than to the underlying physics. When asked how communities should aggregate benchmarks, her answer was governance: an advisory board, a steering board, a community-forward process for deciding which cases are most representative. The benchmark must be a community object, not a lab product.

HPC infrastructure: legacy, precision, and the allocation problem

Jean-Pierre Vilotte asked one of the most disruptive questions of the session: in four years, will we still talk about “HPC” as a distinct category of computing, or will it simply be called a computing system? The hardware landscape is already being shaped primarily by AI workloads. GPU and accelerator design is market-driven, and that market is not scientific computing. The primary concern of HPC scientists, when talking to them, is not how to integrate AI; It is “I have a legacy code and I need to keep it running.” Finding ways to integrate HPC workflows with AI tools is not only a opportunity, it is a requirement to ensure that these workflows are still compatible with the new hardware.

The numerical tension at the heart of this is precision: AI hardware is designed for lower-precision, stochastic matrix operations; classical HPC relies on deterministic, high-precision linear algebra. These are genuinely different things, and the software stack does not yet bridge them cleanly.

Shantenu Jha named one of the field’s dirty secrets: large-scale AI-HPC workflows show overall resource utilisation in the single digits. A 4,000-node machine is “fully occupied” in the allocation sense while running at one-tenth of peak flops, because a heterogeneous set of tasks cannot be efficiently mapped onto a homogeneous allocation model. The fix is rethinking allocation entirely, from nodes-to-humans to resources-to-projects. He also noted that cloud-like services, based on orchestration, scheduling, job monitoring, and failure recovery, that cloud users take for granted are largely absent from HPC. To improve allocation, convergence toward these on HPC infrastructure is inevitable.

The team problem is structural, not personal

Multiple speakers arrived independently at the same diagnosis: the barrier to multi-disciplinary team science in academia is not cultural reluctance but institutional structure. Funding systems, evaluation metrics, and career paths are built around individuals and actively resist team-based approaches.

Paola Cinnella described the structural isolation of AI researchers in each lab: three or four people, completely cut off from the rest of their institute, while PhD students in high-energy physics three departments over are independently solving almost identical data-handling problems. CNRS had project team structures designed to address this but no longer funds or recognises them.

Paola Cinnella put it most directly: “Let’s at least make teams.” Teamwork is a genuine skill that is no longer taught, no longer funded, and that people have largely lost the rules of. Publication records, grant attribution, and career advancement are built around the individual. As long as that is true, teams will remain formal objects rather than operational realities. Julie Deshayes added an important dimension: these teams need to be multi-disciplinary, to foster real interaction between people of various background.

Shantenu Jha extended this to the talent pipeline: academia is losing the ability to retain graduate students. Research universities have one-hundredth of the resources of the frontier AI labs, and the labs have equally interesting problems. The fix requires genuinely incorporating industry into academic team structures, not just collaborating transactionally.

Carbon is not optional

A question from the audience asked whether we are heading toward a world where the best algorithms simply belong to whoever has the cheapest electricity and most water. The exchange that followed was the sharpest of the session.

Julie Deshayes argued that energy and water metrics for AI are largely opaque and that the geographic distribution of compute means carbon intensity varies enormously across runs. The socio-political conflicts between data centers and competing land and water uses will intensify and act as real constraints on unchecked AI scaling.

Shantenu Jha offered the contrarian view: energy is a non-issue. The exascale computing programme was predicted to require a dedicated nuclear reactor. It was delivered at one-tenth of the cost. Amara’s Law: we consistently overestimate the challenges of ambitious computing goals.

Deshayes’s response was categorical: carbon is not a non-issue. Climate change is already happening and already killing people. Doing science does not grant a free carbon pass. The obligation is to compute the integrated environmental cost of your AI tool choices including energy, carbon, water, and treat it as a real budget constraint. Paola Cinnella closed the loop: in aeronautics, the entire motivation for using AI-assisted simulation is to reduce environmental impact. AI tools used to design greener aircraft must not have a larger carbon footprint than the emissions they are trying to reduce. She called explicitly for frugal AI and small models: large models are frequently overkill.

Are we ready for agent-driven HPC?

Current HPC systems are not equipped to handle the elasticity and dynamism that agent-driven scientific inquiry requires. But there is no fundamental technical barrier to adapting. The barriers, both Vilotte and Jha agreed, are policy and governance: HPC systems have been designed for security, access control, and batch scheduling in ways that make them inherently rigid. Those are governance choices, not hardware constraints.

From the audience, George Karniadakis described a concrete preview: a student set up two AI agents on separate laptops, communicating via a messaging app, to run a CFD computation. He woke up to find one agent had spontaneously called the other to help finish the task, without being asked. That is not yet science. But the implication is clear: HPC allocation models built around human job submission are already anachronistic.

He also stated, without equivocation, that he believes he has taken his last human PhD student. Paola Cinnella pushed back with equal directness: mentoring has non-rational dimensions, like eye contact, physical presence, the unplanned conversation, that cannot be replicated. We need new generations, and those generations need professors.

Calls to action

Invest in the ‘about’ layer. The community is under-investing in the reasoning and planning layer that decides which computations to run. The highest-leverage AI application in HPC is not inside computation or accelerating it, it is the planning agent above it.

Build evolutionary, community-governed benchmarks. Scored on physically meaningful quantities, governed by an advisory and steering board, and designed to move the bar as methods improve: this is the model for how communities can establish standards without fragmenting into incompatible local benchmarks.

Treat open standards as a scientific necessity. Closed proprietary tools, including closed data, closed models, closed APIs, are a scientific liability for reproducibility, auditability, and independence. Open standards are not idealistic preferences; they are requirements for science.

Compute your carbon budget. The integrated environmental cost of AI tool choices should be treated as a real budget constraint, not an externality. Frugal AI is good science, not a compromise.

Design teams for interdisciplinarity. Academia’s individual-centred incentive systems are the main barrier to the multi-disciplinary team science that AI-enabled research requires. Changing this requires changing evaluation metrics, funding structures, and career paths.

The session closed on an unanswered question, read aloud from the audience vote, and left deliberately open. For 1,500 years, chess was the benchmark of human intelligence. Deep Blue moved the goalposts in 1997. AlphaGo moved them again in 2016. AI weather forecasting now outperforms simulation. Can we admit we have no clear, stable limit for what AI can do, and commit, in that uncertainty, to looking for AI within the real resource constraints of Earth?

The panel had no answer. Neither does the field.

Notes from the session. Quotes reconstructed from the panel transcription and may not be verbatim.

Privacy | Accessibility: non-compliant