Chapter 15
Thermal Management & Cooling
Chapter 15: Thermal Management & Cooling
15.1 Overview
Every watt consumed by a GPU becomes a watt of heat that must be removed. An NVIDIA Blackwell B200 GPU has a thermal design power of 700W. A GB200 NVL72 rack generates approximately 120 kW 1 of heat in a space the size of a refrigerator. At these densities, traditional air cooling fails. The data center industry is in the middle of a forced transition from air to liquid cooling, and the companies that manufacture cooling infrastructure are experiencing the same explosive demand growth as the chip and power layers of the AI supply chain.
This chapter covers the thermal management solutions that keep AI data centers operational: direct-to-chip liquid cooling, rear-door heat exchangers, coolant distribution units (CDUs), immersion cooling, and the traditional air cooling systems that still serve the majority of existing data centers. It connects to the power distribution layer (Chapter 14), since cooling consumes 7-40% 2 of total data center electricity depending on efficiency, and to the physical construction layer (Chapter 17), since cooling infrastructure is a major determinant of data center design and site selection. The chapter also now includes water infrastructure (supply, treatment, water use efficiency) as data center water consumption has become a material constraint.
A consolidation wave is reshaping this layer. HVAC incumbents are acquiring liquid cooling specialists at premium valuations: Ecolab acquired CoolIT Systems for $4.75 billion 3, Eaton acquired Boyd Thermal for $9.5 billion 4, Trane Technologies acquired LiquidStack, Schneider Electric acquired Motivair, and Daikin acquired Chilldyne. The result is that many of the most innovative cooling companies are no longer available as independent public investments. The technology is moving from private startup to industrial conglomerate subsidiary in a single step.
ZutaCore (Israel, private) has maintained independence and has a long-term partnership with Equinix since 2022, deploying direct-on-chip, waterless, two-phase liquid cooling capable of 100+ kW per rack across Equinix Metal infrastructure. The upstream cooling infrastructure (cooling towers that reject heat from the building) is dominated by SPX Technologies (SPXC, NYSE, ~$8B 5), which manufactures cooling towers under the Marley brand with over 100 years of operating history. ENEXIO (Germany, private) and Baltimore Aircoil Company (private) are the other major cooling tower manufacturers. These are rarely discussed in AI infrastructure analysis but every data center with evaporative cooling depends on them.
The data center liquid cooling market was valued at approximately $4.8-6.65 billion in 2025, depending on the source and methodology. Dell’Oro Group estimates the market roughly doubled in 2025 to nearly $3 billion in manufacturer revenue (a narrower definition excluding installation and services), and projects it will reach $7 billion by 2029. Growth rates range from 20-32% CAGR depending on scope, making this one of the fastest-growing segments in the AI infrastructure stack 678.
The market segments into three primary cooling architectures. Direct-to-chip (DTC) liquid cooling, which uses cold plates attached directly to processors with liquid circulated through tubes, is the dominant approach. DTC held approximately 42-47% of the liquid cooling market in 2025 and is the technology being mandated by hyperscalers. In February 2025, Microsoft mandated DTC liquid cooling for all new AI and HPC server deployments in Azure. HPE expanded its alliance with CoolIT Systems in March 2025 for next-generation liquid-cooled Cray servers 9.
Immersion cooling, where entire servers are submerged in dielectric fluid, is the second architecture. It offers superior heat removal (handling 100+ kW per rack) but requires fundamentally different server designs and is more complex to maintain. Companies like GRC (Green Revolution Cooling), LiquidStack, and Submer lead this segment. Immersion is growing rapidly but remains a smaller share of deployments than DTC.
The third is rear-door heat exchangers (RDHx), which attach a water-cooled heat exchanger to the back of a standard server rack, removing heat from exhaust air. This is the simplest retrofit path for existing data centers and is offered by Vertiv, Schneider, and others.
The competitive field is fragmented but consolidating. Vertiv leads the overall liquid cooling market with approximately 11.3% share, followed by Schneider Electric (which acquired Motivair in 2024 to add cold plate and CDU technology), CoolIT Systems, nVent, and Boyd Corporation. The top 5 players hold roughly 35% of the market. Dell’Oro Group highlights Aaon as a fast-growing player due to deep hyperscaler partnerships. Daikin Applied acquired Chilldyne (direct-to-chip specialist) in November 2025, signaling traditional HVAC players’ entry into the AI cooling market 6810.
CoolIT Systems (private, based in Calgary) is arguably the most important pure-play in liquid cooling. The company pioneered direct-to-chip cold plate technology and has shipped millions of cold plates to hyperscalers. Its expanded alliance with HPE makes it the default DTC solution for HPE’s Cray AI server line. CoolIT’s cold plates are also designed into multiple NVIDIA reference architectures.
The transition to liquid cooling is not optional. Air cooling reaches practical limits at approximately 30-40 kW per rack. AI racks already operate at 100-120 kW, and next-generation systems (NVIDIA Rubin) are expected to push rack densities beyond 200 kW. Leading-edge GPU TDPs are projected to exceed 4,000W by 2029, making liquid cooling a structural requirement rather than an efficiency upgrade. Liquid cooling can reduce cooling energy costs by more than 30% compared to air, improving a facility’s power usage effectiveness (PUE) 69.
15.2 Market Sizing & Growth
Data center liquid cooling market: Valued at $4.8-6.65 billion in 2025 depending on source. Dell’Oro Group: nearly $3 billion in manufacturer revenue (2025), projected ~$7 billion by 2029. Grand View Research: $6.65 billion (2025), projected $29.46 billion by 2033 at 20.1% CAGR. Global Market Insights: $4.8 billion (2025), projected $27.1 billion by 2035 at 18.2% CAGR 678.
AI-specific liquid cooling: Future Market Insights: $3.2 billion (2025), projected $7.2 billion by 2030. Direct-to-chip held 47% share. Technavio: market growing at 31.7% CAGR from 2025 to 2030, adding $2.48 billion 911.
Direct-to-chip market: Valued at $2.53 billion in 2025, projected to reach $12.76 billion by 2034 at 19.72% CAGR. Single-phase holds 66% share; two-phase growing as TDPs exceed single-phase limits 12.
North America liquid cooling: $0.97-1.29 billion in 2025, projected to reach $12.19 billion by 2034 at 32.47% CAGR. US alone generated $1.29 billion 813.
Vertiv (cooling + power): Total Q3 2025 revenue $2.6 billion (+29% YoY), backlog $9.5 billion. Cooling is a major segment alongside power (not separately reported). Acquired Purge Rite (~$1B) for liquid cooling services. Market leader in liquid cooling at 11.3% share 610. By Q4 2025, CEO Giordano Albertazzi reported the backlog had surged to $15 billion, “more than double last year’s and up 57% sequentially,” with customers requesting 12-18 month delivery windows 15.
Schneider Electric (cooling): Acquired Motivair (2024) for cold plate and CDU technology. Collaborated with NVIDIA on AI cooling reference architectures. EcoStruxure platform integrates cooling with power management 5.
15.3 Supply Chain Flowchart
THERMAL MANAGEMENT & COOLING
|
|---> DIRECT-TO-CHIP (DTC) LIQUID COOLING (~42-47% of liquid market)
| Cold plates attached to CPU/GPU; liquid circulates through tubing
| Single-phase (water/glycol): dominant today, simpler, 66% share
| Two-phase (dielectric fluid boils on chip): higher heat removal, emerging
| |
| |-- DTC SOLUTION PROVIDERS
| | CoolIT Systems (Private): pioneer, HPE Cray alliance, millions shipped
| | Vertiv: Liebert XDU CDUs, cold plate solutions, 11.3% market leader
| | Schneider/Motivair: cold plates + CDUs (acquired 2024)
| | Asetek: desktop/enterprise liquid cooling IP, OEM partnerships
| | Boyd Corporation: acquired Durbin Group for vertical integration
| | nVent Electric: rack-level liquid cooling solutions
| | Chilldyne → Daikin Applied (acquired Nov 2025)
| | ZutaCore: waterless two-phase DTC technology
| |
| +-- CDU (Coolant Distribution Unit) MANUFACTURERS
| Vertiv, Schneider/Motivair, CoolIT, Rittal, Stulz
| CDU sits between facility chilled water and server cold plates
|
|---> IMMERSION COOLING (higher heat removal, more complex)
| Servers submerged in dielectric fluid
| Single-phase immersion: servers in non-boiling fluid
| Two-phase immersion: fluid boils on components, condensed and recirculated
| |
| |-- Green Revolution Cooling (GRC): single-phase immersion leader
| |-- LiquidStack (acquired by Trane Technologies): two-phase immersion
| |-- Submer: single-phase immersion, European focus
| |-- Iceotope Technologies: precision immersion cooling
| +-- Midas Green Technologies: modular immersion
|
|---> REAR-DOOR HEAT EXCHANGERS (RDHx)
| Easiest retrofit for existing air-cooled data centers
| Vertiv, Schneider, nVent, Rittal
| Effective up to ~50 kW per rack; insufficient for 100+ kW AI racks
|
|---> TRADITIONAL AIR COOLING (legacy, reaching limits)
| CRAC/CRAH units: Vertiv, Schneider, Stulz, Rittal
| Effective up to ~30-40 kW per rack
| Still serves majority of existing enterprise data centers
| Being phased out for AI/HPC workloads
|
+---> SUPPORTING COMPONENTS
Chiller plants: Carrier, Trane, Daikin, Johnson Controls
Cooling towers: SPX Cooling Technologies, Baltimore Aircoil (BAC)
Pumps: Grundfos, Xylem
Water treatment: Xylem, Nalco (Ecolab)
Thermal interface materials: Dow, Honeywell, Henkel
15.4 Key Companies
| Company | Ticker | Exchange | Approx. Mkt Cap | Role in Buildout | Key Metric |
|---|---|---|---|---|---|
| Vertiv Holdings | VRT | NYSE | ~$131B | Market leader in DC liquid cooling (11.3% share); UPS, CDUs, thermal | Q3 2025 revenue $2.6B; backlog $9.5B; acquired Purge Rite $1B |
| Schneider Electric | SU | Euronext Paris | ~$170B | Acquired Motivair (cold plates/CDUs); EcoStruxure; NVIDIA partnership | End-to-end power + cooling integration for AI DCs |
| nVent Electric | NVT | NYSE | ~$27.5B | Rack-level liquid cooling, enclosures, busway, power scaling to 200A+/100kW+ | Modular liquid-cooling for high-density AI racks; launched HPC/AI-specific hardware |
| CoolIT Systems | Private | Private | Private | Pioneer in DTC cold plates; HPE Cray alliance; hyperscaler supply | Millions of cold plates shipped; default for HPE AI servers |
| Asetek | ASTK | Oslo Bors | ~$300M | Desktop and enterprise liquid cooling; OEM partnerships | Decades of liquid cooling IP; expanding into DC segment |
| Aaon | AAON | NASDAQ | ~$10.0B | Fast-growing DC cooling; deep hyperscaler partnerships | Rapid growth highlighted by Dell’Oro; custom air handling |
| Stulz | Private | Private (Germany) | Private | Precision air conditioning and liquid cooling for DCs | Major European DC cooling supplier |
| Rittal | Private | Private (Germany) | Private | IT racks, cooling solutions, CDUs | Part of Friedhelm Loh Group; significant DC infrastructure |
| GRC (Green Revolution Cooling) | Private | Private | Private | Single-phase immersion cooling leader | Targeting hyperscale and enterprise with tank-based systems |
| LiquidStack (Trane Technologies) | Private | Subsidiary | Private | Two-phase immersion cooling; acquired by Trane | MOU with Innovo for liquid-cooled modular DCs (Nov 2025) |
| Daikin Industries | 6367 | Tokyo | ~$48.0B | Traditional HVAC; acquired Chilldyne (DTC) in Nov 2025 | Entry into AI DC cooling via DTC acquisition |
| Carrier Global | CARR | NYSE | ~$60.0B | HVAC, chiller plants for data center cooling loops | Major chiller supplier for DC facility cooling |
| Moog Inc | MOG.A | NYSE | ~$8.0B | CoreMotion magnetic pumps for direct-to-chip liquid cooling | Aerospace-heritage precision fluid control; variable-speed, leak detection, zero maintenance (no wear parts); critical for CDU pump reliability |
| Alfa Laval | ALFA | Nasdaq Stockholm | ~$21.4B | Heat exchangers and thermal systems for DC liquid cooling | Industrial heat transfer specialist entering DC segment; plate heat exchangers for CDUs and facility cooling loops |
| Panasonic (cooling pumps) | 6752 | TSE | ~$25.0B (group) | Compact cooling pumps for AI DCs; 75% flow rate increase (40 to 70 L/min) | 70 years pump technology; 400kW and 800kW CDU models for EU market (March 2026); redundant pump systems |
| Jentech Precision | 3653 | TWSE | ~$17.0B | Vapor chamber lids, microchannel lids; top-3 globally in vapor chambers | NVIDIA/AMD/Intel partner. MCL (liquid cooling solution) expected 2027. Critical for next-gen GPU thermal density. |
| Kaori Heat Treatment | 8996 | TWSE | ~$3.0B | Plate heat exchangers (PHE); Asia’s largest manufacturer; CDMs and CDUs for NVIDIA GB200/GB300 | Liquid cooling products reaching ~50% of revenue by 2027 (from 20% currently). Stock +498% last year. |
| Auras Technology | 3324 | TPEx | ~$3.1B | Thermal ground planes (TGPs), 3D vapor chambers for GPU heat management | FY2025 revenue NT$23.3B (+48% YoY). AMD-certified. Critical component inside GPU coolers. |
| Modine Manufacturing | MOD | NYSE | ~$14.7B | Liquid cooling heat exchangers, CDUs, and thermal management products for data centers | Data Center segment revenue growing 40%+; rear-door heat exchangers and facility-level cooling for AI DCs |
| Trane Technologies | TT | NYSE | ~$107B | Industrial chillers, precision HVAC, facility-level cooling systems for data centers | Acquired LiquidStack (two-phase immersion cooling); chiller plants serving hyperscaler DC campuses |
15.5 Bottleneck Analysis
Air cooling reaching physical limits (SEVERE, structural): Air cooling cannot economically remove heat from racks above 30-40 kW. AI racks operate at 100-120 kW today and are heading toward 200+ kW. This is not an optimization problem; it is a physics constraint. Every new AI data center must incorporate liquid cooling. The installed base of air-cooled data centers cannot support next-generation GPU deployments without retrofit or replacement. This structural shift drives the 20-32% CAGR projections for liquid cooling 69.
Liquid cooling deployment expertise (MODERATE-HIGH): Liquid cooling requires plumbing, leak detection, fluid management, and maintenance skills that traditional data center operators do not possess. The services segment is the fastest-growing part of the North American liquid cooling market (23.5% CAGR) precisely because operators lack in-house expertise. A shortage of qualified liquid cooling installers and technicians could slow deployment even when equipment is available 13.
Cold plate manufacturing scale (MODERATE): CoolIT has shipped millions of cold plates, but demand is growing faster than production capacity. Each GPU in a liquid-cooled system requires its own cold plate with precise mating to the chip surface. As GPU architectures change with each generation (Hopper to Blackwell to Rubin), cold plates must be redesigned and requalified. The 12-18 month design cycle for new cold plates means that cooling solutions must be developed in parallel with GPU designs, creating coordination dependencies between cooling vendors and GPU makers.
Dielectric fluid supply for immersion (MODERATE, rising): Immersion cooling requires large volumes of specialized dielectric fluids. 3M/Solventum began phasing out Novec 7500 fluorinated fluids (January 2025) due to PFAS regulatory pressure, removing the incumbent two-phase immersion fluid from the market. Replacement suppliers (Engineered Fluids, Shell, Chemours, Cargill bio-based fluids) are ramping but have limited production capacity for data center-grade dielectrics. The Chemours/2CRSi partnership (February 2026) signals the industry scrambling for Novec alternatives. This creates a potential supply gap for two-phase immersion deployments in 2026-2027.
Emerging direct-to-chip (DTC) startups (MODERATE, acceleration risk): Beyond incumbent CoolIT and Vertiv, a wave of DTC startups is receiving strategic investment from systems integrators: Accelsius (Bell Labs two-phase NeuCool technology, received Johnson Controls strategic investment October 2025, achieving 250+ W/cm2 heat flux removal), JetCool (acquired by Flex, SmartPlate microjet system, 300kW CDU 14), and DCX (EU-based, open standards approach). These companies address the thermal density challenge of next-gen GPUs (Rubin: 1000W+ per chip) that incumbent solutions may not handle without architectural changes. If thermal density outpaces cooling innovation, GPU clock speeds or rack densities must be derated.
Water availability (MODERATE in specific locations): Liquid cooling ultimately rejects heat to the environment, often through evaporative cooling towers that consume water. A 100 MW data center can consume 1-2 million gallons of water per day (derived from IEA data; see Chapter 1, Section 1.3). In water-stressed regions (Phoenix, Las Vegas, parts of Texas), this creates siting constraints. Dry cooling alternatives exist but are less efficient and more expensive.
15.6 Risks
Liquid cooling commoditization: As the market grows from niche to mainstream, pricing pressure will intensify. Traditional HVAC giants (Daikin, Carrier, Trane/LiquidStack) are entering the market through acquisitions, bringing massive manufacturing scale and distribution networks. This could compress margins for pure-play liquid cooling specialists like CoolIT and Asetek. The acquisitions of Motivair by Schneider, Chilldyne by Daikin, and LiquidStack by Trane all signal this dynamic.
Immersion cooling displaces DTC: If two-phase immersion cooling achieves the reliability and serviceability needed for hyperscale deployment, it could leapfrog direct-to-chip as the preferred architecture for highest-density racks. Immersion offers better heat removal per watt and simplifies the per-server plumbing. However, immersion requires custom server designs and is harder to service (you cannot hot-swap a server in a fluid tank as easily as in a rack). The base case is that DTC dominates through 2028, with immersion growing in niche high-density applications.
Efficiency improvements reduce cooling load: More efficient GPUs (lower watts per FLOP) would reduce the heat generated per unit of compute, potentially extending the life of air cooling and slowing liquid cooling adoption. NVIDIA’s Blackwell architecture delivers better performance per watt than Hopper. However, absolute power consumption per GPU has plateaued at high levels (H100: 700W, B200: 700W) and is projected to increase further with Rubin, and cluster sizes are growing, so total heat generation per data center is still increasing. Efficiency improvements slow the rate of cooling demand growth but do not reverse it.
Retrofit costs limit brownfield adoption: Converting an existing air-cooled data center to liquid cooling is expensive and disruptive: new piping, CDUs, leak detection, and potentially structural modifications to support the weight of water. Many enterprise data center operators will choose to build new liquid-cooled facilities rather than retrofit existing ones. This creates a two-speed market: greenfield AI data centers deploy liquid cooling from day one, while brownfield enterprise data centers remain air-cooled for years.
First principles check: Why can’t we just use bigger fans? Because heat transfer from a solid surface to air is fundamentally limited by the thermal conductivity of air (~0.026 W/mK) versus water (~0.6 W/mK), a 23x difference. To remove 700W from a GPU chip die measuring ~800 mm², air cooling requires enormous heatsinks and airflow volumes that physically cannot fit in a dense server rack. Liquid cooling is not a preference; it is a thermodynamic necessity at these power densities.