The transition toward liquid-cooled AI infrastructure has significantly improved heat dissipation for high-density GPU systems; however, existing thermal management architectures continue to rely on localized control policies that respond to temperature excursions after they occur, limiting their ability to optimize cooling efficiency, computational performance, and infrastructure sustainability simultaneously. The absence of an integrated decision-making mechanism capable of anticipating thermal evolution across compute, cooling, and facility subsystems remains a major challenge for next-generation AI data centers. This study proposes an Intelligent Thermal Management Framework (ITMF) that unifies physics-informed thermal modeling, spatiotemporal predictive analytics, and optimization-driven cooling orchestration within a closed-loop control architecture. The framework constructs a real-time thermal state representation of GPU clusters by fusing telemetry from liquid-cooling loops, rack-level sensors, power delivery systems, and workload schedulers. A hybrid forecasting model predicts short-term thermal dynamics and identifies emerging hotspot propagation before performance degradation occurs. These predictions drive a multi-objective optimization engine that jointly regulates coolant circulation, pump operation, workload distribution, and rack-level power allocation to minimize thermal imbalance, cooling energy demand, and carbon emissions while maintaining application performance and hardware reliability. Unlike existing thermal control approaches that optimize individual cooling components, the proposed framework coordinates computational and cooling resources as an integrated cyber-physical system, enabling adaptive thermal resilience under continuously changing AI workloads. The framework establishes a scalable foundation for sustainable AI infrastructure management by improving cooling effectiveness, reducing operational expenditure, extending hardware service life, and supporting environmentally responsible large-scale AI computing.
@artical{a1572026ijsea15071007,
Title = "Intelligent Thermal Management Framework Combining Liquid Cooling Technologies and Predictive Analytics for Sustainable AI Infrastructure Operations ",
Journal ="International Journal of Science and Engineering Applications (IJSEA)",
Volume = "15",
Issue ="7",
Pages ="29 - 43",
Year = "2026",
Authors ="Abiodun Victor Oyeleke, Blessing Adebimpe Ojajuni, Felix Denkyi, Francis Chukwudi Eze"}