Carbon-aware GPU cloud scheduling places AI workloads in regions and times where the electrical grid has the lowest carbon intensity. This article explains the algorithms, infrastructure, and economics behind the approach.

Carbon-aware GPU cloud scheduling represents a fundamental shift in how AI compute infrastructure operates. Traditional GPU schedulers optimize for performance and cost — placing workloads wherever GPUs are available at the lowest price. Carbon-aware scheduling adds a third dimension: the carbon intensity of the electricity powering those GPUs. By integrating real-time grid carbon data into scheduling decisions, carbon-aware systems can reduce the carbon footprint of AI workloads by 50-90% without sacrificing performance or increasing cost.
The approach is particularly impactful for GPU workloads because AI training and inference are among the most energy-intensive computational tasks in modern data centers. A single NVIDIA H100 GPU consumes approximately 700 watts under load, and a training cluster of 256 GPUs draws 180 kilowatts — equivalent to the electricity consumption of 150 average homes. At this scale, the carbon difference between running on a coal-heavy grid (800 gCO2/kWh) and a renewable-heavy grid (50 gCO2/kWh) is enormous: 8.6 tonnes of CO2 per day versus 0.5 tonnes per day for the same computational output.
Carbon-aware GPU scheduling uses a multi-objective optimization that balances three factors: compute performance (GPU availability, memory capacity, interconnect bandwidth), cost (electricity price, GPU rental rate), and carbon intensity (real-time grid emissions data). The algorithm assigns a composite score to each potential placement and selects the option that minimizes carbon intensity while meeting performance and cost constraints.
HarchOS implements this through a federated scheduling model where each of the five Moroccan hubs runs an independent scheduler that cooperates with peers. The scheduler receives real-time carbon intensity feeds from Morocco's grid operator, cross-referenced with on-site solar and wind generation data from Harch Energy's renewable installations. When carbon intensity at one hub drops below the fleet average — for example, when midday solar pushes the Dakhla hub to near-zero carbon intensity — the scheduler migrates eligible workloads to that location.
Carbon-aware scheduling operates in two dimensions. Temporal scheduling defers flexible workloads to periods of lower carbon intensity. AI model training, which can run at any time over a period of days or weeks, is an ideal candidate for temporal scheduling — it can be paused during high-carbon periods and resumed when renewable generation peaks. Spatial scheduling routes workloads to regions with lower carbon intensity. A model that needs to train continuously can be migrated between hubs as carbon intensity shifts, following the sun across Morocco's solar corridor.
The combination is powerful: temporal scheduling reduces the average carbon intensity of flexible workloads by 40-60%, while spatial scheduling adds another 20-30% reduction for workloads that must run continuously. Together, they achieve the 89% carbon intensity reduction that Harch Intelligence delivers across its fleet — from the industry average of approximately 450 gCO2/kWh to approximately 47 gCO2/kWh.
The impact of carbon-aware GPU scheduling is measured through carbon intensity metrics reported at the workload level. Every job processed through HarchOS receives a carbon report detailing total energy consumed, average carbon intensity, and total CO2 emissions. This enables customers to include actual emissions data in their sustainability reports rather than relying on industry averages. The measurement methodology follows the GHG Protocol Scope 2 guidelines, using location-based and market-based emissions factors. Harch Intelligence publishes quarterly carbon intensity reports audited by independent third parties, providing transparency that few cloud providers match.
Related Topics
More Articles