Sustainable Development & Local AI

No greenwashing but numbers to show you how good we are!

The problem

AI consumes. A lot. That's not a problem, except when the bill arrives.

We're not going to give you the "save the planet with our green AI" pitch when we're just running the same models as everyone else. But we can explain why AEGIS IA local infrastructure consumes significantly less than a cloud subscription over the long term.

Ready for some physics and math?

What cloud LLMs really consume

A recent academic study (Jegham et al., 2025) measured the real consumption of major commercial LLMs and the numbers are scary:

📊 Consumption per query (cloud models)

GPT-4o: 0.43 Wh per short query
o3 (OpenAI): 39.2 Wh per long prompt (70x more than a nano model)
DeepSeek-R1: 33.6 Wh per long prompt

Source: Jegham et al. (2025), "How Hungry is AI? Benchmarking Energy, Water, and Carbon Footprint of LLM Inference", arXiv:2505.09598

To put this in context: a single long query to o3 consumes as much electricity as running a 65-inch LED TV for 20-30 minutes.

Now multiply that by 700 million queries per day (conservative estimate for GPT-4o in 2025):

Annual electricity consumption: 391,509 to 463,269 MWh — equivalent to 35,000 US homes or nearly 77,000 European homes (comparing Enedis and US EIA data)
Evaporated water (datacenter cooling): 1.33 to 1.58 million kiloliters — enough to fill 500 Olympic pools
Carbon emissions: 138,125 to 163,441 tons of CO₂ — equivalent to 30,000 gasoline cars

And we're talking about one model. Add Gemini, Mistral, all the others...

Jevons Paradox applied to AI

Know Jevons Paradox? The more efficient you make something, the more people use it (think rush hour traffic), so it ends up consuming more despite the optimization.

That's exactly what's happening with cloud LLMs:

GPT-4o is more efficient per query than GPT-3
So people use it 10x more
Result: overall consumption explodes

A Google search consumes 0.30 Wh. A GPT-4o query consumes 0.43 Wh. That's 40% more. Not huge? Now multiply by billions of daily queries.

The hidden datacenter infrastructure

What they don't tell you: the numbers above are significantly UNDERESTIMATED.

Why? Because datacenters don't just consume power to run GPUs. They also consume for:

Cooling: 40-54% of total datacenter consumption
Networks and infrastructure: routers, switches, active cabling
Redundancy: backup power, backup systems
Operational overhead: lighting, security, monitoring
And to a lesser extent employee-related expenses: food service, personal computers, transportation

The average PUE (Power Usage Effectiveness) of a datacenter is 1.5 to 2.0. That means for every 1 Watt consumed by a GPU, you need an additional 0.5 to 1 Watt just to keep it running.

Source: Patterson et al. (2021), "Carbon Emissions and Large Neural Network Training"

So how much does it really cost? (the crux of the matter)

Let's not kid ourselves: the bill is what motivates

💰 Cost comparison Cloud API vs Local (over 3 years)

Scenario: 50 employees, moderate usage (500 req/day/person)

Item	Cloud API	Local infrastructure
Year 1	~€60,000 (tokens)	~€80,000 (hardware + setup)
Year 2	~€60,000	~€15,000 (electricity + maintenance)
Year 3	~€60,000	~€15,000
TOTAL 3 YEARS	€180,000	€110,000

Break-even: 18-24 months

Source: Calculated from Lenovo Press (2025), "On-Premise vs Cloud: Generative AI Total Cost of Ownership"

And be careful, these numbers assume your API prices stay stable. We suspect: they don't.

OpenThing can decide tomorrow to double their prices. You can't do anything. You're held hostage.

With local infrastructure, your marginal cost per query decreases over time. The more you use it, the less it costs per query. And that's exactly the opposite of cloud.

Carbon footprint, let's talk seriously

An average US datacenter uses an electricity mix with ~60% fossil fuels (coal, gas). A European datacenter is more like 30-40% depending on the country.

Your local server in Lorraine? You plug into the French electricity mix: ~70% nuclear + 20% renewables. CO₂ emissions: 50-60g/kWh.

An AWS datacenter in Virginia (us-east-1 region, the most common)? Electricity mix: 40% gas, 35% coal. Emissions: 350-400g/kWh.

Same calculation, 6-7x less CO₂ in France than Virginia.

We're not greenwashing here. We're just noting that plugging a server into EDF's grid emits less than an AWS datacenter in Virginia.

Hidden costs of cloud

What we often forget to count in cloud TCO:

Price increases: OpenAI, Anthropic, Google can change their prices whenever they want. You're dependent.
Output tokens 2-4x more expensive than input tokens. If your LLM generates a lot of text, surprise.
Rate limiting penalties: exceed your quota, pay 2-3x more per token.
Egress fees: getting data out of the cloud costs a fortune (AWS loves this).
Enterprise support: +15-30% on the bill if you want a decent SLA.

Result: your monthly bill of €5000 can quickly become €8000 without you changing anything.

Source: MPT Solutions (2025), "The Hidden Infrastructure Cost of Running Local LLMs vs Cloud APIs"

Local infrastructure isn't free either

Let's be honest. Local infrastructure costs too. Here's what you really need to count:

Hardware: €40,000 to €80,000 for a decent config (2-4 professional GPUs like L40S or A100)
Electricity: ~€5000-10,000/year depending on usage
Cooling: If your datacenter is poorly ventilated, plan for AC. €2000-5000/year.
Maintenance: Hardware refresh every 3-5 years
Personnel: Either you already have IT engineers, or you need to train/recruit

But once amortized (18-24 months), your marginal cost per query is negligible compared to cloud.

And most importantly: you're in control. No surprises, no dependency, no unexpected (and often brutal) increases.

Because there will be some. The ROI of large LLMs like ChatGPT is $3.5 return on $5 invested!

The hybrid model (the real smart idea)

We're not going to lie: 100% local isn't always optimal. We have our challenges too.

Hybrid, real agility:

Simple and recurring tasks (FAQ, classification, extraction) → Local 7B-13B model. Marginal cost almost zero.
Complex and occasional tasks (multi-step reasoning, creativity) → Cloud API if needed. You only pay for the exceptional.
Sensitive data → Local, always. Non-negotiable.
Load spikes → Burst to cloud if your local infrastructure saturates. But it's the exception, not the rule.

Result: you combine the best of both worlds. Controlled costs, optimal performance, preserved sovereignty.

AEGIS IA doesn't pretend to be the little green man

We don't claim to save the planet.

What we do:

We deploy local infrastructure sized for your needs (no unnecessary over-equipment)
We optimize models to reduce consumption per query (quantization, distillation if relevant)
We help you calculate your real TCO (cloud vs local) with honest numbers
We use the local electricity mix (in France = low carbon)
We avoid waste: no GPU running idle 90% of the time

That's it. No carbon offset certificates. Just efficient infrastructure that consumes what it should consume, no more.

Academic sources and references

Our sources:

Jegham et al. (2025): "How Hungry is AI? Benchmarking Energy, Water, and Carbon Footprint of LLM Inference", arXiv:2505.09598 — Comparative study of 30 commercial LLMs
Patterson et al. (2021): "Carbon Emissions and Large Neural Network Training", arXiv:2104.10350 — Analysis of LLM training carbon footprint
Lenovo Press (2025): "On-Premise vs Cloud: Generative AI Total Cost of Ownership" — Detailed TCO analysis with break-even points
Scientific Reports (2024): "Reconciling the contrasting narratives on the environmental impact of large language models", Nature — LLM vs human work comparison
Strubell et al. (2019): "Energy and Policy Considerations for Deep Learning in NLP", ACL — First major study on LLM energy consumption
Venditti B. (2025): Ranked: Electricity Use Per Capita in Major Global Economies

Conclusion: do the math yourself!

Local AI isn't always the solution. But for 80% of companies with stable and predictable usage, it's economically and ecologically more viable than cloud over 2-3 years.

If you're spending more than €2000/month on cloud APIs, it's high time to look into this.

We can help. No smoke and mirrors, with real numbers and transparent TCO.

Write us!