Energy · XAI Published Paper· Energy & Buildings 2025

Why Is This Building
Wasting Energy?

Buildings account for one-third of global energy consumption — and anomalies like broken equipment or inefficiencies go unnoticed for months. While deep learning can detect these problems, it can't explain them. This research gives AI the ability to say why, not just that, something is wrong — and makes those explanations reliably consistent every single time.

Energy & Buildings, Vol. 328, 2025 DOI: 10.1016/j.enbuild.2024.115177 Western University December 2024
38%
Avg. Reduction in Explanation Variability
5
Building Types Tested
10
Deep Learning Models
5
XAI Techniques Evaluated
80%
Max. Variability Reduction Achieved
① The Problem
Buildings Waste Energy — And Nobody Knows Why

Imagine a hospital where a freezer runs at the wrong temperature for weeks, or an office where the HVAC silently overconsumes electricity every night. These aren't dramatic failures — they're quiet anomalies that bleed energy costs and cause equipment damage over time. AI can detect these patterns, but until now, it couldn't explain them in a consistent, trustworthy way.

Buildings Drive Global Energy Use

Residential and commercial buildings account for one-third of global electricity consumption. Anomalies like equipment faults quietly amplify this waste.

🔒

AI Alarms Without Explanations

Deep learning detects anomalies with high accuracy — but it's a black box. An energy manager gets an alert but no answer: what caused this?

Existing Explanations Are Unreliable

SHAP — the standard explainability tool — produces different answers every time it runs on the same anomaly. That inconsistency kills expert trust.

② What Anomalies Look Like in Real Data

Office Building — 7-Day Energy Consumption (kWh)

Four real anomaly types flagged by the AI model. Hover over the marked points to see details.

Normal pattern
Anomaly detected
Expected range
🌡️
HVAC Malfunction
Heating system ran uncontrolled overnight on Monday — 2.8× normal night consumption.
↑ Night spike · Mon 2am–5am
⚙️
Equipment Fault
Industrial refrigeration unit stuck in continuous-run mode on Wednesday afternoon.
↑ Sustained high · Wed 1pm–8pm
📉
Sensor / Meter Dropout
Smart meter reported near-zero readings on Thursday midday — likely a communication fault, not real drop.
↓ False low · Thu 11am–2pm
🌙
Overnight Overconsumption
Unknown load left active Friday night. No occupancy detected — possible forgotten equipment or unauthorised use.
↑ Unexplained · Fri 11pm–Sat 4am
🔍

Why Each Anomaly Needs a Different Explanation

A nighttime HVAC spike is caused by temperature and hour-of-day. A Wednesday equipment fault is driven by device runtime and load history. A sensor dropout looks similar to both yet has a completely different root cause. This is why generic, random-sample explanations fail — the AI must use context-matched historical patterns to explain each anomaly correctly.

③ The Data
Five Real Buildings, Thousands of Hours of Energy Data

To make results meaningful and generalizable, the study used five real-world energy consumption datasets spanning entirely different building types — from a private home to a manufacturing plant. Each building has a unique energy rhythm, making the same anomaly look very different depending on context.

🏠
Residence
London, Ontario
2002–2004
Hourly readings
🏭
Manufacturing
Industrial facility
2016–2017
High consumption
🏥
Medical Clinic
24/7 operation
2016–2017
Irregular patterns
🏪
Retail Store
Commercial space
2016–2017
Weekly cycles
🏢
Office Building
Business hours
2016–2017
Regular patterns
Each dataset was enriched with weather data (temperature, humidity, wind speed) and calendar features (hour, day of week, month). Data was split 80/10/10 for training, validation, and testing — keeping time order intact to avoid data leakage.
④ How It Works
From Raw Meter Data to Trustworthy Explanations

Think of this approach like a detective who not only identifies a crime but also explains their reasoning by referencing the most relevant past cases — not random ones. The key innovation is choosing the right comparison context for every anomaly before generating an explanation.

Detect

Train a deep learning model to predict energy. Flag points where prediction error is unusually large.

🔍

Find Context

For each anomaly, find the most similar historical windows using weighted cosine similarity — not random samples.

🧠

Explain

Run SHAP using those context-matched samples as the reference baseline, so explanations reflect the anomaly's real environment.

📋

Clarify

Separate features that caused the anomaly from those that merely offset it — giving managers a clear action list.

① Detect
  • 10 deep learning models
  • IQR-based thresholding
  • Sliding window (48h input)
  • Bayesian hyperparameter tuning
② Find Context
  • Random Forest feature importance
  • Exponential importance weighting
  • Weighted cosine similarity
  • Top 100 similar neighbours
③ Explain
  • Kernel SHAP, Partition SHAP
  • Sampling SHAP, LIME
  • Permutation importance
  • Context-matched baseline
④ Clarify
  • Positive SHAP = offset features
  • Negative SHAP = anomaly drivers
  • SHAP heatmap visualisation
  • Reduced noise from irrelevant features
💡

The Key Insight: Context Changes Everything

Using a random background dataset is like asking a random doctor to explain a patient's unusual lab result — they might give an answer, but it won't be informed by the patient's history. This method picks similar past patterns as the reference point, so explanations reflect reality — and stay consistent across multiple runs.

⑤ Results
More Consistent Explanations, Across Every Building Type

The proposed method was evaluated across all five buildings, all ten deep learning models, and all five XAI techniques. In every single case, the new approach produced less variable explanations than the standard random baseline.

38%
Average reduction in explanation variability across all datasets & methods
statistically significant in most cases
80.3%
Maximum reduction achieved (manufacturing facility, Partition SHAP)
p < 0.001
250+
Individual model–dataset–XAI combinations evaluated
10 models × 5 datasets × 5 methods
Explanation Variability Reduction per Building — Kernel SHAP
🏆

What This Means in Practice

Before this work, running the same explanation tool on the same anomaly twice could yield completely different feature rankings. Now, with context-matched baselines, the explanation is stable, interpretable, and actionable — pointing to the right features (temperature spikes, overnight loads, weekday patterns) rather than noise.

Results were consistent across all 10 deep learning architectures tested:

LSTM
GRU
BiLSTM
BiGRU
1D-CNN
TCN
DCNN
WaveNet
TFT
TST
⑥ Takeaways
What This Research Means Beyond the Numbers
01
Context-Aware AI Is More Trustworthy
Explanations grounded in relevant historical context are significantly more stable than those using random data. This principle applies far beyond energy — to any domain where AI accountability matters.
02
Generalisable Across Buildings & Models
The method worked for homes, factories, clinics, stores, and offices — and across 10 different model types. This validates it as a practical, deployable tool rather than a narrow case study.
03
Energy Managers Can Finally Act on AI Output
When explanations consistently point to the right features — temperature spikes, overnight loads, weekday patterns — facility teams can diagnose root causes and fix issues with confidence.
04
A Path to Explainable Building Automation
As smart meters and IoT sensors proliferate, reliable XAI for energy systems becomes essential infrastructure. This work provides a validated, open methodology to build on.