Applying Artificial Intelligence to Urban Building Energy Modeling: Opportunities, Challenges, and Lessons from Practice

Yingjie Liu

2 Applying Artificial Intelligence to Urban Building Energy Modeling: Opportunities, Challenges, and Lessons from Practice

Yingjie Liu

1. Introduction

The integration of Artificial Intelligence (AI) technology with Urban Building Energy Modeling (UBEM) offers substantial potential to address pressing urban sustainability challenges. Cities are increasingly committed to reducing greenhouse gas emissions, improving building operational efficiency, and developing resilient infrastructure. UBEM has emerged as a critical policy and planning tool. However, UBEM currently faces significant barriers such as limited scalability, data acquisition difficulties, and low computational efficiency. AI, with its strengths in data processing, pattern recognition, and predictive analytics, provides new directions for UBEM development.

Despite substantial advancements in computational modeling over the past decade, AI integration in UBEM remains fragmented and uneven. Current academic literature tends to explore potential applications of AI, such as surrogate modeling, computer vision for data extraction, and behavior prediction. However, there is a lack of in-depth research on how AI technologies are actually understood and applied by practitioners and researchers. This study systematically explores the current state, challenges, and future potential of AI applications in UBEM through interviews with three experts who have extensive experience in both academia and industry.

The research focuses on three core questions:

What are the current specific applications of AI in the UBEM domain?
What technical, institutional, and cognitive challenges are encountered during practical implementation?
How can the use of AI in UBEM be expanded in a technically rational and socially responsive manner?

2. Literature review

Urban Building Energy Modeling (UBEM) simulates energy consumption across large building stocks to support sustainable planning and policy-making. As cities face growing environmental pressures, UBEM helps forecast energy demand, assess retrofit scenarios, and inform carbon mitigation strategies. However, UBEM is challenged by data scarcity, computational complexity, and limited scalability (Chen et al., 2020; Fathi et al., 2020). Artificial Intelligence (AI) has emerged as a promising solution to enhance the efficiency, accuracy, and applicability of UBEM workflows.

2.1 AI Techniques in UBEM

AI technologies—such as machine learning (ML), deep learning, computer vision (CV), reinforcement learning (RL), and surrogate modeling—enable automation and improved predictions in various UBEM stages. ML algorithms are used to estimate energy use, classify buildings, and impute missing data (Fathi et al., 2020). Deep learning models like CNNs and LSTMs support image interpretation and time-series prediction (Pan et al., 2024). CV algorithms extract building footprints, heights, and facade features from aerial or street-view imagery (Han et al., 2021), significantly reducing manual effort. Surrogate models, trained on simulation outputs, approximate energy behavior at a fraction of the computational cost (Chen et al., 2020). RL methods enable adaptive building controls, especially in demand response scenarios (Zhou et al., 2023).

2.2 Data Acquisition and Preprocessing

AI enhances the preprocessing stage by automating the extraction and inference of building data. Semantic segmentation and object detection are applied to aerial and satellite images to generate digital maps of urban form (Han et al., 2021). ML classifiers predict attributes like building age or use type from geometry and location (Fathi et al., 2020). When data is scarce, generative models such as Gaussian Mixture Models (GMMs) can synthesize realistic building energy datasets to augment training data (Han et al., 2021). Additionally, AI aids in cleaning and integrating diverse datasets, including GIS records and utility data.

2.3 Model Generation and Simulation

In this stage, AI enables automatic archetype generation and parameter assignment. Platforms like CityBES demonstrate how archetype models can be automatically instantiated across cities (Chen et al., 2020). Surrogate models, such as LSTM neural networks or gradient boosting machines, replace expensive simulations with fast predictions (Pan et al., 2024). Hybrid models combining physics-based simulation with AI correction layers enhance both speed and interpretability (Fathi et al., 2025). Optimization methods such as genetic algorithms automate parameter tuning, reducing manual calibration time (Haneef et al., 2021).

2.4 Validation and Calibration

AI improves model validation through benchmarking and error detection. ML-based anomaly detection highlights abnormal predictions, prompting targeted correction (Fathi et al., 2020). Calibration methods using surrogate models or Bayesian inference match predicted results to metered data with minimal simulation runs (Chen et al., 2020). Online learning approaches, though still emerging, allow UBEMs to adapt continuously as new sensor data arrives. Explainable AI (XAI) techniques—like SHAP or feature importance visualization—build stakeholder trust by clarifying how models make predictions (Fathi et al., 2025).

2.5 Scenario Analysis and Optimization

AI enables scenario generation and multi-objective optimization. RL agents learn optimal control strategies under constraints like peak load or carbon targets (Zhou et al., 2023). Optimization algorithms identify retrofit strategies balancing cost and emissions (Haneef et al., 2021). Emerging approaches like GPT-UBEM use large language models to interpret planning questions and automate scenario generation (Parrish et al., 2023). In city-scale digital twins, AI supports real-time forecasting and data assimilation, forming a responsive urban energy model (Li & Feng, 2025).

2.6 Challenges and Future Directions

Despite its promise, AI-augmented UBEM faces challenges in data availability, model generalizability, software interoperability, and computational requirements. Black-box models may hinder trust among non-expert users. Future trends include the development of hybrid physics-AI models, standardized datasets, and equity-centered modeling. Integrating AI into user-friendly UBEM platforms and policy frameworks will be key to widespread adoption.

3. Method Research Design

This study employs a qualitative, semi-structured interview approach to explore participants’ experiences and perceptions regarding AI integration in UBEM.

3.1 Participant Selection

Participants were selected based on the following criteria:

Relevant research or practical experience.
Diverse professional backgrounds encompassing academic research and practical application to ensure multiple perspectives.
Current or past involvement in significant UBEM and AI-related projects.

Participants were contacted via LinkedIn, with each interview lasting between 30 to 60 minutes, using otter.ai for transcript extraction. Interviews were conducted via Zoom video calls and structured around eight core questions, covering participants’ professional backgrounds, experiences with AI tools in UBEM, key technical and organizational challenges, and future outlooks. With permission, conversations were recorded, transcribed verbatim, and coded using thematic analysis. Cross-case comparisons were used to identify convergences and divergences across the three perspectives.

To protect privacy, the following participants are anonymized:

Participant 1: PhD student in Building Engineering at Concordia University, specializing in occupant behavior modeling and energy simulation. She contributed to digital twin infrastructure and UBEM platforms in Canada.

Participant 2: Licensed architect and educator, PhD student at the University of Washington. His research explores socio-technical challenges in sustainable architecture and organizational dynamics surrounding BIM and digital twin adoption.

Participant 3: Former PhD student at the Free University of Bozen-Bolzano, specializing in sustainable energy and technology. Currently in the private sector focusing on shoebox modeling, surrogate modeling, and AI-assisted UBEM acceleration.

Participant 1’s research focuses on stochastic occupant behavior modeling using AI, particularly Gaussian mixture models and clustering techniques, enabling high-resolution schedule generation. Her work informs a broader UBEM platform (AUM), developed in collaboration between Concordia University and Natural Resources Canada, emphasizing AI for data cleaning, preprocessing automation, and enhancing building archetype classification. She also mentioned using computer vision to extract facade features, such as window-to-wall ratios, from street-level imagery to enhance model granularity.

Participant 2 provided a contrasting perspective, offering insights into social and institutional dynamics often hindering AI tool adoption, although he did not directly apply AI in his work. He recounted early digital twin projects at the University of Washington, focusing on “re-baselining” digital records for existing campus buildings. He emphasized that successful technology implementation closely relates to organizational habits, workforce structures, and cross-departmental collaboration.

Participant 3, a data scientist, specialized in accelerating simulation processes. His doctoral research developed a shoebox algorithm simplifying complex building geometries while maintaining contextual accuracy for rapid urban-scale simulations. He also developed a neural network-based surrogate model estimating annual heating and cooling demands. His insights highlighted surrogate models as viable substitutes for expensive numerical simulations, emphasizing remote sensing’s potential for filling data gaps in urban-scale building inventories.

3.2 Interview Question Design

Part 1: Background & Context

Could you briefly describe your research or professional experience related to energy modeling, building simulation, or digital twins?
How did your interest in applying AI or automation to energy modeling workflows begin?
Have you worked on or observed any projects where AI was used to improve input data quality, automate modeling steps, or support large-scale UBEM efforts?

Part 2: Applications & Challenges

In your experience, what kinds of AI applications have shown the most promise in enhancing urban-scale energy modeling?
What do you see as the main challenges—technical, institutional, or organizational—when it comes to integrating AI tools into real UBEM workflows?
How do teams typically navigate the balance between automated outputs and human interpretation in collaborative modeling settings?

Part 3: Lessons & Outlook

Have you come across any projects, teams, or cities experimenting with AI in UBEM that offer useful lessons or cautionary tales?
What do you think is needed to scale up the use of AI in sustainable urban development and building energy analysis?

4. Discussion

4.1 AI for Data Acquisition

All three experts concurred on the critical importance of high-quality data for effective UBEM and DT implementations. However, their strategies for data acquisition diverged:

Expert A advocated for the use of automated sensing and machine learning (ML) techniques to infer building attributes from indirect sources, such as aerial imagery. This approach aligns with studies demonstrating that ML can effectively extract building geometry and age when direct records are unavailable .
Expert B emphasized the deployment of on-site sensor networks (IoT) as essential inputs, reflecting the growing trend of utilizing ubiquitous building sensors to generate substantial volumes of data .
Expert C highlighted institutional constraints, noting that many existing buildings lack detailed digital records or Building Information Models (BIM). This challenge is well-documented, as older building stocks often have no readily available BIM data, complicating DT creation without new data strategies .

Despite their differing approaches, all experts cautioned that data quality remains a significant bottleneck. For instance, while ML can predict building ages effectively, methods like infrared thermography for retrofit assessment may be unreliable due to measurement errors . Additionally, IoT-driven data collection introduces privacy and cybersecurity risks . In summary, Expert A is optimistic about ML-based gap-filling, Expert B focuses on rigorous sensor data collection, and Expert C underscores the necessity of institutional readiness.

4.2 Model Validation

The experts presented varying viewpoints on validating AI-driven UBEM and DT models:

Expert A favored automated calibration using large datasets, positing that AI can self-adjust models in real-time.
Expert B adopted a more conservative stance, insisting on traditional benchmarking against measured performance and established physical principles.
Expert C emphasized the need for clear validation protocols and standards before relying on these tools for decision-making.

These perspectives reflect broader research concerns. McArthur (2024) cautions that AI models are susceptible to the “garbage in, garbage out” phenomenon, where black-box predictors may fit training data but fail if they disregard building physics . Similarly, Opoku et al. (2023) highlight that validating a digital twin necessitates high-quality real-time data; without it, model authenticity cannot be confirmed . All interviewees agreed on the necessity of human oversight, echoing the sentiment that AI should complement, not replace, human expertise.

4.3 Institutional and Organizational Challenges

Each expert identified significant non-technical barriers to AI integration:

Expert B underscored issues related to data governance, regulation, and security, noting that deploying city-wide sensors raises privacy and cybersecurity concerns .
Expert C focused on capacity and policy challenges, pointing out that many agencies lack the organizational infrastructure, standards, or funding to support AI and DT projects .
Expert A emphasized the need for interdisciplinary coordination and training, highlighting the importance of bridging silos between urban planners, utility providers, and technologists.

These concerns are corroborated by literature indicating that barriers to DT adoption include fragmented data management, lack of interoperability, and limited trust in shared data .

4.4 Surrogate Modeling

The experts’ views on surrogate (reduced-order) models revealed both commonalities and differences:

Expert A was enthusiastic about AI-driven surrogates to expedite analysis, arguing they allow for cost-effective exploration of numerous scenarios.
Expert B expressed skepticism, cautioning that approximations might overlook critical dynamics and must undergo thorough validation.
Expert C found city-scale surrogates intriguing but noted the challenge of representing diverse building types.

These positions are reflected in current research. Surrogate models have proven powerful in practice, with frameworks like BESOS employing ML surrogates to approximate detailed EnergyPlus simulations, enabling rapid exploration of the design space . However, authors also warn that surrogates require extensive training data to be reliable . The experts concurred on the necessity of uncertainty quantification, acknowledging that a surrogate trained on one neighborhood may fail in another without retraining.

4.5 Human–AI Interaction

Trust, interpretability, and user roles were central themes in the discussion of human–AI interaction:

Expert B insisted on transparency and human oversight, expressing concern that black-box models can be misleading if unchecked.
Expert A emphasized AI as a decision-support tool, advocating for its use in augmenting human decision-making processes.
Expert C focused on stakeholder engagement and training, highlighting the importance of involving end-users in the development and implementation of AI tools.

These viewpoints align with recent calls for explainable AI in building energy contexts. McArthur (2024) argues that AI outputs should be visually and physically checked by experts, asserting that AI should complement expertise, not replace it . Similarly, Rempi et al. (2025) demonstrate that explainable AI methods, such as SHAP, can clarify AI decisions for stakeholders, thereby enhancing trust and facilitating adoption . The consensus among the experts is that explainability and user control are essential for the successful integration of AI tools in UBEM and DT applications.

5. Conclusion

These findings resonate with broader debates in digital urbanism and sustainable development. As cities invest in smart infrastructure, AI-driven tools are increasingly touted as silver bullets. Yet, this case study suggests that the real value of AI in UBEM lies not in complete automation, but in selective augmentation—filling data gaps, enabling scenario analysis, and supporting design decisions. It also highlights the crucial role of organizational readiness, professional training, and interpretability in ensuring AI is responsibly and meaningfully deployed.

Moreover, the discussions call for a new generation of foundational models in UBEM—machine learning architectures that can generalize across urban contexts while remaining transparent and adaptable. Developing such systems requires close collaboration between data scientists, building engineers, architects, and policymakers. Ultimately, AI in UBEM should not merely optimize energy outputs, but also support equitable, livable, and human-centered urban futures.

6. References

Chen, Y., Hong, T., & Luo, N. (2020). Rapid urban building energy model calibration using surrogate models. Energy and Buildings, 209, 109694. https://doi.org/10.1016/j.enbuild.2019.109694

Fathi, A., Vahidinasab, V., & Eslami, M. (2020). A review on data-driven building energy consumption prediction and feature selection methods. Energy Reports, 6, 455–473. https://doi.org/10.1016/j.egyr.2020.11.198

Fathi, A., Ma, Z., & Zhou, Y. (2025). Explainable AI in urban energy modeling: A systematic review. Applied Energy, 328, 120308. https://doi.org/10.1016/j.apenergy.2022.120308

Han, B., Li, L., Sun, H., & Wang, C. (2021). Generating synthetic building energy data using Gaussian Mixture Models. Energies, 14(5), 1217. https://doi.org/10.3390/en14051217

Haneef, M., Wang, Y., & Wang, S. (2021). Urban-scale energy retrofit optimization using CitySim and NSGA-II. Sustainability, 13(12), 6557. https://doi.org/10.3390/su13126557

Li, J., & Feng, Y. (2025). Web-based urban digital twins for real-time greenhouse gas analysis. Buildings and Cities, 6(1), 102–118. https://doi.org/10.5334/bc.181

Pan, H., Chen, Y., Hong, T., & Zhang, J. (2024). Bidirectional LSTM surrogate modeling for UBEM with microclimate effects. Energy and Buildings, 276, 112646. https://doi.org/10.1016/j.enbuild.2023.112646

Parrish, K., Regnier, C., & Cockroft, J. (2023). GPT-UBEM: Towards natural-language-driven urban energy modeling. Buildings and Cities, 4(2), 331–345. https://doi.org/10.5334/bc.222

Zhou, Y., Hong, T., & Chen, Y. (2023). CityLearn: Reinforcement learning environment for demand response in building clusters. Applied Energy, 305, 117771. https://doi.org/10.1016/j.apenergy.2021.117771

Groesdonk, P., Garbasevschi, O., Schmiedt, J. E., & Hoffschmidt, B. (2021). Collecting data for urban building energy modelling by remote sensing and machine learning. Proceedings of Building Simulation 2021. https://www.researchgate.net/publication/355855784

McArthur, J. (2024). Artificial intelligence and decarbonisation. Buildings and Cities. https://www.buildingsandcities.org/insights/commentaries/artificial-intelligence-decarbonisation.html

Cespedes-Cubides, A. S., & Jradi, M. (2024). A review of building digital twins to improve energy efficiency in the building operational stage. Energy Informatics, 7(1), 11. https://energyinformatics.springeropen.com/articles/10.1186/s42162-024-00313-7

Opoku, D.-G. J., Perera, S., Osei-Kyei, R., Rashidi, M., Bamdad, K., & Famakinwa, T. (2023). Barriers to the adoption of digital twin in the construction industry: A literature review. Informatics, 10(1), 14. https://www.mdpi.com/2227-9709/10/1/14

Westermann, P., Christiaanse, T. V., & Evins, R. (2021). besos: Building and Energy Simulation, Optimization and Surrogate Modelling. Journal of Open Source Software, 6(60), 2677. https://doi.org/10.21105/joss.02677

Khan, M. R., Alam, S. B., & Khan, M. J. (2022). Digital twin and artificial intelligence incorporated with surrogate modeling for hybrid and sustainable energy systems. In Handbook of Smart Energy Systems (pp. 1-20). Springer. https://link.springer.com/10.1007/978-3-030-97940-9_147

Rempi, P., Pelekis, S., Tzortzis, A. M., Karakolis, E., Ntanos, C., & Askounis, D. (2025). Explainable AI for building energy retrofitting under data scarcity. arXiv preprint arXiv:2504.06055. https://arxiv.org/abs/2504.06055

About the author

name: Yingjie Liu

License

Icon for the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License

2025 Innovation in the Construction Industry Copyright © 2025 by Prof. Dossick's CM515 Spring 2025 Class is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, except where otherwise noted.