Âé¶ą´«Ă˝Ół»­

Skip to main content Skip to search

Âé¶ą´«Ă˝Ół»­ News

Âé¶ą´«Ă˝Ół»­ News

Study Finds Simple Models Can Beat Complex AI at Predicting Building Temps

At the heart of Ruslan Gokham’s research is a simple but important question: when do we actually need complex AI models and when can simpler tools do the job just as well or even better?

By Dave DeFusco

When people think about artificial intelligence today, they often picture massive, complex models that require enormous computing power. These systems, known as transformer models, have helped drive breakthroughs in language translation, chatbots and image generation. New research by Ruslan Gokham, a Katz School Ph.D. student in mathematical sciences, argues that bigger and more complicated is not always better, especially when it comes to predicting temperatures in real buildings.

Gokham presented his paper, â€śLinear vs Transformer Models for Long-Horizon Exogenous Temperature Forecasting,” at NeurIPS 2025, one of the world’s most prestigious AI conferences in San Diego at a poster session in the UrbanAI Workshop, which focuses on applying artificial intelligence to real-world urban problems.

At the heart of Gokham’s research is a simple but important question: when do we actually need complex AI models and when can simpler tools do the job just as well or even better?

“There’s a growing question in time-series research about when we really need the complexity of large attention models,” said Gokham, “and when simpler linear models can outperform them, especially for long-range forecasting.”

The problem he studied has clear real-world relevance. Modern “smart buildings” rely on automated systems to control heating, cooling and ventilation. To save energy and keep people comfortable, these systems must make decisions ahead of time. That means predicting indoor temperatures hours or even days in advance.

The challenge is that, in real life, systems often don’t know what the current or future indoor temperature will be. Instead, they rely on outside information, such as weather forecasts, time of day or system settings. This is known as “exogenous-only” forecasting.

“For smart buildings, we want to change air conditioning settings in advance,” said Gokham. “To do that, we need to predict what the approximate temperature will be in the future, using only outside information.”

This setup is much harder than traditional forecasting, where models can look at recent temperature readings. It also better reflects how these systems are used in practice. To test different approaches, Gokham compared six models. Three were simple linear models, called Linear, NLinear and DLinear. These models assume that future temperatures are closely tied to past patterns and trends. 

The other three came from the transformer family, which are more complex and designed to capture subtle, nonlinear relationships in data. These included a standard Transformer, Informer and Autoformer.

All six models were tested under the same conditions using data from the Smart Buildings Control Suite, a large benchmark dataset for heating and cooling systems. Each model was trained, validated and tested using identical data splits, including a blind test set that the models never saw during development. This ensured a fair comparison.

The results were surprising. Across every stage—training, validation and testing—the linear models consistently outperformed the transformer-based models. Among them, NLinear performed the best overall, producing the most accurate predictions on unseen data.

“In time series, the structure is very different from language,” said Gokham. “Linear models can often find better connections in this kind of data than models that were originally designed for language.”

The transformer and informer models appeared to perform well during training, but their accuracy dropped sharply on the blind test set. According to Gokham, this points to a common problem called overfitting, where a model learns the training data too well but fails to generalize.

“They trained on the same amount of data,” he said, “but for some reason they overfit and didn’t perform well on the test set.”

Another key advantage of linear models is interpretability. In practical settings like energy management, engineers need to understand why a model makes certain predictions.

“With linear models, it’s much easier to see which variables are having an impact,” said Gokham. “Large models are much more complicated, especially for time-series data.”

Even Autoformer, which tries to build knowledge of seasonal patterns directly into a transformer-style model, could not match the performance of the linear baselines. For Gokham, this reinforces the idea that simpler designs can sometimes be more reliable. To ensure his conclusions were robust, he followed a widely used evaluation framework that includes many different models.

“Because we tested several linear models and several transformer models, we can make real conclusions about performance,” he said. “If you compare just one model against another, it’s harder to know what’s really going on.”

Beyond presenting his own research, Gokham found NeurIPS to be an inspiring experience. He connected with researchers exploring ideas like frequency-based methods for AI and spoke with a developer from Meta about their LMFusion framework for combining multiple types of data. Looking ahead, Gokham believes transformer models still have a role to play, but not by default.

“If results are similar, linear models can be preferable,” he said, citing their lower computational cost and better interpretability. “Transformers may be more useful when models are fine-tuned specifically for time-series data.”

Share

FacebookTwitterLinkedInWhat's AppEmailPrint

Follow Us