By Dave DeFusco
Diffusion models have become the artistic and scientific darlings of artificial intelligence. They power image generators like DALL·E and Stable Diffusion, producing stunning, lifelike pictures from simple text prompts. But a recent study led by researchers at the Katz School of Science and Health asks a fundamental question: Are these models really creating something new or just rearranging what they’ve already seen?
That question lies at the heart of published in the journal Information Fusion by Lakshmikar Polamreddy, a Ph.D. student in mathematics at the Katz School, and Jialu Li, a student in the M.S. in Artificial Intelligence. Their research challenges a popular belief that diffusion models “imagine†in the same way humans do.
“Diffusion models have been the state-of-the-art for image and video generation,†said Polamreddy. “We wanted to test whether they really generate new data or not. My assumption was that they don’t—that they just replicate the existing content in different forms.â€
Diffusion models work by gradually turning random noise into a detailed image, learning from large datasets of real pictures. Because their results can be impressively realistic, it’s easy to assume they’re generating novel ideas. But Polamreddy’s team found otherwise. When they asked a model trained on tens of thousands of images to produce new ones, almost all the results were variations of existing data.
“If I generate 10,000 images,†he said, “maybe only 10 of them contain truly new features not seen in the training data. Those 10 are what we call ‘diverse samples.’â€
These diverse samples are special. They contain elements that are different but relevant to the original data. Polamreddy distinguishes them from so-called out-of-distribution samples, which are completely unrelated.
“If I give the model brain images and ask it to generate more, but it produces a heart image, that’s out of distribution,†he said. “We discard those. But if it gives a new kind of brain image with a slightly different structure, that’s a diverse sample and it’s valuable.â€
The team’s most striking finding came from applying their method to medical images, where data scarcity is a real problem. Hospitals often can’t share patient scans because of privacy concerns, and collecting new images for training AI diagnostic systems is expensive and time-consuming.
That’s where data augmentation—creating additional training images artificially—comes in. Most augmentation techniques, like flipping or rotating existing images, don’t add new information. Polamreddy’s study suggests that even a small number of truly diverse samples can significantly improve diagnostic models.
“Data augmentation is especially critical in the medical field,†said Polamreddy. “Because of privacy concerns, we don’t have enough data. Generating diverse samples with new content helps counter that scarcity and improves downstream tasks, like image classification and disease diagnosis.â€
Using chest X-rays and breast ultrasound images, the researchers trained an image-classification model with and without diverse samples. The results were striking: adding diverse samples improved classification accuracy by several percentage points, sometimes more than five points higher than models trained on standard generated images.
“Even a few diverse samples can make a big difference,†said Jialu Li, co-author of the study. “They diversify the training data and help the model generalize better, which means it performs more accurately on real-world medical images.â€
To measure novelty, the team turned to information theory, a mathematical framework that studies how information is stored and transmitted. They used metrics like entropy and mutual information to see whether the generated images truly contained new data.
“If there’s no relationship between the training and generated images, entropy will be high,†said Polamreddy. “These measurements help us see whether there’s really new information or just a repetition of what the model already knows.â€
Their conclusion is that ideal diffusion models don’t create new information at all. Any new content comes from small imperfections in how the model reverses the diffusion process or, essentially, a lucky byproduct of noise and complexity.
To find diverse samples, the researchers had to take a brute-force approach. They generated thousands of images repeatedly, filtering each batch to identify the rare few that contained novel features.
“If I want 100 diverse samples,†said Polamreddy, “I might have to run the model many times. Each iteration gives me one or two, so I keep going until I get what I need.â€
That method, while effective, is slow. The team’s next goal is to design “diversity-aware†diffusion models—ones that can produce semantically rich, varied images in a single pass.
“We need better conditioning in the diffusion process,†said Polamreddy. “That’s how we can teach models to generate more diverse samples automatically instead of relying on brute force.â€