MIT research suggests AI can learn to identify images using synthetic data

16 Mar 2022

Image: © knssr/

The MIT researchers said their generative model requires less memory to store than datasets, which can cost millions of dollars to create.

MIT researchers have found a way to classify images using synthetic data, which they claim can rival models trained from real data.

In the study, the team created a special type of machine learning model to generate extremely realistic synthetic data, which can then train another model for vision-related tasks.

The researchers said that currently, massive amounts of data is required to train a machine to perform image classification tasks, such as identifying damage in satellite photos following a natural disaster. However, the datasets required to train the model can cost millions of dollars to generate.

The team said their special machine learning model, known as a generative model, requires far less memory to store or share than a dataset.

The team compared the results of their learning model that only used this synthetic data, to several other image classification models that were trained using real data. The results suggest that their method can sometimes learn visual representations better than the other models.

“We knew that this method should eventually work; we just needed to wait for these generative models to get better and better,” lead author Ali Jahanian said. “But we were especially pleased when we showed that this method sometimes does even better than the real thing.”

AI training AI

The training process involves showing the generative model millions of images that contain objects in a particular class – such as cars or cats – and then it learns what these objects look like so it can generate similar objects.

The researchers said these generative models can also learn how to transform the underlying data. For example, if it is trained on images of cars, it can ‘imagine’ how a car would look in different scenarios that it did not observe during training. It could then create images that show a car in different poses, colours, or sizes.

Having multiple views of the same image is important for a technique called contrastive learning, where a machine learning model is shown many unlabelled images to learn which pairs are similar or different.

When the researchers connected a pretrained generative model to a contrastive learning model, the contrastive learner could tell the generative model to produce different images of an object, and then learn to identify that object from multiple angles.

“This was like connecting two building blocks,” Jahanian said. “Because the generative model can give us different views of the same thing, it can help the contrastive method to learn better representations.”

Jahanian cautioned that there are some limitations to using generative models, as they can reveal source data, which can pose privacy risks. They could also amplify biases in the datasets they are trained on if they aren’t properly audited. The research team plans to address those limitations in future studies.

10 things you need to know direct to your inbox every weekday. Sign up for the Daily Brief, Silicon Republic’s digest of essential sci-tech news.

Leigh Mc Gowran is a journalist with Silicon Republic