MIT researchers develop new tool to predict the future using data

15 Apr 2022

Predictive database algorithm created by researchers at MIT.

The team said a powerful algorithm in their tool can transform multiple time-series datasets into a tensor, which is a multi-dimensional array of numbers. Image: Figure courtesy of the researchers/edited by MIT News

The team said this system is more accurate and efficient than state-of-the-art deep learning methods when predicting future values and filling in missing data points.

Researchers at MIT have made a tool that could help non-experts in various fields make predictive forecasts using collected data.

Making predictions using time-series data usually requires several data-processing steps and complex machine learning algorithms, which have steep learning curves.

This can make using data difficult for various areas that benefit from these predictions, such as putting together weather forecasts, estimating future stock prices or looking at a patient’s risk of developing a disease.

To make predicting future outcomes easier, the research team created a system that integrates prediction functionality on top of an existing time-series database.

The team said its new interface – called TspDB – takes care of the complex modelling so non-experts can easily generate predictions in seconds.

“Even as the time-series data becomes more and more complex, this algorithm can effectively capture any time-series structure out there,” senior author of the study Prof Devavrat Shah said. “It feels like we have found the right lens to look at the model complexity of time-series data.”

The researchers said a powerful algorithm at the heart of their tool can transform multiple time-series datasets into a tensor, or a multi-dimensional array of numbers. It can also analyse data that has more than one time-dependent variable – for example, elements of a weather database such as temperature, dew point and cloud cover each depend on previous values.

The team said their model is more accurate and efficient than state-of-the-art deep learning methods when predicting future values and filling in missing data points.

“One reason I think this works so well is that the model captures a lot of time-series dynamics, but at the end of the day, it is still a simple model,” study author Abdullah Alomar said. “When you are working with something simple like this, instead of a neural network that can easily overfit the data, you can actually perform better.”

Shah and his collaborators have been working on the problem of interpreting time-series data for years. Their next goal is to make this algorithm accessible to everyone.

The researchers are also targeting new algorithms that can be incorporated into TspDB and plan to gather more feedback to see how they can improve the system’s functionality and user-friendliness.

“Our interest at the highest level is to make TspDB a success in the form of a broadly utilisable, open-source system,” Shah said. “Time-series data are very important, and this is a beautiful concept of actually building prediction functionalities directly into the database.”

10 things you need to know direct to your inbox every weekday. Sign up for the Daily Brief, Silicon Republic’s digest of essential sci-tech news.