avatarSynced

Summary

MIT and Brown University have enhanced the Northstar data analytics platform with a Virtual Data Scientist (VDS) feature, enabling users without technical expertise to perform complex machine learning tasks through a drag-and-drop interface.

Abstract

The Northstar platform, a collaboration between MIT and Brown University, has been upgraded to include an AutoML-based component known as Virtual Data Scientist (VDS). This enhancement allows users, even those without programming or statistical backgrounds, to intuitively explore data and construct machine learning models using a simple drag-and-drop interface on touchscreen devices. The VDS feature is designed to democratize data science by assisting users in generating predictive models for various applications, such as medical diagnostics, sales forecasting, and inventory management. The system's estimation engine facilitates rapid processing of data samples to provide high-quality results quickly, and future updates aim to include alerts for potential data bias or errors. The innovation has been well-received, with the platform's performance ranking among the fastest interactive AutoML tools when evaluated on 300 real-world datasets.

Opinions

  • The authors believe that the Northstar platform's VDS feature can significantly lower the barrier to entry for AI and machine learning, making these technologies more accessible to a broader audience.
  • The researchers are optimistic about the future potential of the platform, particularly with the addition of features that could automatically detect and alert users to data bias or errors.
  • The article suggests that VDS can empower professionals across various industries, such as healthcare and business, to leverage machine learning for their specific needs without the necessity of hiring technical consultants.
  • The performance of the Northstar platform with VDS is considered impressive, as it ranks among the fastest interactive AutoML tools, thanks to its custom estimation engine.

MIT Drag-and-Drop Data Analytics: Machine Learning for Everyone

From Andrew Ng’s “AI for everyone” courses on Coursera to tech giants’ open-sourced tools that lower the tech bar for building machine learning models, we are seeing a wide range of efforts aimed at simplifying AI to make it accessible to everyone.

Northstar is an interactive data science cloud platform introduced last year by MIT and Brown University. It enables users without programming experience or a background in statistics to easily explore and mine data through an intuitive black-and-while user interface on touchscreen devices such as smartphones, tablets or interactive whiteboards. The drag-and-drop interface allows users to easily discover patterns inside the data and build machine learning pipelines.

MIT and Brown have now upgraded the Northstar platform with an AutoML-based component called Virtual Data Scientist (VDS), which helps users generate machine learning models to run prediction tasks on datasets. VDS was introduced in the paper Democratizing Data Science through Interactive Curation of ML Pipelines presented this week at the ACM SIGMOD conference in Amsterdam.

It’s believed that VDS can be used for example by doctors in disease diagnosis; by business owners for sales forecasts, and even to guide coffee shop owners in their inventory planning. All this without requiring a data science background or the hiring of machine learning tech consultants.

User can also run predictive analytics tasks with VDS via models customized to their specific objectives, such as data prediction, image classification, or analyzing complex graph structures. For instance, if medical researchers want to predict potential blood disease in patients, they could simply drag and drop “AutoML” from the list of algorithms in the “operators” box on the screen and then add the “blood” feature from under the “target” tab. The system will then automatically recommend the best machine-learning pipelines for the task, along with their respective error rates, structure, computations, and so on.

Researchers evaluated VDS on 300 real-world datasets where its performance ranked among the fastest interactive AutoML tools thanks to its custom “estimation engine.” This estimation engine sits between the interface and the cloud, and automatically creates representative samples from a dataset. These can be progressively processed to produce high-quality results in seconds.

Researchers say in the future they hope to add a feature that could automatically alert users regarding potential data bias or errors.

The paper Democratizing Data Science through Interactive Curation of ML Pipelines can be found here. The project demo and test installation/collaboration information is here.

Author: Yuqing Li | Editor: Michael Sarazen

2018 Fortune Global 500 Public Company AI Adaptivity Report is out! Purchase a Kindle-formatted report on Amazon. Apply for Insight Partner Program to get a complimentary full PDF report.

Follow us on Twitter @Synced_Global for daily AI news!

We know you don’t want to miss any stories. Subscribe to our popular Synced Global AI Weekly to get weekly AI updates.

Artificial Intelligence
Technology
Machine Learning
Automl
MIT
Recommended from ReadMedium