A New AutoML Model With Mind-Boggling Results

Summary

The website content discusses recent advancements in data science and AI, focusing on a new AutoML model called TabPFN and its impressive performance, as well as other developments like multiplayer Stable Diffusion and a video created using Stable Diffusion.

Abstract

The article provides a concise overview of the latest developments in the fields of data science and artificial intelligence. It highlights the introduction of a groundbreaking AutoML model known as TabPFN, which boasts rapid training and prediction capabilities, requires no hyperparameter tuning, and outperforms gradient boosting models within time constraints. Despite its limitations in data points and classes, the model's ability to perform approximate Bayesian inference and its potential applications in online and embedded systems are noted as significant advancements. Additionally, the article touches on interactive AI applications, such as a multiplayer Stable Diffusion platform and a captivating video generated by iteratively applying Stable Diffusion, showcasing the versatility and creative potential of AI in various domains.

Opinions

The author expresses that the TabPFN model could be a game-changer in the field of data science, simplifying model selection and enabling real-time applications.
There is excitement about the future of AI and data science, with expectations that improvements to models like TabPFN will develop rapidly.
The author is intrigued by the multiplayer Stable Diffusion experiment, suggesting it as an innovative way to engage with AI in a collaborative environment.
The video created with Stable Diffusion is described as "amazing," indicating the author's admiration for the creative use of AI in generating dynamic visual content.
The author seems optimistic about the broader implications of these advancements, particularly in the realm of tabular data, suggesting that pre-trained models may become more prevalent.

The Incredible AutoML model

A tabular classification model called TabPFN was recently published and the results are absolutely mind-boggling. In summary:

Can be run (i.e. be trained and predict) in less than one second (!) with a GPU

Requires no hyperparameters

Beats hyperparameter-tuned gradient boosting given time constraints on a benchmark of datasets (up to 2000 samples in each dataset)

Is of relatively small size with less than 26M parameters

Bayesian and thus models uncertainty

The only catch is that it can only take up to 1k points, 100 features and 10 classes.

It is a meta-learned transformer model trained on synthetic data to perform approximate Bayesian inference. It takes as input training data (features and labels) and test data (without labels) and then outputs predictions/probabilities in a single forward pass (i.e. no tuning with backpropagation). According to the authors, it is unlikely to overfit.

This could be a game-changer. The limited number of points is of course a large constraint, but this is a great step in the right direction. It is likely that future improvements will develop very quickly. Not only could this simplify model selection for data scientists, but also its speed and small size means it can be applied in many circumstances online without supervision, such as on an embedded device. We’ve seen pre-trained models for images and text, but perhaps these will now become more prominent in the area of tabular data. Personally, I’m very excited about this.

Sources:

Multiplayer Stable Diffusion

There have been many creations of interactive 2D worlds where people can perform simple actions that combined together become a world on a grid. For instance, on yourworldoftext (beware it might not be moderated extensively) you can type text on a grid with other people, and in 2017 Reddit made an experiment where users could change the color of a single pixel on a grid each before having to wait and submit again.

Now a new experiment has been created involving Stable Diffusion! It’s a Hugging Face space you can access here, and here is a tweet from the creator talking about it. You simply choose the position of a box on the grid, type a prompt and then let it generate an image at that position. As of writing this, the grid seems to be invaded by Shrek…

Multiplayer Stable Diffusion

Weekly Findings In Data Science and AI

A New AutoML Model With Mind-Boggling Results — Weekly Findings

This is a summary of interesting findings in data science and AI I’ve discovered recently. Hopefully, this can be a recurring type of article that I can publish every week. Let’s get into it.

The Incredible AutoML model

Multiplayer Stable Diffusion

Stable Diffusion Video

Extracting Training Data From Neural Networks — Weekly Findings

Hidden gorillas, extracting training data from neural networks and a search engine for your life. These are topics…

Data science

AI