2022 Anthology on Statistics and Machine Learning

My year’s review on blogging in Statistics and Machine Learning

2022 was the year of the tiger. Cover photo generated by the author using an AI tool Midjourney (Licenses as Creative Commons Noncommercial 4.0 asset license)

It’s the last day of 2022! The first day of 2022 seems as if it was just yesterday, although on the 1st Jan 2022, it felt like the end of the year will be farther than what the first day of the year seems now. The past always seems closer than the future. The human mind is mysterious in ways, our own mind cannot fathom. Such mind also has created stellar tools of artificial intelligence in recent years, be it Stable-Diffusion, Chat-GPT, GitHub Co-pilot, AlphaFold, and so on.

While tools like Chat-GPT or AlphaFold have immense use cases, to begin with, much of the discovery has also been fuelled by cheaper computing, larger investment, and deep pockets. I even tried the prediction capability of stable-diffusion on my laptop with 8 GB of RAM and an Intel iCore 7 processor and there was output yet to be seen after an hour.

In such a scenario, my larger focus has been on core statistics and the theoretical framework that has paved the path for better applications and tools. I picked some of my favorite topics from Statistics that I learned during my graduate school at the University of Arizona and presented them to readers targeting sophomores and beyond, packaged with coding examples in bite-size, easy-to-digest short reads.

Variable Transformation and New Distribution

The precursor to Deepfake and normalizing flow is the variable transformation in statistics, often overlooked by over-zealous machine learning enthusiasts.

My goal here was to present important concepts for beginners in this direction through the following articles:

Variable Transformation to Generate New Distributions: https://towardsdatascience.com/stat-stories-variable-transformation-to-generate-new-distributions-d4607cb32c30
Why is the Moment Generating Function Important?: https://towardsdatascience.com/stat-stories-why-is-moment-generating-function-important-25bbc17dad68
Common Families of Statistical Distributions (Part 1): https://towardsdatascience.com/stat-stories-common-families-of-statistical-distributions-part-1-2b704dd6a808
Common Families of Statistical Distributions (Part 2): https://towardsdatascience.com/stat-stories-common-families-of-statistical-distributions-part-2-4bdea86c3132
Multivariate transformation for statistical distributions: https://towardsdatascience.com/stat-stories-multivariate-transformation-for-statistical-distributions-7077a374b3b4
Normalizing Flows as an Application of Variable Transformation: https://towardsdatascience.com/stat-stories-normalizing-flows-as-an-application-of-variable-transformation-7b7beda7b03b
Delta Method in Statistics: https://towardsdatascience.com/stat-stories-delta-method-in-statistics-bd681fbbf037

Statistical and Information Geometry

Another topic that has been of my interest is somewhat obscure but remains useful for high-dimensional data analysis, and the burgeoning field of quantum information theory is the topology for statistics and machine learning, and information geometry.

Manifold Alignment: https://towardsdatascience.com/manifold-alignment-c67fc3fc1a1c
The Gromov–Wasserstein Distance: https://towardsdatascience.com/the-gromov-wasserstein-distance-835c39d4751d
Mystical World of Information Geometry: https://towardsdatascience.com/mystical-world-of-information-geometry-16b4637d89e8

However, of course, this is just a prologue to what is to come in 2023.

Foundation of Reinforcement Learning

The year closes with exploring and developing the foundation for reinforcement learning, especially driven around autonomous driving (pun intended).

Markov States, Markov Chain, and Markov Decision Process: https://towardsdatascience.com/foundational-rl-markov-states-markov-chain-and-markov-decision-process-be8ccc341005
Solving Markov Decision Process: https://towardsdatascience.com/foundational-rl-solving-markov-decision-process-d90b7e134c0b
Dynamic Programming: https://towardsdatascience.com/foundational-rl-dynamic-programming-28f96f6fb40e
Value Iteration and Policy Iteration: https://readmedium.com/foundational-rl-value-iteration-and-policy-iteration-76251e47581b

A number of other topics in RL such as proximal policy optimization, multi-agent RL, and their application with coding examples are still in development I am hoping to release them soon.

In the end, when I look back, I see this as the beginning where the idea is to be versatile by not merely learning how to use tools, but learning how to develop tools, and that requires how to learn. Becoming versatile, and becoming robust to breakthrough changes in technology requires going back to the foundation. The year of the rabbit, 2023 is about strengthening the foundation.

Did you enjoy this article? Buy me a Coffee.

Love my writing? Join my email list.

Want to know more about STEM-related topics? Join Medium

Control and Safety: Reachability Analysis (Part 1)

Reachability Analysis as a Safety Toolbox for Automation & Navigation

rahulbhadani.medium.com