Stop One-Hot Encoding your Time-based Features
Essential guide to feature transformation for cyclic features

Feature Engineering is an essential component of the data science model development pipeline. A data scientist spends most of the time analyzing and preparing features to train a robust model. A raw dataset consists of various types of features including categorical, numerical, time-based features.
A machine learning or deep learning model understands only numerical vectors. The categorical and time-based features need to be encoded into the numerical format. There are various feature engineering strategies to encode categorical features include One-Hot Encoding, Count Vectorizer, and many more.
Time-based features include the day of month
, day of week
, day of year
, time
. Time-based features are cyclic or seasonal in nature. In this article, we will discuss why One-Hot encoding or dummy encoding should be avoided for cyclic features, instead discuss and implement a better and elegant solution.
Why NOT One-Hot Encoding?
One-hot Encoding is a feature encoding strategy to convert categorical features into a numerical vector. For each feature value, the one-hot transformation creates a new feature demarcating the presence or absence of feature value.

One-hot encoding creates d-dimensional vectors for each instance where d is the unique number of feature values in the dataset.
For a feature having a large number of unique feature values or categories, one-hot encoding is not a great choice. There are various other techniques to encode the categorical (ordinal or nominal) features.
Read the below-mentioned article to get an understanding of several feature encoding strategies for categorical features:
Time-based features such as day of month
, day of week
, day of year
, etc have a cyclic nature and have many feature values. One-hot encoding day of month
feature results in 30 dimensionality vector, day of year
results in 366 dimension vector. It’s not a great choice to one-hot encode these features, as it may lead to a curse of dimensionality.
Idea:
The elegant solution to encode these cyclic features can be using mathematical formulation and trigonometry. In this article, we will encode the cyclic features using the basic formulation of trigonometry, by computing the sin and cosine of the features.