avatarTrainDataHub

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

1844

Abstract

what it represents and how did I come to it?</p><p id="4aaa">First iterations were totally unrelated to this final result. They were nice, but they felt a bit off.</p><p id="38ad">Then I had a moment when <b>what I do</b> returned into focus: I build software and give advice on software solutions.</p><p id="5936">I write blocks of code, mix technologies and that translates into products for myself and my clients.</p><p id="6185">Then <b>Constanting</b> started to make me think of <b>Constructing</b>.</p><p id="7160">I don’t know about others, but when I think about building blocks my brain thinks instantly of Tetris. Tetris is a tile-matching puzzle video game originally designed and programmed by Soviet Russian software engineer <a href="https://en.wikipedia.org/wiki/Alexey_Pajitnov">Alexey Pajitnov</a>.</p><p id="601d">Tetris is copyrighted, but <a href="https://en.wikipedia.org/wiki/Polyomino">polyominoes</a> aren’t. So I decided to use these simple geometric shapes to build my logo. I used 2 <a href="https://en.wikipedia.org/wiki/Tetromino">tetrominoes</a> and one domino piece.</p><figure id="6667"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*XogMEIEn5EQHTacfdvZgmg.png"><figcaption></figcaption></figure><p id="7858">When you rotate the logo 90 degrees to the right (clockwise) you also get the <b>IT</b> word. This was an unintended outcome that I realized after finishing up and presenting it to my arduous critics (my better half, Monica, and my sisters: Oana and Alina).</p><p id="ba79">An intentional effect was the aspect of a staircase, which should communicate to business partners the message of stable growth. Although I’m not very happy with the right-to-left direction of the stairs, that was a compromise made for keeping the words and letters that form within the logo.<

Options

/p><p id="10b4">My choice for colors was very much linked to my country of origin’s flag, and that is <b>Romania</b>.</p><figure id="43e9"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*CN6xpGZxLfaA-LJ9W7CLTw.png"><figcaption></figcaption></figure><p id="d7a2">So that’s the short story.</p><p id="80aa">The bottom line is that I’m very happy with this bootstrapped logo I made in-house for myself. In total it was around 2 weeks of thinking about the Identity and 1 day for executing the logo.</p><figure id="ce06"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*3P9JTncDwQexnUVMzX0p7Q.gif"><figcaption></figcaption></figure><p id="bbd8">The Dutch may have fun or difficulties with the pronunciation of Constanting because the G at the end, and the bright side from this perspective is that most of my clients aren’t from The Netherlands.</p><p id="b511">Nevertheless, the process of registering with <b>KvK</b> (the Dutch Chamber of Commerce) went on smoothly.</p><p id="dfba">If you have a business and want to stand out, you can try to 3D print a coaster to use around the office and/or house.</p><div id="5e13" class="link-block"> <a href="https://readmedium.com/3d-printing-a-coaster-with-your-company-logo-9df3beafb1f2"> <div> <div> <h2>3D Printing: A Coaster with Your Company Logo</h2> <div><h3>From digital to analog in a few simple steps</h3></div> <div><p>medium.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/1*-yZreBO20R3FYchYzXIyxQ.jpeg)"></div> </div> </div> </a> </div><p id="f0f8">Tha(nk|t’)s all!</p></article></body>

Step by Step Guides for EDA Process with Python’s Codes

What is EDA?

Exploratory data analysis (EDA) is a thorough investigation of the whole data sets mainly with visualization plots before conducting and building data mining models for the data scientists.

Why EDA is necessary for data mining project?

Performing EDA will reveal the main characteristics of data sets such as numbers of records, attributes, missing values and outliers, correlation between attributes, possible multicollinearity, necessity of dimension reduction, skewness of data, data’s distribution patterns, etc. Thus, EDA process is cleaning up the data set to get the valuable insights of the data we are working with.

EDA Steps with Python

  • Loading the Data set
df = pd.read_csv('/Users/users/Desktop/stroke.csv')
df

From this, it’s noted that there are 5110 rows and 12 columns.

  • Finding out Data Types of Variables and Null Values
df.info() 

df.info() will reveal the names of the variables and data types of each variables. It will also show the null values present in each variable. For example, there’s only 4909 record without null values in bmi variable. That tells us that 201 missing values in the bmi variable.

  • Another Way To Finding Out Total Counts of Null Values
df.isna.sum()

It’s a good strategy to replace the null values with mean instead of dropping them since it will retain the full records.

#replacing bmi na value with mean
df['bmi'] = df['bmi'].fillna((df['bmi'].mean()))
#Double checking null values again
df.isna.sum()
  • Summary Statistics of whole data set
df.describe()
  • Unique Variables

Let’s look at the unique counts from each variable. For example, we will expect to see 2 unique counts from heart_disease variable (Yes and No).

df.nunique()

Visualizations

Visualization is an essential step in EDA process.

To get started, let’s assign the numerical variables under num_col so that we don’t need to list out the variables every single time.

num_col = df[['age','hypertension','heart_disease','avg_glucose_level','bmi']]
cat_col = df[['gender','ever_married','work_type','Residence_type','smoking_status']]
  • Visualization of Numerical Variables
#visualizing numerical variables
plt.figure(figsize=(20, 12))
for i, column in enumerate(num_col,1):
    plt.subplot(3, 2, i)
    sns.distplot(x=df[column],color='blue') 
    plt.legend() 
    plt.xlabel(column)

From these plots, all of the variables show the skewness and not normally distributed.

Let’s check the target variable (stroke) distribution

target= 'stroke'
df.hist(target, color='yellow')

From the plot, class imbalance is observed between stroke (Yes=1 and No=0).

  • Visualization for Correlation with Heatmap
corr = df.corr()
fig, ax = plt.subplots()
fig.set_size_inches(12,11)
sns.heatmap(corr, annot=True, cmap="Purples", center=0, ax=ax)

No strong correlation is observed among the attributes.

So these are the steps involved in EDA process. Another essential step of EDA process is finding out the outliers and removing them. We will talk about that in the next topic. Stay tuned!

Eda Process
Visualization
Handling Missing Values
Heatmap Tool
Exploratory Data Analysis
Recommended from ReadMedium