Introduction to Python for Data Engineering
Participating in the #100daysofcode
If you missed my first article on participating in #100DaysOfCode, you can find it here.
Python has been a bit harder to understand than SQL. Maybe I’m finding it easier with SQL due to my Statistics background. I have a DataCamp subscription (not an affiliate link) and I highly recommend DataCamp for any Data Science newbies.
I just finished an introductory course in Python and I’m currently looking at Writing Functions in Python.
Introduction to Python for Data Science
4 years ago I worked with a Machine Learning and IOT engineer who tried to get my to get started with Numpy. 4 years later, I finally started. So far, I’ve learnt data analysis basics and scientific computing with numpy.
The numpy package contains the code for array computing. To use it, you have to import it. A package is a collection of code. To import and use numpy, write import numpy. Instead of having to repeat the .numpy prefix you can alias it, ie: import numpy as np.
Overall, general imports are preferred over selective imports due to their readability. A general import is import numpy as np or import math, while a selective import might look like from numpy import array or from math import pi.
Arithmetic Python
Instructables has a great article on Numbers and Arithmetic operators that I’m not going to get into. However, here are the things that stood out for me:
- Use parentheses to prioritize calculations.
- Use the + operator to combine two list. L = [ 5,8 ], L+ [1,2] = [5,8,1,2]. A list is a collection of values and is denoted by square brackets [].
- Use the append () method to add an element to the end of a list.
- Use np.random.normal() to simulate data for the arguments distribution mean, standard deviation and number of samples.
- Use max() and min() to get the maximum and minimum values in a set of numbers.
- The module operator % returns the remainder of the division on the number on it’s left by the number on it’s right.
Converting Data Types
Applying arithmetics after converting data types has been challenging, but I’m getting the hang of it.
You can convert a variable into a different type of variable with the str(), int(), float() or bool () functions. To add strings and integers, first convert the integer to a string using str(). ie: print(‘I traveled to’ + str(3) + ‘continents’!
Math with Numpy Arrays
You can perform element-wise operation with 1D and 2D Numpy arrays.
b=np.array[(1,2), (3,4)]
print(b*2)
[(1, 4), (6,8)]
Other than the above and in the 30 days since I begun, I’ve learnt:
- How to subset Numpy arrays.
- List definition.
- Variables and Types.
- Python lists.
Next Steps
- Continue learning SQL.
- Continue learning Python for Data Engineering.
- Create a great work space and environment.
- Document my journey on Medium.
Follow me on Medium and Twitter to keep up with me and my tech journey.






