avatarSarah Nderi

Summary

The web content outlines an individual's journey through learning Python for Data Engineering, particularly focusing on the use of NumPy, within the context of the #100DaysOfCode challenge, and their intention to continue learning SQL and document the process on Medium.

Abstract

The article titled "Introduction to Python for Data Engineering" details the author's participation in the #100DaysOfCode challenge, emphasizing their recent completion of an introductory Python course and current exploration into writing functions in Python. The author reflects on their previous experience with a Machine Learning and IOT engineer, which has now led them to start learning data analysis basics and scientific computing with NumPy, a package they have recently imported and begun using. The piece also touches on arithmetic operations in Python, the importance of data type conversion, and the use of NumPy arrays for mathematical operations. The author shares insights from an article on Instructables about Python programming basics, including the use of parentheses, list concatenation, the append() method, and the max() and min() functions. They acknowledge the challenges faced with arithmetic operations post data type conversion but express a growing familiarity with these concepts. The article concludes with the author's next steps, which include continuing their learning journey in SQL and Python for Data Engineering, setting up a conducive workspace, and documenting their progress on Medium and Twitter.

Opinions

  • The author finds Python more challenging to understand compared to SQL, potentially due to their Statistics background making SQL more accessible.
  • They highly recommend DataCamp for Data Science newcomers, indicating a positive opinion of the platform's educational value.
  • The author expresses a sense of accomplishment after starting to learn NumPy, which they had postponed for four years.
  • They advocate for general imports over selective imports for readability, suggesting a preference for import numpy as np over from numpy import array.
  • The author is impressed with the Instructables article on Python programming basics, highlighting specific points they found useful.
  • They admit to initial difficulties with arithmetic operations after converting data types but note that they are overcoming these challenges.
  • The author encourages readers to follow their tech journey on Medium and Twitter, indicating a willingness to share their experiences and insights with a broader audience.

Introduction to Python for Data Engineering

Participating in the #100daysofcode

Photo by Alexandru Acea on Unsplash

If you missed my first article on participating in #100DaysOfCode, you can find it here.

Python has been a bit harder to understand than SQL. Maybe I’m finding it easier with SQL due to my Statistics background. I have a DataCamp subscription (not an affiliate link) and I highly recommend DataCamp for any Data Science newbies.

I just finished an introductory course in Python and I’m currently looking at Writing Functions in Python.

Introduction to Python for Data Science

4 years ago I worked with a Machine Learning and IOT engineer who tried to get my to get started with Numpy. 4 years later, I finally started. So far, I’ve learnt data analysis basics and scientific computing with numpy.

The numpy package contains the code for array computing. To use it, you have to import it. A package is a collection of code. To import and use numpy, write import numpy. Instead of having to repeat the .numpy prefix you can alias it, ie: import numpy as np.

Overall, general imports are preferred over selective imports due to their readability. A general import is import numpy as np or import math, while a selective import might look like from numpy import array or from math import pi.

Arithmetic Python

Instructables has a great article on Numbers and Arithmetic operators that I’m not going to get into. However, here are the things that stood out for me:

  • Use parentheses to prioritize calculations.
  • Use the + operator to combine two list. L = [ 5,8 ], L+ [1,2] = [5,8,1,2]. A list is a collection of values and is denoted by square brackets [].
  • Use the append () method to add an element to the end of a list.
  • Use np.random.normal() to simulate data for the arguments distribution mean, standard deviation and number of samples.
  • Use max() and min() to get the maximum and minimum values in a set of numbers.
  • The module operator % returns the remainder of the division on the number on it’s left by the number on it’s right.

Converting Data Types

Applying arithmetics after converting data types has been challenging, but I’m getting the hang of it.

You can convert a variable into a different type of variable with the str(), int(), float() or bool () functions. To add strings and integers, first convert the integer to a string using str(). ie: print(‘I traveled to’ + str(3) + ‘continents’!

Math with Numpy Arrays

You can perform element-wise operation with 1D and 2D Numpy arrays.

b=np.array[(1,2), (3,4)]

print(b*2)

[(1, 4), (6,8)]

Other than the above and in the 30 days since I begun, I’ve learnt:

  • How to subset Numpy arrays.
  • List definition.
  • Variables and Types.
  • Python lists.

Next Steps

  • Continue learning SQL.
  • Continue learning Python for Data Engineering.
  • Create a great work space and environment.
  • Document my journey on Medium.

Follow me on Medium and Twitter to keep up with me and my tech journey.

Data Engineering
Data Science
Python
Numpy
Recommended from ReadMedium