avatarSarah Nderi

Summary

The author embarked on a 100-day coding challenge, focusing on SQL and Python for Data Science, and has made significant progress in SQL while also securing a spot in a Data Engineering mentorship program.

Abstract

The author began the #100daysofcode challenge on July 1st, 2022, without a specific plan, exploring programming languages to find the best fit. They chose SQL and Python for Data Science, overcoming past obstacles such as procrastination, unfamiliarity with code editors, lack of structured courses, and a 4GB RAM laptop. The author used DataCamp to learn SQL, attracted by its intuitive platform and structured courses, and was motivated by the competitive XP system. This led to their application and acceptance into a 12-week Data Engineering mentorship program offered by Data Science East Africa (DSEA) and Lux Tech Academy. While proficient in PostgreSQL, the author continues to learn Python and has shared their SQL knowledge and experiences, including challenges with joins and nested queries, and the setup of their coding environment. Future plans include continuing SQL and Python learning, creating a conducive workspace, and documenting the journey on Medium and Twitter.

Opinions

  • The author values structured learning and competition as motivational tools, as evidenced by their preference for DataCamp's XP system.
  • They are self-reflective, acknowledging past challenges with learning code and their strategies to overcome them.
  • The author is proactive in seeking out learning resources and opportunities, such as the mentorship program by DSEA and Lux Tech Academy.
  • They find SQL, particularly PostgreSQL, to be an important skill in their data engineering journey.
  • The author is committed to continuous learning and improvement, with plans to enhance their workspace and document their learning process for others to follow.

Data Engineering 101: Introduction to Data Engineering

Participating in the #100daysofcode

Photo by Christopher Gower on Unsplash

On 1st July, 2022, I joined the #100daysofcode.

I didn’t have a plan in mind, I decided to play around with programming languages and figure out which one would suit me. I settled on SQL and Python for Data Science. I’ve tried to learn code in the past but I always hit a roadblock:

  • Procrastinating.
  • Not understanding code editors.
  • No structured courses.
  • Pressuring myself to learn quickly as opposed to learning efficiently.
  • 4GB RAM laptop. This is not a bottleneck if you’re learning online, but it becomes one when you need to install code editors etal on your laptop.

I started on SQL first and chose DataCamp as my platform of choice. I love Data Camp because the platform is intuitive and their courses are structured. On DataCamp, you earn credits (XP) and you can kind of see how you compete with other learners. I’m highly competitive and seeing the XP increase motivated me to learn SQL in the first 3 weeks of July.

Image courtesy of the author.

This also gave me confidence to apply and get into a 12 weeks Data Engineering mentor-ship program by Data Science East Africa (DSEA) and Lux Tech Academy.

Image courtesy of the author.

While I’m accurately versed in PostgreSQL, I’m not as familiar with python, yet. In this article, I will expound some knowledge on PostgreSQL.

SQL

SQL stands for Structured Query Language. In SQL, data is arranged in tables where each column is a field and each row is a record. It is used to query relational databases. Relational database contains a collection of tables and the data stored relates to other pieces of data.

A query is a request for data from a table or a combination of tables in a database.

It uses Keywords such as SELECT and FROM. SQL is not case sensitive, and thus doesn’t differentiate between FROM and from, or SELECT from select. However, it’s good practice to write your keywords in upper case, to differentiate them from other parts of your query like column names or rows.

Each query ends with a semi colon (;) which tells SQL to end/terminate the query.

In the 52 days since I begun, I’ve learnt how to:

  • Select columns with the keywords SELECT, SELECT DISTINCT and COUNT
  • Filter rows with WHERE, AND, OR, BETWEEN, IS NULL, IS NOT NULL, LIKE, NOT LIKE.
  • Aggregate functions with Aliasing.
  • Sorting and grouping data with GROUP BY and Having.
  • Joining data — Inner joins, self joins, case when and then.
  • Nested queries/sub-queries.

The most challenging part has been learning joins and nested queries. Setting up the environment hasn’t been easy. I manged to do it after procrastinating for about a week. I set up PGAdmin, and I’m using the Windows command line interface.

Next Steps

  • Continue learning SQL.
  • Continue learning Python for Data Engineering.
  • Create a great work space and environment.
  • Document my journey on Medium.

Follow me on Medium and Twitter to keep up with me and my tech journey.

Data Science
Sql
Postgresql
Python Data Science
Recommended from ReadMedium