avatarChristianlauer

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

1454

Abstract

ge (SQL) and topics like relational relationships, modelling and principle of normalization. Here, these articles might be also interesting for you:</p><ul><li><a href="https://readmedium.com/what-is-a-snowflake-schema-d310125c10e2">What is a Snowflake Schema?</a></li><li><a href="https://readmedium.com/what-is-a-star-schema-5a03e0f9ce6d">What is a Star Schema?</a></li></ul><p id="0b70">In addition, Data engineers also have to be familiar with other types of databases, so-called NoSQL databases which are based on file formats, column-oriented (often used in modern Data Warehouses) or a graph-oriented approach — <a href="https://readmedium.com/what-is-a-graph-database-b3bff4eb9902">Read here more about it</a>. Examples for NoSQL databases are MongoDB and Cassandra, hybrid systems often used in Data Warehousing are Amazon Redshift or Google BigQuery.</p><figure id="e87b"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*LJX8bc9Ptexg_d-1.png"><figcaption>NoSQL vs. SQL — Image Source: <a href="https://www.techtarget.com/searchdatamanagement/definition/NoSQL-Not-Only-SQL">TechTarget</a></figcaption></figure><p id="6264">It is also important to know how to access the systems. Be it in a very primitive export as a CSV or via direct data connections, interfaces such as REST, ODBC or JDBC. Knowledge of socket connections and client-server architectures will pay off. Here, I learned that you should also learn about the charac

Options

teristics and peculiarities of the source systems. Here, an exchange with product owners or IT staff who have the know-how is always useful. I have provided some useful tips in <a href="https://readmedium.com/99304b18360e">this article</a>.</p><p id="d3d1">But to take one point from it in advance: One of the most important points is to know about <b>unwanted behavior in the source systems. </b>If not taken into account these behaviors could really ruin your data integration processes but also other problems can occur for example duplicate and inconsistent data. Like already said, it’s important to get to know the source systems and their pitfalls.</p><p id="6d94"><b>If you are interested in what a Data Engineer earns for their job tasks, <a href="https://readmedium.com/salary-of-a-data-engineer-d7a793b27b51">this article</a> might be of interest for you.</b></p><p id="f80c">I hope that this article has given you a brief overview of an important topic that a data engineer should know about — the knowledge of databanks. It is, so to speak, the cornerstone for a successful data integration. In the upcoming weeks, I will publish more articles with other areas that are essential for a data engineer — stay tuned.</p><h2 id="d41b">Sources and Further Readings</h2><p id="17b0">[1] TechTarget, <a href="https://www.techtarget.com/searchdatamanagement/definition/NoSQL-Not-Only-SQL">NoSQL (Not Only SQL database)</a> (2021)</p></article></body>

Part 1 — Database Technolgy Knowledge

What Skills does a Data Engineer need?

How to increase your Market Value and Salary

Photo by Campaign Creators on Unsplash

To be successful as a Data Engineer and thus increase your market value and salary, you need certain skills. I have thought about illuminating these in more detail in various articles — this time with a focus on database technologies.

The most important task for a Data Engineer is to provide high quality data from source systems to systems like a Data Warehouse, Data Lake or Data Lakehouse for example. This data often comes from systems like an ERP or CRM system, social media or production and is often stored in relational databases. However, also unstructured data from an NoSQL could be possible.

That means that a good Data Engineer is not only good in Structured Query Language (SQL) and topics like relational relationships, modelling and principle of normalization. Here, these articles might be also interesting for you:

In addition, Data engineers also have to be familiar with other types of databases, so-called NoSQL databases which are based on file formats, column-oriented (often used in modern Data Warehouses) or a graph-oriented approach — Read here more about it. Examples for NoSQL databases are MongoDB and Cassandra, hybrid systems often used in Data Warehousing are Amazon Redshift or Google BigQuery.

NoSQL vs. SQL — Image Source: TechTarget

It is also important to know how to access the systems. Be it in a very primitive export as a CSV or via direct data connections, interfaces such as REST, ODBC or JDBC. Knowledge of socket connections and client-server architectures will pay off. Here, I learned that you should also learn about the characteristics and peculiarities of the source systems. Here, an exchange with product owners or IT staff who have the know-how is always useful. I have provided some useful tips in this article.

But to take one point from it in advance: One of the most important points is to know about unwanted behavior in the source systems. If not taken into account these behaviors could really ruin your data integration processes but also other problems can occur for example duplicate and inconsistent data. Like already said, it’s important to get to know the source systems and their pitfalls.

If you are interested in what a Data Engineer earns for their job tasks, this article might be of interest for you.

I hope that this article has given you a brief overview of an important topic that a data engineer should know about — the knowledge of databanks. It is, so to speak, the cornerstone for a successful data integration. In the upcoming weeks, I will publish more articles with other areas that are essential for a data engineer — stay tuned.

Sources and Further Readings

[1] TechTarget, NoSQL (Not Only SQL database) (2021)

Data Science
Data Engineering
Technology
Programming
Database
Recommended from ReadMedium