Git Repos and Resources for Learning Data Engineering
Are you ready to become a data engineer ?
Are you searching for the right place to start ?
Then you are lucky.
In this post I am sharing all the resources and tools you can check out to learn and master data engineering. Lets get started.
Introductory & Tutorials:
- Building a Data Engineering Project in 20 Minutes: https://github.com/damklis/DataEngineeringProject — A beginner-friendly project covering data ingestion, warehousing, and visualization.
- Data-Engineering-HowTo: https://github.com/josephmachado — A curated list of resources for learning data engineering from scratch.
- Learn Data Engineering From These GitHub Repositories: https://github.com/andkret/Cookbook — Collection of categorized projects covering different data engineering tasks.
Specific Technologies & Tools:
- Airbyte: https://github.com/airbytehq/airbyte — Open-source data integration platform for ELT pipelines.
- dbt-examples: https://github.com/dbt-labs — Official repository with example dbt projects for various data transformations.
- Kafka Tutorials: https://www.youtube.com/watch?v=hyJZP-rgooc — Official Kafka tutorials and examples.
- Spark Examples: https://github.com/spark-examples — Official Spark examples covering various functionalities.
- Terraform by Example: https://github.com/futurice/terraform-examples — Extensive collection of Terraform examples for different use cases.
- Airflow Examples :https://github.com/apache/airflow/airflow/example_dags Extensive collection of Airflow examples for different use cases.
Advanced Projects:
- CloudQuery: https://github.com/cloudquery/cloudquery — Open-source high-performance data integration platform.
- StreamSets Data Collector: https://github.com/streamsets/datacollector-edge-oss — Open-source data ingestion and transformation platform.
- Luigi Tutorials: https://github.com/spotify/luigi — Official tutorials for learning Luigi workflows.
- Apache Beam Examples: https://beam.apache.org/documentation/programming-guide/ — Official Beam examples for diverse data processing tasks.
Bonus Resources:
- Awesome Data Engineering: https://github.com/igorbarinov/awesome-data-engineering — Extensive list of curated data engineering resources.
- Data Engineering subreddit: https://www.reddit.com/r/dataengineering/ — Community of data engineers discussing topics and sharing resources.
Hope you find this useful. Happy Learning !!