avatarRebecca Vickery

Summary

The webpage provides a curated list of six free data science books that cover foundational concepts for beginners, ranging from Python programming to deep learning.

Abstract

The article on the webpage introduces six complimentary books that serve as an essential learning path for novices in data science. It emphasizes the importance of Python as the primary language for data science and suggests starting with "Learn Python the Right Way" for programming basics. The list includes "Practical Statistics for Data Science" for statistical knowledge, "Pandas: Powerful Python Data Analysis Toolkit" for data manipulation, and "Hands-on Machine Learning with Scikit-learn, Keras and Tensorflow" for machine learning applications. Additionally, "Python Data Science Handbook" offers a broad overview of the data science toolkit in Python, while "Deep Learning for Coders with Fastai and Pytorch" introduces deep learning without requiring an advanced math background. The author recommends these resources for a comprehensive foundation in data science without the need for expensive courses.

Opinions

  • The author believes that Python is the most important programming language for data science beginners to learn.
  • Statistical knowledge is crucial for data science, and "Practical Statistics for Data Science" is highlighted as an excellent resource for beginners.
  • Pandas is considered a key tool for data analysis in Python, and the provided book on Pandas is seen as a comprehensive guide.
  • "Hands-on Machine Learning with Scikit-learn, Keras and Tensorflow" is recommended for its practical approach to machine learning concepts and projects.
  • The "Python Data Science Handbook" is praised for its introduction to essential Python libraries like NumPy, Pandas, and Scikit-learn.
  • "Deep Learning for Coders with Fastai and Pytorch" is favored for making deep learning accessible to coders without a PhD in the field.
  • The author suggests that the combination of these free books can provide a learning experience equivalent to paid courses, making data science education more accessible.

6 Free Data Science Books for Complete Beginners

Every book you need to get started for free

Photo by Tom Hermans on Unsplash

Data science is an interdisciplinary field and requires a broad range of skills from several areas including computer science, maths, statistics and machine learning. As a complete beginner to learning data science, it can be hard to know where to start.

Data science degree programs and boot camps can guide you through a structured curriculum. However if you don’t have thousands of dollars to pay for these courses, an alternative option is to develop your own curriculum and use free or low-cost resources to learn.

I’ve previously written about how to go about creating your own personalised curriculum for learning data science and this can be found here. Free resources are widely available online but it can be hard to find the best ones. One source of learning that can be great for beginners are books, and fortunately, there are several freely available to read, that cover the concepts you need to get started.

In the following article, I will introduce six of my favourite free books for beginners. Each book in this article covers one of the most important foundational concepts of data science. I have organised the titles in the order I would personally recommend learning them. This is as follows:

  • The fundamentals of programming in Python.
  • Statistics for data science.
  • Pandas for data analysis.
  • The Python data science toolkit.
  • Machine learning.
  • Deep learning.

1. Learn Python the Right Way

By Peter Wentworth, Jeffrey Elkner, Allen B. Downey and Chris Meyers

Read it for free here.

Python is the programming language of choice for data science. Learning the fundamentals of programming in this language is therefore one of the first things beginners in the field should learn. This book is not specific to programming for data science but covers the general concepts of writing Python code. I would recommend starting with this book before moving on to the data science Python packages covered in the books recommended later in this article.

This book assumes no prior knowledge of programming and gives an introduction to the Python language and basic general coding principles. Each chapter contains a complementary Youtube video which helps to further explain the concepts covered.

The authors of the book have also made available a set of hands-on tutorials, containing over 15 practical Python projects to put your learnings into practice.

2. Practical Statistics for Data Science

By Peter Bruce & Andrew Bruce

Image courtesy of amazon.co.uk

Read it for free here.

This is by far one of the best books I have found for introducing beginners to the most important statistical concepts for data science. It covers a wide range of topics including data sampling, distributions, statistical experiments and machine learning.

All concepts are explained in beginner-friendly terms and include hands-on code examples using a mixture of R and Python.

3. Pandas: Powerful Python Data Analysis Toolkit

By Wes McKinney and the Pandas Development Team

Image from pandas.org

Read it for free here.

The Python package known as Pandas is the tool of choice for exploring, transforming, cleaning and processing data for data science. This book is a complete user guide to the tool.

It covers everything including installation, basic usage, plotting and data analysis. It assumes basic knowledge of Python, so important to cover the concepts in Learn Python the Right Way first but is a fantastic reference guide to this essential tool for beginners and more established learners alike.

4. Hands-on Machine Learning with Scikit-learn, Keras and Tensorflow

By Aurelien Geron

Image courtesy of amazon.co.uk

Read it for free here.

This book provides a detailed and hands-on introduction to machine learning using some of the most popular Python machine learning packages Scikit-learn, Keras and Tensorflow.

This book does a great job of introducing important theoretical concepts of machine learning including types of machine learning systems, overfitting and underfitting, and descriptions of how the common algorithms work.

There is a lot of practical content including a complete walkthrough of a typical end to end machine learning project. This book will provide a complete introduction for a newcomer to this field.

5. Python Data Science Handbook

By Jake Vanderplus

Image from jakevdp.github.io

Read it for free here.

This book is a good and broad introduction to the Python data science toolkit. It covers an introduction to the NumPy library including concepts such as arrays, computations on arrays and data types in Python.

It also gives a good, beginner-friendly introduction to data analysis with Pandas and machine learning with Scikit-learn.

6. Deep Learning for Coders with Fastai and Pytorch: AI Applications Without a PHD

By Jeremy Howard and Sylvain Gugger

Image courtesey of amazon.co.uk

Read it for free here.

This book is a practical first introduction to deep learning. It is aimed at coders so an understanding of Python programming is essential before diving into this book. However, it does not assume a deep understanding of maths and statistics and includes some excellent and simple explanations of the theory behind deep learning.

This is an ideal first introduction to learning about deep learning and will get you up and running with developing your first deep learning models.

The internet has a huge variety of free or low-cost resources for learning data science. This includes free courses, university lectures, tutorials, videos, blog posts and books. In this article, I have provided links to six free to read books that alone would provide a complete beginner with an excellent foundation of knowledge in the field of data science.

I have previously written several articles covering a wide range of free resources for learning these subjects. If you would like to go beyond these introductory concepts or are looking for resources outside of textbooks, the following links will help to get you started.

Thanks for reading!

Data Science
Machine Learning
Artificial Intelligence
Technology
Education
Recommended from ReadMedium