Data Science Must-Reads That Aren’t Textbooks
Books to help you understand data science concepts — without feeling like assigned reading.

Since data science is a hybrid of computer science, statistics and business, there are many textbooks that attempt to teach the discipline by focusing on developing hard skills like statistical concepts and programming. However, if you’re like me, the academic feel of many textbooks makes it difficult to fully engage with the learning material.
Code snippets don’t always cut it.
Luckily, early on in my data science journey, I sought out and discovered reading materials that helped provide a ‘big picture’ understanding of machine learning, data analysis and trends that power our data-driven world.
These selections range from guides on overlooked but important skills like data storytelling and data journalism to gripping narratives surrounding seismic data events like Edward Snowden and Cambridge Analytica.
In no particular order, here are my recommendations for the data science student or professional who wants to gain a more wholistic understanding of the field and enjoy the journey.
Note: I’m not compensated for any of these selections. They are simply books I’ve enjoyed. I’ve provided Amazon links for your convenience.
Fortune’s Formula

The subtitle of Fortune’s Formula by William Poundstone promises excitement: “The Untold Story of the Scientific Betting System That Beat the Casinos and Wall Street.” While gambling is covered, a significant portion of the book focuses on how MIT mathematician Claude Shannon, and his professional partner Edward Thorp, developed predictive algorithms rooted in probability theory.
Although the book contains exciting moments covering the exploits of Shannon and Thorp in Vegas casinos, it dedicates ample time to explaining the math and logic behind the team’s approach.
Since the book was written for a non-scientific audience, Shannon and Thorp’s process is presented as simply as possible, making it a great starting point for anyone dipping their toes into machine learning-or organized betting.
Storytelling with Data
I’ve referenced storytelling with data in a story I wrote for Towards Data Science in which I noticed the often-discounted skill of communicating data-driven insights to non-technical audiences.
Even though most of the books on this list were ‘fun reads’ for me, Storytelling with Data by Cole Nussbaumer Knaflic was actually assigned while I was a graduate data science student taking a Python-based visualization course.
One of the key takeaways was the author’s emphasis on concise but precise storytelling. Knaflic termed this storytelling methodology the three minute story and noted that it is a data scientist’s equivalent to an elevator pitch.
While Knaflic covers techniques to optimize visualizations, the real value in this book is its emphasis on clean, concise storytelling.
Pardon the interruption: For more Python, SQL and cloud computing walkthroughs, follow Pipeline: Your Data Engineering Resource.
To receive my latest writing, you can follow me as well.
Permanent Record

A bit of a controversial read due to the actions of its author, Edward Snowden, Permanent Record covers important concepts in data privacy and data collection.
Even if you’re not working for the CIA, NSA or any other three-letter federal agency, the book conveys important and relevant ideas about how data must be responsibly wielded, regardless of political affiliation or ideological differences.
As someone entering the data field, it is essential that you formulate a foundational understanding of the importance of data privacy, associated regulations and know the consequences for what happens when that consumer trust is violated.
Everybody Lies: Big Data. New Data. And What The Internet Can Tell Us About Who We Really Are
More focused on analysis than prediction, Everybody Lies, offers intriguing insights into the information and trends that can be uncovered from an overlooked information source: Internet search data.
Mining Google Trends, author Seth Stephens-Davidowitz ventures into topics as taboo as racism and suicide trends. However, the information he gleams through the use of the open source, intuitive Google Trends tool, are fascinating.
In addition to learning some interesting facts about what individuals do behind closed doors, this book opened my eyes to the viability of Google Trends as a primary and supplemental data source.
I even used Google Trends data to fuel personal projects I shared with employers during my most recent job search.
The book’s breadth of topics and revelations about the power of publicly available data will be exciting to anyone interested in pursuing a data analysis track.
The Data Journalism Handbook

As I approached graduation, I briefly flirted with the idea of leveraging my undergraduate journalism degree and pursuing data journalism. In researching such a career path, I stumbled upon this neat, niche book, The Data Journalism Handbook compiled and edited by Liliana Bounegru, Lucy Chambers and Johnathan Gray.
Written and populated by the stories of working data journalists, the data journalism handbook conveys and important idea: Leveraging relevant data, even at the local level, can result in tangible change.
Packed with examples from international data journalists, it was a book that helped contextualize the importance and possibilities of data-driven storytelling.
Mindf*ck: Cambridge Analytica and the Plot to Break America
Next to Permanent Record, Mindf*ck by Christopher Wiley, is probably the most controversial pick on this list.
Delving deep into the hidden mechanisms that helped shape the outcome of the 2016 election, Mindf*ck examines how machine learning was used to target and manipulate users and voters in both small nations and, ultimately, the American general election. Mindf*ck covers data science topics like data sourcing, feature engineering and clustering.
With any first-person account, there is certainly a degree of bias and an attempt by author Chris Wiley to separate himself from the data-driven horrors Cambridge Analytica unleashed on countries around the world.
Nonetheless, Mindf*ck should function as a cautionary tale, reminding data practitioners to wield an increasingly volatile power ethically and responsibly.
Create a job-worthy data portfolio. Learn how with my free project guide.





