avatarCassie Kozyrkov

Summary

The web content outlines the distinctions between amateur and professional data analysts, emphasizing the importance of software skills, the ability to handle large data sets, and a critical mindset towards data science bias.

Abstract

The provided web content delves into the transition from being an amateur to a professional data analyst by highlighting ten key differences. It starts by acknowledging that many people engage in basic data analysis without realizing it and then proceeds to outline the advanced skills and perspectives required to become a professional. The article stresses the necessity of learning analytics software like Python and R for handling diverse and voluminous data sets efficiently. It also underscores the importance of developing an analytical mindset that includes skepticism towards data, commonly referred to as data science bias. The content further suggests that learning through doing, such as looking up solutions and experimenting with data, is a practical approach to gaining expertise in data analysis. The article concludes by teasing upcoming topics in the series, including understanding the career, avoiding being a data charlatan, and developing a nuanced view of excellence in data analysis.

Opinions

  • The author believes that the transition from an amateur to a professional analyst involves more than just enthusiasm for data; it requires the development of specific technical and analytical skills.
  • Professionals are distinguished by their adept use of software tools like Python and R, which enable them to work with various data formats and large data sets that would be impractical to analyze using basic tools like MS Paint.
  • The article conveys that true expertise in data analysis comes from a combination of learning professional tools and adopting an analytical mindset that questions the inherent trustworthiness of data.
  • The author suggests that learning by looking up and applying solutions found on the internet is a valid method for developing professional software skills.
  • There is an opinion that data should not be treated as infallible and that a healthy skepticism is crucial for good analytics.
  • The author implies that a career in analytics may not be suitable for those who have not learned to be critical of the information they encounter, whether it's in the form of data or other media.
  • The article expresses that becoming a professional analyst involves continuous learning and the ability to add value through data analysis, beyond just confirming pre-existing beliefs.

Making Data Useful

Becoming a “real” data analyst

10 differences between amateurs and professional analysts

Previously, I introduced you to a few analytics tasks disguised as everyday activities to prove that you’re already a data analyst. For example, consider the image below. Digital photos are stored as a bunch of numbers (left) that make no sense to your brain until you open them with suitable tools (right).

Example of the blue color channel data from a photo of my wooden floor, opened in MS Paint.

Ta-da! You’ve just done data visualization. The music swells as you discover that the power of data analysis was inside you all along.

But does this mean you’re ready to work as a professional analyst?

Not quite. There are some big differences between an amateur and a professional analyst.

Data pro vs amateur difference #1 — Software skills

Unlike most amateurs, the pro knows how to use software (e.g. Python and R) that allows them to interact with more data formats all in one place. While MS Paint only works for images, analytics software can handle images and tables and sounds and text and and and… and the kitchen sink.

Here’s what it looks like when you open that same image with Python:

And here’s the same image viewed with R:

Data pro vs amateur difference #2— Handling lots of data with ease

The second difference is that a pro can work with obscene amounts of data. Even though I’ve been playing with data more than two decades, I still prefer to open a single photo in my browser or even MS Paint rather than in R or Python. So, besides the flexibility of being able to open lots of different data types, what’s the selling point for learning the analytics pro tools? Well, what if you want to make sense of a million photos?

You *could* try to use MS Paint to make sense of them all, but at the speed of 1 second per image, it’ll take you more than a month of full time work. A pro can do it in minutes with the right tools by using code to process and summarize vast amounts of data.

How do you start learning these tools? You look up how to install them (R and Python are free) and start playing with them. Just like MS Paint, but better. Simply do a Google search for whatever task you’re trying to achieve with them and read the results.

Here’s the first result that comes up in response to the search query above:

Boom. That’s all you need.

Well, if you’ve never used R before, your next search will need to be “How do I install a package in R?” but after that, you’re golden. Just copy-paste the code in the answer, replacing “my image” with the filename and filepath for your photo. Not sure what those terms mean? Do a search to look them up. When you’ve run out of things you have to look up, you will have mastered the task you set out to learn. Looking stuff up is how developers develop (pun intended).

Do a whole bunch of this and one day you’ll wake up to the realization that you’ve accidentally developed pro software skills.

This tweet made me laugh. If you don’t get it, the point is that he’s a real software developer already… copy and pasting *is* the job. Same goes for data analytics code skills. You learn by looking up how to do a task, and then adding it to your toolbox.

One reason I love programming is that it’s a cross between magic spells and LEGO. To learn the abracadabra that gets your task done, you look it up on the internet… which is itself data analytics!

One reason I love programming is that it’s a cross between magic spells and LEGO.

Seriously, you don’t need a course. Simply challenge yourself to look at as many new data formats as you can in R or Python (they’re both good), and, along the way, keep asking the internet how to overcome any hurdles that come up. After you open the data (here’s how to find data to look at), come up with a question that strikes your fancy and try to use the tool to get an answer. Start small and get more ambitious as you go along. There’s nothing stopping you! Have fun!

Photo by Jonny Gios on Unsplash

Data pro vs amateur difference #3 — Immunity to data science bias

In my opinion, learning the tools is the easy part. The hard part is adopting the analytics mindset, which is what the next differences are all about. Starting with this one: the expert has developed an all-encompassing disrespect for data. Yes, you heard me.

Only a newbie pronounces “data” with a capital “D” and treats it as something magical. Professionals have been burned and had their hearts broken enough times to learn the hard way that data is just some stuff that humans decided to write down in electronic form. (More here.)

The advantage of data is memory, not quality.

Sprinkling some numbers into a story to make make it more “sciency” might win the trust of amateurs, but seasoned analysts know better. They are immune to what I call data science bias — trusting information more when it smells of the data sciences. Adding a pretty graph to a nonsense report doesn’t fool them.

Experts understand that the advantage of data is memory, not quality, so they’re as skeptical of formal datasets as they are of the sights and sounds they take in by strolling down the street.

“With data, you’re still just another person with an opinion.’’

One of my favorite pioneers of statistics, W. Edwards Deming, famously said that “without data, you’re just another person with an opinion.’’ That is true, but unfortunately so is this: “With data, you’re still just another person with an opinion.’’ Expert analysts understand this in their very bones.

Photo by Hiroshi Kimura on Unsplash

To start building the same immunity, stop treating data as special. You’ve already (hopefully*) learned how to be sensible and skeptical with photos. For example, you know better than to take anything you see on Instagram as a true unaltered, unbiased representation of reality. If you didn’t take the photo, you won’t trust the photo. Right? Right.

Stop treating data as special!

All the common sense rules you’ve learned for navigating the sights and sounds you’re exposed to in the wild also apply to structured data (numbers in a table/matrix/spreadsheet).

Equating data with truth is the same thing as believing everything that’s written in a book without knowing anything about the author. If you keep your wits about you and maintain a healthy skepticism, you’re well on your way to good analytics.

*There are some darling people who seem to have reached adulthood without learning that not everything you find online is true. If that’s you, may I gently suggest that analytics might not be the best career choice for you?

Photo by Alexander Sinn on Unsplash

In addition to more practice with professional tools, the professional analyst understands the, ahem, professional aspects of the profession, which we’ll cover in the next article in this series. For a sneak preview, here are the upcoming section headings:

Data pro vs amateur difference #4 — Understanding the career Data pro vs amateur difference #5—Refusing to be a data charlatan Data pro vs amateur difference #6 — Resistance to confirmation bias Data pro vs amateur difference #7—Realistic expectations of data Data pro vs amateur difference #8—Knowing how to add value Data pro vs amateur difference #9—Thinking differently about time Data pro vs amateur difference #10 — Nuanced view of excellence

If you’ve thought of any other differences that might not fall under these headings, let me know in the comments!

Previous article in this series

Next article in this series

P.S. Have you ever tried hitting the clap button here on Medium more than once to see what happens? ❤️

Liked the author? Connect with Cassie Kozyrkov

Let’s be friends! You can find me on Twitter, YouTube, Substack, and LinkedIn. Interested in having me speak at your event? Use this form to get in touch.

Analytics
Data Science
Technology
Python
Editors Pick
Recommended from ReadMedium