Top 20 Frequently Asked Questions on Descriptive Statistics for Data Science Aspirants

Whether you’re just starting out or preparing for an interview, having a strong grasp of descriptive statistics is crucial.

In the world of data science, understanding and interpreting data is a fundamental skill. Descriptive statistics provides the tools to summarize and describe the essential features of a dataset.

This blog-plus-assessment will walk you through some of the most frequently asked questions on descriptive statistics, helping you assess your understanding and solidify your knowledge (and also prepare you for conceptual interview questions).

Frequently Asked Questions on Descriptive Statistics

Below are some multiple-choice questions (MCQs) that cover key concepts in descriptive statistics. Take your time to answer each question, and feel free to jot down your answers in a notepad.

What is the primary goal of descriptive statistics? A) To make inferences about a population B) To describe and summarize data C) To test hypotheses D) To establish causality

2. Which of the following is a measure of central tendency?

A) Variance B) Standard deviation C) Mean D) Range

3. The median is: A) The most frequent value in a dataset B) The average of all data points C) The middle value when the data is ordered D) The difference between the highest and lowest values

4. Which of the following measures the spread of data in a dataset? A) Mode B) Median C) Variance D) Mean

5. What does the standard deviation represent? A) The average value of a dataset B) The square root of the variance C) The difference between the maximum and minimum values D) The middle value of a dataset

6. In a positively skewed distribution, which is true? A) Mean = Median = Mode B) Mean = Median > Mode C) Mode > Median > Mean D) Mean > Median > Mode

7. Which of the following is a formula for calculating the mean of a dataset? A) Sum of all values divided by the number of values B) Difference between maximum and minimum values C) The most frequent value D) Square root of the variance

8. If the mode of a dataset is 15 and the mean is 20, what can you infer about the distribution? A) It is negatively skewed B) It is uniform C) It is symmetric D) It is positively skewed

9. The range of a dataset is: A) The difference between the maximum and minimum values B) The average value of the dataset C) The square of the standard deviation D) The middle value of the dataset

10. Which of the following is not a measure of dispersion? A) Range B) Standard deviation C) Mean D) Variance

11. How do you calculate the median for an even number of observations? A) The value in the middle B) The average of the two middle values C) The most frequent value D) The sum of all values divided by the number of observations

12. What is the primary advantage of using the median over the mean? A) It is easier to calculate B) It is always a whole number C) It considers all data points D) It is less affected by extreme values

13. A dataset has a mean of 50 and a standard deviation of 5. What is the z-score of a value of 60? A) 2 B) -2 C) 1 D) -1

14. What is the formula for variance? A) Sum of squared deviations from the mean divided by the number of observations B) Sum of squared deviations from the mean C) Square root of the standard deviation D) Difference between the maximum and minimum values

15. Which measure is most appropriate for describing the center of a skewed distribution? A) Mean B) Range C) Mode D) Median

16. Which of the following statements is true about the mode? A) It is always greater than the mean B) There can be more than one mode in a dataset C) It is the average of all data points D) It is unaffected by outliers

17. In a perfectly normal distribution, the relationship between mean, median, and mode is: A) Mean = Median = Mode B) Mean = Median = Mode C) Mode > Median > Mean D) Mean > Median > Mode

18. Which of the following measures is not sensitive to outliers? A) Median B) Mode C) Mean D) Standard deviation

19. A data point that lies far away from the other data points in a dataset is called: A) A median B) A mode C) An outlier D) A central tendency

20. The sum of all frequencies in a frequency distribution is equal to: A) The number of classes B) The number of data points C) The mean of the dataset D) The median of the dataset

Solutions and Explanations

Now that you’ve attempted the questions, it’s time to check your answers. Let’s go through each question and discuss the correct answer.

1. Answer: B) To describe and summarize data Descriptive statistics aims to provide a summary of the main features of a dataset, making it easier to understand.

2. Answer: C) Mean The mean, along with the median and mode, is a measure of central tendency.

3. Answer: C) The middle value when the data is ordered The median is the value that separates the higher half from the lower half of the data.

4. Answer: C) Variance Variance is a measure of the dispersion or spread of data points in a dataset.

5. Answer: B) The square root of the variance Standard deviation is the square root of the variance and represents the average distance of each data point from the mean.

6. Answer: D) Mean > Median > Mode In a positively skewed distribution, the mean is typically greater than the median, which is greater than the mode.

7. Answer: A) Sum of all values divided by the number of values The mean is calculated by summing all the data points and dividing by the number of data points.

8. Answer: D) It is positively skewed A mode lower than the mean suggests a positive skew in the distribution.

9. Answer: A) The difference between the maximum and minimum values. The range is a simple measure of the spread of the dataset.

10. Answer: C) Mean The mean is a measure of central tendency, not dispersion.

11. Answer: B) The average of the two middle values For an even number of observations, the median is the average of the two middle values.

12. Answer: D) It is less affected by extreme values The median is a more robust measure of central tendency when dealing with skewed distributions or outliers.

13. Answer: A) 2 Z-score is calculated as (Value — Mean) / Standard Deviation. Here, (60–50) / 5 = 2.

14. Answer: A) Sum of squared deviations from the mean divided by the number of observations. Variance is calculated by squaring the deviations from the mean and averaging them.

15. Answer: D) Median The median is often used as the measure of central tendency in skewed distributions.

16. Answer: B) There can be more than one mode in a dataset A dataset can be bimodal or multimodal, meaning it has two or more modes.

17. Answer: A) Mean = Median = Mode In a perfectly normal distribution, the mean, median, and mode are all equal.

18. Answer: A) Median The median is not affected by extreme values, making it a robust measure of central tendency.

19. Answer: C) An outlier An outlier is a data point that lies significantly outside the range of the other data points.

20. Answer: B) The number of data points. The sum of all frequencies in a frequency distribution equals the total number of observations in the dataset.

Rating Your Performance**

Now that you’ve checked your answers, it’s time to assess your performance:

18–20 Correct Answers: Excellent! You have a strong grasp of descriptive statistics.
15–17 Correct Answers: Good job! You have a solid understanding, but there’s room for improvement.
12–14 Correct Answers:Fair. You might want to review some concepts to strengthen your knowledge.
Below 12 Correct Answers: Needs improvement. Consider revisiting the fundamental concepts of descriptive statistics.

To Sum Up

I hope you enjoyed this fun and engaging way to assess your understanding of descriptive statistics. It’s not just about knowing the answers but truly grasping the concepts that will make you a more effective data scientist.

Remember, understanding descriptive statistics is crucial for any aspiring data scientist. It’s the foundation upon which more advanced statistical methods are built

Whether you aced the quiz or found areas to improve, you’ve taken a valuable step in your learning journey.

I encourage you to reflect on your performance, revisit any challenging areas, and continue honing your skills.

I’d love to hear your thoughts and comments! Was this assessment helpful? Are there other topics you’d like to explore in a similar format? Your feedback will help shape future content, so please don’t hesitate to share what you’d like to learn more about.

Also, if you are working on Python, this blog will help you get up-to speed with how to perform descriptive statistics on credit risk data using Python! So check this out too!

Happy learning!

If you’re as passionate about AI, ML, DS, Strategy and Business Planning as I am, I invite you to connect with me on LinkedIn.