Learn Statistics for data science
Fundamentals of statistics you need to know to start your data science career
What is Statistics? 🤔
Generally, statistics are applied for collecting, organizing and analyzing the data(a piece of information). In other words, analyzing the numerical information and extracting information from data.
A Statistic is a measure that tells only a Sample of the population. Eg: Conducting a Quiz to Question random (voluntary) people.
Descriptive Statistics And Inferential Statistics:
Descriptive Statistics: It involves the method of picturing information observed from samples and population
Inferential statistics: Method of using information from a sample to conclude the population.
- Statistics deals with population and sample.
Parameter:
A parameter (P) describes the enter population Eg: Conducting a quiz to question random (voluntary) people
Population Vs sample:
Population :
- A population is a group of people or objects so each member in a group is considered a population. The population is a collection of all items which can be denoted by N.
- Taking population and performing the calculation is extremely difficult because the more time is consumed, the higher cost is required and hard to observe.
Sample :
A sample is a piece of information taken from the population. To perform “inferential statistics” we take a sample from the population. In other words, a sample is a subset of a population.
Error in Statistics:
We can face two different types of errors while performing statistics
- Sampling error
- Non-sampling error
Sampling error occurs when the population mean will differ from our sample mean. This mistake is made by a statistician(analyst)
Non-sampling error: The occurrence of non-sampling error is caused by poor sample design and inaccurate measurement. This error is the goal that we want to avoid.
Sampling Frame:
This indicates the list of individuals from the sample is selected. we can say a list of students enrolled in a college is an example. Sometimes list may be either physical or theoretical.
Types of Sampling
Simple Random Sampling
Stratified Sampling
Systematic sampling
Convenience Sampling
i. Simple Random Sample[SRS]
Generally, SRS is a randomly selected subset of a population that “every sample has an equal chance of being selected”. the major advantage in simple random sampling is we can also randomly assign numbers to the population, but numbers must be unique not as identical.
“Student ID card” is a good example of a simple random sample, because every individual student has a different ID card number.

ii. Stratified Sampling:
The population(N) is split into “non-overlapping or well-separated groups (strata), then simple random sampling is done for each group to form a sample(n).
iii. Systematic Sampling :
- Every nth (specific) individual from the population(N) is placed in the sample(n).
- Example: Let us consider Specifically taking the 7th value from the data set and adding them. like in a supermarket the staff member only giving rewards to the 7th customer,14th customer, 21st customer..so on, which means you are performing the systematic sampling.
iv. Convenience Sampling :
Convenience sampling is one of the easiest samplings of all, picking the easiest way of getting your sample. In other words “ easily obtained individuals from the population are placed in the sample(n)”.
- Convenience sampling is also called voluntary response sampling.
Thank you..




