Probability Theory #7 — F (Fisher) Distribution

F Distribution — also known as Fisher–Snedecor distribution — is about differences in variances. Let’s have GPT4.0 explain to us in straightforward terms.
Imagine you’re comparing two groups of students from different schools, School A and School B. You want to know if one school consistently has better test scores than the other.
Think of the F distribution as a tool that helps you decide whether the differences in test scores between the two schools are real or just a result of random chance.
As you have read, it is about comparing two groups of data (in this case, students) and whether one group has better test scores. We want to approach it using the scientific method via a statistical approach.
What is interesting about this distribution is that
we assume members within each group do not differ much among themselves, but the average of all members within each group does differ much between groups.
Personally, this is one of my favourite distributions. It makes total sense, based on the fact that members who are assigned to the group should be similar to each other since these members describe the same group they are in. That’s the reason why we have meaningful groups.
But at the same time, groups should differ from each other, as a meaningful group is only meaningful when they differ apart from each other.
Now the question is how much ‘differences’ should be observed so that we can say with certainty that they are different scientifically speaking?
I have gone to great lengths to discuss chance and non-chance (pattern) in my previous writings. And when we say that we want to observe non-chance or pattern, we want to make sure that the likelihood of chance playing a bigger role in our data is 5% or less. If we draw a parallel comparison between the distribution of the differences within and between groups, and the model data (technically we call it F-distribution), do we observe a significant shift away from the model data?
One of the most useful applications for this distribution is the identification of the uniqueness of groups of nationalities when it comes to immigration. Ideally, we should observe certain similarities among fellow countrymen. However, as we know, Americans who grew up in the USA with both parents being American too differ quite a lot from Asians who grew up in Asia with both parents being Asian. The key question is this:
What factors truly determine these differences?
I’m pretty sure our immigration officers will be keen to know these factors, so as to provide a clearer profile of visitors.

Daniel started off his career as a senior list researcher with a British publishing firm. Back then, his role involved contact sourcing through the internet and performed data entry into the Microsoft Dynamic CRM system. (Microsoft Dynamic CRM 3.0) Progressively, he explored the option of using Visual Basic scripting within excel to automate the contact sourcing process.
He successfully developed and implemented the scripts, leading to 95% increase in data entry efficiency. He then moved on to take on the role of a CRM executive with Fuji Xerox Singapore.
As a CRM executive, he liaised with third party vendor for technical enhancement of the CRM system (Microsoft Dynamic CRM 4.0 and 365). He also performs functional enhancement of the CRM system for hundreds of end users.
His notable achievement was the development of the CRM boy that led to 98% improvement in data quality and data integrity in the CRM system. Following his Masters studies in Consumer Insight with Nanyang Business School, he took on the role of an Analytics instructor with Singapore Management University. He prepared class notes and technical walkthrough, and taught Analytics to the undergraduate students from various disciplines. Subsequently, he took on various roles as consultants in the consultancy, manufacturing and information technology industries in Singapore.

He travelled to Paris, London, Sri Lanka, Japan and Malaysia to fulfill his role as a consultant. The cultural and professional exchanges between local and overseas data analytics had given him a very good overview of the expectations and motivations from people around the world. He also had a chance to relocate to the United States for one year, particularly focusing on Operations Management.
Prior to his current freelance status, he took on the role of the Data Science Lead in a Singaporean software company. His primary role was to develop Artificial Intelligence using logic, data science and machine learning techniques through in-depth, full-stacked scripting. He also developed customized Reporting for his customers. In his point of view, 95% of today’s reporting can be automated, which can free up staff from daily manual work.

He holds a Bachelor of Science in Marketing (BSc. Marketing Pass with Merit) from Singapore University of Social Sciences (in which he graduated as a Valedictorian), a Master of Science in Marketing and Consumer Insights (MSc. Marketing and Consumer Insights) from Nanyang Technological University, a Doctor of Business Administration (DBA) from Swiss School of Business and Management.






