avatarDr. Daniel Koh

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

11627

Abstract

stics. An alternative hypothesis can be split into two — a left tail and a right tail. This is particularly true for parametric analysis where Gaussian distribution is assumed for both the response variable and its error. However, an alternative hypothesis is simply a conjecture to explain the unknown phenomenon and at times a highly probable one based on the frequentist view.</p><p id="3717">If an event has occurred, the definitive question is not “Is this an event which would be rare if the null hypothesis is true?” but “Is there an alternative hypothesis under which the event would be relatively frequent?” (Berkson, 1942, p.327)</p><p id="b591">There can be other explanations if the alternative hypothesis is transformed into a null hypothesis. For example, in this research paper, the alternative hypothesis for the third hypothesis is that the behavioral pattern of loss aversion for TSVD product follows the LAS function. However, we can transform this alternative hypothesis into a null hypothesis by stating the potential explanation of using other mathematical functions to explain the loss aversion behavior of consumers selling the TSVD product. The null hypothesis becomes “the behavioral pattern of loss aversion for TSVD is not only explained by the LAS function.” Such a transformation allows researchers to further develop the theory. Pruzek (2016) advocated for this transformation approach in the Bayesian inference framework, whereby the posterior probability of the rejected null hypothesis is used as the prior probability for the newly transformed null hypothesis from the previously accepted alternative hypothesis. But as Frick (1996) mentioned in his paper, specifying the prior probabilities becomes a great challenge when the research work is conducted in isolation, and the applicability of the posterior probabilities from an accepted alternative hypothesis may not necessarily translate quite fittingly to the prior probabilities of a subsequent null hypothesis, especially when the environment or the respondents’ state of mind changes.</p><p id="038a">However, the transformation of the alternative hypothesis to the null hypothesis must not be a translation of chance to probability. As described in Lindley’s paradox, a 95% non-chance does not mean 95% posterior probability for the null hypothesis, and a high posterior probability does not mean a high likelihood for a very small prior (i.e. 5%) in Bayesian analysis. Implying likelihood, in this case, is erroneous. It is very important to note that the 95% as described above is the probability of non-chance (Lindley, 1957) and not a 95% posterior probability.</p><p id="f19c">The way the hypotheses are formed matters in this research paper. The null hypothesis (cradle) follows the Acceptance-Support Null Hypothesis Significance Testing (AS-NHST) approach, which claims that the researcher believes in the null hypothesis to be true, and accepting it will lead to validation of what the researcher believes. The other null hypothesis (candle) follows the Rejection-Support Null Hypothesis Significance Testing (RS-NHST) approach, which claims that the researcher does not believe that the null hypothesis is true, and rejecting it will lead to validation of what the researcher believes. (Nickerson, 2000) Researchers need to know which approach should be adopted in a study.</p><p id="0305">In Hypothesis testing, reproducibility is paramount to the generalizability of the research. A small sample with a small p-value may be effective in the current research, but not generalizable to the population. And applicability to the population allows replicability or reproducibility. This brings us to the point about statistical power. A high reproducibility — being able to replicate the research and obtain similar results — requires high statistical power and a larger sample size (Schmidt & Hunter, 1997). However, in most social and psychology research, large sample size is not an ideal option due to budget constraints and logistical challenges. Nonetheless, a study that allocates sample sizes to subgroups that potentially represent the subgroups of the populations (within a reasonably large sample size based on the confidence level and margin of error) can produce strong statistical power. (<i>ibid.</i>) In this research paper, respondents were asked to choose a flower from a list of options. Each choice allocates one respondent to a sub-group. There are 64 or 65 respondents<a href="#_ftn6">[6]</a> in each sub-group that answer questions for a specific scenario. In total, the sample size in each group adds up to the total sample size, generalizable to the population.</p><p id="0bb3">As a final note, it is unfair to throw the entire hypothesis out of the window when the critical region is determined by the critical value — commonly known as the (alpha) — is 5% and the p-value is .051<a href="#_ftn7">[7]</a>. Can we truly not reject the null hypothesis simply because the p-value is 0.001 above the critical value? It becomes a contentious topic when we observe marginally significant results and yet we throw the entire hypothesis out of the window by not rejecting it when it is, in fact, false. This is often referred to the type II errors in hypothesis testing or “beta”<a href="#_ftn8">[8]</a>.</p><p id="c90c">At the end of the day, a p-value and the rejection of a hypothesis (or not rejecting it) truly depend on the researcher’s observation, data on hand, and experience. It also depends on auxiliary theories or the nature of the psyche in any form of experiments. (Meehl, 1990a, 1990b) Some researchers argue that the results arising from hypothesis testing by observing the probability of chance are imprecise and it give little information about the event. And there’s a need to think statistically rather than ritually practice statistics. (Gigerenzer, 1998) A common agreement that all researchers have arrived at is that the Null Hypothesis Significance Testing should serve as a guide (Abelson, 1995; McDonald, 2016) rather than a rule, and a smart researcher should know when hypothesis testing makes sense in a given context and scenario rather than using hypothesis testing as a norm. (cf. Dar, 1998; Harlow et al., 1997) The use of the p-value to reject the null hypothesis depends on the planned manipulation designed by the researchers to observe the anticipated effect, and such practice should be made explicit before the start of research work. (Rogers et al., 1993)</p><p id="51ee">There is a possibility to use the ‘good enough’ non-nil null hypothesis instead of the point null hypothesis to test for the existence of chance. The ‘good enough’ assessment comes from statistical outputs such as Cohen’s <i>d<a href="#_ftn9"><b>[9]</b></a> </i>or confidence level. In this approach, we can better determine not just the magnitude of the association but also the direction based on the variability of the data. It can give us a better picture of the relationship between variables. This method calls for the theory to be assessed using logic (conjoint or disjoint) rather than point (definitive and only one hypothesis to discover the truth). Refer to the paper written by Cohen (1994) for more information on the method of nil hypothesis.</p><p id="cb9e">Finally, evidence suggests that people do not take note of the contrapositive likelihood ratio within the Bayesian framework. (Beyth-Marom & Fischhoff, 1983; Griffin & Tversky, 1992; Troutman & Shanteau, 1977) The likelihood ratio refers to the probability of observing the data given a true hypothesis and the contrapositive refers to the probability of observing the data when the hypothesis is not true. Popper (2005) suggested that a scientific theory is not proven to be scientific if falsification is not achieved. Plainly speaking, Popper argued that the contrapositive must be observed together with the positive for a scientific theory to be accepted<a href="#_ftn11">[10]</a>. We tend to display confirmation bias by favoring a particular hypothesis with which we agree and neglect the contrapositive which can likewise provide a confirmatory outcome. Confirmation bias refers to the irrational favoring of a particular position or belief, such that the other positions or beliefs weigh lesser. In our research paper, we focus on the likelihood ratio of observing data (i.e. chance versus non-chance) that follows the hypothesis but have yet to compute the probability of observing the contrapositive. In subsequent research work, researchers may look into the contrapositive within the Bayesian framework.</p><p id="2c08">References</p><p id="f685">Abelson, R. P. (1995). <i>Statistics as Principled Argument</i>. Psychology Press.</p><p id="e8c1">Bakan, D. (1966). The test of significance in psychological research. <i>Psychological Bulletin</i>, <i>66</i>(6), 423.</p><p id="8ef9">Bartko, J. J. (1991). <i>Proving the null hypothesis.</i></p><p id="c51c">Berkson, J. (1942). Tests of significance considered as evidence. <i>Journal of the American Statistical Association</i>, <i>37</i>(219), 325–335.</p><p id="4885">Beyth-Marom, R., & Fischhoff, B. (1983). Diagnosticity and pseudodiagnosticity. <i>Journal of Personality and Social Psychology</i>, <i>45</i>, 1185–1195. <a href="https://doi.org/10.1037/0022-3514.45.6.1185">https://doi.org/10.1037/0022-3514.45.6.1185</a></p><p id="5d52">Brewer, J. K. (1985). Behavioral statistics textbooks: Source of myths and misconceptions? <i>Journal of Educational Statistics</i>, <i>10</i>(3), 252–268.</p><p id="40ed">Carver, R. (1978). The case against statistical significance testing. <i>Harvard Educational Review</i>, <i>48</i>(3), 378–399.</p><p id="9a84">Cohen, J. (1994). The earth is round (p < .05). <i>American Psychologist</i>, <i>49</i>, 997–1003. <a href="https://doi.org/10.1037/0003-066X.49.12.997">https://doi.org/10.1037/0003-066X.49.12.997</a></p><p id="8086">Cronbach, L. J. (1975). Beyond the two disciplines of scientific psychology. <i>American Psychologist</i>, <i>30</i>(2), 116.</p><p id="230b">Crosier, P. (n.d.). <i>LibGuides: Quantitative data collection and analysis: Testing hypotheses</i>. Retrieved 21 April 2023, from <a href="https://libguides.tees.ac.uk/quantitative/testing">https://libguides.tees.ac.uk/quantitative/testing</a></p><p id="3a0a">Dagnall, N. A., Parker, A., & Munley, G. (2007). Paranormal belief and reasoning. <i>Personality and Individual Differences</i>, <i>43</i>(6), Article 6.</p><p id="b039">Dar, R. (1998). Null hypothesis tests and theory corroboration: Defending NHSTP out of context. <i>Behavioral and Brain Sciences</i>, <i>21</i>(2), 196–197. <a href="https://doi.org/10.1017/S0140525X98251168">https://doi.org/10.1017/S0140525X98251168</a></p><p id="9075">Estes, W. (1997). On the communication of information by displays of standard errors and confidence intervals. <i>Psychonomic Bulletin & Review</i>, <i>4</i>(3), 330–341.</p><p id="7626">Falk, R. (1986). Misconceptions of statistical significance. <i>Journal of Structural Learning</i>.</p><p id="8065">Falk, R., & Greenbaum, C. W. (1995). Significance tests die hard: The amazing persistence of a probabilistic misconception. <i>Theory & Psychology</i>, <i>5</i>(1), 75–98.</p><p id="9c50">Folger, R. (1989). <i>Significance tests and the duplicity of binary decisions.</i></p><p id="db7f">Frick, R. W. (1995). Accepting the null hypothesis. <i>Memory & Cognition</i>, <i>23</i>(1), 132–138.</p><p id="4f68">Frick, R. W. (1996). The appropriate use of null hypothesis testing. <i>Psychological Me

Options

thods</i>, <i>1</i>, 379–390. <a href="https://doi.org/10.1037/1082-989X.1.4.379">https://doi.org/10.1037/1082-989X.1.4.379</a></p><p id="ae05">Gigerenzer, G. (1998). We need statistical thinking, not statistical rituals. <i>Behavioral and Brain Sciences</i>, <i>21</i>(2), 199–200. <a href="https://doi.org/10.1017/S0140525X98281167">https://doi.org/10.1017/S0140525X98281167</a></p><p id="05bf">Gigerenzer, G., & Murray, D. J. (2015). <i>Cognition as intuitive statistics</i>. Psychology Press.</p><p id="06a7">Good, I. J. (1981). Some Logic and History of Hypothesis Testing. In J. C. Pitt (Ed.), <i>Philosophy in Economics: Papers Deriving from and Related to a Workshop on Testability and Explanation in Economics held at Virginia Polytechnic Institute and State University, 1979</i> (pp. 149–174). Springer Netherlands. <a href="https://doi.org/10.1007/978-94-009-8394-6_10">https://doi.org/10.1007/978-94-009-8394-6_10</a></p><p id="27fa">Grant, D. A. (1962). Testing the null hypothesis and the strategy and tactics of investigating theoretical models. <i>Psychological Review</i>, <i>69</i>(1), 54.</p><p id="b2e7">Griffin, D., & Tversky, A. (1992). The weighing of evidence and the determinants of confidence. <i>Cognitive Psychology</i>, <i>24</i>(3), 411–435. <a href="https://doi.org/10.1016/0010-0285(92)90013-R">https://doi.org/10.1016/0010-0285(92)90013-R</a></p><p id="f5be">Harlow, L. L., Mulaik, S. A., & Steiger, J. H. (Eds.). (1997). There is a Time and Place for Significance Testing. In <i>What If There Were No Significance Tests?</i> Psychology Press.</p><p id="a625">Kupfersmid, J., & Fiala, M. (1991). <i>A survey of attitudes and behaviors of authors who publish in psychology and education journals.</i></p><p id="1c16">Lindley, D. V. (1957). A statistical paradox. <i>Biometrika</i>, <i>44</i>(1/2), 187–192.</p><p id="d9d4">MasterClass. (2022, March 8). <i>Scientific Inquiry Definition: How the Scientific Method Works — 2023</i>. MasterClass. <a href="https://www.masterclass.com/articles/scientific-inquiry">https://www.masterclass.com/articles/scientific-inquiry</a></p><p id="d474">McDonald, R. P. (2016). Goodness of Approximation in the Linear Model. In <i>What If There Were No Significance Tests?</i> Routledge.</p><p id="4d53">Meehl, P. E. (1967). Theory-testing in psychology and physics: A methodological paradox. <i>Philosophy of Science</i>, <i>34</i>(2), 103–115.</p><p id="2bd5">Meehl, P. E. (1990a). Appraising and Amending Theories: The Strategy of Lakatosian Defense and Two Principles That Warrant It. <i>Psychological Inquiry</i>, <i>1</i>(2), 108–141.</p><p id="bc3e">Meehl, P. E. (1990b). Why Summaries of Research on Psychological Theories are Often Uninterpretable. <i>Psychological Reports</i>, <i>66</i>(1), 195–244. <a href="https://doi.org/10.2466/pr0.1990.66.1.195">https://doi.org/10.2466/pr0.1990.66.1.195</a></p><p id="8ed8">Nickerson, R. S. (2000). Null hypothesis significance testing: A review of an old and continuing controversy. <i>Psychological Methods</i>, <i>5</i>(2), 241–301. <a href="https://doi.org/10.1037/1082-989X.5.2.241">https://doi.org/10.1037/1082-989X.5.2.241</a></p><p id="7fc2">Popper, K. (2005). <i>The logic of scientific discovery</i>. Routledge.</p><p id="40d2">Pruzek, R. M. (2016). An Introduction to Bayesian Inference and Its Applications. In <i>What If There Were No Significance Tests?</i> Routledge.</p><p id="809f">Ranganathan, P., & Pramesh, C. (2019). An Introduction to Statistics: Understanding Hypothesis Testing and Statistical Errors. <i>Indian Journal of Critical Care Medicine : Peer-Reviewed, Official Publication of Indian Society of Critical Care Medicine</i>, <i>23</i>(Suppl 3), S230–S231. <a href="https://doi.org/10.5005/jp-journals-10071-23259">https://doi.org/10.5005/jp-journals-10071-23259</a></p><p id="fb34">Riedel, S. (2005). Edward Jenner and the history of smallpox and vaccination. <i>Proceedings (Baylor University. Medical Center)</i>, <i>18</i>(1), 21–25.</p><p id="b06b">Rogers, J. L., Howard, K. I., & Vessey, J. T. (1993). Using significance tests to evaluate equivalence between two experimental groups. <i>Psychological Bulletin</i>, <i>113</i>(3), 553–565. <a href="https://doi.org/10.1037/0033-2909.113.3.553">https://doi.org/10.1037/0033-2909.113.3.553</a></p><p id="c1da">Rosnow, R. L., & Rosenthal, R. (1992). Statistical procedures and the justification of knowledge in psychological science. In <i>Methodological issues & strategies in clinical research</i> (pp. 295–314). American Psychological Association. <a href="https://doi.org/10.1037/10109-027">https://doi.org/10.1037/10109-027</a></p><p id="9140">Schmidt, F. L., & Hunter, J. E. (1997). Eight common but false objections to the discontinuation of significance testing in the analysis of research data. <i>What If There Were No Significance Tests</i>, 37–64.</p><p id="8f23">Troutman, C. M., & Shanteau, J. (1977). Inferences based on nondiagnostic information. <i>Organizational Behavior and Human Performance</i>, <i>19</i>(1), 43–55. <a href="https://doi.org/10.1016/0030-5073(77)90053-8">https://doi.org/10.1016/0030-5073(77)90053-8</a></p><p id="109d">Tversky, A., & Kahneman, D. (1983). Extensional versus intuitive reasoning: The conjunction fallacy in probability judgment. <i>Psychological Review</i>, <i>90</i>(4), 293.</p><p id="7615"><a href="#_ftnref1">[1]</a> The scientific inquiry refers to the ways in which scientists study nature and provide some explanations based on the evidence of a study. (MasterClass, 2022)</p><p id="7bc8"><a href="#_ftnref2">[2]</a> This is usually referred to as the substantive hypothesis.</p><p id="7105"><a href="#_ftnref3">[3]</a> This is usually referred to as the statistical hypothesis.</p><p id="722c"><a href="#_ftnref4">[4]</a> As of now, it is endemic with many countries removing the requirements for mask-wearing.</p><p id="f2be"><a href="#_ftnref5">[5]</a> We are not saying that hypotheses that pose a challenge are not statements to prove. Rather, the motivation to test the hypotheses comes from a challenge more than proof. For example, we know the sun rises from the east every day. There is no need for further proof. But we can challenge this truth for the sake of making a point!</p><p id="9ba7"><a href="#_ftnref6">[6]</a> As the sample size is an odd number of</p><p id="b830">, there will be one group which has 64 respondents.</p><p id="1632"><a href="#_ftnref7">[7]</a> This value is translated to 5.1% for the purpose of comparison.</p><p id="132f"><a href="#_ftnref8">[8]</a> This is the reason why the LAS function uses the b-parameter rather than the ‘beta’ Greek letter. There will be much confusion when the beta is used in both scenarios in a text.</p><p id="ebbf"><a href="#_ftnref9">[9]</a> Cohen’s d considers the variability of the data. It takes the difference between two mean values, and dividing by the data’s standard deviation. As you may have noticed, the variability of the data is considered.</p><p id="64e1"><a href="#_ftnref11">[10]</a> Popper argued his position with the possibility of other explanations that might exists in the contrapositive.</p><figure id="0e43"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*hUzXXbWnQht5xBpaeHWtZg.jpeg"><figcaption><b>Dr. Daniel Koh</b></figcaption></figure><p id="77a1">Daniel started off his career as a senior list researcher with a British publishing firm. Back then, his role involved contact sourcing through the internet and performed data entry into the Microsoft Dynamic CRM system. (Microsoft Dynamic CRM 3.0) Progressively, he explored the option of using Visual Basic scripting within excel to automate the contact sourcing process.</p><p id="6ece">He successfully developed and implemented the scripts, leading to 95% increase in data entry efficiency. He then moved on to take on the role of a CRM executive with Fuji Xerox Singapore.</p><p id="4040">As a CRM executive, he liaised with third party vendor for technical enhancement of the CRM system (Microsoft Dynamic CRM 4.0 and 365). He also performs functional enhancement of the CRM system for hundreds of end users.</p><p id="5a95">His notable achievement was the development of the CRM boy that led to 98% improvement in data quality and data integrity in the CRM system. Following his Masters studies in Consumer Insight with Nanyang Business School, he took on the role of an Analytics instructor with Singapore Management University. He prepared class notes and technical walkthrough, and taught Analytics to the undergraduate students from various disciplines. Subsequently, he took on various roles as consultants in the consultancy, manufacturing and information technology industries in Singapore.</p><figure id="7ea7"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*VWhPhUkYYQx2VBMU.jpeg"><figcaption></figcaption></figure><p id="87c8">He travelled to Paris, London, Sri Lanka, Japan and Malaysia to fulfill his role as a consultant. The cultural and professional exchanges between local and overseas data analytics had given him a very good overview of the expectations and motivations from people around the world. He also had a chance to relocate to the United States for one year, particularly focusing on Operations Management.</p><p id="62f9">Prior to his current freelance status, he took on the role of the Data Science Lead in a Singaporean software company. His primary role was to develop Artificial Intelligence using logic, data science and machine learning techniques through in-depth, full-stacked scripting. He also developed customized Reporting for his customers. In his point of view, 95% of today’s reporting can be automated, which can free up staff from daily manual work.</p><figure id="e41d"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*FvkpvIpBBAPadr0u.jpeg"><figcaption></figcaption></figure><p id="326e">He holds a Bachelor of Science in Marketing (BSc. Marketing Pass with Merit) from Singapore University of Social Sciences (in which he graduated as a Valedictorian), a Master of Science in Marketing and Consumer Insights (MSc. Marketing and Consumer Insights) from Nanyang Technological University, a Doctor of Business Administration (DBA) from Swiss School of Business and Management.</p><h1 id="59b5">A Message from DataFrens…</h1><p id="9683">Thanks for being a part of our community!</p><p id="8e6e">Do join us here at:</p><div id="4c8f" class="link-block"> <a href="https://www.datafrens.sg/"> <div> <div> <h2>www.DataFrens.sg</h2> <div><h3>Data Introduced Us Frenship Bonded Us</h3></div> <div><p>www.datafrens.sg</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*it9sezMoFzLZK0Gk)"></div> </div> </div> </a> </div><p id="4e96">Read all our DataFrens articles here at:</p><div id="4994" class="link-block"> <a href="https://medium.com/datafrens-sg"> <div> <div> <h2>DataFrens.sg</h2> <div><h3>A Place for All DataFrens to Blog about Data Stuff…www.DataFrens.sg</h3></div> <div><p>medium.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/1*1zvJEgNLM209Ecvqco5MJw.png)"></div> </div> </div> </a> </div></article></body>

Null Hypothesis and Alternative Hypothesis

(The following writeup is taken from my doctoral dissertation)

Photo by Rohan Makhecha on Unsplash

In a typical scientific inquiry[1], null hypotheses are expected to be rejected and if they are not rejected, a researcher is expected to review the scope of the study or the research design. To further complicate the matter, many academic journal reviewers favorably consider papers that reject the null hypothesis. (Kupfersmid & Fiala, 1991) However, we take the view that not all null hypotheses are to be rejected.

First, there are two kinds of null hypotheses in a research study. There are null hypotheses that test the ‘cradle’[2] of the study and there are null hypotheses that test the ‘candle’[3] of the study. The cradle is the object that holds the candle in place and collects the hardened wax once it is melted and cooled. In our context, the cradle is the foundation of the study, and it falls apart if the hypotheses are rejected. And the candle is the test hypothesis, and it fails to prove a point if the hypotheses are not rejected.

Consider this simple example. When COVID-19 became a pandemic[4], the null hypothesis that the world was very interested in was the following: the COVID-19 vaccine was not able to help subjects resist the virus. Rejecting this null hypothesis (candle) would mean that the vaccine helps the subjects resist the virus. However, underlying this null hypothesis, there are several other null hypotheses (cradle), of which they are expected not to be rejected. For example, the vaccine had no disparate effect on helping subjects resist the virus across different demographic groups. If this null hypothesis is rejected, we would be facing a larger problem with the effectiveness of the vaccine. Moreover, there would likely be many types of vaccine for different age groups, if the age group proves to produce results of statistical difference in the disparate effect. In this simple example, we argue that the ‘candle’ must be rejected but the ‘cradle’ must not be rejected.

However, there’s the other side of the story. Consider the following example. When English physician Edward Jenner discovered vaccination by inoculating an 8-year-old James Phipps with ‘matter collected from a cowpox sore on the hand of a milkmaid’, he had to test the following null hypothesis (candle): the subject will not be reinfected by the same virus when the subject is inoculated with a matter of the virus earlier. This was the hypothesis that existed in the minds of everyone in those days. However, Edward would need to prove the effectiveness of the method by NOT rejecting the null hypothesis: by concluding that the subject will NOT be reinfected by the same virus. Hence, we have an example whereby the ‘candle’ must NOT be rejected. (Bartko, 1991; Frick, 1995)

Unfortunately, not rejecting the null hypothesis (candle) was deemed as ‘pure nonsense’. (Riedel, 2005) For this reason, vaccination was not widely accepted. Was it the part where Edward proved the wrong null hypothesis that caused the scientific committee back then to reject his findings? Or was the scientific committee side blinded by years, and even centuries, of research works that consistently reject null hypotheses as a scientific practice? Has the world arrived at a point where significance testing becomes ‘a kind of essential mindlessness in the conduct of research?’ (Bakan, 1966) Researchers argued that it is not always sensible to accept or reject a hypothesis ‘in a sharp sense’ (Good, 1981) and it has no meaning in interpreting a sharp p-value when the strength of evidence can be better understood by the distribution in which the p-value is assessed. (Rosnow & Rosenthal, 1992) The p-value is a statistic whereby we reject the null hypothesis if the p-value is within the critical region of a distribution. And if it does not fall within the critical region, we do not reject the null hypothesis.

At this point, it is important to differentiate rejecting the null hypothesis as a challenge[5] and not rejecting the null hypothesis as proof. The former calls for statistical methods to meet the challenge and establish the foundation of research, whereas the latter calls for statistical methods to validate and prove a point that interests us.

This research paper will not go into much detail about the validity of not rejecting null hypotheses. There are just too many research papers arguing against the use of significance testing. (Bakan, 1966; Brewer, 1985; Cronbach, 1975; Falk, 1986; Falk & Greenbaum, 1995; Folger, 1989; Gigerenzer & Murray, 2015; Grant, 1962; Meehl, 1967) By simple examples, we see that by not rejecting null hypotheses — such as the cradle, we can also produce results that lay the foundation of a research study and potentially welcome other researchers to falsify the claims. (Popper, 2005) And we expect to reject the null hypotheses — such as the candle — to prove the point in the scientific study.

In this research paper, the ‘cradle’ refers to the differences in loss aversion found between the demographic groups and undertaken roles. Ideally, we should observe no differences and if there are any differences observed, the entire study will fall apart, of which a more extensive study is required by including experimental research into how demographic variables impact the study.

One may argue that the null hypotheses are impacted by typology, taxonomy, or semantics. For example, the objects (typology), the type of objects (taxonomy), and the meaning attached to the objects by linguistics (semantics) impact null hypotheses. For this reason, many researchers cast a light of doubt on how well null hypotheses are designed to explain the research study. However, null hypotheses should be commonly understood and widely accepted in the current civilization. The understanding of the conjectures as laid out in the null hypotheses depends on the language of those days, (Estes, 1997) and the understanding of null hypotheses should remain consistent throughout all ages.

Probably the other argument we face is the understanding of the cradle. Some may argue that the cradle is essentially not part of the scientific study but a validation of the scope of the study. For example, before the actual research study by means of implementing the survey instrument, a researcher may perform the testing of null hypotheses (cradle) to ensure that the subsequent scientific study remains relevant and valid. We believe that the time in which the cradle is tested remains a preference for the researcher. In this research paper, all null hypotheses (cradle and candle) are put together in one chapter, so as to make it easier for readers to follow the research effort. Having the cradle placed in other chapters will require the readers to refer back and forth while reading the results chapter. Nonetheless, the point remains the same, not all null hypotheses are to be rejected, depending on the type of null hypotheses they fall under (cradle or candle?).

About the definition of the null hypothesis, it is defined as “the proposition that there will not be a relationship between the variables you are looking at, i.e. any differences are due to chance.” (Crosier, n.d.) When it comes to chance, we usually examine and assess it using the p-value. If the p-value is 0.05 or less, we observe chance within the probability of 5% or less; a form of ‘odds-against-chance’ fantasy (Carver, 1978) or ‘illusion of attaining improbability’ (Falk & Greenbaum, 1995) Technically, we are saying that the null hypothesis is a proposition which shows that there is no relationship between the variables we are looking at, and even if there’s any relationship observed, they happen by chance with a probability of 5% or less. For null hypotheses that are ‘candle’ in our context, we want to make sure that the differences arising from the question with obscurity effect and question post-effect are subjected to 5% or less of the probability of chance. However, on the flip side, we do not want the differences in loss aversion as observed among the demographic groups attributed to a 5% or less probability of chance. The reason is that the conjecture on which the null hypothesis is based is one that does not expect the differences to be attributed to a 5% or less probability of chance. We are not saying that we want to observe chance. In fact, we want to observe non-chance: the fact that the null hypothesis can’t be rejected remains a long-held truth about the event and by that, we see chance not having a significant role (non-chance). (Abelson, 1995)

Chance is a topic that deals with the unknown and it is often correlated with the conjunction fallacy whereby chance is attributed to the co-occurrence of events rather than its single constituents (Dagnall et al., 2007) and also to reasoning error when co-occurring events are overestimated. (Tversky & Kahneman, 1983) We can say that chance and non-chance are mutually exclusive. An increase in chance does not necessarily mean a decrease in non-chance, and a chance in the sample generalized to the population is mainly directed by the intention of the researcher. For example, in a particular event that is entirely implausible, the odds ratio of 1 out of 10 is seemingly more significant than the odds ratio of 1 out of 100, and such differences in odds ratio are often attributed to how the researcher intends the study to be (1 out of 10 or 1 out of 100?). This is because the implausible event weighs on the probability of chance heavier, and the odds ratio is a reflection of individual events that the researcher observes with a probability of chance. Fixating on the p-value < 0.05 neglects the weightage on the probability of chance. Hypothetically, we can assume that weightage is equal across all probabilities of chance in every scenario, ceteris paribus. However, in real-life scenarios, this is impossible. An event is influenced by many factors, such that chance happens by mere randomness arising from an extremely complex environment. Can we ascertain the exact weightage of the probability of chance? Unfortunately, men do not have the capability to do so. Neither can machines nor artificial intelligence. No human or machine is capable of looking into the crystal ball, so to speak, and determining the weightage and value of the probability of chance for future events.

When a null hypothesis is rejected, if an alternative hypothesis is presented, we accept the alternative hypothesis. Since the null hypothesis tests for chance happening with a probability of 5% or less, researchers strive to explain the newly discovered (or rediscovered) phenomenon. The alternative hypothesis provides this information. Every null hypothesis should have one or more alternative hypotheses (Ranganathan & Pramesh, 2019) in classic statistics. An alternative hypothesis can be split into two — a left tail and a right tail. This is particularly true for parametric analysis where Gaussian distribution is assumed for both the response variable and its error. However, an alternative hypothesis is simply a conjecture to explain the unknown phenomenon and at times a highly probable one based on the frequentist view.

If an event has occurred, the definitive question is not “Is this an event which would be rare if the null hypothesis is true?” but “Is there an alternative hypothesis under which the event would be relatively frequent?” (Berkson, 1942, p.327)

There can be other explanations if the alternative hypothesis is transformed into a null hypothesis. For example, in this research paper, the alternative hypothesis for the third hypothesis is that the behavioral pattern of loss aversion for TSVD product follows the LAS function. However, we can transform this alternative hypothesis into a null hypothesis by stating the potential explanation of using other mathematical functions to explain the loss aversion behavior of consumers selling the TSVD product. The null hypothesis becomes “the behavioral pattern of loss aversion for TSVD is not only explained by the LAS function.” Such a transformation allows researchers to further develop the theory. Pruzek (2016) advocated for this transformation approach in the Bayesian inference framework, whereby the posterior probability of the rejected null hypothesis is used as the prior probability for the newly transformed null hypothesis from the previously accepted alternative hypothesis. But as Frick (1996) mentioned in his paper, specifying the prior probabilities becomes a great challenge when the research work is conducted in isolation, and the applicability of the posterior probabilities from an accepted alternative hypothesis may not necessarily translate quite fittingly to the prior probabilities of a subsequent null hypothesis, especially when the environment or the respondents’ state of mind changes.

However, the transformation of the alternative hypothesis to the null hypothesis must not be a translation of chance to probability. As described in Lindley’s paradox, a 95% non-chance does not mean 95% posterior probability for the null hypothesis, and a high posterior probability does not mean a high likelihood for a very small prior (i.e. 5%) in Bayesian analysis. Implying likelihood, in this case, is erroneous. It is very important to note that the 95% as described above is the probability of non-chance (Lindley, 1957) and not a 95% posterior probability.

The way the hypotheses are formed matters in this research paper. The null hypothesis (cradle) follows the Acceptance-Support Null Hypothesis Significance Testing (AS-NHST) approach, which claims that the researcher believes in the null hypothesis to be true, and accepting it will lead to validation of what the researcher believes. The other null hypothesis (candle) follows the Rejection-Support Null Hypothesis Significance Testing (RS-NHST) approach, which claims that the researcher does not believe that the null hypothesis is true, and rejecting it will lead to validation of what the researcher believes. (Nickerson, 2000) Researchers need to know which approach should be adopted in a study.

In Hypothesis testing, reproducibility is paramount to the generalizability of the research. A small sample with a small p-value may be effective in the current research, but not generalizable to the population. And applicability to the population allows replicability or reproducibility. This brings us to the point about statistical power. A high reproducibility — being able to replicate the research and obtain similar results — requires high statistical power and a larger sample size (Schmidt & Hunter, 1997). However, in most social and psychology research, large sample size is not an ideal option due to budget constraints and logistical challenges. Nonetheless, a study that allocates sample sizes to subgroups that potentially represent the subgroups of the populations (within a reasonably large sample size based on the confidence level and margin of error) can produce strong statistical power. (ibid.) In this research paper, respondents were asked to choose a flower from a list of options. Each choice allocates one respondent to a sub-group. There are 64 or 65 respondents[6] in each sub-group that answer questions for a specific scenario. In total, the sample size in each group adds up to the total sample size, generalizable to the population.

As a final note, it is unfair to throw the entire hypothesis out of the window when the critical region is determined by the critical value — commonly known as the (alpha) — is 5% and the p-value is .051[7]. Can we truly not reject the null hypothesis simply because the p-value is 0.001 above the critical value? It becomes a contentious topic when we observe marginally significant results and yet we throw the entire hypothesis out of the window by not rejecting it when it is, in fact, false. This is often referred to the type II errors in hypothesis testing or “beta”[8].

At the end of the day, a p-value and the rejection of a hypothesis (or not rejecting it) truly depend on the researcher’s observation, data on hand, and experience. It also depends on auxiliary theories or the nature of the psyche in any form of experiments. (Meehl, 1990a, 1990b) Some researchers argue that the results arising from hypothesis testing by observing the probability of chance are imprecise and it give little information about the event. And there’s a need to think statistically rather than ritually practice statistics. (Gigerenzer, 1998) A common agreement that all researchers have arrived at is that the Null Hypothesis Significance Testing should serve as a guide (Abelson, 1995; McDonald, 2016) rather than a rule, and a smart researcher should know when hypothesis testing makes sense in a given context and scenario rather than using hypothesis testing as a norm. (cf. Dar, 1998; Harlow et al., 1997) The use of the p-value to reject the null hypothesis depends on the planned manipulation designed by the researchers to observe the anticipated effect, and such practice should be made explicit before the start of research work. (Rogers et al., 1993)

There is a possibility to use the ‘good enough’ non-nil null hypothesis instead of the point null hypothesis to test for the existence of chance. The ‘good enough’ assessment comes from statistical outputs such as Cohen’s d[9] or confidence level. In this approach, we can better determine not just the magnitude of the association but also the direction based on the variability of the data. It can give us a better picture of the relationship between variables. This method calls for the theory to be assessed using logic (conjoint or disjoint) rather than point (definitive and only one hypothesis to discover the truth). Refer to the paper written by Cohen (1994) for more information on the method of nil hypothesis.

Finally, evidence suggests that people do not take note of the contrapositive likelihood ratio within the Bayesian framework. (Beyth-Marom & Fischhoff, 1983; Griffin & Tversky, 1992; Troutman & Shanteau, 1977) The likelihood ratio refers to the probability of observing the data given a true hypothesis and the contrapositive refers to the probability of observing the data when the hypothesis is not true. Popper (2005) suggested that a scientific theory is not proven to be scientific if falsification is not achieved. Plainly speaking, Popper argued that the contrapositive must be observed together with the positive for a scientific theory to be accepted[10]. We tend to display confirmation bias by favoring a particular hypothesis with which we agree and neglect the contrapositive which can likewise provide a confirmatory outcome. Confirmation bias refers to the irrational favoring of a particular position or belief, such that the other positions or beliefs weigh lesser. In our research paper, we focus on the likelihood ratio of observing data (i.e. chance versus non-chance) that follows the hypothesis but have yet to compute the probability of observing the contrapositive. In subsequent research work, researchers may look into the contrapositive within the Bayesian framework.

References

Abelson, R. P. (1995). Statistics as Principled Argument. Psychology Press.

Bakan, D. (1966). The test of significance in psychological research. Psychological Bulletin, 66(6), 423.

Bartko, J. J. (1991). Proving the null hypothesis.

Berkson, J. (1942). Tests of significance considered as evidence. Journal of the American Statistical Association, 37(219), 325–335.

Beyth-Marom, R., & Fischhoff, B. (1983). Diagnosticity and pseudodiagnosticity. Journal of Personality and Social Psychology, 45, 1185–1195. https://doi.org/10.1037/0022-3514.45.6.1185

Brewer, J. K. (1985). Behavioral statistics textbooks: Source of myths and misconceptions? Journal of Educational Statistics, 10(3), 252–268.

Carver, R. (1978). The case against statistical significance testing. Harvard Educational Review, 48(3), 378–399.

Cohen, J. (1994). The earth is round (p < .05). American Psychologist, 49, 997–1003. https://doi.org/10.1037/0003-066X.49.12.997

Cronbach, L. J. (1975). Beyond the two disciplines of scientific psychology. American Psychologist, 30(2), 116.

Crosier, P. (n.d.). LibGuides: Quantitative data collection and analysis: Testing hypotheses. Retrieved 21 April 2023, from https://libguides.tees.ac.uk/quantitative/testing

Dagnall, N. A., Parker, A., & Munley, G. (2007). Paranormal belief and reasoning. Personality and Individual Differences, 43(6), Article 6.

Dar, R. (1998). Null hypothesis tests and theory corroboration: Defending NHSTP out of context. Behavioral and Brain Sciences, 21(2), 196–197. https://doi.org/10.1017/S0140525X98251168

Estes, W. (1997). On the communication of information by displays of standard errors and confidence intervals. Psychonomic Bulletin & Review, 4(3), 330–341.

Falk, R. (1986). Misconceptions of statistical significance. Journal of Structural Learning.

Falk, R., & Greenbaum, C. W. (1995). Significance tests die hard: The amazing persistence of a probabilistic misconception. Theory & Psychology, 5(1), 75–98.

Folger, R. (1989). Significance tests and the duplicity of binary decisions.

Frick, R. W. (1995). Accepting the null hypothesis. Memory & Cognition, 23(1), 132–138.

Frick, R. W. (1996). The appropriate use of null hypothesis testing. Psychological Methods, 1, 379–390. https://doi.org/10.1037/1082-989X.1.4.379

Gigerenzer, G. (1998). We need statistical thinking, not statistical rituals. Behavioral and Brain Sciences, 21(2), 199–200. https://doi.org/10.1017/S0140525X98281167

Gigerenzer, G., & Murray, D. J. (2015). Cognition as intuitive statistics. Psychology Press.

Good, I. J. (1981). Some Logic and History of Hypothesis Testing. In J. C. Pitt (Ed.), Philosophy in Economics: Papers Deriving from and Related to a Workshop on Testability and Explanation in Economics held at Virginia Polytechnic Institute and State University, 1979 (pp. 149–174). Springer Netherlands. https://doi.org/10.1007/978-94-009-8394-6_10

Grant, D. A. (1962). Testing the null hypothesis and the strategy and tactics of investigating theoretical models. Psychological Review, 69(1), 54.

Griffin, D., & Tversky, A. (1992). The weighing of evidence and the determinants of confidence. Cognitive Psychology, 24(3), 411–435. https://doi.org/10.1016/0010-0285(92)90013-R

Harlow, L. L., Mulaik, S. A., & Steiger, J. H. (Eds.). (1997). There is a Time and Place for Significance Testing. In What If There Were No Significance Tests? Psychology Press.

Kupfersmid, J., & Fiala, M. (1991). A survey of attitudes and behaviors of authors who publish in psychology and education journals.

Lindley, D. V. (1957). A statistical paradox. Biometrika, 44(1/2), 187–192.

MasterClass. (2022, March 8). Scientific Inquiry Definition: How the Scientific Method Works — 2023. MasterClass. https://www.masterclass.com/articles/scientific-inquiry

McDonald, R. P. (2016). Goodness of Approximation in the Linear Model. In What If There Were No Significance Tests? Routledge.

Meehl, P. E. (1967). Theory-testing in psychology and physics: A methodological paradox. Philosophy of Science, 34(2), 103–115.

Meehl, P. E. (1990a). Appraising and Amending Theories: The Strategy of Lakatosian Defense and Two Principles That Warrant It. Psychological Inquiry, 1(2), 108–141.

Meehl, P. E. (1990b). Why Summaries of Research on Psychological Theories are Often Uninterpretable. Psychological Reports, 66(1), 195–244. https://doi.org/10.2466/pr0.1990.66.1.195

Nickerson, R. S. (2000). Null hypothesis significance testing: A review of an old and continuing controversy. Psychological Methods, 5(2), 241–301. https://doi.org/10.1037/1082-989X.5.2.241

Popper, K. (2005). The logic of scientific discovery. Routledge.

Pruzek, R. M. (2016). An Introduction to Bayesian Inference and Its Applications. In What If There Were No Significance Tests? Routledge.

Ranganathan, P., & Pramesh, C. (2019). An Introduction to Statistics: Understanding Hypothesis Testing and Statistical Errors. Indian Journal of Critical Care Medicine : Peer-Reviewed, Official Publication of Indian Society of Critical Care Medicine, 23(Suppl 3), S230–S231. https://doi.org/10.5005/jp-journals-10071-23259

Riedel, S. (2005). Edward Jenner and the history of smallpox and vaccination. Proceedings (Baylor University. Medical Center), 18(1), 21–25.

Rogers, J. L., Howard, K. I., & Vessey, J. T. (1993). Using significance tests to evaluate equivalence between two experimental groups. Psychological Bulletin, 113(3), 553–565. https://doi.org/10.1037/0033-2909.113.3.553

Rosnow, R. L., & Rosenthal, R. (1992). Statistical procedures and the justification of knowledge in psychological science. In Methodological issues & strategies in clinical research (pp. 295–314). American Psychological Association. https://doi.org/10.1037/10109-027

Schmidt, F. L., & Hunter, J. E. (1997). Eight common but false objections to the discontinuation of significance testing in the analysis of research data. What If There Were No Significance Tests, 37–64.

Troutman, C. M., & Shanteau, J. (1977). Inferences based on nondiagnostic information. Organizational Behavior and Human Performance, 19(1), 43–55. https://doi.org/10.1016/0030-5073(77)90053-8

Tversky, A., & Kahneman, D. (1983). Extensional versus intuitive reasoning: The conjunction fallacy in probability judgment. Psychological Review, 90(4), 293.

[1] The scientific inquiry refers to the ways in which scientists study nature and provide some explanations based on the evidence of a study. (MasterClass, 2022)

[2] This is usually referred to as the substantive hypothesis.

[3] This is usually referred to as the statistical hypothesis.

[4] As of now, it is endemic with many countries removing the requirements for mask-wearing.

[5] We are not saying that hypotheses that pose a challenge are not statements to prove. Rather, the motivation to test the hypotheses comes from a challenge more than proof. For example, we know the sun rises from the east every day. There is no need for further proof. But we can challenge this truth for the sake of making a point!

[6] As the sample size is an odd number of

, there will be one group which has 64 respondents.

[7] This value is translated to 5.1% for the purpose of comparison.

[8] This is the reason why the LAS function uses the b-parameter rather than the ‘beta’ Greek letter. There will be much confusion when the beta is used in both scenarios in a text.

[9] Cohen’s d considers the variability of the data. It takes the difference between two mean values, and dividing by the data’s standard deviation. As you may have noticed, the variability of the data is considered.

[10] Popper argued his position with the possibility of other explanations that might exists in the contrapositive.

Dr. Daniel Koh

Daniel started off his career as a senior list researcher with a British publishing firm. Back then, his role involved contact sourcing through the internet and performed data entry into the Microsoft Dynamic CRM system. (Microsoft Dynamic CRM 3.0) Progressively, he explored the option of using Visual Basic scripting within excel to automate the contact sourcing process.

He successfully developed and implemented the scripts, leading to 95% increase in data entry efficiency. He then moved on to take on the role of a CRM executive with Fuji Xerox Singapore.

As a CRM executive, he liaised with third party vendor for technical enhancement of the CRM system (Microsoft Dynamic CRM 4.0 and 365). He also performs functional enhancement of the CRM system for hundreds of end users.

His notable achievement was the development of the CRM boy that led to 98% improvement in data quality and data integrity in the CRM system. Following his Masters studies in Consumer Insight with Nanyang Business School, he took on the role of an Analytics instructor with Singapore Management University. He prepared class notes and technical walkthrough, and taught Analytics to the undergraduate students from various disciplines. Subsequently, he took on various roles as consultants in the consultancy, manufacturing and information technology industries in Singapore.

He travelled to Paris, London, Sri Lanka, Japan and Malaysia to fulfill his role as a consultant. The cultural and professional exchanges between local and overseas data analytics had given him a very good overview of the expectations and motivations from people around the world. He also had a chance to relocate to the United States for one year, particularly focusing on Operations Management.

Prior to his current freelance status, he took on the role of the Data Science Lead in a Singaporean software company. His primary role was to develop Artificial Intelligence using logic, data science and machine learning techniques through in-depth, full-stacked scripting. He also developed customized Reporting for his customers. In his point of view, 95% of today’s reporting can be automated, which can free up staff from daily manual work.

He holds a Bachelor of Science in Marketing (BSc. Marketing Pass with Merit) from Singapore University of Social Sciences (in which he graduated as a Valedictorian), a Master of Science in Marketing and Consumer Insights (MSc. Marketing and Consumer Insights) from Nanyang Technological University, a Doctor of Business Administration (DBA) from Swiss School of Business and Management.

A Message from DataFrens…

Thanks for being a part of our community!

Do join us here at:

Read all our DataFrens articles here at:

Data Science
Hypothesis
Recommended from ReadMedium