Chi-square test in Python

  4 minute read   Follow @GaryLi

Chi-square (χ2) test for independence (Pearson Chi-square test)

  • Chi-square test is a non-parametric (distribution-free) method used to compare the relationship between the two categorical (nominal) variables in a contingency table
  • For example, we have different treatments (treated and nontreated) and treatment outcomes (cured and noncured), here we could use the chi-square test for independence to check whether treatments are related to treatment outcomes.
  • Note: Chi-square test for independence is different than the chi-square goodness of fit test


\( \chi^2 = \sum\limits_i^n \frac{(O_i - E_i)^2}{E_i} \)

Where, \( O_i \) = Observed value in contingency table,
\( E_i \) = Expected value for each cell in contingency table,
\( i \) = 1,2,..., n


  • Null hypotheses: The two categorical variables are independent (no association between the two variables) ( H0: Oi = Ei )
  • Alternative hypotheses: The two categorical variables are dependent (there is an association between the two variables) ( Ha: Oi ≠ Ei )
  • Note: There are no one or two-tailed P-value. Rejection region of the chi-square test is always on the right side of the distribution.


  • The two variables are categorical (nominal) and data is randomly sampled
  • The levels of variables are mutually exclusive
  • The expected frequency count for at least 80% of the cell in a contingency table is at least 5
  • The expected frequency count should not be less than 1
  • Observations should be independent of each other
  • Observation data should be frequency counts and not percentages or transformed data

Perform a chi-square test for independence

  • We will use bioinfokit v0.9.5 or later
  • Check bioinfokit documentation for installation and documentation
  • Download a hypothetical dataset for chi-square test for independence
# I am using interactive python interpreter (Python 3.7.4)
>>> from bioinfokit.analys import stat, get_data
# load example dataset
>>> df = get_data('drugdata').data
>>> df.head()
   treatments  cured  noncured
0     treated     60        10
1  nontreated     30        25
# set treatments column as index
>>> df = df.set_index('treatments')
>>> df.head()
            cured  noncured
treated        60        10
nontreated     30        25

# run chi-square test for independence
>>> res = stat()
>>> res.chisq(df=df)

# output
>>> print(res.summary)

Chi-squared test for independence

Test              Df    Chi-square      P-value
--------------  ----  ------------  -----------
Pearson            1       13.3365  0.000260291
Log-likelihood     1       13.4687  0.000242574

>>> print(res.expected_df)

Expected frequency counts

      cured    noncured
--  -------  ----------
 0     50.4        19.6
 1     39.6        15.4


The p value obtained from chi-square test for independence is significant (p < 0.05), and therefore, we conclude that there is a significant association between treatments (treated and nontreated) with treatment outcome (cured and noncured)

Chi-square (χ2) Goodness of Fit test

  • Chi-square Goodness of Fit Test test is a non-parametric (distribution-free) method used to compare the observed and expected values from one categorical variable. The expected values are calculated based on the known theoretical expectation.
  • For example, we have resistant (A) and susceptible (B) genotypes for some disease. The crosses between these two genotypes will produce offspring in 3:1 (75% A and 25% B genotype) as per Mendelian ratio assuming resistance to disease is a dominant trait. Here, we could use the chi-square Goodness of Fit Test test to check whether observed counts of A and B genotypes are similar to expected counts of A and B genotypes as per the Mendelian ratio.


\( \chi^2 = \sum\limits_i^n \frac{(O_i - E_i)^2}{E_i} \)

Where, \( O_i \) = Observed value for category i,
\( E_i \) = Expected value for category i,
\( i \) = 1,2,..., n


  • Null hypotheses: The observed and expected counts in each group are equal ( H0: Oi = Ei )
  • Alternative hypotheses: The observed and expected counts in each group are different ( Ha: Oi ≠ Ei )


  • The variable should be categorical (nominal) and data is randomly sampled
  • The groups of variables are mutually exclusive
  • The expected count should be at least 5 for each group
  • Observations should be independent of each other
  • Observation data should be frequency counts and not percentages or transformed data

Perform a Goodness of Fit test Python

# I am using interactive python interpreter (Python 3.7)
>>> from bioinfokit.analys import stat
>>> import pandas as pd
# create or import pandas dataframe of observed counts
>>> df = pd.DataFrame({'genotypes':['A', 'B'], 'observed':[155, 45]})
>>> df = df.set_index(['genotypes'])
>>> df.head()
A               155
B                45

# run chi-square test 
>>> res = stat()
# p should be known theoretical expectation and must sum to 1
>>> res.chisq(df=df, p=(0.75, 0.25))

# output
>>> print(res.summary)

Chi-squared goodness of fit test

  Chi-Square    Df    P-value    Sample size
------------  ----  ---------  -------------
    0.666667     1   0.414216            200

# get expected counts
>>> print(res.expected_df)
           observed  expected_counts
A               155            150.0
B                45             50.0


The p value obtained from the chi-square Goodness of Fit test is non-significant (p > 0.05 and fail to reject the null hypothesis), and therefore, we conclude that the observed genotypes counts after crosses is similar to that of expected counts as per the Mendelian ratio.


  • Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, Burovski E, Peterson P, Weckesser W, Bright J, van der Walt SJ. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nature methods. 2020 Mar;17(3):261-72.

Last updated: November 24, 2020