Solution in R

Question: By using an appropriate hypothesis test, determine if the age of those who have recently donated is at least 10 years older than those who have not recently donated in the population.

Solution: Let \(x_1\) be the vector of ages of those who have recently donated and let \(x_0\) be the vector of ages of those who have not recently donated.

# get data
url <- "http://peopleanalytics-regression-book.org/data/charity_donation.csv"
donation <- read.csv(url)

# subset ages for those who recently donated and those who didn't
x1 <- subset(donation, subset = recent_donation == 1, select = "age")
x0 <- subset(donation, subset = recent_donation == 0, select = "age")

We are trying to establish if \(\bar{x_1} - 10 > \bar{x_0}\) in the population. Alternatively stated, we are testing if \(\overline{x_1 - 10} > \bar{x_0}\) in the population, so we are doing a \(t\)-test in our sample to compare x1 - 10 with x0. Our null hypothesis is that there is no difference between the means of x1 - 10 and x0. Our alternative hypothesis is that the mean of x1 - 10 is greater than the mean of x0. This requires a one sided \(t\)-test.

# run one sided t-test to compare means of x1 - 10 and x0
t.test(x1 - 10, x0, alternative = "greater")
## 
##  Welch Two Sample t-test
## 
## data:  x1 - 10 and x0
## t = 5.2268, df = 105.15, p-value = 4.403e-07
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  7.162817      Inf
## sample estimates:
## mean of x mean of y 
##  52.68675  42.19188

The p-value of this test meets an alpha standard of 0.001 and this confirms the alternate hypothesis that those who donated are at least ten years older than those who did not.

Alternatively, as submitted by sasnnm, the mu parameter in the t.test() function can be used to test for a specified minimum difference:

# one sided t-test for minimum difference of 10
t.test(x1, x0, alternative = "greater", mu = 10)
## 
##  Welch Two Sample t-test
## 
## data:  x1 and x0
## t = 5.2268, df = 105.15, p-value = 4.403e-07
## alternative hypothesis: true difference in means is greater than 10
## 95 percent confidence interval:
##  17.16282      Inf
## sample estimates:
## mean of x mean of y 
##  62.68675  42.19188

Solution in Python

This solution was submitted by NicoleRL25.

import statsmodels.api as sm

# get data
donation=pd.read_csv('http://peopleanalytics-regression-book.org/data/charity_donation.csv')

# subset ages for those who recently donated and those who didn't
x1=donation.loc[donation.recent_donation==1,'age']
x0=donation.loc[donation.recent_donation==0,'age']

To perform the Welch’s t-test using statsmodels with usevar= 'unequal'. The default is 'pooled' which will perform the student’s t-test. The value parameter can be used to test for a specific minimum difference. This returns a tuple of the t-statistic, p-value and degrees of freedom of the hypothesis test.

# one sided t-test for minimum difference of 10
sm.stats.ttest_ind(x1, x0, usevar='unequal', alternative='larger', value=10)
## (5.226802173285962, 4.4030947228348763e-07, 105.150577594637)

Back to solutions