**Question:** By using an appropriate hypothesis test, determine if the age of those who have recently donated is at least 10 years older than those who have not recently donated in the population.

**Solution:** Let \(x_1\) be the vector of ages of those who have recently donated and let \(x_0\) be the vector of ages of those who have not recently donated.

```
# get data
url <- "http://peopleanalytics-regression-book.org/data/charity_donation.csv"
donation <- read.csv(url)
# subset ages for those who recently donated and those who didn't
x1 <- subset(donation, subset = recent_donation == 1, select = "age")
x0 <- subset(donation, subset = recent_donation == 0, select = "age")
```

We are trying to establish if \(\bar{x_1} - 10 > \bar{x_0}\) in the population. Alternatively stated, we are testing if \(\overline{x_1 - 10} > \bar{x_0}\) in the population, so we are doing a \(t\)-test in our sample to compare `x1 - 10`

with `x0`

. Our null hypothesis is that there is no difference between the means of `x1 - 10`

and `x0`

. Our alternative hypothesis is that the mean of `x1 - 10`

is greater than the mean of `x0`

. This requires a one sided \(t\)-test.

```
# run one sided t-test to compare means of x1 - 10 and x0
t.test(x1 - 10, x0, alternative = "greater")
```

```
##
## Welch Two Sample t-test
##
## data: x1 - 10 and x0
## t = 5.2268, df = 105.15, p-value = 4.403e-07
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
## 7.162817 Inf
## sample estimates:
## mean of x mean of y
## 52.68675 42.19188
```

The p-value of this test meets an alpha standard of 0.001 and this confirms the alternate hypothesis that those who donated are at least ten years older than those who did not.

Alternatively, as submitted by sasnnm, the `mu`

parameter in the `t.test()`

function can be used to test for a specified minimum difference:

```
# one sided t-test for minimum difference of 10
t.test(x1, x0, alternative = "greater", mu = 10)
```

```
##
## Welch Two Sample t-test
##
## data: x1 and x0
## t = 5.2268, df = 105.15, p-value = 4.403e-07
## alternative hypothesis: true difference in means is greater than 10
## 95 percent confidence interval:
## 17.16282 Inf
## sample estimates:
## mean of x mean of y
## 62.68675 42.19188
```

This solution was submitted by NicoleRL25.

```
import statsmodels.api as sm
# get data
donation=pd.read_csv('http://peopleanalytics-regression-book.org/data/charity_donation.csv')
# subset ages for those who recently donated and those who didn't
x1=donation.loc[donation.recent_donation==1,'age']
x0=donation.loc[donation.recent_donation==0,'age']
```

To perform the Welch’s t-test using statsmodels with `usevar= 'unequal'`

. The default is `'pooled'`

which will perform the student’s t-test. The `value`

parameter can be used to test for a specific minimum difference. This returns a tuple of the t-statistic, p-value and degrees of freedom of the hypothesis test.

```
# one sided t-test for minimum difference of 10
sm.stats.ttest_ind(x1, x0, usevar='unequal', alternative='larger', value=10)
```

`## (5.226802173285962, 4.4030947228348763e-07, 105.150577594637)`