Sign test
The sign test is a non-parametric statistical test used to determine if there is a statistically significant difference between paired observations. It's used when the data do not meet the assumptions required for parametric tests, such as the t-test (e.g., data are not normally distributed or measured on an ordinal scale). The sign test relies on the direction of the differences between paired observations (positive or negative) rather than the magnitude of the differences.
Purpose
The primary purpose of the sign test is to assess whether one member of a pair consistently ranks higher (or lower) than the other. It tests the null hypothesis that the median difference between paired observations is zero.
Methodology
The sign test involves the following steps:
- Pair the observations: Data must be in paired form (e.g., measurements taken before and after a treatment on the same individual).
- Calculate the differences: For each pair, subtract one observation from the other.
- Assign signs: Assign a plus (+) sign to positive differences, a minus (-) sign to negative differences, and ignore pairs with zero differences (these ties are typically discarded from the analysis, although there are variations that account for them).
- Count the signs: Count the number of positive signs (N+) and the number of negative signs (N-).
- Determine the test statistic: The test statistic is either the smaller of N+ and N- (for a two-tailed test) or N+ or N- (depending on the direction of the hypothesis for a one-tailed test).
- Calculate the p-value: Compare the test statistic to a critical value from a binomial distribution (or a normal approximation if the sample size is large enough) with p = 0.5 (under the null hypothesis, positive and negative signs are equally likely). The p-value represents the probability of observing the observed data (or more extreme data) if the null hypothesis is true.
- Make a decision: If the p-value is less than the chosen significance level (alpha), reject the null hypothesis and conclude that there is a statistically significant difference between the paired observations. Otherwise, fail to reject the null hypothesis.
Assumptions
The sign test has minimal assumptions:
- The data consist of paired observations.
- The observations within each pair are independent of other pairs.
- The data are at least ordinal (allowing for the determination of direction).
Advantages
- Simple to understand and apply.
- Non-parametric, making it suitable for data that do not meet parametric assumptions.
- Robust to outliers.
Disadvantages
- Less powerful than parametric tests when parametric assumptions are met.
- Ignores the magnitude of the differences, leading to a potential loss of information. Discarding zero differences can also reduce the sample size and power.
Alternatives
If the data meet parametric assumptions, a paired t-test is more powerful. If the data are ordinal but the magnitudes of the differences are meaningful, the Wilcoxon signed-rank test is a better choice as it considers both the sign and rank of the differences.