📖 WIPIVERSE

🔍 Currently registered entries: 121,541건

DFFITS

DFFITS, short for "Difference in Fits," is a statistic used in regression diagnostics to identify influential observations. It measures the influence of the i-th observation on its own fitted value. Specifically, it quantifies the change in the predicted value for the i-th observation when that observation is removed from the dataset and the regression model is refitted.

DFFITS can be calculated using the formula:

DFFITSi = (ŷi - ŷi(i)) / (s(i) * √(hii))

Where:

  • ŷi is the predicted value for the i-th observation using the full dataset.
  • ŷi(i) is the predicted value for the i-th observation when the i-th observation has been removed from the dataset and the model refitted.
  • s(i) is the estimated standard deviation of the error term when the i-th observation has been removed from the dataset.
  • hii is the leverage of the i-th observation, representing the influence the i-th observation has on its own fitted value. It is the i-th diagonal element of the hat matrix.

DFFITS combines the leverage and residual information of a data point. A large absolute value of DFFITS indicates that the observation has a strong influence on its own predicted value. A common rule of thumb is that an observation is considered influential if |DFFITSi| > 2√(p/n), where p is the number of parameters in the regression model (including the intercept) and n is the number of observations. This threshold provides a guideline for identifying observations that may warrant further investigation or special treatment in the analysis.