To request a blog written on a specific topic, please email James@StatisticsSolutions.com with your suggestion. Thank you!

Tuesday, March 31, 2009

Screening of the Data

Careful analysis of data applicability after collection and before analysis is probably the most time-consuming part of data analysis (Tabachnick & Fidell, 2001). This step is, however, of utmost importance as it provides the foundation for any subsequent analysis and decision-making which rests on the accuracy of the data. Incorrect analysis of the data during purification, including EFA, and before conducting confirmatory SEM analysis may result in poor fitting models or, worse, models that are inadmissible.

Data screening is important when employing covariance-based techniques such as structural equation modelling where assumptions are stricter than for the standard t-test. Many of the parametric statistical tests (based on probability distribution theory) involved in this study assume that: (a) normally distributed data – the data are from a normally distributed population, (b) homogeneity of variance – the variances in correlational designs should be the same for each level of each variable, (c) interval data – data where the distance between any two points is the same and is assumed in this study for Likert data, and (d) independence – the data from each respondent has no effect on any other respondent’s scores.

Many of the common estimation methods in SEM (such as maximum-likelihood estimation) assume: (a) “all univariate distributions are normal, (b) joint distribution of any pair of the variables is bivariate normal, and (c) all bivariate scatterplots are linear and homoscedastic” (Kline, 2005, p. 49). Unfortunately, SPSS does not offer an assessment of multivariate normality but Field (2005) and others (Kline, 2005; Tabachnick & Fidell, 2001) recommend first assessing univariate normality. The data were checked for plausible ranges and examination was satisfactory. There were no data out of range.