To request a blog written on a specific topic, please email James@StatisticsSolutions.com with your suggestion. Thank you!

Friday, August 21, 2009

t-test

The t-test involves the single interval dependent variable and a dichotomous independent variable if the researcher wishes to conduct the t-test for the difference of means. The t-test can also be used to compare the means for two dependent samples and two independent samples. Additionally, the t-test can be used to test between a sample mean and a known mean, which is also called the t-test for one sample.

Statistics Solutions is the country's leader in t-test and dissertation statistics. Contact Statistics Solutions today for a free 30-minute consultation.

The t-test is a parametric test that makes a very popular and obvious assumption—that of normal distribution or normal population. The researcher should note that if all the assumptions of the t-test are met, then the t-test becomes the most powerful. It is the most powerful test of any particular two sample non-parametric test.

The t-test is basically employed in those cases where the size of the sample is generally less than 30. If, however, the sample size is larger than 30, then instead of using the t-test, the researcher employs the z test.

The t-test is mainly based upon the student’s t distribution. The calculation of the t-test is different for comparison between the independent and the dependent samples, but the inference drawn from the t-test is the same.

The critical value in the t-test is the value that is found in the table of values of the t distribution for a given level of significance. If the value that has been calculated by using the t-test is more than the critical t value, then the null hypothesis that has been assumed in the t-test is rejected. But, if the value that has been calculated by using the t-test is less than the critical t value, then the null hypothesis that is assumed in the t-test is accepted.

The confidence limits in the t-test basically construct the upper bound and the lower bound on an estimate for a given level of significance. The confidence interval in the t-test is the range within these bounds. Such limits are employed in the t-test because such limits provide additional information on the relative meaningfulness of the estimates.

In SPSS, the t-test is conducted by selecting the “compare means” from the “analyze” menu and then by clicking any option, depending upon the type of t-test to be conducted by the researcher in SPSS. If two samples are involved, then the researcher can either employ an independent sample t-test or a paired sample t-test, depending on the type of data.

The following are some assumptions that have been assumed in the t-test:

The first assumption in the t-test is that the distribution, or the population under consideration, is that of normal distribution or normal population. For satisfying this assumption, there are certain tests for normality. The researcher should note that the t-test can draw invalid conclusions when the two samples come from widely different shaped distributions. Some statisticians suggest that the t-test should be normally distributed for the sample size, which is mainly less than 15.

The second assumption made in the t-test is that of the homogeneity of the variances in the sample. SPSS employs a test for testing the homoscedastic nature of the sample in the t-test. This test is called "Levene's Test for Equality of Variances," with F value and corresponding significance. The researcher should note that the t-test will result in invalid inferences if the two samples are unequal in size and also have unequal variances.

The third assumption is that in the t-test it does not matter whether the sample is a dependent or independent sample. This is because the inference drawn from the t-test will remain the same whether the sample is independent or dependent; only the calculation of the t-test will differ.

Multiple Regression

Multiple regression involves a single dependent variable and two or more independent variables. Multiple regression is basically a statistical technique that simultaneously develops a mathematical relationship between two or more independent variables and an interval scaled dependent variable.

Statistics Solutions is the country's leader in multiple regression and dissertation statistics. Contact Statistics Solutions today for a free 30-minute consultation.

Questions like how much of the variations in sales can be explained by advertising expenditures, prices and the level of distribution can be answered by employing the statistical technique called multiple regression.

The general form of multiple regression is given by the multiple regression model and is the following:

Y= ß0 + ß1X1 + ß2X2 + …….. + ßkXk + e.

This multiple regression model is estimated using the following equation:

= a + b1X1 + b2X2 + …….. + bkXk.

There are certain statistics that are used while conducting the analysis on multiple regression.

The R2 in multiple regression is the coefficient of the multiple determination. This coefficient in multiple regression measures the strength of association.

The F test in multiple regression is used to test the null hypothesis that the coefficient of the multiple determination in the population is equal to zero.

The partial F test in multiple regression is used to test the significance of a partial regression coefficient. This incremental F statistic in multiple regression is based on the increment in the explained sum of squares that results from the addition of the independent variable to the regression equation after all the independent variables have been included.

The partial regression coefficient in multiple regression is denoted by b1. This denotes the change in the predicted value per unit change in X1, when the other independent variables are held constant.

In SPSS, multiple regression is conducted by the researcher by selecting “regression” from the “analyze menu.” From regression, the researcher selects the “linear” option. When the linear regression dialogue box appears, then the researcher enters one numeric dependent variable and two or more independent variables and then finally he will carry out multiple regression in SPSS.

The following assumptions are made in multiple regression statistical analysis:

The first assumption of multiple regression involves the proper specification of the model. This assumption is important in multiple regression because if the relevant variables are omitted from the model, then the common variance which they share with variables that are included in the mode is then wrongly characterized with respect to those variables, and hence the error term is inflated.

The second assumption is that the residual errors in multiple regression are normally distributed. In other words, one can say that the residual errors in multiple regression should follow the normal population having zero as mean and a variance as one.

The third assumption in multiple regression is that of unbounded data. The regression line produced by OLS (ordinary least squares) in multiple regression can be extrapolated in both directions, but is meaningful only within the upper and lower natural bounds of the dependent.

Thursday, August 20, 2009

Dissertation Data Analysis

The dissertation is the single most important part of any doctoral student’s career because the dissertation is the last step that a doctoral student must take if he or she is to receive his or her doctoral degree and graduate with the title of “Dr.” The dissertation, then, needs to be completed carefully and meticulously as it will be scrutinized by professors before a student can obtain his or her degree. One of the most difficult aspects of the lengthy and time consuming dissertation is the dissertation data analysis. The dissertation itself revolves around data because the doctoral student must actually prove something new and of interest in the field in which he or she is working. The dissertation data analysis will be that proof and the dissertation data analysis will center the entire dissertation.


Statistics Solutions is the country's leader in dissertation data analysis and dissertation statistics. Contact Statistics Solutions today for a free 30-minute consultation.


Doctoral students often struggle with the dissertation data analysis because the dissertation data analysis revolves around statistics. In other words, in order to have accurate dissertation data analysis, a doctoral student must be proficient in statistics if he or she is to have valid dissertation data analysis. Dissertation data analysis also takes a lot of time, which is another reason why doctoral students often have a hard time with it. If mistakes are done in the dissertation data analysis phase, then those mistakes from the dissertation data analysis will severely affect the student’s dissertation, conclusion and results. Thus, it is very important that the dissertation data analysis is done in a precise and accurate manner.


Because dissertation data analysis can be so complicated and difficult for doctoral students, there is help available. The first source for help on the dissertation data analysis, of course, is the student’s advisor. The doctoral student’s advisor is not always available to answer questions and provide help on the dissertation data analysis, however, and for that reason, dissertation consultants are the perfect solution to getting help and working on the dissertation data analysis. Dissertation consultants can make sure that the dissertation data analysis is accurate and done correctly because dissertation consultants specialize in dissertation data analysis. The reason why dissertation consultants are a sound solution to acquiring dissertation data analysis help is because dissertation consultants are trained statisticians who specialize in both statistics and in dissertations. Dissertation consultants can look at all of the data that the student has gathered over the course of working on his or her dissertation and dissertation consultants can make sense of that data. Thus, dissertation consultants can be an immense help to students on the dissertation data analysis.


Not only can dissertation consultants help with the dissertation data analysis, however, dissertation consultants can also make sure that the data that has been collected by the doctoral student is valid, accurate and statistically sound. In order to produce accurate dissertation data analysis, one must have valid dissertation statistics and dissertation data. If that data is not valid or accurate, that will of course throw off the dissertation data analysis. Dissertation consultants know this and dissertation consultants will make sure that the data that has been collected has been collected properly, that all of the proper tests were used while collecting the data, that the correct sample sizes were used while collecting the data and that the data is not biased. Once the dissertation consultant has double checked to make sure that the dissertation data is indeed accurate, the dissertation consultants can get to work with the student on the dissertation data analysis. With the help of a dissertation consultant, then, the student will be sure to obtain his or her doctoral degree as every single part of the dissertation data and the dissertation data analysis will be correct, accurate, valid and dependable.

Descriptive Measures

Quantitative data in statistics exhibits some general characteristics that constitute the ideology of descriptive measures.

Statistics Solutions is the country's leader in descriptive statistics and dissertation statistics. Contact Statistics Solutions today for a free 30-minute consultation.

There are four different forms of descriptive measures.

The first form of descriptive measures is the measure of central tendency, which is also called the averages.

The second form of descriptive measures is the measure of variation or dispersion.

The third form of descriptive measures is the measure of skewness.

The fourth form of descriptive measures is the measure of kurtosis.

The first form of descriptive measures consists of five descriptive measures, namely Arithmetic Mean, Median, Mode, Geometric Mean and Harmonic Mean.

There are some characteristics that have been put forth by Professor Yule regarding descriptive measures. They are as follows:

1. Descriptive measures should be rigidly defined.

2. Descriptive measures should be less complicated and easy to calculate.

3. The descriptive measures being calculated must be based upon all the observations under consideration.

4. The descriptive measures must be applicable for further mathematical treatment.

5. The descriptive measures must not get affected by the fluctuations of the sampling.

6. The descriptive measures should not get affected by extreme values.

The descriptive measure called the arithmetic mean is defined as the sum of the set of the observations that is divided by the number of that particular set of observations. This descriptive measure satisfies the first five properties laid down by Professor Yule. The biggest disadvantage of this descriptive measure is that it can’t be used in the case of qualitative data and it is also affected by extreme values.

The descriptive measure called the median is defined as that value of the variable which divides the data under consideration into two equal parts. This descriptive measure satisfies the first two and the sixth property put forth by Professor Yule. This descriptive measure can be used in the case of qualitative data, but this descriptive measure cannot be measured quantitatively.

The descriptive measure called mode is defined as the value that occurs most of the time in a particular set of observations. This descriptive measure satisfies the second and the last property that has been put forth by Professor Yule. It is that type of descriptive measure that is used in obtaining an ideal size in business forecasting, etc.

The descriptive measure called the geometric mean is defined as the nth root of the product of the set of the observations under consideration. The basic disadvantage of this type of descriptive measure is that it can neither be easily understood nor be calculated by the person who does not have a mathematical background. This descriptive measure satisfies the first, third, fourth and fifth property put forth by Professor Yule.

The descriptive measure called the harmonic mean is defined as the reciprocal of the arithmetic mean of the reciprocals of the given values provided that none of the observations are zero. The basic disadvantage of this type of descriptive measure is that it cannot be easily understood or be calculated by a person who does not have a mathematical background. This descriptive measure satisfies the first, third, fourth and fifth property put forth by Professor Yule.

The second form of descriptive measure is classified into two categories.

The first category is used in expressing the spread of the observations with respect to the distance that exists between the values of the selected observations. This includes things like range, inter-quartile range, etc.

The second category is used in expressing the spread of the observations with respect to the average of the deviations of the observations for some central value. This includes things like mean, deviation, standard deviation, etc.

The third form of descriptive measure consists of three coefficients of skewness, namely Professor Karl Pearson’s coefficient of skewness, Professor Bowley’s coefficient of skewness and the coefficient of the skewness that is based on the moments.

The fourth form of descriptive measure gives an idea about the flatness or peakedness of the frequency curve. If the curve is neither flat nor peaked, then the descriptive measure concludes that it is a normal curve or a mesokurtic curve. If the curve is flatter than the normal curve, then the descriptive measure concludes that it is a platykurtic curve. If the curve is more peaked, then the descriptive measure concludes that it is a leptokurtic curve.