To request a blog written on a specific topic, please email James@StatisticsSolutions.com with your suggestion. Thank you!

Thursday, August 20, 2009

Dissertation Data Analysis

The dissertation is the single most important part of any doctoral student’s career because the dissertation is the last step that a doctoral student must take if he or she is to receive his or her doctoral degree and graduate with the title of “Dr.” The dissertation, then, needs to be completed carefully and meticulously as it will be scrutinized by professors before a student can obtain his or her degree. One of the most difficult aspects of the lengthy and time consuming dissertation is the dissertation data analysis. The dissertation itself revolves around data because the doctoral student must actually prove something new and of interest in the field in which he or she is working. The dissertation data analysis will be that proof and the dissertation data analysis will center the entire dissertation.


Statistics Solutions is the country's leader in dissertation data analysis and dissertation statistics. Contact Statistics Solutions today for a free 30-minute consultation.


Doctoral students often struggle with the dissertation data analysis because the dissertation data analysis revolves around statistics. In other words, in order to have accurate dissertation data analysis, a doctoral student must be proficient in statistics if he or she is to have valid dissertation data analysis. Dissertation data analysis also takes a lot of time, which is another reason why doctoral students often have a hard time with it. If mistakes are done in the dissertation data analysis phase, then those mistakes from the dissertation data analysis will severely affect the student’s dissertation, conclusion and results. Thus, it is very important that the dissertation data analysis is done in a precise and accurate manner.


Because dissertation data analysis can be so complicated and difficult for doctoral students, there is help available. The first source for help on the dissertation data analysis, of course, is the student’s advisor. The doctoral student’s advisor is not always available to answer questions and provide help on the dissertation data analysis, however, and for that reason, dissertation consultants are the perfect solution to getting help and working on the dissertation data analysis. Dissertation consultants can make sure that the dissertation data analysis is accurate and done correctly because dissertation consultants specialize in dissertation data analysis. The reason why dissertation consultants are a sound solution to acquiring dissertation data analysis help is because dissertation consultants are trained statisticians who specialize in both statistics and in dissertations. Dissertation consultants can look at all of the data that the student has gathered over the course of working on his or her dissertation and dissertation consultants can make sense of that data. Thus, dissertation consultants can be an immense help to students on the dissertation data analysis.


Not only can dissertation consultants help with the dissertation data analysis, however, dissertation consultants can also make sure that the data that has been collected by the doctoral student is valid, accurate and statistically sound. In order to produce accurate dissertation data analysis, one must have valid dissertation statistics and dissertation data. If that data is not valid or accurate, that will of course throw off the dissertation data analysis. Dissertation consultants know this and dissertation consultants will make sure that the data that has been collected has been collected properly, that all of the proper tests were used while collecting the data, that the correct sample sizes were used while collecting the data and that the data is not biased. Once the dissertation consultant has double checked to make sure that the dissertation data is indeed accurate, the dissertation consultants can get to work with the student on the dissertation data analysis. With the help of a dissertation consultant, then, the student will be sure to obtain his or her doctoral degree as every single part of the dissertation data and the dissertation data analysis will be correct, accurate, valid and dependable.

Descriptive Measures

Quantitative data in statistics exhibits some general characteristics that constitute the ideology of descriptive measures.

Statistics Solutions is the country's leader in descriptive statistics and dissertation statistics. Contact Statistics Solutions today for a free 30-minute consultation.

There are four different forms of descriptive measures.

The first form of descriptive measures is the measure of central tendency, which is also called the averages.

The second form of descriptive measures is the measure of variation or dispersion.

The third form of descriptive measures is the measure of skewness.

The fourth form of descriptive measures is the measure of kurtosis.

The first form of descriptive measures consists of five descriptive measures, namely Arithmetic Mean, Median, Mode, Geometric Mean and Harmonic Mean.

There are some characteristics that have been put forth by Professor Yule regarding descriptive measures. They are as follows:

1. Descriptive measures should be rigidly defined.

2. Descriptive measures should be less complicated and easy to calculate.

3. The descriptive measures being calculated must be based upon all the observations under consideration.

4. The descriptive measures must be applicable for further mathematical treatment.

5. The descriptive measures must not get affected by the fluctuations of the sampling.

6. The descriptive measures should not get affected by extreme values.

The descriptive measure called the arithmetic mean is defined as the sum of the set of the observations that is divided by the number of that particular set of observations. This descriptive measure satisfies the first five properties laid down by Professor Yule. The biggest disadvantage of this descriptive measure is that it can’t be used in the case of qualitative data and it is also affected by extreme values.

The descriptive measure called the median is defined as that value of the variable which divides the data under consideration into two equal parts. This descriptive measure satisfies the first two and the sixth property put forth by Professor Yule. This descriptive measure can be used in the case of qualitative data, but this descriptive measure cannot be measured quantitatively.

The descriptive measure called mode is defined as the value that occurs most of the time in a particular set of observations. This descriptive measure satisfies the second and the last property that has been put forth by Professor Yule. It is that type of descriptive measure that is used in obtaining an ideal size in business forecasting, etc.

The descriptive measure called the geometric mean is defined as the nth root of the product of the set of the observations under consideration. The basic disadvantage of this type of descriptive measure is that it can neither be easily understood nor be calculated by the person who does not have a mathematical background. This descriptive measure satisfies the first, third, fourth and fifth property put forth by Professor Yule.

The descriptive measure called the harmonic mean is defined as the reciprocal of the arithmetic mean of the reciprocals of the given values provided that none of the observations are zero. The basic disadvantage of this type of descriptive measure is that it cannot be easily understood or be calculated by a person who does not have a mathematical background. This descriptive measure satisfies the first, third, fourth and fifth property put forth by Professor Yule.

The second form of descriptive measure is classified into two categories.

The first category is used in expressing the spread of the observations with respect to the distance that exists between the values of the selected observations. This includes things like range, inter-quartile range, etc.

The second category is used in expressing the spread of the observations with respect to the average of the deviations of the observations for some central value. This includes things like mean, deviation, standard deviation, etc.

The third form of descriptive measure consists of three coefficients of skewness, namely Professor Karl Pearson’s coefficient of skewness, Professor Bowley’s coefficient of skewness and the coefficient of the skewness that is based on the moments.

The fourth form of descriptive measure gives an idea about the flatness or peakedness of the frequency curve. If the curve is neither flat nor peaked, then the descriptive measure concludes that it is a normal curve or a mesokurtic curve. If the curve is flatter than the normal curve, then the descriptive measure concludes that it is a platykurtic curve. If the curve is more peaked, then the descriptive measure concludes that it is a leptokurtic curve.

Binomial Test of Significance

The binomial test of significance is a kind of probability test that is based on various rules of probability. The binomial test of significance is used to examine the distribution of a single dichotomous variable in the case of small samples. The binomial test of significance involves the testing of the difference between a sample proportion and a given proportion.

Statistics Solutions is the country's leader in binomial tests of significance and dissertation statistics. Contact Statistics Solutions today for a free 30-minute consultation.


This document will discuss certain terms used in the binomial test of significance so that the reader can better understand the binomial test of significance.

The calculation of the binomial test of significance is done in the following manner:

Let us assume that p(r) is to be calculated in the binomial test of significance. In the binomial test of significance, p(r) is the probability that the researcher will obtain an ‘r’ observation in one category of a dichotomy and the researcher will obtain an ‘n – r’ observations in the other category, when the sample size is n. In the binomial test of significance, if ‘p’ is the probability that the researcher will obtain the first category, and ‘q’ is equal to ‘1 – p,’ then it denotes the probability that the researcher will obtain the second category. The formula for the calculation of the binomial test of significance is given by the following:

p(r) = nCr*pr*qn-r = (n!prqn-r)/(r!(n-r)!)

In this formula of the binomial test of significance, nCr is denoted as the number of combinations of n things drawn from ‘r’ at a time.

The normal approximation of the binomial test of significance is made in following manner:

When the size of the sample ‘n’ is greater than 25, and the probability ‘p’ of obtaining the first category is around 0.50, then product of the term ‘npq’ in the binomial test of significance is at least 9. In this case, the binomial distribution approximates the normal distribution in the binomial test of significance. Because of this approximation, a normal curve z-test is used as an approximation of the binomial test of significance. This formula of the approximation of the binomial test of significance is given by the following:

z = ((r[+,-].5) - np)/SQRT(npq)

The binomial test of significance can be done in SPSS. This non parametric test is calculated in SPSS by selecting “Non Parametric test” from the “analyze” menu and then selecting “binomial test of significance.”

There are certain assumptions that are made in the binomial test of significance. The assumptions are the following:

A dichotomous kind of a distribution is assumed in the binomial test of significance. In other words, in the binomial test of significance, it is assumed that the variable of interest is considered to be dichotomous in nature where the two values are mutually exclusive and mutually exhaustive in all cases being considered. The word ‘binomial’ in the binomial test of significance suggests that the variables of interest should be dichotomous in nature as the term ‘binomial’ means two.

Since this binomial test of significance does not involve any parameter and therefore is non parametric in nature, the assumption that is made about the distribution in the parametric test is therefore not assumed in the binomial test of significance.

In the binomial test of significance, it is assumed that the sample that has been drawn from some population is done by the process of random sampling. The sample on which the binomial test of significance is conducted by the researcher is therefore a random sample.

Chi square test

The definition of chi square in the chi square test is defined as the square of the standard normal variable.

Statistics Solutions is the country's leader in chi square test and dissertation statistics. Contact Statistics Solutions today for a free 30-minute consultation.

The chi square test is basically a test for approximating the large values of ‘n.’ Here ‘n’ is considered as the number of observations under consideration.

There are different varieties of the chi square test where the chi square statistic finds its application. They are as follows:

A chi square test is used to test the hypothetical value of the population variance.

A chi square test is used to test the goodness of fit.

A chi square test is used to test the independence of attributes.

A chi square test is used to test the homogeneity of independent estimates of the population variance.

A chi square test is used to test the homogeneity of independent estimates of the population correlation coefficient.

The chi square distribution involved in the chi square test is a continuous kind of distribution. The range of the chi square distribution in the chi square test is from zero to infinity. The probability density function (pdf) of the statistic involved in the chi square test is given by the following:

f(x)=(exp-{χ2/2} (χ2)(n/2)-1)/2n/2г(n/2); 0<∞

Among these entire chi square tests that are mentioned above, the most popular chi square tests are the chi square test for the goodness of fit and the chi square test for the independence of attributes.

The chi square test for the independence of attributes is conducted on the observations that are assigned in the contingency tables. It should be noted that this type of chi square test is carried out only upon those variables that are of categorical type.

Let us state an example in which the chi square test for the independence of the attributes is carried out. Suppose two sample polls of votes for two candidates A and B for a public office are taken, one from among the residents of rural areas and one from urban areas. In this case, there are two variable votes and two areas that are categorized as A and B, rural and urban respectively. The chi square test is carried out here for examining whether the nature of the area is associated to voting preference in the election in the two areas.

The second popular test is the chi square test for goodness of fit. This is a very powerful chi square test for testing the significance of the discrepancy between theory and experiments. This popular chi square test was introduced by Prof. Karl Pearson. This popular chi square test enables the researcher to find out whether the deviation of the experiment from theory has occurred by chance or due to inadequacy of the theory.

This popular chi square test is considered as an approximate test for testing the large values of ‘n.’

There are certain conditions that must be satisfied while conducting the chi square test. They are as follows:

The sample observations in the chi square test must be independent from each other.

The constraints on the cell frequencies in the chi square test must be linear in nature. In other words, this means that in the chi square test, the sum of the observed frequencies must be equal to the sum of the expected frequencies.

The total frequency in the chi square test, which is ‘N,’ must be reasonably large, which means that it should be greater than 50.

The theoretical cell frequency in the chi square test must not be less than five.

Monday, August 17, 2009

Methodology

Methodology refers to the way of doing things in every field. This document will detail the statistical methodology used in the field of medicine and nursing. There is a methodology called testing of hypothesis and this methodology is extensively used in the field of medicine and nursing. The testing of hypothesis methodology is a kind of a confirmatory test that helps the researcher understand whether or not the hypothesis he/she made is true. This methodology consists of some terms which are often used by the researcher while making some of the other statistical inferences about the drug being tested.

Statistics Solutions is the country's leader in methodology and dissertation consulting. Contact Statistics Solutions today for a free 30-minute consultation.

There are many terminologies used in this methodology. The term null hypothesis is used in this methodology and represents the theory that states that there is no significant difference in the two products being tested. Thus, the methodology of null hypothesis in this case will be stated in the following way: there is no statistical significant difference in the new drug and the current drug. On the other hand, the methodology behind the alternative hypothesis states that it is the complementary of null hypothesis. So in the field of medicine and nursing, the methodology of the alternative hypothesis will be stated in the following way: there might be some statistical significance in the new drug and the current drug.

The methodology behind Type I error involves the rejection of the correct sample. In the field of medicine and nursing, according to the methodology of Type I error, it will reject the correct sample of the drug. On the other hand, the methodology behind Type II error involves the acceptance of an incorrect sample. In the field of medicine and nursing, according to the methodology of Type II error, it will accept a defective drug assuming it is as an effective drug. According to this methodology, this error is one of the most serious errors in this field.

The methodology behind the test statistic is that it is the value that helps the researcher to decide whether a null hypothesis should be accepted or rejected.

The methodology behind the critical region is that it is the set of values of the test statistic for which the null hypothesis is rejected in tests of hypothesis. The critical region methodology is also called the region of rejection.

The methodology behind the level of significance is that it is the probability that there will be a false rejection of the null hypothesis. Usually, this methodology is chosen by the researcher as 0.05.

The methodology behind the power in tests of hypothesis is that it measures the test’s ability to reject the null hypothesis when the null hypothesis is false. In other words, this methodology helps the researcher in making a correct decision. The maximum value of this methodology should be one and the minimum should be 0.

The methodology behind the one sided test will be discussed with the help of an example. Suppose the researcher wants to test whether or not there is any statistical difference between the current drug and the new drug. According to this methodology, the alternative hypothesis will be that the current drug is more effective than the new drug or the current drug is less effective than the new drug. On the other hand, according to the methodology behind the two sided test, the alternative hypothesis will be that there is some statistical significant difference in the new drug and the current drug to be tested.

Thursday, August 6, 2009

Path Analysis

Path analysis is an extended generalized form of the regression model. Path analysis is used for comparing two or more causal models from the correlation matrix. Path analysis is done diagrammatically in the form of circles and arrows that indicate the causation. The task of path analysis is to predict the regression weight. The regression weight predicted during path analysis is then compared to the observed correlation matrix. In path analysis, the goodness of fit test is done in order to show that the model is the best possible fit.

Statistics Solutions is the country's leader in statistical consulting and path analysis. Contact Statistics Solutions today for a free 30-minute consultation.

While conducting path analysis, a researcher comes across some key terminologies used during path analysis. The following terminologies are used during path analysis:

For researchers, the first thing to tackle is the question that they want answered. The question here is what kind of estimation method is to be used in path analysis. Ordinary least squares (OLS) method and maximum likelihood methods are used to estimate the path.

Additionally, there is a term called path model in path analysis. Path model in path analysis is nothing but a diagram that indicates independent variables, intermediate variables and dependent variables. The arrows with a double head indicate that the covariance is being calculated between the two variables in path analysis.

The exogenous variables in path analysis are those variables with no error pointed towards them, except for the measurement error. The endogenous variables in path analysis can have both approaching and withdrawing arrows.

The path coefficient in path analysis is the same as that of the standardized regression coefficient. This coefficient in path analysis indicates the direct effects of an independent variable on the dependent variable.

Since the estimation method is ordinary least squares (OLS), there is a term called disturbance terms in path analysis. These terms in path analysis are nothing but the residual error terms. These terms in path analysis merely indicate the variances which are unexplained and the errors that occurred during measurement (i.e. the measurement errors).

As discussed, goodness of fit test is used in path analysis, and therefore chi square statistics is also used in path analysis. The values that are not significant in path analysis indicate the model with a good fit.

Path analysis is generally conducted with the help of analysis of a moment structures (AMOS), which is an added module in SPSS. Other than the analysis of a moment structures (AMOS), there is other statistical software like SAS, LISREL, etc. that can be used to conduct path analysis. According to Kline (1998), an adequate sample size should be 10 times the cases of parameters in path analysis. The ideal sample size should be 20 times the cases of parameters in path analysis.

Since path analysis is a statistical method, it has assumptions. The following are the assumptions of path analysis:

In path analysis, the relationship between the variables should be linear in nature. The data used in path analysis should have an interval scale. In order to reduce disturbances in the data, the theory of path analysis assumes that the error terms should not be correlated with the variables.

Path Analysis, however, also has some limitations. Although path analysis can evaluate or test two or more causal hypotheses, path analysis cannot establish the direction of causality.

Path analysis is useful only in cases where a small number of hypotheses (that can be represented by a single path) are being tested.

Thursday, July 23, 2009

Methodology in Psychology

Methodology refers to the theoretical analysis of the methods appropriate to a particular field of study. The purpose of this paper is to discuss statistical methodology in the field of psychology.

Statistics Solutions is the country's leader is statistical consulting and methodology. Contact Statistics Solutions today for a free 30-minute consultation.

In the field of psychology, statistical methodology (like statistical significance testing) is being done. The methodology consists of statistical significance tests, such as t-tests. The methodology called t-test is used to compare the statistical significance of the two samples under study. Suppose one wants to compare the literacy rate of two regions. In this case, t-test methodology is useful. The null hypothesis in this methodology will be that there is no statistically significant difference between the literacy rate of the two samples drawn from region A and B. Suppose, in this methodology, that the calculated t statistic is more than the tabulated t statistic. The null hypothesis assumed in this methodology will be rejected at a particular level of significance.

A statistical methodology called ANOVA, i.e. Analysis of Variance, is used to examine the differences in the mean values of the dependent variable associated with the effect of the controlled independent variables after taking the influence of the uncontrolled independent variables into account.

ANOVA, or one way methodology, involves only one categorical variable, or a single factor. Similarly, if two or more factors are involved in the methodology, then this methodology can be termed as ANOVA n way methodology. The following two assumptions are assumed in this methodology:
  • The samples drawn from a population in this methodology should be random in nature.
  • The variance in this methodology should be homogeneous in nature

A statistical methodology, called partial correlation, is used in the field of psychology. This methodology is the measure of the relationship between two variables while controlling or adjusting the effect of one or more additional variables. In psychology, this methodology is useful in behavioral studies. Since psychology is a branch of social science, quantitative methodology can be done through SPSS, which is a statistical software for social sciences.

There are certain terms used in this methodology that can help in understanding this methodology in a precise manner. The term control variables used in this methodology refers to those variables that draw out variances obtained from the initial correlated variables. The order of correlation in this methodology refers to the correlation with a controlled variable. For example, first order partial correlation methodology is the one that has a single control variable.Other than quantitative methodology, there are two techniques of qualitative methodology that are used in this field. Those two techniques in this methodology are namely Delphi Process and the Nominal Group Technique. The prime objective of Delphi Process methodology is to create a reliable and creative investigation of ideas for enabling suitable information for appropriate decision making. This methodology operates as a useful communication device which in turn facilitates the formation of group judgments, which helps in retrieving the appropriate response. Nominal Group Technique in this methodology is a balanced method involving overall participation. In this methodology, the term “balanced” is used because this methodology encourages equal participation of all the group respondents. It means that this methodology involves ideas and views of a group of people rather than an individual.

The idea behind the theory of Nominal Group Technique methodology is the biggest advantage over Delphi Process methodology. This advantage also cites the major difference between the two methodologies. The difference in these two methodologies is that the information obtained using the Nominal Group methodology is more reliable because the responses were obtained from each and every participant.