Quant: ANOVA and Multiple Comparisons in SPSS

Introduction

The aim of this analysis is to look at the relationship between the dependent variable of the income level of respondents (rincdol) and the independent variable of their reported level of happiness (happy).   This independent variable has at least 3 or more levels within it.

From the SPSS outputs the goal is to:

  • How to use the ANOVA program to determine the overall conclusion. Use of the Bonferroni correction as a post-hoc analysis to determine the relationship of specific levels of happiness to income.

Hypothesis

  • Null: There is no basis of difference between the overall rincdol and happy
  • Alternative: There is are real differences between the overall rincdol and happy
  • Null2: There is no basis of difference between the certain pairs of rincdol and happy
  • Alternative2: There is are real differences between the certain pairs of rincdol and happy

Methodology

For this project, the gss.sav file is loaded into SPSS (GSS, n.d.).  The goal is to look at the relationships between the following variables: rincdol (Respondent’s income; ranges recoded to midpoints) and happy (General Happiness). To conduct a parametric analysis, navigate to Analyze > Compare Means > One-Way ANOVA.  The variable rincdol was placed in the “Dependent List” box, and happy was placed under “Factor” box.  Select “Post Hoc” and under the “Equal Variances Assumed” select “Bonferroni”.  The procedures for this analysis are provided in video tutorial form by Miller (n.d.). The following output was observed in the next two tables.

The relationship between rincdol and happy are plotted by using the chart builder.  Code to run the chart builder code is shown in the code section, and the resulting image is shown in the results section.

Results

Table 1: ANOVA

Respondent’s income; ranges recoded to midpoints
Sum of Squares df Mean Square F Sig.
Between Groups 11009722680.000 2 5504861341.000 9.889 .000
Within Groups 499905585000.000 898 556687733.900
Total 510915307700.000 900

Through the ANOVA analysis, Table 1, it shows that the overall ANOVA shows statistical significance, such that the first Null hypothesis is rejected at the 0.05 level. Thus, there is a statistically significant difference in the relationship between the overall rincdol and happy variables.  However, the difference between the means at various levels.

Table 2: Multiple Comparisons

Dependent Variable:   Respondent’s income; ranges recoded to midpoints
Bonferroni
(I) GENERAL HAPPINESS (J) GENERAL HAPPINESS Mean Difference (I-J) Std. Error Sig. 95% Confidence Interval
Lower Bound Upper Bound
VERY HAPPY PRETTY HAPPY 4093.678 1744.832 .058 -91.26 8278.61
NOT TOO HAPPY 12808.643* 2912.527 .000 5823.02 19794.26
PRETTY HAPPY VERY HAPPY -4093.678 1744.832 .058 -8278.61 91.26
NOT TOO HAPPY 8714.965* 2740.045 .005 2143.04 15286.89
NOT TOO HAPPY VERY HAPPY -12808.643* 2912.527 .000 -19794.26 -5823.02
PRETTY HAPPY -8714.965* 2740.045 .005 -15286.89 -2143.04
*. The mean difference is significant at the 0.05 level.

According to Table 2, for the pairings of “Very Happy” and “Pretty Happy” did not disprove the Null2 for that case at the 0.05 level. But, all other pairings “Very Happy” and “Not Too Happy” with “Pretty Happy” and “Not Too Happy” can reject the Null2 hypothesis at the 0.05 level.  Thus, there is a difference when comparing across the three different pairs.

u3db3f1

Figure 1: Graphed means of General Happiness versus incomes.

The relationship between general happiness and income are positively correlated (Figure 1).  That means that a low level of general happiness in a person usually have lower recorded mean incomes and vice versa.  There is no direction or causality that can be made from this analysis.  It is not that high amounts of income cause general happiness, or happy people make more money due to their positivism attitude towards life.

SPSS Code

DATASET NAME DataSet1 WINDOW=FRONT.

ONEWAY rincdol BY happy

  /MISSING ANALYSIS

  /POSTHOC=BONFERRONI ALPHA(0.05).

* Chart Builder.

GGRAPH

  /GRAPHDATASET NAME=”graphdataset” VARIABLES=happy MEAN(rincdol)[name=”MEAN_rincdol”]

    MISSING=LISTWISE REPORTMISSING=NO

  /GRAPHSPEC SOURCE=INLINE.

BEGIN GPL

  SOURCE: s=userSource(id(“graphdataset”))

  DATA: happy=col(source(s), name(“happy”), unit.category())

  DATA: MEAN_rincdol=col(source(s), name(“MEAN_rincdol”))

  GUIDE: axis(dim(1), label(“GENERAL HAPPINESS”))

  GUIDE: axis(dim(2), label(“Mean Respondent’s income; ranges recoded to midpoints”))

  SCALE: cat(dim(1), include(“1”, “2”, “3”))

  SCALE: linear(dim(2), include(0))

  ELEMENT: line(position(happy*MEAN_rincdol), missing.wings())

END GPL.

References:

Quant: Group Statistics in SPSS

Introduction

The aim of this analysis is to make a decision about whether a person is alive or dead ten years after a coronary is reflected in a significant difference in his diastolic blood pressure taken when that event occurred. The variable “DBP58” will be used as a dependent variable and “Vital10” as an independent variable.

From the SPSS outputs the goal is to:

  • Analyze these conditions to determine if there is a significant difference between the DBP levels of those (vital10) who are alive 10 years later compared to those who died within 10 years.

Hypothesis

  • Null: There is no basis of difference between the DBP58 and Vital10
  • Alternative: There is are real differences between the DBP58 and Vital10

Methodology

For this project, the electric.sav file is loaded into SPSS (Electric, n.d.).  The goal is to look at the relationships between the following variables: DBP58 (Average Diastolic Blood Pressure) and Vital10 (Status at Ten Years). To conduct a parametric analysis, navigate to Analyze > Compare Means > Paired-Samples T Test.  The variable DBP58 was placed in the “Test Variables” box, and Vital10 was placed under “grouping variable” box.  Then select the “Define Groups” button and enter 0 for “Group 1” and 1 for “Group 2”.  The procedures for this analysis are provided in video tutorial form by Miller (n.d.). The following output was observed in the next two tables.

Results

Table 1: Group Statistics

Status at Ten Years N Mean Std. Deviation Std. Error Mean
Average Diast Blood Pressure 58 Alive 178 87.56 11.446 .858
Dead 61 92.38 16.477 2.110

According to the results in Table 1, the mean diastolic blood pressure of those who have passed away ten years later was 5 points higher and had a huge standard deviation.  Thus, those who are alive ten years later have a smaller variation of their diastolic blood pressure.

Table 2: Independent Samples Test

Levene’s Test for Equality of Variances t-test for Equality of Means
F Sig. t df Sig. (2-tailed) Mean Difference Std. Error Difference 95% Confidence Interval of the Difference
Lower Upper
Average Diast Blood Pressure 58 Equal variances assumed 8.815 .003 -2.515 237 .013 -4.815 1.915 -8.587 -1.043
Equal variances not assumed -2.114 80.735 .038 -4.815 2.277 -9.347 -.284

According to the independent t-test for equality of means, shows that there is no equality in the variance at the 0.05 level, such that when equal variances are not assumed, the null hypothesis could be rejected at the 0.05 level because the significance value is 0.038.  Thus, there is a statistically significant difference between the means of diastolic blood pressure of those who are alive and those who have passed away.

SPSS Code

DATASET NAME DataSet1 WINDOW=FRONT.

T-TEST GROUPS=vital10(0 1)

  /MISSING=ANALYSIS

  /VARIABLES=dbp58

  /CRITERIA=CI(.95).

References:

Quant: Paired Sample Statistics in SPSS

Introduction

The aim of this analysis is to conduct a comparison of productivity under two organizational structures: The data are artificial estimates of productivity with column 1 representing traditional vertical management and column 2 representing other autonomous work teams (ATW). The background is that a company of 100 factory workers had been operating under traditional vertical management and decided to move to ATW. The same employees were involved in both systems having first worked under vertical management and then being converted to ATW.

From the SPSS outputs the goal is to:

  • Analyze the productivity levels of the 2 management approaches, and decide which is superior.

Hypothesis

  • Null: There is no basis of difference between the prodpre and prodpost
  • Alternative: There is are real differences between the prodpre and prodpost

Methodology

For this project, the atw.sav file is loaded into SPSS (ATW, n.d.).  The goal is to look at the relationships between the following variables: prodpre (productivity level preceding the new process) and prodpost (productivity level following the new process). To conduct a parametric analysis, navigate to Analyze > Compare Means > Paired-Samples T Test.  The variable prodpre was placed in the “Paired Variables” box under “Pair” 1 and “Variable 1”, and prodpost was placed under “Pair” 1 and “Variable 2”.  The procedures for this analysis are provided in video tutorial form by Miller (n.d.). The following output was observed in the next three tables.

Results

Table 1: Paired Sample Statistics

Mean N Std. Deviation Std. Error Mean
Pair 1 productivity level preceding the new process 76.43 100 16.820 1.682
productivity level following the new process 84.24 100 9.797 .980

Descriptively, productivity on average increased by 8 points, and the standard deviation about the mean decreased by 7 points.  This means that the estimates of productivity under the traditional vertical management are less than and showcase a wider spread than those of the estimates of productivity under the autonomous work teams.  Essentially these distributions tell the story that the workers are getting better productivity estimates with less deviation under autonomous work teams.

Table 2: Paired Samples Correlation

N Correlation Sig.
Pair 1 productivity level preceding the new process & productivity level following the new process 100 .040 .695

Based on Table 2, there is a weak correlation (r = 0.040) between the estimates of productivity under the traditional vertical management and the autonomous work teams.  Although correlation does not imply causation.

Table 3: Paired Samples Test

Paired Differences t df Sig. (2-tailed)
Mean Std. Deviation Std. Error Mean 95% Confidence Interval of the Difference
Lower Upper
Pair 1 productivity level preceding the new process – productivity level following the new process -7.817 19.126 1.913 -11.612 -4.022 -4.087 99 .000

Based on the results from the 2-tailed student t-tests (Table 3), the null hypothesis can be rejected.  There is a significant difference between the two variables prodpre and prodpost at the 0.05 level or less.  The data based on 100 workers (with degrees of freedom of 99) show that there is a significance in the estimates of productivity under the traditional vertical management and the autonomous work teams.

SPSS Code

DATASET NAME DataSet1 WINDOW=FRONT.

T-TEST PAIRS=prodpre WITH prodpost (PAIRED)

  /CRITERIA=CI(.9500)

  /MISSING=ANALYSIS.

References:

Quant: Lack of detail

Concerns about the lack of detail In this scenario, there is a lack of detail, and to get subjects to participate in this research problem Miller (n.d.) said: “People need to know the specifics.” From the scenario described above, there is no indication of who these researchers are nor their credentials.  Without a quick biography … Continue reading “Quant: Lack of detail”

Concerns about the lack of detail

In this scenario, there is a lack of detail, and to get subjects to participate in this research problem Miller (n.d.) said: “People need to know the specifics.” From the scenario described above, there is no indication of who these researchers are nor their credentials.  Without a quick biography on the website, it is hard to discern if these researchers are credible to conduct the research. From the scenario, the recruit of subjects for their study seems to be lacking a statement of purpose, which sets the stage, intent, objectives, and major idea of the study to begin with (Creswell, 2014).  The statement of purpose gives the reader (the subjects) the reason as to why these researchers want to examine the two styles of leadership.  The statement of purpose demonstrates the problem statement, and defines the specific research questions these researchers are studying (Creswell, 2014).  Creswell, (2014) stated that effective purpose statements for quantitative research will be written in a deductive language and should include the variables, the relationships between the variables, the participants, and the research location.  The intent in quantitative research is demonstrated in the purpose statement through describing the relationships or lack thereof between the variable to be found through either a survey or experiments.  Miller (n.d.) and Creswell (2014) stated that identification of theory or conceptual framework is needed to build a strong statement of purpose.  Miller (n.d.) goes further to explain that there needs to be a statement of which two leadership styles theories or dimensions will be evaluated in this study.

There is no mention of whether the recruitment of the subjects is part the pilot study, which is used to help develop and try out methods and procedures, or conducting the main study, which is where the collection of actual data for the study is collected (Gall, Gall, & Borg, 2006).  The methodology section of this call for subjects should have addressed this.  It should also address what type of instrument these researchers are using to collect data from the subjects.  There are two main types of data collection: Survey and experiments.  It is more likely that this study recruiting subjects to study two types of leadership styles will use surveys as their means of quantitative data collection.  Creswell (2014) defines surveys a numerical data collected, studied and analyzed from a sample of the population to find out participant opinions and attitudes.  If done correctly, a statistical inference could be applied to aid in applying the results gained from this study to those of the population these researchers are trying to understand on these two leadership styles (Gall et al., 2006). Miller (n.d.) suggested that the surveys could ask about the subjects’ opinions or attitudes towards certain leadership style traits, or the survey could state a few scenarios and have the subjects select a multiple choice answer.

The survey instrument should be either valid and reliable.  It should have been either used before in other studies, with slight modifications to fit the parameters of this study, and it should be listed on their website.  A slight modification to the instrument may not have held the same validity and reliability as the old instrument.  Plus, if there is a lack of validity or reliability in the study’s instrument, then why should the subjects participate and waste their time.  Validity and reliability ensure that the results captured through the instrument will provide valid and meaningful results (Creswell, 2014; Miller, n.d.).  If the current instrument is not fully valid and reliable, then this could be indications of a pilot study to help refine and build validity and reliability in the instrument (Gall et al., 2006).   According to Creswell (2014), there are internal, external, and statistical conclusion threats to validity that must be controlled or mitigated as to help draw out the correct inference of the population.

There is no mention of the population in which these researchers are trying to study on the two leadership styles.  If the subject doesn’t fall under the conditions of the population, then the subject doesn’t know if even applying would seem like a waste of time.  Creswell (2014) states that depending on the population of the certain study instruments could work better than others, while others are just not well-suited enough to provide the needed validity and reliability needed to generalize results to that population.  The researchers could try to narrow their population by stating, “This study aims to understand the relationship between X, Y, and Z, that are displayed in A & B leadership styles among the Latin(x)-Americans population in the state of Oklahoma, from the ages of 25-35 and 45-55.”  Thus, subjects that do not fall under this population should not need to worry about applying for the study, saving time for both prospective subjects and the researchers.  The study has not mentioned how the population should be narrowed into a few dimensions to fit their study.  Thus, one can assume that these researchers may be trying to study the general population, which has a huge number of diverse dimensions that are impossible to study (Miller, n.d.).  Unless otherwise stated, any assumption goes based on the facts of this scenario.  The scenario does not mention how the researchers plan to obtain a rand selection from this population, and submitting a call through their website, would only draw a special type of population, which may or may not represent the population these researchers are trying to study.  The closer the sample represents the study’s aimed population, the more powerful is the statistical inference is to help draw inferences that are more representative of the population (Gall et al., 2006; Miller, n.d.).

Finally, there is a need for subject participation information that would entice participation: how long will the survey take; is there compensation; and will the subject be informed of the results at the end of the survey.  If the survey takes too much time and the population that these researchers are trying to sample doesn’t have that time readily available, then the participation rate would decrease.  The longer it takes to fill out an assessment, the need for compensation for the subjects is needed.  There are two ways to compensate subjects in a study: hand out small amounts of compensation to each participant or at the conclusion of the study; a random drawing is conducted to give out 2-3 prizes of substantial size (Miller, n.d.).  Regardless, if there is or is not any form of compensation available, the researchers should consider if there are at least some results or “lessons learned” that would be earned by the subjects through the participation in their study.

References: