SPSS Statistics for Dummies
von: Keith McCormick, Jesus Salcedo
For Dummies, 2015
ISBN: 9781118989029
Sprache: Englisch
384 Seiten, Download: 20193 KB
Format: EPUB, auch als Online-Lesen
Chapter 1
Introducing SPSS
In This Chapter
Considering the quality of your data
Communicating with SPSS
Seeing how SPSS works
Finding help when you’re stuck
A statistic is a number. A raw statistic is a measurement of some sort. It’s fundamentally a count of something — occurrences, speed, amount, or whatever. A statistic is calculated using a sample. In a sense, a sample is the keyhole you have to peer through to the population, which is what you’re trying to understand. The value at the population level — the average height of an American male, for instance — is called a parameter. Unless you’ve got all the data there is, and you’ve collected a census of the population, you have to make do with the data in your sample. The job of SPSS is to calculate. Your job is to provide a good sample.
In this chapter, we discuss the importance of having accurate, reliable data, and some of the implications when this is not the case. We also talk about how best to organize your data in SPSS and the different kinds of files that SPSS creates. We take a trip down memory lane and discuss the origins of SPSS, as well as what can be done in the program and different ways of communicating with the software. Finally, we spend some time discussing different ways in which you can get help when navigating SPSS.
Garbage In, Garbage Out: Recognizing the Importance of Good Data
SPSS doesn’t warn you when there is something wrong with your sample. Its job is to work on the data you give it. If what you give SPSS is incomplete or biased, or if there is data that doesn’t belong in there, the resulting calculations won’t reflect the population very well. Not much in the SPSS output will signal to anyone that there is a problem. So, if you’re not careful, you can conclude just about anything from your data and your calculations.
Consider the data in Table 1-1. What if you calculated the survival rate of Titanic passengers based on this small sample? What if you calculated what fraction of the passengers were in each class of service? You can easily see that you’d be in real trouble.
Table 1-1 Sample of Titanic Passengers
Survived or Died | Class | Name | Sex | Age | Fare Paid | Cabin | Embarkation |
Died | 1 | Andrews, Mr. Thomas, Jr. | Male | 39 | 0.00 | A36 | Southampton |
Died | 1 | Parr, Mr. William Henry Marsh | Male | 0.00 | Southampton |
Died | 1 | Fry, Mr. Richard | Male | 0.00 | B102 | Southampton |
Died | 1 | Harrison, Mr. William | Male | 40 | 0.00 | B94 | Southampton |
Died | 1 | Reuchlin, Mr. John George | Male | 38 | 0.00 | Southampton |
Died | 2 | Parkes, Mr. Francis “Frank” | Male | 0.00 | Southampton |
Died | 2 | Cunningham, Mr. Alfred Fleming | Male | 0.00 | Southampton |
Died | 2 | Campbell, Mr. William | Male | 0.00 | Southampton |
Died | 2 | Frost, Mr. Anthony Wood “Archie” | Male | 0.00 | Southampton |
Died | 2 | Knight, Mr. Robert J. | Male | 0.00 | Southampton |
Died | 2 | Watson, Mr. Ennis Hastings | Male | 0.00 | Southampton |
Died | 3 | Leonard, Mr. Lionel | Male | 36 | 0.00 | Southampton |
Died | 3 | Tornquist, Mr. William Henry | Male | 25 | 0.00 | Southampton |
Died | 3 | Johnson, Mr. William Cahoone, Jr. | Male | 19 | 0.00 | Southampton |
Died | 3 | Johnson, Mr. Alfred | Male | 49 | 0.00 | Southampton |
However, consider this: Would you be tempted to drop these cases from your analysis because their fare information appears to be missing? What if fare information were provided for all the other passengers? You might drop the cases in Table 1-1 but use everyone else. You’d be dropping only a handful of passengers out of hundreds, so that would be okay, right? The answer is no, it would not be okay. As it turns out, there is a good reason that each of these passengers didn’t pay a fare (for example, Mr. Thomas Andrews, Jr., designed the ship), and if this was your data, your job would be to know that.
Sampling is a big topic, but here’s the quick version:
- The data points in your sample should be drawn at random from the population.
- There should be enough data points.
- You should be able to justify the removal of any data points.
This book is not about the accuracy, correctness, or completeness of the input data. Your data is up to you. This book shows you how to take the numbers you already have, put them into SPSS, crunch them, and display the results in a way that makes sense. Gathering valid data and figuring out which crunch to use is up to you.
Your data is your most valuable possession, so be sure to back it up. Make sure you have multiple backups, with at least one stored offsite. The last thing you want is to lose your data.
The origin of SPSS
SPSS is probably older than you are. In 2018, it will turn 50. That makes it older than Windows and older than the first Apple computer, so in the early days SPSS was run on mainframe computers using punch cards.
At Stanford University in the late 1960s, Norman H. Nie, C. Hadlai (Tex) Hull, and Dale H. Bent developed the original software system named Statistical Package for the Social Sciences (SPSS). They needed to analyze a large volume of social science data, so they wrote software to do it. The software package caught on with other folks at universities, and, consistent with the open-source tradition of the day, the software spread through universities across the country.
The three men produced a manual in the 1970s, and the software’s popularity took off. A version of SPSS existed for each of the different kinds of mainframe computers in existence at the time. Its popularity spread from universities into the public sector, and it began to leak into the private sector as well.
In the 1980s, a version of the software was moved to the personal computer. In 2008, the name was briefly changed to Predictive Analytics Software (PASW). In 2009, SPSS, Inc., was acquired by IBM, and the name of the product was returned to the more familiar SPSS. The official name of the software today is IBM SPSS Statistics.
SPSS is available in several forms — single user, multiuser, client-server, student version, and so on. The software also has a number of special-purpose add-ons. You can find out about them all at www-01.ibm.com/software/analytics/spss/products/statistics
.
Talking to SPSS: Can You Hear Me Now?
More than one way exists for you to command SPSS to do your bidding. You can use any of four approaches to perform any of the SPSS functions, and we cover them all in this section. The method you should choose depends not only on which interface you prefer, but also (to an extent) on the task you want performed.
The graphical user interface
SPSS has a window interface. You can issue commands by using the mouse to make menu selections that cause dialog boxes to appear. This is a fill-in-the-blanks approach to statistical analysis that guides you through the process of making choices and selecting values. The advantage of the graphical user interface (GUI) approach is that, at each step, SPSS makes sure you enter everything necessary before you can proceed to the next step. This interface is preferred for those just starting out — and if you don’t go into depth with...