Other than summary counts, means, and crosstabs, one of the most common needs in using data is filtering. You'll want to select subsets of your data to concentrate on, rather than performing calculations on the whole set. For example, to return to the dataset from the earlier SPSS tutorials, you may want to find the average age of females for each occupation category. This is easily done using your analyze means functions, but first you have to filter out males in your extract.
To filter, go to Data >> Select Cases from the top menu in SPSS. In you dialogue window you'll see a list of variables, and on the right, the options to Select cases based on certain parameters. Choose the "If condition is satisfied" radio button and click on "If." The next dialogue box allows you to write an expression using your variables that will filter out certain cases:
The idea is to write an expression in the top box that, where true, will preserve those records and filter out any that don't fit your expression.
For example, if you want to filter out males in your sample, you would type:
SEX = 2
Why 2? Because in this particular dataset, males are coded in the Sex column as a "1" and females as a "2". Your expression is telling SPSS to keep records where gender has been coded as 2, and exclude those coded as anything else. What if you didn't know the numerical values? You can always right click on a variable in the lefthand column and select "Variable Information". This will bring up a summary of all the values used in your dataset to encode variables.
What if you wanted to filter out anybody above the age of 30? Enter:
AGE <= 30
This means keep only those who Age is less than or equal to 30. What if you wanted to keep everybody 30 or younger, and who were resident in Brooklyn?
AGE <= 30 AND CITY = 4611
In other words you can stack all kind of expressions using the boolean operators AND, OR. And why 4611? Because, as you'll see in the Variable Information, the city of Brooklyn has been coded in the IPUMS dataset as the number 4611.
One last example can help show how these boolean operators work. Let's say that you want only to investigate those males who were between the age of 20 and 60. You would enter:
SEX =1 AND AGE >= 20 AND AGE <=60
Note that you cannot enter: SEX = 1 AND (20 <= AGE <= 60) SPSS doesn't work with that approach, because it needs the AGE variable to be repeated, like this:
SEX = 1 AND (20 <=AGE AND AGE <=60)
So a good rule of thumb is to separate all of your expressions and link them with AND and OR; use parentheses if you need, just like a good mathematical expression.
Order of operations: it is also important to note that there is an order to how SPSS evaluates AND, OR, NOT. It evaluates NOT first, then, AND, then OR. A full summary is here. Be sure to use parentheses liberally to get the proper filter in place.
Click Continue and then OK, and your filter will be in place until you remove it.