Case Selection
Previous Topic  Next Topic 

To reach the dialog box that allows you to select specific cases or records from your data set, click on the Observations tab at the top of the dialog boxes.  This will bring the Observations dialog box to the front.


The scrolling text box in the upper left corner provides brief, on-screen documentation on how to select particular data records based on conditions that you specify.  The variables of the input data set are listed in the box at the right of the screen.


At the bottom of the screen is the case-selection field in which you enter the case selection, or WHERE, expression that will specify cases.  This expression gives the conditions on the variables that will define the subgroup of the data set that you wish to select.


Variable names can be entered in this field by selecting their names from the variable list box.  When you double-click on a variable name it will be copied to the case-selection box.


Case-Selection Expressions

The WHERE statement is used to give the conditions on the variables that will define the subgroup of the data set that you wish to select.


The case-selection, or WHERE, expression, has the following form:


    WHERE     variable expression     relational operator      selection condition


Here, variable expression consists of a single variable or an expression involving several variables, relational operator is one of the operators listed below, while selection condition gives specifications for the variables to be selected.


Variable Expression

All of the usual arithmetic operators [+ - / * ( ) ] are available for use in this expression.


If variable names used in WHERE expressions contain embedded blanks or characters such as relational operators or arithmetic  like '/', then they must be enclosed in single quotes.


Internal Variable

An internal variable, '_rownum' is available which allows specific rows or records of the data set to be referenced.


Relational Operators

The following relational operators are available:



=

equals


!=

not equal


<

less than

>

greater than


<=

less than or equal


>=

greater than or equal


&

and


|

or


,

or (used in a series)


|

not


The modulus operator is also available:


remainder after division by the operand following


Selection Conditions

If variable values consist of string, then when they contain blanks or characters such as '/', they must be enclosed in double quotes.


Examples

Examples of selection conditions given by WHERE expressions are:


     where educ = 12 & rate > .2


   where (income1 + income2)/famsize < 20000


   where income1 >= 20000 | income2 >= 20000


   where acct != 2001


   where name = smith


   where 'dept-sales' = "auto loan"


   where  id  %  2 = 0   (which selects all even values of 'ID')


   where _rownum < 200 (which selects rows 1 - 199)



Wildcards in Selection Conditions

Wildcards ( * or ? ) are available to select subgroups of string variables.  For example:


   where account = ?3*


   where name = mc* | name = mac*


Note that when the wildcard '?' is used, it replaces a single character, while the wildcard '*' replaces an unspecified number of characters.  Thus the specification '?3*' will select account numbers of any length that have a three in the second place.


Comma Operator

The comma operator ',' is used to list different values of the same variable name that will be used as selection criteria.  It allows you to bypass potentially lengthy OR expressions when selecting lists of values.  For example, the WHERE expression above can be more easily written:


     where name = mc*,mac*


Other examples are:


     where age = 21,31,41,51,61


which will select only the listed ages, and


      where caseid != 22*,30??,4?00


which will select all cases except those id's starting with '22', or four character id's starting with '30', or starting with '4' and ending with '00'.



Preserving WHERE expressions

Ordinarily the WHERE expression is cleared after a transfer operation.  If you wish to apply the same expression to several input files, you can check the box Preserve expression between transfers and your expression will be available for re-use or editing for your next transfer run.



See also:


Missing Values

Sampling Functions