User Missing Values
Previous Topic  Next Topic 

User Missing Values

You have some control over the way missing values are treated for input files containing more than one type. At present the User Missing Values options apply to SPSS files (both Data and Portable) and OSIRIS. The options are selected with the buttons Use All, Use First or Use None.


Some statistical systems distinguish between "system missing," such as the result of a divide by zero, and "user-missing," a numeric value which is defined as a missing value by the user.  Further, particularly in survey research, distinctions are made between user-defined missing values that represent structurally missing data (such as answers to pregnancy history questions from male respondents), and those that represent categories of non-response or simply the failure of the interviewer to properly collect the data.


Conventionally, zero is used to represent "inapplicable" missing values, and  higher numbers are used to represent such responses as "don't know," "refused" and "not ascertained".  While inapplicable data is analytically equivalent to "system missing", there can be legitimate research interest in the patterns of non-response represented by the other categories of missing data.


Use All  By default, when multiple missing values are allowed (as in SPSS, for example) they are mapped into a single missing value on output.  This corresponds to selection of the Use All button.


The mapping to a missing value on output is determined by the option Map to extended (a-z) missing.


Use First  If you select Use First, the first user-defined missing value will be mapped to the system missing value and the rest will be transferred intact to the target data set.  Use First will often be the most useful of the options, since it will allow tabulations in the target package of patterns of non-response.


Use None  If you choose Use None, then all of the user-defined missing values will be transferred retaining the input value in the target data set.



Map to extended (a-z) missing

By default (when this option is left unchecked), all user missing values that are selected according the options for user missing values (Use All/ Use First/Use none) will go to a single missing value which will then be converted to the "system" missing value in the target package ( '.' in SAS or Stata, for example).


If the option Map to extended (a-z) missing is checked, user missing values will be mapped, if possible, to extended missing values in target packages that support them (SAS, ASCII, or Stata).


By default, this mapping will be done in order, with the first missing value going into .a, the second into .b, and so on.


Map using variable labels

If the option Map using variable labels is checked, the first letter of the value label will be used as the missing value. For instance, if the value '0' is a user missing value and is labeled as "inapplicable", it will be mapped to '.i'.  This mapping will only occur for missing values that are computed with an equal operator.


If there is no label, or if the missing letter has already been used, the missing value will then be mapped sequentially to '.a' -  '.c'.


Since current versions of Stata and SAS support the labeling of missing values, this option is less useful.  If you have value labels, it is best to keep it unchecked and rely on the value labels to differentiate your values.



Note that we believe these options are potentially dangerous.  To avoid the chance of users checking one of these options and then forgetting about it, Stat/Transfer does not save the settings when options are automatically saved at the end of a session.