ASCII/Text Files - Read Options
Previous Topic  Next Topic 

Delimiter

This option will give you a list of possible delimiters for input ASCII files.  You can choose to have Stat/Transfer automatically sense which delimiter to use or you can choose from the list: commas, tabs, spaces or semicolons. If you have a delimiter that is not on the list, click on 'Other' and enter the delimiter you wish to use.


Combine adjacent blanks

This option is available for space delimited files only.  It is useful if data values are delimited by one or more blanks or tabs.


The default for Combine adjacent blanks is 'off'.   If you turn this option on, you can select 'Spaces' in which case multiple blanks are treated as one blank, or you can select 'Spaces and tabs', in which case multiple instances of tabs and blanks are converted to a single space.


Variable Names

By default, Stat/Transfer will sense whether the first line of your input data set contains field names or data.  You may, if you wish, explicitly override this default.


AutoSense: If this option is set to 'AutoSense', Stat/Transfer will look at the first and second rows of data.  If there is a change from a string to a number for one or more variables between these rows, Stat/Transfer will use the first row as the field names.  This will fail if your first row contains the field names, and all of your variables are of the string type.  In that case you should choose one of the following two options:


First Row: When this option is set to 'First Row', Stat/Transfer uses the data found in your first row as the field names.


Make Up:  When this option is set to 'Make Up', Stat/Transfer treats the first row in your file as data and assigns the field names 'col1' 'coln'.


Numeric Missing Value

It is possible to specify a string that will be interpreted as a missing value when Stat/Transfer reads ASCII files.  For example, your input data set may use the string 'NA' to represent missing values or it may use a period.


Enter the string that represents missing values in the input data in the Numeric Missing Value field.


If you wish to read extended missing values for either delimited or fixed ASCII files, use the option below or enter the work "extended".


Read Variable Labels from Second Row

If this option is checked, variable labels will be read from the row that follows the variable names (ordinarily the second row).


Rows to Skip

Enter the number of rows you want to skip at the top of the file.  This option is useful for skipping headings and titles.


Convert extended (a-z) missing values

If this option is checked, the keyword 'extended' will be entered into the Numeric Missing Value field.  When 'extended' is entered, extended missing values ('.a' - '.z', '.', and '._')  in either delimited or fixed ASCII files will be read from the file.  Note that reading missing values is case-insensitive (that is, '.a' and '.A', for example, are equivalent).


These extended missing values will be automatically written to the output file for output formats that support them (SAS and Stata).


String Quote Character

This is the character that is used to enclose string fields in the input data set.  The default character is set as double quotes.  You can choose the appropriate character for the input data.  However, if string variables are not enclosed by any character, you can leave this option set at the default double quote.


Maximum Number of Lines to Examine

Stat/Transfer first reads your ASCII data to determine what type of variable is present in each delimited position.  By default it will read the first 1000 lines of your data set.  If you data are consistent, so that the first few lines suffice to show each variable type, and your data have enough rows that it actually takes more than a few seconds to examine 1000 lines, you might want to set this option to a smaller limit, such as 50.  On the other hand, if you don't want to worry about this issue and you do not generally read truly large datasets (gigabytes of data), you might want to just set this to "all".


Decimal Point

If your data set uses a symbol other than the default period to indicate the decimal point in a number (a comma, for example), enter the character on the Decimal Point line.


Thousands Separator

If your data set uses a symbol other than the default comma  to mark thousands in a number (a period, for example), enter the character on the Thousands Separator line.