Stata Files
Previous Topic  Next Topic 

Stat/Transfer will read and write data for any version of Stata including versions running on Unix and Macintosh.


Standard extension:  dta


Reading Stata Files

Stat/Transfer can read data from any version of Stata.  Character variables and dates are fully supported.  Variable and value labels are transferred in and out of Stata.


Stata Version 13 includes a  new strl data type.  These are potentially very long, variable length strings.  Stat/Transfer will read these variables.  However, strings longer than the Stat/Transfer limit of  32,000 characters will be truncated and binary strLs will be skipped.


Stata Version 14 supports Unicode and the input encoding will be automatically set to utf-8.


Writing Stata Files

Versions of Stata higher than Version 7 come in two flavors (Standard or "intercooled" Stata and Stata SE or Special Edition). These differ in their limits for the number of output variables.


You can choose the "flavor" of the Stata output file from the list given in the Output File Type box in the Transfer dialog box.


You use the option Stata Version in the Output Options (1) section of the Options dialog box to choose the version of the Stata files written by Stat/Transfer.  The default is Stata Version 13.  Change this option if a different version will be used to read the file.


Any variable and value labels present in the input data set will be written to Stata files.  Variable labels longer than Stata's eighty character maximum are now written to Stata both as truncated variable labels and, in full, as Stata notes.


Stata Version 13 includes a  new strL data type.  These are potentially very long, variable length strings.   Stat/Transfer will write strings longer than a threshold value to Stata strL's rather than strings.  The threshold, by default, is 32 characters, but you can change this on the Options (1) screen.  Strings longer than  Stata’s width limit  of 2045 characters will always be written to strls.   Dates are written to Stata's internal date format.


The output encoding for Stata version 14 will be automatically set to utf-8 and all data and metadata such as variable names and labels will be properly transcoded.


Missing Data

Stata supports missing values.


Versions 8 and above of Stata support SAS-style extended missing values.  The missing values '.A'  - '.Z' and '.' are supported. The missing value '._ ' found in SAS is not used in Stata.  Stat/Transfer supports these extended missing values.


When an Stata file with extended missing values is transferred to a SAS or ASCII file, the input missing values will transfer to the equivalent SAS or ASCII ones.  When a SAS file or ASCII file is transferred to an Stata file with extended missing values specified, missing values will transfer to equivalent ones, except that '._' in input SAS files is written out as  '.' in the output.


For input files that support user missing values (SPSS and OSIRIS), the options User Missing Values and Map to extended (a-z) missing in the Options dialog box can be used to map selected user missing values to extended missing values in the Stata output file.


Output Variable Types

The output variable type that results from each target variable type is given in the following table:



Target Type

Output Type


byte

Byte


int

Int


long 

Long


float

Float


double

Double


date

Stata Date


time

Float (fractional part of a day)


date/time

Double (Stata date and fractional part of a day)



string

Character


strl

Character or strL