OpenDocument Spreadsheets
Previous Topic  Next Topic 

The OpenDocument format for spreadsheets is an ISO standard XML format for spreadsheet data.  It supported by numerous applications, including, LibreOffice, and Google Docs.

Standard extension: ods

Reading OpenDocument Spreadsheets

Because worksheet files are in general not designed to hold statistical data, only worksheets in certain formats can be read.

Worksheets must be in worksheet database format or in certain modifications of this.  It is most straightforward to read worksheets in database format.

Database Format

Worksheet database files are structured worksheets where each row is a single case and each column contains a variable.  Data can consist of numbers (including serial date numbers), labels, or formulas.

The first non-blank row of a worksheet database file usually has strings in each column that give the names of the variables.  The data then begins in the next row. However, variable names may be in different rows or not present at all.

You have several options to specify what part of an input worksheet to read and how to read variable names.  An option allows you to read variable labels from the row after the variable names. 

Data Range

You can choose different ranges to be read in input worksheets by using the drop-down menu for Data Range in the Worksheets section of the Options dialog box or using the SET command, WKS-DATA-RANGE.

If you use Autosense (the default), Stat/Transfer will read to the first non-blank cell and use that as the upper left corner of the data range.  It will then read data until it encounters an entirely blank line.  This is the default behavior.  You can change the behavior with respect to blank rows by using the Blank Rows option.

Rather than have Stat/Transfer automatically sense the number of rows to be read, you can use the other options for Data Range to specify a range, either by giving a named range or by giving explicit coordinates.

When a range has been specified, Stat/Transfer's treatment of entirely blank rows will also be overridden.  They will be returned in your output data and, in addition, blank rows at the end of the worksheet, through the last row of the specified range, will also be returned.

Determining Variable Names

By default, Stat/Transfer will attempt to determine whether or not the data in the first non-blank row (or the first row of a specified range) are variable names or the first row of data.  It does so by looking for at least one column in which there is a string in the first row and a number in the second.  If this behavior is inappropriate for your worksheet (for example, if you have only string data), you can specify different options in the Field Name Row drop-down menu of the Worksheets section of the Options dialog box.  You can specify that variable names must be taken from the first non-blank row, that they be taken from a specific row or that the worksheet does not contain variable names, so that Stat/Transfer should assign them ('col1' through 'coln'.)

Determining the Data Types and Widths

After identifying the label row, Stat/Transfer will look at the entire column. If any cell contains a string, the entire column will be treated as string data (with numbers and dates converted to strings).  If the column is all numeric or blank, it will be treated as a numeric column.

The width of the column for each numeric variable and the format of the first non-blank data cell in that column are used, where possible, to set the default target, or output, types for the numeric variables.  If the format of the first data row has any decimal places (for example, F(2)), the target type will be 'float'.  On the other hand, if the cell format has no decimal places (for example, F(0)) the target types will be various flavors of integers which depend on the column width.  If the column width is less than three, the target type will be 'byte'.  If the column width is less than five, the target type will by 'int'.  Otherwise the target type will be 'long'.  Any date format in the first data row will set the target type to 'date'.

The maximum width of character variables is determined by examining the widths of all of the strings in a column.

Stat/Transfer is lenient in typing variables from worksheets.  If it is expecting a character variable and it encounters a number it will convert it to a string.

Combining Multiple Input Worksheet to a Single Output File

The option Concatenate Worksheet Pages, found in the Worksheets section of the Options dialog box allows you to combine worksheet pages into a single output file.  This option is appropriate if your worksheet contains many sheets that are identical in structure.  These can be then be combined into a single output file of any type.

For example, you may have a workbook that has 50 sheets, with one sheet for each state and the same variables on each sheet.  If you check this box, Stat/Transfer will effectively combine the sheets into one large input file, dropping the field names, if necessary, on the second and higher sheets. You will end up with the data from all of your worksheet pages in a single output file.

Writing OpenDocument Spreadsheets

On output, Stat/Transfer will write variable labels in the first row of the worksheet.  Data values will be placed in the second and succeeding rows.

Column widths and formats will be determined by the variable information available.  Dates and character variables are straightforward.  For numerical data, information on the width and number of decimal places of variables, where available, is used to set the column widths and formats.

Missing Data

On input, blank cells and cells containing labels consisting of a single dot are read as missing.

If there is a string in your worksheet, such as 'NA', that you would like to have treated as a numeric missing value, you can specify it using the Numeric Missing Value option in the Worksheets section of the Options dialog box. 

When transferring data to worksheets from other formats, missing values will be written out to worksheets as blank cells.

Output Variable Types

The output variable type that results from each target variable type is given in the following table:

Target Type

Output Type






Numeric cell (formatted if information available)




Formatted Date


Formatted Time


Formatted Time Stamp