Deprecated: Return type of ResultWrapper::current() should either be compatible with Iterator::current(): mixed, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/db/DatabaseUtility.php on line 174

Deprecated: Return type of ResultWrapper::next() should either be compatible with Iterator::next(): void, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/db/DatabaseUtility.php on line 192

Deprecated: Return type of ResultWrapper::key() should either be compatible with Iterator::key(): mixed, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/db/DatabaseUtility.php on line 185

Deprecated: Return type of ResultWrapper::valid() should either be compatible with Iterator::valid(): bool, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/db/DatabaseUtility.php on line 202

Deprecated: Return type of ResultWrapper::rewind() should either be compatible with Iterator::rewind(): void, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/db/DatabaseUtility.php on line 163

Deprecated: explode(): Passing null to parameter #2 ($string) of type string is deprecated in /var/www/vhosts/rave.gatech.edu/httpdocs/help/languages/Language.php on line 2104

Deprecated: explode(): Passing null to parameter #2 ($string) of type string is deprecated in /var/www/vhosts/rave.gatech.edu/httpdocs/help/languages/Language.php on line 2104

Deprecated: explode(): Passing null to parameter #2 ($string) of type string is deprecated in /var/www/vhosts/rave.gatech.edu/httpdocs/help/languages/Language.php on line 2104

Deprecated: explode(): Passing null to parameter #2 ($string) of type string is deprecated in /var/www/vhosts/rave.gatech.edu/httpdocs/help/languages/Language.php on line 2104

Deprecated: explode(): Passing null to parameter #2 ($string) of type string is deprecated in /var/www/vhosts/rave.gatech.edu/httpdocs/help/languages/Language.php on line 2104

Deprecated: explode(): Passing null to parameter #2 ($string) of type string is deprecated in /var/www/vhosts/rave.gatech.edu/httpdocs/help/languages/Language.php on line 2104

Deprecated: explode(): Passing null to parameter #2 ($string) of type string is deprecated in /var/www/vhosts/rave.gatech.edu/httpdocs/help/languages/Language.php on line 2104

Deprecated: explode(): Passing null to parameter #2 ($string) of type string is deprecated in /var/www/vhosts/rave.gatech.edu/httpdocs/help/languages/Language.php on line 2104

Deprecated: explode(): Passing null to parameter #2 ($string) of type string is deprecated in /var/www/vhosts/rave.gatech.edu/httpdocs/help/languages/Language.php on line 2104

Warning: Trying to access array offset on value of type null in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/profiler/SectionProfiler.php on line 104

Warning: Trying to access array offset on value of type null in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/profiler/SectionProfiler.php on line 104

Warning: Trying to access array offset on value of type null in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/profiler/SectionProfiler.php on line 105

Warning: Trying to access array offset on value of type null in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/profiler/SectionProfiler.php on line 105

Warning: Trying to access array offset on value of type null in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/profiler/SectionProfiler.php on line 106

Warning: Trying to access array offset on value of type null in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/profiler/SectionProfiler.php on line 106

Deprecated: header(): Passing null to parameter #3 ($response_code) of type int is deprecated in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/WebResponse.php on line 37

Warning: Cannot modify header information - headers already sent by (output started at /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/profiler/SectionProfiler.php:105) in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/WebResponse.php on line 37

Deprecated: header(): Passing null to parameter #3 ($response_code) of type int is deprecated in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/WebResponse.php on line 37

Warning: Cannot modify header information - headers already sent by (output started at /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/profiler/SectionProfiler.php:105) in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/WebResponse.php on line 37

Deprecated: header(): Passing null to parameter #3 ($response_code) of type int is deprecated in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/WebResponse.php on line 37

Warning: Cannot modify header information - headers already sent by (output started at /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/profiler/SectionProfiler.php:105) in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/WebResponse.php on line 37

Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/skins/SkinTemplate.php on line 313

Deprecated: explode(): Passing null to parameter #2 ($string) of type string is deprecated in /var/www/vhosts/rave.gatech.edu/httpdocs/help/languages/Language.php on line 2104

Deprecated: explode(): Passing null to parameter #2 ($string) of type string is deprecated in /var/www/vhosts/rave.gatech.edu/httpdocs/help/languages/Language.php on line 2104
Difference between revisions of "Data Set File Formats" - Rave Documentation

Difference between revisions of "Data Set File Formats"

From Rave Documentation
Jump to: navigation, search
(Fast Load)
 
(4 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
Rave can read .txt, .csv, .xls, and .xlsx files. The file must be formatted as a "flat file database," which simply means it has the variable names in the first row, and data records in all remaining rows. Each column must be separated by a comma or tab (or be separate columns in Excel), and each row of the file must have the same number of columns. If data is missing, it should be listed as "nan", not simply left blank.  
 
Rave can read .txt, .csv, .xls, and .xlsx files. The file must be formatted as a "flat file database," which simply means it has the variable names in the first row, and data records in all remaining rows. Each column must be separated by a comma or tab (or be separate columns in Excel), and each row of the file must have the same number of columns. If data is missing, it should be listed as "nan", not simply left blank.  
 +
 +
 +
===Allowable Variable Names===
 +
In general, your variable names can be any string of text that doesn't contain whatever delimiter is used for your data file (e.g. in a .csv file your variable names cannot contain commas, but in other files types commas are ok.
 +
 +
But there are a couple additional restrictions (well, only one right now):
 +
*Your variable names cannot be enclosed in angle brackets: <Variable Name>. However, angle brackets may otherwise appear in your names. E.g. "this<that" is a valid variable name, as is "thisthat>" but "<thisthat>" is not.
 +
 +
 +
 +
===Fast Load===
 +
When creating a [[data set]], you can use the "fast load" option (turned on by default) if your data meets the following requirements:
 +
*The format of each column can be inferred by the first row of data (the 2nd row of the file, after the column headers). I.e. if the value in the first row of a given column is a number, then every entry in that column is a number.
 +
*The [[data set]] has no missing values or NaNs.
 +
 +
Using fast load reduces the number of checks Rave does when loading your data. This makes the loading process significantly faster, but it may cause errors (such as only loading a portion of the data file) if your data does not meet the above requirements.
 +
 +
If your data does not meet these requirements, uncheck the "Fast Load" box when creating the [[data set]]. As its name implies, turning off fast load will significantly increase the amount of time needed for Rave to load your data file.
 +
 +
'''Note:''' The "Fast Load" check affects data files loaded by the following buttons in the Manage data gui: "Create this [[Data Set]]", "Append Data", "Replace Data"
  
 
=Working with Excel Files=
 
=Working with Excel Files=
*If you load an Excel file that contains multiple worksheets, each (non-empty) worksheet will be loaded as a separate [[Data set]]. The [[data sets]] will be named after the corresponding worksheets.   
+
*If you load an Excel file that contains multiple worksheets, each (non-empty) worksheet will be loaded as a separate [[data set]]. The [[data sets]] will be named after the corresponding worksheets.   
  
 
*Excel files take longer to load that plain text files, so if you're working with large [[data sets]] you might want to use .txt files.
 
*Excel files take longer to load that plain text files, so if you're working with large [[data sets]] you might want to use .txt files.

Latest revision as of 16:05, 17 October 2014

Rave can read .txt, .csv, .xls, and .xlsx files. The file must be formatted as a "flat file database," which simply means it has the variable names in the first row, and data records in all remaining rows. Each column must be separated by a comma or tab (or be separate columns in Excel), and each row of the file must have the same number of columns. If data is missing, it should be listed as "nan", not simply left blank.


Allowable Variable Names

In general, your variable names can be any string of text that doesn't contain whatever delimiter is used for your data file (e.g. in a .csv file your variable names cannot contain commas, but in other files types commas are ok.

But there are a couple additional restrictions (well, only one right now):

  • Your variable names cannot be enclosed in angle brackets: <Variable Name>. However, angle brackets may otherwise appear in your names. E.g. "this<that" is a valid variable name, as is "thisthat>" but "<thisthat>" is not.


Fast Load

When creating a data set, you can use the "fast load" option (turned on by default) if your data meets the following requirements:

  • The format of each column can be inferred by the first row of data (the 2nd row of the file, after the column headers). I.e. if the value in the first row of a given column is a number, then every entry in that column is a number.
  • The data set has no missing values or NaNs.

Using fast load reduces the number of checks Rave does when loading your data. This makes the loading process significantly faster, but it may cause errors (such as only loading a portion of the data file) if your data does not meet the above requirements.

If your data does not meet these requirements, uncheck the "Fast Load" box when creating the data set. As its name implies, turning off fast load will significantly increase the amount of time needed for Rave to load your data file.

Note: The "Fast Load" check affects data files loaded by the following buttons in the Manage data gui: "Create this Data Set", "Append Data", "Replace Data"

Working with Excel Files

  • If you load an Excel file that contains multiple worksheets, each (non-empty) worksheet will be loaded as a separate data set. The data sets will be named after the corresponding worksheets.
  • Excel files take longer to load that plain text files, so if you're working with large data sets you might want to use .txt files.

Deprecated: header(): Passing null to parameter #3 ($response_code) of type int is deprecated in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/WebResponse.php on line 37

Warning: Cannot modify header information - headers already sent by (output started at /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/profiler/SectionProfiler.php:105) in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/WebResponse.php on line 37

Deprecated: header(): Passing null to parameter #3 ($response_code) of type int is deprecated in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/WebResponse.php on line 37

Warning: Cannot modify header information - headers already sent by (output started at /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/profiler/SectionProfiler.php:105) in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/WebResponse.php on line 37

Deprecated: header(): Passing null to parameter #3 ($response_code) of type int is deprecated in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/WebResponse.php on line 37

Warning: Cannot modify header information - headers already sent by (output started at /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/profiler/SectionProfiler.php:105) in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/WebResponse.php on line 37

Deprecated: header(): Passing null to parameter #3 ($response_code) of type int is deprecated in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/WebResponse.php on line 37

Warning: Cannot modify header information - headers already sent by (output started at /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/profiler/SectionProfiler.php:105) in /var/www/vhosts/rave.gatech.edu/httpdocs/help/includes/WebResponse.php on line 37