Data sets
Contents
Introduction
A data set is best thought of as a table, with rows and columns. The sections below describe how the data is arranged.
Variables
A variable is the basic building block of your data sets in Rave. Variables are defined by the following attributes:
Variable Name
Each variable has a name. When you load a data set from a file, the variables names are taken from the first row of the file. Otherwise, Rave will ask you to name variables as they are created.
See also: Renaming Variables
Dependency
Each variable may be independent or dependent:
Independent Variables are not calculated by functions. Rave can use the values of independent variables stored in the data table to draw graphs, but since Rave has no way of calculating new values of independent variables, they cannot be used in continuous graphs or in most optimizers. When you load a new data set from a file, each column in the file becomes a new independent variable. After loading a data set, you can modify the values of independent variables by editing the main table.
Dependent Variables are calculated by functions that you have loaded from the Model Tab. Dependent variables can be functions of independent variables, or other dependent variables (or both). Rave can call the function that calculates dependent variables' values to generate new data, allowing dependent variables to be used for optimization and continuous graphs. However you can never directly modify the values of independent variables; you can only edit the values of the independent variables.
Class
Each variable has one of the following classes that describes the information it encodes. When you load a new data set, Rave will attempt to determine the class of each variable. Typically numerical variables will begin as "continuous" and string variables will begin as "string". You can change these classes from the Manage Data Sets window.
Continuous numerical variables can take any value between their min and max values.
Integer numerical variables can take only integer values between their min and max values.
Discrete numerical variables can only take values from a user-defined list of allowable values.
Logical variables can only take 0 or 1 values. You can customize how these values appear on graphs, for example "True" vs "False" or "Yes" vs "No".
Text variables are like Discrete variables, but instead of taking numerical values they take any text string. Any operation that performs math cannot be used with string variables.
Note: Currently all dependent variables must be continuous, although the variable class is really only used by independent variables so this is not too important. (If the output of a function is discrete, then it doesn't matter if Rave thinks it is continuous.)
See also: Changing variable classes
Rows
Modifying Data Sets
There are several actions you can take in Rave to modify a data set. It will always be apparent when you are performing such an action. Working with graphs, optimizers, or surrogate models will never alter your data set without asking you first.
When you modify a data set, all graphs that use that data set will be automatically updated to reflect the new data.
Certain modifications will reset row colors, selection state, or visibility state. For example if you replace the entire data set, all rows will reset to their default state (visible, unselected, default color).
Some of the most common ways to modify a data set are:
- Add new columns by loading a function
- Add new rows by appending the results of an optimizer
- Change individual data values by editing the main table
- Delete rows by right clicking the main table header
- Add new rows by appending a design of experiments, or replace all rows in the data set