...
Info | ||
---|---|---|
| ||
If there were missing values (∅) recorded in the Lake, these missing values are maintained when the data is resampled. Therefore, you may choose a strategy to “Fill missing values”. The method depends on the type of data to fill and the logic for choosing a sampling method is similar to resampling data. |
To create a strategy of fill missing value of a certain variable:
- Click Transform > Fill missing values in the menu.
Enter a name for the new variable set.
Info title Variable set Note that Filling Missing Values creates a Variable set, containing several new variables. Therefore, to check your Filled Missing variables you can go to the third icon on the left bar.
- Enter the variable name, default prefix: CLEAN_
- Choose the Method.
- Enter a Value, in case, Default Type enter a Default Value.
- Select Record set.
- Enter the variables or the set of variables.
- Enter a Index variable to define the order of rows (records). e.g. Select timestamp if the rows are not in chronological order. The ordering is important for some fill methods: i.e. interpolation, previous and next.
- Click Save.
The types used:
- Average: the arithmetic mean of the original variable.
- Default value: the value designated in the Default value field.
- Interpolation: the linear interpolation between the value of the previous record and the value of the next record.
- Previous: the value of the previous record.
- Next: the value of the next record.
Tip | ||
---|---|---|
| ||
The method depends on the type of data to fill and the logic for choosing a sampling method is similar to resampling data.
You can also filter missing values using “Record Sets”, so that if a value is missing for one variable, the whole record line will be removed from the selection (=from the record set) Be aware that such a rule: will remove all rows where "Profit/hr" is below 3000, including any rows where “Profit/hr” is missing and that this impacts all variables (Record sets filter entire rows). |