Univariate Analysis

Statistical Tests

Statistical Tests are classification and regression tests that include: T test, Spearman, Pearson and Kruskal-Wallis ANOVA.

Model Fonction

The statistical tests' goal can be symbolic or numerical. The T test and Kruskal-Wallis ANOVA support a symbolic goal. The tests Spearman and Pearson support a numerical goal.

To launch this model tool, select Models > Statistical Test from the menu.

Create Statistical Test

  1. Select a Test from the list.
  2. Select an Record set from the list.
  3. Select a Variable Set, if required. 
  4. Select variables(s) from the list for the Candidates.
  5. Select a variable for the Goal.
  6. Click Save

Empty and constant variables

The statiscal test ignores empty or constant variables.


Statistical TestsTool tips Graphs
Welch's  T Test

T-test evaluates the impact of each candidate variable for discriminating two different Goal variable behaviors.
The t value is based on the distance between the Average values for the two Candidate sub-distributions.

-Candidates: must be numeric
-Goal: must be symbolic with two different classes/symbols. For more than two different classes, see Kruskal-Wallis ANOVA. 

Creates a series of Histograms with:

X - one Candidate variable

C - the Goal variable

RS - same record set as selected in the editor window

rem: display the statistics info (line on graphs + table below)

Spearman

Spearman evaluates the Monotone (non-linear) correlation between each Candidate variable and the Goal variable.
The Rho value is based on the gaps between Rank values of the Candidate variable and the ones of the Goal variable.
If there is a monotonic relationship between a Candidate and the Goal variables, it means either that,
(1) as the value of one variable increases, so does the value of the other variable;
or (2) as the value of one variable increases, the value of the other variable decreases.

-Candidates: must be numeric
-Goal: must be numeric

Create a series of Scatter Plots with:

X - one Candidate variable

Y - the Goal variable

RS - same record set as selected in the editor window

Pearson

Pearson evaluates the Linear correlation between each Candidate variable and the Goal variable.
The Rho value is based on the Covariance (= deviations from Average values) of the Candidate variable and the Goal variable.
This is the same correlation result as provided by the Dendrogram.

-Candidates: must be numeric
-Goal: must be numeric

Create a series of Scatter Plots with:

X - one Candidate variable

Y - the Goal variable

RS - same record set as selected in the editor window

Kruskal-Wallis ANOVA

Kruskal-Wallis evaluates the impact of each Candidate variable for discriminating two (or more) different Goal behaviors.
The H value is based on the distance between the Median values of the two (or more) Candidate sub-distributions.

-Candidates: must be numeric
-Goal: must be symbolic with two (or more) different values/symbols

Create a series of Box Plots with:
X - the Goal variable

Y - one Candidate variable

RS - same record set as selected in the editor window

rem: display the statistics infos (line on graphs + table below)

In Results tab, it is possible to jump from to any graph of the table. Note that the graphs are ordered according to the statistical importance. In More Actions, create a Variable Set from top variables on the Pareto. 

In Graph tab, there is one graph per variable. Use the control menu below the chart to modify the zoom level and apply rulers.