Here you enter the percentage Value which will be used be the selected statistical method.

Here you can set special values (separated by Semicolon and/or space) which will be checked separately. If a match is found the result column will be flagged with a 1.

Summary

This algorithm filters data records through value comparisons and statistical averages, averages with standard deviations, medians and special values. Value changes in relation to the previous data record are also considered.

Configuration

Which tables are affected by the operation?

Apply to the following column(s)
[Input] Angabe der Spalten, die analysiert werden sollen. Achtung: Operation kann nur auf numerische Spalten angewandt werden. Keine Auswertung bei Texten, Datum und Zeitwerten!

Settings

Aktion
[Input] What should happen? here you can decide, if all data sets should be in the one result or if only the outliner data sets should be used.
[Choice]

  • Retain all data records
  • Only retain data records WITH all criteria
  • Only retain data records WITHOUT criteria

in the result column will be a flag for all three criteria. There for you can first run the analysis look at the result and choose the right set later on.

Value less than

[Input] Here you can enter the absolute value. This value will be compared with the columns. if the value in the column smaller than the absolute value is the criteria satisfied.

Value greater than
[Input] Here you can enter the absolute value. This value will be compared with the columns. if the value in the column greater than the absolute value is the criteria satisfied. 

Valid value set

[Input] Here you can set special values (separated by Semicolon and/or space) which will be checked separately. If a match is found the result column will be flagged with a 1.

Invalid value set (Error Same as above)
[Input] Here you can set special values (separated by Semicolon and/or space) which will be checked separately. If a match is found the result column will be flagged with a 1.

Statistic methods
[Input] Here you can select a method for the analysis. The comparison values for the statistical method are entered in the input field (value less than) x (percent from the statistical method) and (value greater than) x (percent from the statistical method).


[Choice]

  • Average
  • Mean value with standard deviation
  • Median
  • Percentile 1
  • Percentile 5
  • Percentile 95
  • Percentile 99

Value less than xx% of the statistic method

[Input] Here you enter the percentage Value which will be used be the selected statistical method.

Value greater than xx% of the statistic method
[Input] Here you enter the percentage Value which will be used be the selected statistical method.


Value changed in comparison to the precursor

[Input] Here you can select different methods for the evaluation which refer to the precursor data set
[Choice]

Increase (to predecessor) by more than ...

If the new data set value has increased by more than the entered value, then this criterion applies.

Increase (to the predecessor) by no more than ...

If the new data set value has increased by no more than the entered value, then this criterion applies.

Increase (to the predecessor) by more than ... percent

If the new data set value has increased by more than the entered percentage value, this criterion applies.

Increase (to the predecessor) by no more than ... percent

If the new data set value has increased by no more than the entered percentage value, this criterion applies.
Decrease (to the predecessor) by more than ...
If the new data set value has decreased by more than the entered value, then this criterion applies.
Decrease (to the predecessor) by no more than ...
If the new data set value has decreased by no more than the entered value, then this criterion applies.
Decrease (to the predecessor) by more than ... percent
If the new data set value has decreased by more than the entered percentage value, this criterion applies.
Decrease (to the predecessor) by no more than ... percent
If the new data set value has decreased by no more than the entered percentage value, this criterion applies.

Beispiel 1

TIS: Example1: Predecessor

Value change in relation to the predecessor

In this example the calculating values in relation to the predecessor are displayed. in the left table was calculated with 0-values allowed, the right column was calculated withoput 0-values. the differences are marked in red.

The method from Value change in relation to the predecessor compares the operation values change to predecessors with the column Diff with 0 values, if values with 0 as predecessors ignore is not checked. If values with 0 is ignored as predecessors, the column diff is shown as 0-W.




Value change against previous value

[Input] Here you enter an absolute value for the selected method field Value changed in comparison to the precursor

Ignore values with zero as previous value
[Input] Here you can select if values which have a zero as precousor should be ignored.

Result Column
[Input] Name of the result column

Notes

  • This operator can analyze multiple columns in one run.
  • All calculations and comparisons are done separately, for each column, e.g. the average is calculated for every column not the average from multiple columns
  • All comparisons are conectet with an AND, ALL conditions have to be met
  • more detailed information see: Statistical methods
  • Value changed in comparison to the precursor: The data sets are used as they are given, there is no soting involved. The first data set will be used as a precursor, to prevent that it is counted as an outlier.
  • Problems with 0-Values:
  • 0-Values in the data set can produce a great change in value, so normal values can be flaged as outliers, so there is the setting Ignore values with zero as previous value.
  • By 0-Values with percentage variation a constant will be used to prevent divided by 0 errors, it will produce values in the billions,  this should describes an almost infinite slope.

Want to learn more?

Settings

Columns of input table



Parameter



Screenshoot

Examples

Example: Value smaller than 5

Start

  • In the value column from the data node A01, the data records which are smaller than 5 should be found:


Add operators and choose the right settings

  • Add the operation 'Outlier search' to data node A01.
  • Under Analyze, enter the following column (s) 'B'.
  • Under Value, enter less than '5'.


Show Screenshots

TIS: Example: Outlier, Value smaller than 5



Result

  • Each data record is compared with the entered value; if the value of column B is less than 5, column C is set to 1.


Example: Value greater than 5

Start

  • In of the value column from the data node A01, the data records which are larger than 5 should be found:


Add operators and choose the right settings

  • Add the operation 'Outlier search' to data node A01.
  • Under Analyze, enter the following column (s) 'B'.
  • Under Value, enter greater than '5'.


Show Screenshots


Result

  • Each data record is compared with the entered value; if the value of column B is less than 5, column C is set to 1.


Example: Median (smaller/greater)

Start

  • In the value column from data node A01, the data sets, which are 20% smaller than the median, should be to be found:


Add operators and choose the right settings

  • Add data to node A01 to perform the operation of the outlier operation.
  • Under Analyze, enter the following column (s) 'B'.
  • Select the method 'Median' as the statistical method.
  • enter Value, less than ...% of the statistical method '20'.


Show Screenshots



Result

  • The median is calculated via column B. For each row, the difference between the value from column B and the median is calculated. If the difference is less than 20 percent of the median, column C is set to 1.


Example: Average (smaller/greater)

Start

  • In the value column of node A01, all values which are smaller than the average by at least 21% should be found


Add operators and choose the right settings

  • Add data to node A01 to perform the operation of the outlier operation.
  • Under Analyze, enter the following column (s) 'B'.
  • Select the method 'Average' as the statistical method.
  • enter Value, less than ...% of the statistical method '21'.


Show Screenshots



Result

  • The average is calculated via column B. For each row, the difference between the value from column B and the average is calculated. If the difference is less than 21 percent of the average, column C is set to 1.


Show Screenshots




Example: Average with Standard deviation (smaller)

Start

  • In the value column of node A01, all values which are smaller than the average with the standard deviation by at least 20% should be found


Add operators and choose the right settings

  • Add data to node A01 to perform the operation of the outlier operation.
  • Under Analyze, enter the following column (s) 'B'.
  • Select the method 'Average with standard deviation' as the statistical method.
  • enter Value, less than ...% of the statistical method '21'.


Show Screenshot



Result

  • First by using column B the average with standard deviation is calculated, later on the average is calculated and the values are compared, if the deviation is too big, the flag in column c is set to 1.


Show Screenshots



Result


Example: percentile

Start

  • In the B column from the data node A04, the data sets where the values are 1 percent smaller than the percentile 5 will be marked.



Add operators and choose the right settings

  • Add data to node A01 to perform the operation of the outlier operation.
  • Under Analyze, enter the following column (s) 'B'.
  • As an action, "Only retain data records WITH all criteria".
  • Select the method 'percentile 5' as the statistical method.
  • enter Value, less than ...% of the statistical method '1'

Results

  • In the B column from the data node A04, the data sets where the values are 1 percent smaller than the percentile 5 will be marked.


Example: Valid values

Start

  • In the value column from the data node A03, the data sets which are  0, 10, 100 will be flagged.


Add operators and choose the right settings

  • Add data to node A01 to perform the operation of the outlier operation.
  • Under Analyze, enter the following column (s) 'B'.
  • As an action, "Valid value quantity", select  0,10,100
  • Select the method 'percentile 5' as the statistical method.
  • enter Value, less than ...% of the statistical method '1'


Show Screenshots




Result

  • The selected values wiill be searched in column B, if a match is found column C will be flagged.

Example: Increase to predecessor by more than...

Start

  • In the B column from the data node A02, the data sets which values increases by more than 5 in relation to the predecessor will be marked:


Add operators and choose the right settings

  • Add data to node A02 to perform the operation of the outlier operation.
  • Under Analyze, enter the following column (s) 'B'.
  • As an action, "increase to predecessor by more than".
  • enter Value, more than ...% of the statistical method '5'

Result

All Values which are different by more than 5% from the predecessor are flagged.

  • In the left column 0-values are ignored.
  • In the right column 0-values are not ignored.

 
Left

 
right

Troubleshooting

Bisher nichts bekannt.

Related topics

  • ...

Increase to predecessor by more than