Regression analysis 4.0
Summary
This operator performs a multiple linear regression analysis.
Method
Regression analysis is a statistical process for estimating the relationships among variables. Specifically, it is estimated, how the value of a criterion variable (dependent variable) changes when a predictor (independent variable) is varied. The estimation target is a function of the independent variables called the regression function. For more information see for example Wikipedia Regression Analysis.
Source: https://en.wikipedia.org/wiki/Regression_analysis#/media/File:Linear_regression.svg
The operation "Regression Analysis" produces estimates for the coefficients of the independent variables, and an evaluation of the regression in form of a string. Additionally, it is possible to display different statistical measures regarding the regression analysis and plot the data.
Configuration
Input settings of existing table
Settings
Want to learn more?
Examples
Situation | A company expects a linear relation between the number of employees and sales. Therefore, they measure the number of employees and the sales figures in different regions. This assumption shall be examined by calculating a linear regression analysis. |
---|---|
Settings | In this example, we chose the following settings: |
Result |
|
Project File | - |
Troubleshooting
Problem | Frequent Causes | Solutions |
---|---|---|
Error message or "n. def." | 1. There are too few values to estimate this figure. | Create larger groups, or categories (= less differentiation by identifier categories). |
2. An independent variable shows only one value and does not vary. No calculation is possible. | Do not use this independent variable, since it does not vary (requirement for regression analysis). | |
3. Two or more variables are linearly dependent. E.g.,
Using A,B, and TOTAL as independent variables does not allow to distinguish between the effects of each single variable. | Do not use any of these variables (only independent variables). | |
Error message | If the option "Select all numeric columns is set", the semantics of each column needs to be set to "Number" | Use the operator Format columns and change the semantics. |