Page tree
Skip to end of metadata
Go to start of metadata

Summary

Merge columns from the current table with columns from another data node's result table based on one or more key columns. This operator can be applied to any table type. It implements various join operations for relational databases.

Example: Combine two tables using "Intersection"

(Examples for other options see bottom of this page.)

The following tables should be merged:

A01A02
Settings

  

Result

Each row of A01 is compared with each row of A02. If the values of the columns indicated in the first parameter and the third parameter (i.e.,the keys) coincide,
then the respective row of A01 is completed with the values of the matching row of A02 and shown in the result table.

Project-FileConfluence Op Merge Data.gzip

Want to learn more?

 Settings

Merge data from the current table with data from another data node's result table based on one or more key columns. This operation results in a table containing rows with matching keys and the set of columns defined in parameter "How should the data be merged?".

Parameters


ParameterValueOpt.Description
Merge the following columns with columns from the data source

System.String

opt.

A sequence of columns of the current table that serve as keys, i.e., that will be used to match rows of the other table.

The number of columns in this parameter and the type of the columns must equal the number of columns in parameter "Columns used as key in data source 1" and their type.

Data source

System.Int32

-The name of the data node whose result table is to be merged with the current table.
Columns used as key in data source 1

System.String

opt.

A sequence of columns of the result table of the data node specified in parameter "Data source" that serve as keys, i.e., that will be used to match rows of the current table.

The number of columns in this parameter and the type of the columns must equal the number of columns in parameter "Merge the following columns with columns from the data source" and their type.

For further information see also Column references

How should the data be merged?

System.String

  • Intersection (same keys)
  • Add rows in data source 1
  • Add rows in data source 2
  • Add rows in both data sources
  • Append (same table structure required)
  • Keys only used in data source 1?
  • Keys only used in data source 2?
  • Keys only used in one data source
  • Cartesian product
  • Rows with common key in data source 2
  • Rows without common key in data source 2
-

An option that defines which rows and which columns from which source(s) are to be included in the resulting table.

See table "Overview of merging options and comparison to set theory and SQL" below for details.

See table "Examples" at the bottom of this page for examples for each of the options.

Ignore Time?

System.Boolean

-If checked, comparisons will only consider the date and ignore the time of keys that are of type Date.
Generate blank row if nothing has been found

System.Boolean

opt.

If checked, a table containing a single row is returned instead of an empty table.

This row contains NULL in numerical columns and date columns and the text defined in parameter "Text fields content for blank line" in text columns.

Text fields content for blank line

System.String

opt.

The text to be inserted in text columns if parameter "Generate blank row if nothing has been found" is checked.


 Overview of merging options and comparison to set theory and SQL

The following table describes the merge options set in parameter "How should the data be merged?" in detail. These options define which rows and which columns of the two tables are contained in the result table and how they are merged.

Keys play a central role in the merge for all options except "Append (same table structure required)" and "Cartesian product". They are responsible for the selection of rows from the two data sources. The keys are defined in parameters "Merge the following columns with columns from the data source" and "Columns used as key in data source 1". For each data source, more than one column can be defined as key. The number and types of keys set for each data source have to match. Keys given as Integer are treated as Decimals, therefore columns of these types also match.

Note that key values in the data sources are not necessarily unique, i.e., a key can occur in more than one row of a table. In this case, the set theory definition in the table is not adequate for options 1 to 4, 10, and 11. In this case, the result table contains the cartesian product of multiply occurring keys. E.g., if table A contains two occurrences of key A.1 and table B two occurrences of B.1, then the result table contains four rows merging each row with key A.1 with each row with key B.1.

The column "Set theory" in this table describes how the rows are selected according to the keys, i.e., which set of keys is contained in the result table.

The column "Result structure" describes which columns are included in the selected rows (described in column "Set theory"). For better readability, the term "key" is used although a sequence of keys, i.e., multiple columns, can be defined as key. The suffix "fields" refers to all fields of a selected row except those fields that are used as key.

For each option, the table also shows a rough correspondence to an SQL statement,

The current table is referred to by "A" and the table from another data node is referred to by "B".

 How should the data be merged?SQL Set theoryResult column structureDescription
1Intersection (same keys)
SELECT * FROM A INNER JOIN B 
ON A.key = B.key
A.key n B.key (Intersection)

A.key; A.fields; B.fields

The result table contains a row for each key that occurs in both tables. For each row, all columns from both tables are added. Except for the key column, columns with the same name are renamed.
2Add rows in data source 1
SELECT * FROM A LEFT OUTER JOIN B 
ON A.key = B.key
(A.key n B.key) u A.keyA.key; A.fields; B.fieldsThe result table contains a row for each key that occurs in the current table. To each row, all columns from both tables are added. Rows with keys in both tables are merged. Except for the key column, columns with the same name are renamed.
3Add rows in data source 2
SELECT * FROM A RIGHT  OUTER JOIN B 
ON A.key = B.key
 (A.key n B.key) u B.keyA.key; A.fields; B.fieldsThe result table contains a row for each key that occurs in the other data node's table. To each row, all columns from both tables are added. Rows with keys in both tables are merged. Except for the key column, columns with the same name are renamed.
Add rows in both data sources
SELECT * FROM A FULL OUTER JOIN B 
ON A.key = B.key
 A.key u B.key (Union)A.key; A.fields; B.fieldsThe result table contains a row for each key that occurs in either of the two tables. For each row, all columns from both tables are added. Rows with keys in both tables are merged. Except for the key column, columns with the same name are renamed.
5Append (same table structure required)
SELECT * FROM A
 UNION
 SELECT * FROM B
n/a

A.key; A.fields

The result table of the other data node is appended to the current table. Both tables must have the same schema. 
 6Keys only used in data source 1?
SELECT A.key, COUNT(A.key) 
 FROM A 
 WHERE A.key NOT IN (SELECT B.key FROM B) 
 GROUP BY A.key
A.key \ B.key (Difference)A.key; #For all rows of the current table whose key also occurs in the other data node's table, the result table contains the key and its number of occurrences  in the current table.
 7 Keys only used in data source 2?
SELECT B.key, COUNT(B.key)
 FROM B 
 WHERE B.key NOT IN (SELECT A.key FROM A)
 GROUP BY B.key
B.key \ A.key (Difference)B.key; #For all rows of the other data node's table whose key also occurs in the other data node's table, the result table contains the key and its number of occurrences  in the current table.
 8Keys only used in one data source
SELECT A.key, COUNT(A.key) 
 FROM A 
 WHERE A.key NOT IN (SELECT B.key FROM B) 
 GROUP BY A.key
 UNION
 SELECT B.key, COUNT(B.key)
 FROM B 
 WHERE B.key NOT IN (SELECT A.key FROM A)
 GROUP BY B.key

(A.key u B.key) \ (A.key n B.key)

(Symmetric difference)

A.key; #For all rows whose keys occurs only in either of the tables, the result table contains the key and its number of occurrences  in the respective table.
 9Cartesian product
SELECT * FROM A, B
n/aA.key; A.fields; B.key; B.fieldsThe result table contains the cartesian product of the current table and the other data node's table.
10 Rows with common key in data source 2
SELECT * FROM A 
WHERE A.key IN (SELECT B.key FROM B)
A.key n B.keyA.key; A.fieldsThe result table contains all rows of the current table, whose key is contained in the other data node's table.
 11Rows without common key in data source 2
SELECT * FROM A 
WHERE A.key NOT IN (SELECT B.key FROM B)
A.key \ B.keyA.key; A.fieldsThe result table contains all rows of the current table, whose key is not contained in the other data node's table.
More than one key column
SELECT * FROM A LEFT JOIN B ON {A.X <> B.Y AND A.X2 <> B.Y2}
 Examples

Examples


Merge options in the Operator

Calculation methodData nodes to be mergedSettingsResult
Intersection (same keys)

A01:

A02:

 

Each row of A01 is compared with each row of A02. If the values of the columns indicated in the first parameter and the third parameter (i.e.,the keys, columns A of both tables) coincide,  then the respective row of A01 is completed with the values of the matching row of A02 and shown in the result table.


Add rows in data source 1

A01:

A02:

Each row of A01 is compared with each row of A02. If the values of the columns indicated (i.e., the keys, columns A of both tables) coincide, then the respective row of A01 is completed with the values of the matching row of A02 and shown in the result table. All remaining rows of A02 are also added to the result. Fields of these rows whose columns are not contained in A02 are completed with NULL  (denoted by "–", see column "Date").

 

Add rows in data source 2

A01:

A02:

Each row of A01 is compared with each row of A02. If the values of the columns indicated (i.e., the keys, columns A of both tables) coincide, then the respective row of A02 is completed with the values of the matching row of A02 and shown in the result. All remaining rows of A02 are also added to the result. Fields of these rows whose columns are not contained in A02 are completed with NULL (denoted by "–", see column "Value").

Add rows in both data sources

A01:

A02:

Each row of A01 is compared with each row of A02. If the values of the columns indicated (i.e., the keys, columns A of both tables) coincide, then the respective row of A01 is completed with the values of the matching row of A02 and shown in the result table. All remaining rows of A01 and A02 are also added to the result. Fields of these rows whose columns are not contained in the other table are completed with NULL  (denoted by "–", see columns "Date" and "Value").

Append (same table structure required)

A01:

A03:

 

Each row of data node A03 is appended to the table of data node A01.

 

Keys only used in data source 1

A01:

A03:

Each row of A01 is compared with each row of A03 according to their keys (columns A in both tables). Each key which is present only in data node A01 is shown in the result together with its number of occurrences.

 

Keys only used in data source 2

A01:

A03:

Each row of A01 is compared with each row of A03 according to their keys (columns A in both tables). Each key which is present only in data node A03 is shown in the result together with its number of occurrences.

 

Keys only used in one data source

A01:

A03:

Each row of A01 is compared with each row of A03 according to their keys (columns A in both tables). Each key which is either only present in either data node A01 or in data node A03 is shown in the result together with its number of occurrences in column "Number", the table where it was found in column "table", and the row number(s) of the table where the occurrences where found in column "Occurrence".

 

Cartesian product

A01:

A02:

Each row of A02 is appended to each row of A01.

Rows with common key in data source 2

A01:

A02:

Those rows of data node A01 whose values in column A are also present in column A of data node A02 are shown in the result.

Rows without common key in data source 2

A01:

A02:

Those rows of data node A01 whose value in column A is not present in column A of data node A02 are shown in the result.

 

 Troubleshooting
Frequent ProblemsSolutions

Columns are not entered in corresponding order

e.g.

  • node A00: column A = date, column B = number
  • node B00: column A = number, column B = date
  • In this case you can not enter A,B and A,B BUT A,B and B,A

Wrong selection from the list of nodes (this may sound unlikely but it is not)

The tricks explained under Benennung von Knoten (old wiki) can help.
There is a text in the background, even if it does not look like it. e.g. with Fehlerknoten (old wiki) 
There are different data types in the data nodes (e.g. numbers, once imported as text, once imported as number)The Formula operator (rowwise) can help converting the data to a common data type.
TOO MANY colums are marked for comparison. Among them there are columns which do not have common keys (even if e.g. intersection was selected). 

War rot markiert im Wiki: In combination with other operations (e.g. Identifier instances) if it should be used with or without consideration of upper and lower case letters.

 
War rot markiert im Wiki: Column order should be chosen so that the strongest restrictions are located in the beginning. Typically, FROM-TO should be in the beginning of the list of keys.  

Related topics




1 Comment