Join Types
Required input stages determine the type of the join. The Joiner supports the following types of joins:
Inner
Outer
To show an example of each type of join, we’ll join records from the Customer and Purchase source datasets.Â
We’ll set the join condition to customer_id
and customer_id
(with alias set to customer_id2
for the Purchase key).Â
The Customer input includes the following records:
customer_id | state |
1 | CA |
2 | TX |
3 | NY |
4 | FL |
5 | CA |
6 | NV |
The Purchase input includes the following records:
customer_id2 | item | price |
1 | donut | 0.80 |
1 | coffee | 2.05 |
2 | donut | 1.50 |
2 | plate | 0.50 |
3 | tea | 1.99 |
5 | cookie | 0.50 |
Inner Join
Inner join outputs fields that have matching values in both input datasets.
Continuing with our example, when the Joiner performs an inner join on the sample datasets, Customer and Purchase, it results in the following output:
customer _id | state | item | price |
1 | CA | donut | .80 |
1 | CA | coffee | 2.05 |
2 | TX | donut | 1.50 |
2 | TX | plate | .50 |
3 | NY | tea | 1.99 |
5 | CA | cookie | .50 |
Outer Join
Outer join outputs fields from the required inputs and matching fields from the non-required inputs. To configure an outer join, select the required inputs for the join. For example, in the following Joiner, Customer is the required input, which means the output will include all records in the Customer input and matching records in the Purchase input.Â
Continuing with our example, when the Joiner performs an outer join with Customer set as the required input, it results in the following output:
customer _id | state | item | price |
1 | CA | donut | .80 |
1 | CA | coffee | 2.05 |
2 | TX | donut | 1.50 |
2 | TX | plate | .50 |
3 | NY | tea | 1.99 |
4 | FL | Â | Â |
5 | CA | cookie | .50 |
6 | NV | Â | Â |
Likewise, if you select Purchase as the required input, the output includes all records in the Purchase input and matching records from the Customer input.
For outer joins, if you join multiple inputs, you can select multiple required inputs. The join behavior is the same with one required input or many. The required output includes all records from the required inputs and matching records for the non-required inputs.
Created in 2020 by Google Inc.