Introduction
What is a join?
- Take tow relations , and create one new relation out of their matches on the join condition.
- That is, for each Record in , find all records in that match the join condition and write as a new row in the output. All the fields of followed by all the fields of .
Notation Conventions
Join Methods
- Simple Nested Loop Join
- Page Nested Loop Join
- Block Nested Loop Join
- Index Nested Loop Join
- Hash Join
- Sort-Merge Join
Hash Join vs. Sort-Merge Join
Sorting pros:
- Good if input already sorted, or need output sorted
- Not sensitive to data skew or bad hash functions
Hashing pros:
- For join: number of passes depends on size of smaller relation
- E.g. if smaller relation is , naive/hybrid hashing is great
- Good if input already hashed, or need output hashed