You might never have heard of a self join but you might need one. Occasionally, tables in a database are structured so that it makes sense to join a copy of a table to itself. Let’s imagine we work for a bank and want to understand our members’ behavior better. We want to know which credit card they are most likely to switch to after closing an American Express card.
Let’s say all the information about card openings and closures are in a single table like this:
If we want to track cards that open on the same day as a card was closed, we’ll need to join this table to itself. Here’s what that looks like:
We’ll join to make sure accounts match one another, and then join a close date to an open date.
Once you do that, you effectively have the same fields listed twice in your data pane, but they represent different parts of your data (notice I renamed the tables to “Closures” and “Opens”).
That gives us the freedom to filter on the Credit Card Type from the “Closures” table and see the number of cards opened by Credit Card Type from the “Opens” table.