Web Analytics Made Easy - Statcounter

What is a Bridge Table in Data Warehousing?

In data warehousing, a Bridge Table (sometimes called a Junction Table or Associative Table) is a specialized type of dimension table used to resolve many-to-many (M:M) relationships between a fact table and a dimension table, or between two dimension tables.

In a standard star schema, the relationship between a fact table and a dimension table is typically one-to-many (1:M), where one dimension member can be associated with many fact records. However, some real-world scenarios inherently involve many-to-many relationships that cannot be directly modeled with a simple foreign key in the fact table. This is where bridge tables come in.

The Problem a Bridge Table Solves

Consider these common many-to-many scenarios in data warehousing:

  1. A Customer belongs to multiple Demographic Groups:
    • A customer (DimCustomer) can be a “student,” a “loyalty member,” and a “senior citizen” simultaneously.
    • A demographic group (DimDemographicGroup) can have many customers.
    • If you want to analyze sales (FactSales) by DemographicGroup, a direct link from FactSales to DimDemographicGroup doesn’t work through DimCustomer because DimCustomer itself has a M:M relationship with DimDemographicGroup.
  2. A Prescription involves multiple Diagnoses:
    • A single prescription (FactPrescription) can be for multiple diagnoses (DimDiagnosis).
    • A diagnosis can be associated with many prescriptions.
  3. A Project is assigned to multiple Employees:
    • A project (FactProjectHours) can have multiple employees (DimEmployee).
    • An employee can work on multiple projects.

In these cases, if you directly linked the DimDemographicGroup (or DimDiagnosis, DimEmployee) to the FactSales (or FactPrescription, FactProjectHours) table, you’d either have to denormalize the fact table to an extreme degree (violating the grain) or miss out on important analytical paths.

How a Bridge Table Works

A bridge table resolves the many-to-many relationship by breaking it down into two one-to-many relationships. It acts as an intermediary table that contains the keys from both entities involved in the many-to-many relationship.3

Structure of a Bridge Table:

A typical bridge table consists of:

  1. A primary key: Often a simple surrogate key (e.g., CustomerDemographicBridgeKey).
  2. Foreign key 1: Links to the “main” dimension (e.g., CustomerKey from DimCustomer).
  3. Foreign key 2: Links to the “secondary” or “attribute” dimension that has the many-to-many relationship (e.g., DemographicGroupKey from DimDemographicGroup).
  4. Optional attributes:
    • Weighting Factor: If the relationship is proportional (e.g., a customer’s total sales should be proportionally allocated to their multiple demographic groups).
    • Effective/End Dates: If the many-to-many relationship changes over time (e.g., a customer’s group membership changes), similar to Slowly Changing Dimension Type 2 (SCD2) logic, but applied to the relationship itself.

Example: Customer and Demographic Groups

Let’s illustrate with the FactSales, DimCustomer, and DimDemographicGroup example:

  1. DimCustomer Table:
    • CustomerKey (PK)
    • CustomerName
    • Email
    • … (other customer attributes)
  2. DimDemographicGroup Table:
    • DemographicGroupKey (PK)
    • GroupName (e.g., “Student”, “Loyalty Member”, “Senior Citizen”)
    • GroupType
    • … (other group attributes)
  3. FactSales Table:
    • SaleKey (PK)
    • CustomerKey (FK to DimCustomer)
    • DateKey (FK)
    • ProductKey (FK)
    • SalesAmount
    • Quantity
    • … (other sales measures)
  4. The Bridge Table (BridgeCustomerDemographic):This table stores every combination of customer and the demographic groups they belong to.
CustomerDemographicBridgeKey (PK)CustomerKey (FK)DemographicGroupKey (FK)AllocationWeight (Optional)
1101201 (Student)0.5
2101202 (Loyalty Member)0.5
3102201 (Student)1.0
4103203 (Senior Citizen)1.0
5103202 (Loyalty Member)1.0
  • CustomerKey 101 belongs to Student and Loyalty Member groups.
  • CustomerKey 103 belongs to Senior Citizen and Loyalty Member groups.

The flow of analysis is: FactSales -> DimCustomer -> BridgeCustomerDemographic -> DimDemographicGroup.

The Core Star Schema

       +-----------------------+
       |       DimDate         |
       |-----------------------|
       | DateKey (PK)          |
       | Year, Month, Day      |
       | ...                   |
       +-----------+-----------+
                   | 1
                   |
                   M
       +-----------+-----------+
       |       FactSales       |
       |-----------------------|
       | SalesKey (PK)         |
       | DateKey (FK)          |
       | ProductKey (FK)       |  +----------+----------+
       | CustomerKey (FK)  ----->|  DimCustomer         |
       | SalesAmount           |  |--------------------|
       | Quantity              |  | CustomerKey (PK)   |
       +-----------------------+  | CustomerName, Email|
                   | M            +--------------------+
                   |
                   1
       +-----------+-----------+
       |      DimProduct       |
       |-----------------------|
       | ProductKey (PK)       |
       | ProductName           |
       | Category, Brand       |
       +-----------------------+

Extending with a Bridge Table for Many-to-Many

                                  +--------------------+
                                  | DimDemographicGroup|
                                  |--------------------|
                                  | GroupKey (PK)      |
                                  | GroupName          |
                                  | ...                |
                                  +---------+----------+
                                            | 1
                                            |
                                            M
+----------+----------+       +------------------------------------+
|  DimCustomer         |  <--> | [Bridging Many-to-Many]            |
|--------------------|       |  BridgeCustomerDemographic          |
| CustomerKey (PK)   |  <--> |------------------------------------|
| CustomerName, Email|       |  BridgeKey (PK)                    |
+--------------------+       |  CustomerKey (FK) ------------------> 1
      M                      |  GroupKey (FK)  -------------------> 1
      |                      |  (Optional: AllocationWeight)        |
      |                      +------------------------------------+
      1
      |
+-----------------------+
|       FactSales       |
|-----------------------|
| ...                   |
| CustomerKey (FK)      |
| ...                   |
+-----------------------+

How the Diagram Works

  • The standard relationships from FactSales to DimCustomer, DimProduct, and DimDate are all 1:M.
  • The business reality is a Many-to-Many relationship between DimCustomer and DimDemographicGroup (one customer can be in many groups, and one group has many customers).
  • The BridgeCustomerDemographic table resolves this. It links to DimCustomer with a 1:M relationship and to DimDemographicGroup with a 1:M relationship.
  • To query sales by a demographic group, you must traverse from FactSales to DimCustomer, then through the BridgeCustomerDemographic table to reach DimDemographicGroup. This path allows for accurate aggregation and filtering based on the many-to-many relationship.

How to Query with a Bridge Table

To analyze sales by DemographicGroup, you would join the tables in the following sequence:

FactSales -> DimCustomer -> BridgeCustomerDemographic -> DimDemographicGroup

SQL Query Example:

SQL

SELECT
    DDG.GroupName,
    SUM(FS.SalesAmount) AS TotalSales
FROM
    FactSales FS
JOIN
    DimCustomer DC ON FS.CustomerKey = DC.CustomerKey
JOIN
    BridgeCustomerDemographic BCD ON DC.CustomerKey = BCD.CustomerKey
JOIN
    DimDemographicGroup DDG ON BCD.DemographicGroupKey = DDG.DemographicGroupKey
GROUP BY
    DDG.GroupName
ORDER BY
    TotalSales DESC;

Note on Weighting: If AllocationWeight is used, the SUM(FS.SalesAmount * BCD.AllocationWeight) would be more appropriate to distribute the sales proportionally across groups.

Another Scenario

Business Case:

A SalesFact table stores orders, and we want to track multiple sales agents responsible for each order.

  • One order can have multiple agents.
  • One agent can handle multiple orders.

Tables

1️⃣ Fact Table: SalesFact

OrderIDOrderDateKeyTotalAmount
101202301011000
102202301021500

2️⃣ Dimension Table: SalesAgentDim

AgentKeyAgentName
1Alice
2Bob
3John

3️⃣ Bridge Table: SalesFactAgentBridge

OrderIDAgentKeyCommissionPct
101160%
101240%
1023100%

🔹 How It Works

  1. Fact Table → contains orders.
  2. Bridge Table → connects each order to multiple agents.
  3. Dimension Table → contains agent details.

This resolves the many-to-many relationship between orders and agents.

Benefits of Bridge Tables

  1. Resolves Many-to-Many Relationships: The primary benefit, allowing proper dimensional analysis of complex relationships.4
  2. Maintains Star Schema Principles: Keeps fact tables lean and simple (containing only foreign keys and measures) and pushes descriptive attributes into dimensions.
  3. Flexibility: Allows for complex attribute relationships without heavily denormalizing the main dimension or duplicating data.5
  4. Historical Tracking: The bridge table can incorporate date columns (EffectiveDate, EndDate) to track changes in the many-to-many relationship over time (e.g., when a customer joined or left a demographic group).
  5. Weighted Analysis: Facilitates scenarios where the contribution of an item to multiple groups needs to be proportioned.

Considerations/Challenges

  1. Increased Query Complexity: Queries involving bridge tables require more joins, which can be slightly less intuitive for beginners and potentially introduce minor performance overhead if not properly indexed. However, modern database optimizers and BI tools are generally very efficient at handling this pattern.
  2. ETL Complexity: Loading and maintaining the bridge table (especially with weights or historical tracking) adds complexity to the ETL process.6
  3. Interpretation: Users need to understand that a single fact may be counted multiple times if a weighting factor isn’t used (e.g., if a customer’s $100 sale is associated with two demographic groups, it will appear as $100 in each group’s total unless weighted).

In essence, a bridge table is a powerful and necessary design pattern in dimensional modeling that allows data warehouses to accurately model and analyze real-world many-to-many relationships, providing a more complete and insightful view of business performance.7


Discover more from Technology with Vivek Johari

Subscribe to get the latest posts sent to your email.

Leave a Reply

Scroll to Top

Discover more from Technology with Vivek Johari

Subscribe now to keep reading and get access to the full archive.

Continue reading