Data Modeling – Fact Table Effective Practices

Database Table

Database Table

Here are a few guidelines for modeling and designing fact tables.

Fact Table Effective Practices

  • The table naming convention should identify it as a fact table. For example:
    • Suffix Pattern:
      • <<TableName>>_Fact
      • <<TableName>>_F
    • Prefix Pattern:
      • FACT_<TableName>>
      • F_<TableName>>
    • Must contain a temporal dimension surrogate key (e.g. date dimension)
    • Measures should be nullable – this has an impact on aggregate functions (SUM, COUNT, MIN, MAX, and AVG, etc.)
    • Dimension Surrogate keys (srky) should have a foreign key (FK) constraint
    • Do not place the dimension processing in the fact jobs

Related References

Data Modeling – Dimension Table Effective Practices

Database Table

Database Table

I’ve had these notes laying around for a while, so, I thought I consolidate them here.   So, here are few guidelines to ensure the quality of your dimension table structures.

Dimension Table Effective Practices

  • The table naming convention should identify it as a dimension table. For example:
    • Suffix Pattern:
      • <<TableName>>_Dim
      • <<TableName>>_D
    • Prefix Pattern:
      • Dim_<TableName>>
      • D_<TableName>>
  • Have Primary Key (PK) assigned on table surrogate Key
  • Audit fields – Type 1 dimensions should:
    • Have a Created Date timestamp – When the record was initially created
    • have a Last Update Timestamp – When was the record last updated
  • Job Flow: Do not place the dimension processing in the fact jobs.
  • Every Dimension should have a Zero (0), Unknown, row
  • Fields should be ‘NOT NULL’ replacing nulls with a zero (0) numeric and integer type fields or space ( ‘ ‘ ) for Character type files.
  • Keep dimension processing outside of the fact jobs

Related References

 

 

Where do data models fit in the Software Development Life Cycle (SDLC) Process?

Data Model SDLC Relationship Diagram

Data Model SDLC Relationship Diagram

In the classic Software Development Life Cycle (SDLC) process, Data Models are typically initiated, by model type, at key process steps and are maintained as data model detail is added and refinement occurs.

The Concept Data Model (CDM) is, usually, created in the Planning phase.   However,  creation the Concept Data Model  can slide forwarded or backward,  somewhat , within the System Concept Development, Planning, and Requirements Analysis phases, depending upon  whether the application being modeled is a custom development effort or a modification of a Commercial-Off-The-Shelf (COTS) application.  The CDM is maintained, as necessary, through the remainder of the SDLC process.

The Logical Data Model (LDM) is created in the Requirement Analysis phase and is a refinement of the information entities of the Concept Data Model. The LDM is maintained, as necessary, through the remainder of the SDLC process.

The Physical Data Model (PDM) is created in the Design phase to facilitate creation of accurate detail technical designs and actual database creation. The PDM is maintained, as necessary, through the remainder of the SDLC process.

Related References: