Netezza / PureData – How to add a Foreign Key

DDL (Data Definition Language), Netezza PureData How to add a Foreign Key
DDL (Data Definition Language)

Adding a forging key to tables in Netezza / PureData is a best practice; especially, when working with dimensionally modeled data warehouse structures and with modern governance, integration (including virtualization), presentation semantics (including reporting, business intelligence and analytics).

Foreign Key (FK) Guidelines

  • A primary key must be defined on the table and fields (or fields) to which you intend to link the foreign key
  • Avoid using distribution keys as foreign keys
  • Foreign Key field should not be nullable
  • Your foreign key link field(s) must be of the same format(s) (e.g. integer to integer, etc.)
  • Apply standard naming conventions to constraint name:
    • FK_<<Constraint_Name>>_<<Number>>
    • <<Constraint_Name>>_FK<<Number>>
  • Please note that foreign key constraints are not enforced in Netezza

Steps to add a Foreign Key

The process for adding foreign keys involves just a few steps:

  • Verify guidelines above
  • Alter table add constraint SQL command
  • Run statistics, which is optional, but strongly recommended

Basic Foreign Key SQL Command Structure

Here is the basic syntax for adding Foreign key:

ALTER TABLE <<Owner>>.<<NAME_OF_TABLE_BEING_ALTERED>>

ADD CONSTRAINT <<Constraint_Name>>_fk<Number>>

FOREIGN KEY (<<Field_Name or Field_Name List>>) REFERENCES <<Owner>>.<<target_FK_Table_Name>.(<<Field_Name or Field_Name List>>) <<On Update | On Delete>> action;

Example Foreign Key SQL Command

This is a simple one field example of the foreign key (FK)

ALTER TABLE Blog.job_stage_fact

ADD CONSTRAINT job_stage_fact_host_dim_fk1

FOREIGN KEY (hostid) REFERENCES Blog.host_dim(hostid) ON DELETE cascade ON UPDATE no action;

Related References

Alter Table

PureData System for Analytics, PureData System for Analytics 7.2.1, IBM Netezza database user documentation, Netezza SQL command reference, Alter Table, constraints

Database – What is a foreign key?

Acronyms, Abbreviations, Terms, And Definitions, DDL (Data Definition Language), What is a foreign key
Acronyms, Abbreviations, Terms, And Definitions

Definition of a Foreign Key

  • A foreign Key (FK) is a constraint that references the unique primary key (PK) of another table.

Facts About Foreign Keys

  • Foreign Keys act as a cross-reference between tables linking the foreign key (Child record) to the Primary key (parent record) of another table, which establishing a link/relationship between the table keys
  • Foreign keys are not enforced by all RDBMS
  • The concept of referential integrity is derived from foreign key theory
  • Because Foreign keys involve more than one table relationship, their implementation can be more complex than primary keys
  • A foreign-key constraint implicitly defines an index on the foreign-key column(s) in the child table, however, manually defining a matching index may improve join performance in some database
  • The SQL, normally, provides the following referential integrity actions for deletions, when enforcing foreign-keys

Cascade

  • The deletion of a parent (primary key) record may cause the deletion of corresponding foreign-key records.

No Action

  • Forbids the deletion of a parent (primary key) record, if there are dependent foreign-key records.   No Action does not mean to suppress the foreign-key constraint.

Set null

  • The deletion of a parent (primary key) record causes the corresponding foreign-key to be set to null.

Set default

  • The deletion of a record causes the corresponding foreign-keys be set to a default value instead of null upon deletion of a parent (primary key) record

Related References

Database – What is a Composite Primary Key?

Database Table
Database Table

What is a Composite Primary Key?

A Composite Primary key is Primary key What a primary key, which is defined by having multiple fields (columns) in it.  Like a Primary Key what a composite Primary Key is depends on the database.  Essentially a Composite Primary Key:

  • Is a combination of Fields (columns) which uniquely identifies every row.
  • Is an index in database systems which use indexes for optimization
  • Is a type of table constraint
  • Is applied with a data definition language (DDL) alter command
  • And may define parent-Child relationship between tables

Related References

Database – What is a Primary Key?

Database Table
Database Table

What is a primary Key?

What a primary key is depends, somewhat, on the database.  However, in its simplest form a primary key:

  • Is a field (Column) or combination of Fields (columns) which uniquely identifies every row.
  • Is an index in database systems which use indexes for optimization
  • Is a type of table constraint
  • Is applied with a data definition language (DDL) alter command
  • And, depending on the data model can, define parent-Child relationship between tables

Related References

Data Modeling – Dimension Table Effective Practices

Database Table
Database Table

I’ve had these notes laying around for a while, so, I thought I consolidate them here.   So, here are few guidelines to ensure the quality of your dimension table structures.

Dimension Table Effective Practices

  • The table naming convention should identify it as a dimension table. For example:
    • Suffix Pattern:
      • <<TableName>>_Dim
      • <<TableName>>_D
    • Prefix Pattern:
      • Dim_<TableName>>
      • D_<TableName>>
  • Have Primary Key (PK) assigned on table surrogate Key
  • Audit fields – Type 1 dimensions should:
    • Have a Created Date timestamp – When the record was initially created
    • have a Last Update Timestamp – When was the record last updated
  • Job Flow: Do not place the dimension processing in the fact jobs.
  • Every Dimension should have a Zero (0), Unknown, row
  • Fields should be ‘NOT NULL’ replacing nulls with a zero (0) numeric and integer type fields or space ( ‘ ‘ ) for Character type files.
  • Keep dimension processing outside of the fact jobs

Related References

Infosphere DataStage – Designer Client Repository Structure

Default Repository Structure

When a project is created, there is a default repository structure created for use in the DataStage designer client.

Default DataStage Repository Structure
Default DataStage Repository Structure

However, some additional organization will be required for most DataStage projects.  Usually, this organization occurs in in these areas:

  • Addition of structure within the “Jobs” folder
  • Addition of a “Parameter Sets” folder
  • Addition of structure within the “Table Definitions” folder
  • Addition of a “Developer Work Area” folder

Repository Structure within the “Jobs” folder

Below is a sample of a folder structure for multiple applications that share a common Repository.  Pattern includes, but does not illustrate all other delivered folders. In addition to the core folder structure, developers can create individual working, test, and in progress folders, which do not migrate, but keep work segregated.

Jobs Folder Pattern Datastage Repository Structure
Jobs Folder Pattern Datastage Repository Structure

Parameter Sets Folders

The parameter set folders or for two sets of information.

  • First, are the database parameters, which include data connections and the attached parameter sets.
  • The second, for job parameters, which may include parameter sets, for things like e-mail parameters, surrogate key file paths, etc.; which is a best practice, rather creating them as project level parameters.
Parameter Sets Folder Pattern Datastage Repository Structure
Parameter Sets Folder Pattern Datastage Repository Structure

Table Definitions

The Tables Definition folder have folders added to segregate the imported meta data for source and target system and, in some case, may need folders to logically organize imported meta which may reside within the same database and/or schema, but belong to different logical layer.

Table Definitions Folder Pattern DataStage Repository Structure
Table Definitions Folder Pattern DataStage Repository Structure