Data Modeling – Column Data Classification

Data Modeling, Column Data Classification, Field Data Classification

Data Modeling

 

Column Data Classification

When analyzing individual column data, at its most foundational level, column data can be classified by their fundamental use/characteristics.  Granted, when you start rolling up the structure into multiple columns, table structure and table relationship, then other classifications/behaviors, such as keys (primary and foreign), indexes, and distribution come into play.  However, many times when working with existing data sets it is essential to understand the nature the existing data to begin the modeling and information governance process.

Column Data Classification

Generally, individual columns can be classified into the classifications:

  • Identifier — A column/field which is unique to a row and/or can identify related data (e.g., Person ID, National identifier, ). Basically, think primary key and/or foreign key.
  • Indicator — A column/field, often called a Flag, that has a binary condition (e.g., True or False, Yes or No, Female or Male, Active or Inactive). Frequently used to identify compliance with complex with a specific business rule.
  • Code — A column/field that has a distinct and defined set of values, often abbreviated (e.g., State Code, Currency Code)
  • Temporal — A column/field that contains some type date, timestamp, time, interval, or numeric duration data
  • Quantity — A column/field that contains a numeric value (decimals, integers, etc.) and is not classified as an Identifier or Code (e.g., Price, Amount, Asset Value, Count)
  • Text — A column/field that contains alphanumeric values, possibly long text, and is not classified as an Identifier or Code (e.g., Name, Address, Long Description, Short Description)
  • Large Object (LOB)– A column/field that contains data traditional long text fields or binary data like graphics. The large objects can be broadly classified as Character Large Objects (CLOBs), Binary Large Objects (BLOBs), and Double-Byte Character Large Object (DBCLOB or NCLOB).

Related References