Infosphere DataStage – Boolean Handling for Netezza

Datastage Director Message - Numeric string expected

Datastage Director Message – Numeric string expected

 

Beware when you see this message when working with Boolean in DataStage, the message displays as informational (at list it did for me) not as a warning or an error.  Even though it seems innocuous, what it meant for my job, was the Boolean (‘true’ / ‘false’) was not being interpreted and everything posted to ‘false’.

In DataStage the Netezza ‘Boolean’ field/Data SQL type maps to the ‘Bit’ SQL type, which expects a numeric input of Zero (0) or one (1).  So, my solution (once I detected the problem during unit testing) was to put Transformer Stage logic in place to convert the Boolean input to the expected number value.

 

Netezza to Datastage Data Type Mapping

Netezza data types

InfoSphere DataStage

data types (SQL types)

Expected Input value

BOOLEAN Bit 0 or 1 (1 = true, 0 = false)

 

Transformer Stage logic Boolean Handling Logic

A Netezza Boolean field can store: true values, false values, and null. So, some thought should be given to you desired data outcome for nulls

This first example sets a that the nulls are set to a specific value, which can support a specific business rule for null handling and, also, provide null handling for non-nullable fields.  Here we are setting nulls to the numeric value for ‘true’ and all other non-true inputs to ‘false’.

If isnull(Lnk_Src_In.USER_ACTIVE) then 1 Else if Lnk_Src_In.USER_ACTIVE = ‘true’ Then 1 Else 0

These second examples sets a that the nulls are set by the Else value, if your logic direction is correct value and still provides null handling for non-nullable fields.

  • If  Lnk_Src_In.USER_ACTIVE = ‘true’ Then 1 Else 0

  • If  Lnk_Src_In.USER_ACTIVE = ‘False’ Then 0 Else 1

Director Log Message

Message ID

  • IIS-DSEE-TBLD-00008

Message Text

  • <<Link Name Where Message Occurred>>: Numeric string expected. Use default value.

Or something like this:

  • <<Link Name Where Message Occurred>>: Numeric string expected for input column ‘<<Field Name Here>>‘. Use default value.

Related References

Boolean

PureData System for Analytics, PureData System for Analytics 7.2.1, IBM Netezza user-defined functions, UDX data types reference information, Supported data types, Boolean

https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.1/com.ibm.nz.udf.doc/r_udf_boolean_datatype.html

Data types and aliases

PureData System for Analytics, PureData System for Analytics 7.2.1, IBM Netezza stored procedures, NZPLSQL statements and grammar, Variables and constants, Data types and aliases

https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.1/com.ibm.nz.sproc.doc/c_sproc_data_types_aliases.html

Logical data types

PureData System for Analytics, PureData System for Analytics 7.2.1, IBM Netezza database user documentation, Netezza SQL basics, Data types, Logical data types

https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.1/com.ibm.nz.dbu.doc/r_dbuser_data_types_logical.html

Data type conversions from Netezza to DataStage

InfoSphere Information Server, InfoSphere Information Server 11.5.0, Connecting to data sources, Databases, Netezza Performance Server, Netezza connector, Designing jobs by using the Netezza connector, Defining a Netezza connector job, Data type conversions, Data type conversions from Netezza to DataStage

https://www.ibm.com/support/knowledgecenter/en/SSZJPZ_11.5.0/com.ibm.swg.im.iis.conn.netezza.use.doc/topics/nzcc_mappingdatatypes.html

Infosphere Datastage – Data type conversions from Netezza to DataStage

IBM Infosphere Information Server (IIS)

IBM Infosphere Information Server (IIS)

 

Some recent research to eliminate some ETL Data Conversion issues, made me want to make an enhancement to the documentation provided by IBM, to prevent repeating the research.

Netezza data types and their equivalent InfoSphere DataStage data types

Netezza data types

InfoSphere DataStage data types (SQL types)

Notes

 
BYTEINT TinyInt
SMALLINT SmallInt
INT Integer
BIGINT BigInt
NUMERIC(p, s) Numeric, decimal, double Decimal and double are aliases of Numeric
FLOAT(p) Float
REAL Real
DOUBLE PRECISION Double
CHAR(n) Char
VARCHAR(n) VarChar
NCHAR(n) NChar
NVARCHAR(n) NVarChar
BOOLEAN Bit
DATE Date
TIME Time
TIME WITH TIME ZONE VarChar The value of time with time zone will be returned without the time zone information.
TIMESTAMP TimeStamp
INTERVAL VarChar You cannot load the interval data type from an external table.
ROWID BigInt
TRANSACTION ID BigInt
DATASLICE Integer

Related References

Data type conversions from Netezza to DataStage

InfoSphere Information Server,InfoSphere Information Server 11.5.0, Connecting to data sources, Databases, Netezza Performance Server, Netezza connector, Designing jobs by using the Netezza connector, Defining a Netezza connector job, Data type conversions, Data type conversions from Netezza to DataStage

Netezza / PureData – What is the maximum length of CHAR, VARCHAR, and NVARCHAR fields

IBM Netezza / IBM PureData for Analytics

IBM Netezza / IBM PureData for Analytics

 

 

During a recent project, I had reason to investigate what the maximum character field links for Netezza to be able to fit the data ETL ‘ed from the source.  The constraints on Netezza character fields are not as long as some other databases, therefore, I thought I would document and a quick post for future reference.   Knowing these limits will help with your ETL design and construction but you may know where to truncate your field input data.

 

Data type Alias names InfoSphere DataStage data types (SQL type) Maximum Length Notes
CHAR CHARACTER, CHAR(n), CHARACTER(n) Char 64,000 Fixed-length character string, blank padded to length n. If you do not specify n, the default is an unsized CHAR value.
VARCHAR CHARACTER VARYING, VARCHAR(n), CHARACTER VARYING(n), CHAR VARYING(n) VarChar, LongVarChar 64,000 Variable length character string to a maximum length of n. If you do not specify n, the default is an unsized VARCHAR value. There is no blank padding, and the value is stored as entered.
NCHAR NATIONAL CHARACTER, NATIONAL CHAR(n), NCHAR(size) NChar, Char, 16,000 Fixed-length character string, blank padded to length n. If you do not specify n, the default is an unsized NCHAR value.
NVARCHAR NATIONAL CHARACTER VARYING, NATIONAL CHAR VARYING, NATIONAL CHAR VARYING(n), NATIONAL CHARACTER VARYING(n), and NVARCHAR(n) VarChar, NVarChar ,LongVarChar, LongNVarChar, VarChar 16,000 Variable length character string to a maximum length of n. If you do not specify n, the default is an unsized NVARCHAR value.

Related References

Data types and aliases

PureData System for Analytics ,PureData System for Analytics 7.2.1, IBM Netezza stored procedures, NZPLSQL statements and grammar, Variables and constants, Data types and aliases

Data type conversions from DataStage to Netezza

InfoSphere Information Server, InfoSphere Information Server 11.5.0, Connecting to data sources, Databases, Netezza Performance Server, Netezza connector, Designing jobs by using the Netezza connector, Defining a Netezza connector job, Data type conversions, Data type conversions from DataStage to Netezza