Parallel jobs on Windows fail with APT_IOPort::readBlkVirt;error

APT_IOPort::readBlkVirt Error Screenshot
APT_IOPort::readBlkVirt Error Screenshot

This a known error for windows systems and applies to DataStage and DataQuality jobs using the any or all the three join type stages (Join, Merge, and Lookup).

Error Message

  • <<Link name>>,0: APT_IOPort::readBlkVirt: read for block header, partition 0, [fd 4], returned -1 with errno 10,054 (Unknown error)

Message ID

  • IIS-DSEE-TFIO-00223

Applies To

  • Windows systems only
  • Parallel Engine Jobs the three join type stages (Join, Merge, and Lookup). It does not apply to Server Engine jobs.
  • Infosphere Information Server (IIS), Datastage and DataQuality 9.1 and higher

The Fix

  • Add the APT_NO_IOCOMM_OPTIMIZATION in project administrator and set to blank or 0. I left it blank so it would not impact other jobs
  • Add the environment variable to the job producing the error and set to 1

What it APT_NO_IOCOMM_OPTIMIZATION Does

  • Sets the use of shared memory as the transport type, rather than using the default sockets transport type.
  • Note that in most cases sockets transport type is faster, so, you likely will not to set this across the project as the default for all job. It is best to apply it as necessary for problematic jobs.

Related References

InfoSphere DataStage and QualityStage, Version 9.1 Job Compatibility

IBM Support, JR54078: PARALLEL JOBS ON WINDOWS FAIL WITH APT_IOPORT::READBLKVIRT; ERROR

IBM Support, Information Server DataStage job fails with unknown error 10,054.

 

Netezza / PureData – How To Quote a Single Quote in Netezza SQL

How To Quote a Single Quote in Netezza SQL?

The short answer is to use four single quotes (””), which will result in a single quote within the select statement results.

How to Assemble the SQL to Quote a Single Quote in a SQL Select Statement

Knowing how to construct a list to embed in a SQL where clause ‘in’ list or to add to an ETL job can be a serious time saver eliminating the need to manually edit large lists.  In the example below, I used the Select result set to create a rather long list of values, which needed to be included in an ELT where clause.  By:

  • Adding the comma delimiter (‘,’) and a Concatenate (||) on the front
  • Followed by adding a quoted single Quote (entered as four single quotes (””)) and a Concatenate (||)
  • The Field I which to have delaminated and Quoted (S1.ORDER_NUM)
  • And closed with a quoted single Quote (entered as four single quotes (””))

This results in a delimited and quoted list ( ,’116490856′) which needs only to have the parentheses added and the first comma removed, which is much less work than manually editing the 200 item that resulted from this select.

Example SQL:

SELECT Distinct

‘,’||””|| S1.ORDER_NUM||”” as Quoted_Order_Number

FROM Sales S1

How to Quote A Single Quote Example SQL
How to Quote A Single Quote Example SQL

Related Reference

DataStage – How to use single quoted parameter list in an Oracle Connector

Data Integration
Data Integration

While working with a client’s 9.1 DataStage version, I ran into a situation where they wanted to parameterize SQL where clause lists in an Oracle Connector stage, which honestly was not very straight forward to figure out.  First, if the APT_OSL_PARAM_ESC_SQUOTE is not set and single quotes are used in the parameter, the job creates unquoted invalid SQL when the parameter is populated.  Second, I found much of the information confusing and/or incomplete in its explanation.   After some research and some trial and error, here is how I resolved the issue.  I’ll endeavor to be concise, but holistic in my explanation.

When this Variable applies

This where I know this process applies, there may be other circumstances to which is this applicable, but I’m listing the ones here with which I have recent experience.

Infosphere Information Server Datastage

  • Versions 91, 11.3, and 11.5

Oracle RDBMS

  • Versions 11g and 12c

Configurations process

Here is a brief explanation of the steps I used to implement the where clause as a parameter.  Please note that in this example, I am using a job parameter to populate on a portion of the where clause, you can certainly pass the entire where clause as a parameter, if it is not too long.

Configure Project Variable in Administrator

  • Add APT_OSL_PARAM_ESC_SQUOTE to project in Administrator
  • Populate the APT_OSL_PARAM_ESC_SQUOTE Variable \

APT_OSL_PARAM_ESC_SQUOTE Project Variable
APT_OSL_PARAM_ESC_SQUOTE Project Variable

Create job parameter

Following your project name convention or standard practice, if you customer and/or project do not have established naming conventions, create the job parameter in the job. See jp_ItemSource parameter in the image below.

Job Parameter In Oracle Connector
Job Parameter In Oracle Connector

Add job parameter to Custom SQL in Select Oracle Connector Stage

On the Job parameter has been created, add the job parameter to the SQL statement of the job.

Job Parameter In SQL
Job Parameter In SQL

Related References

IBM Knowledge Center > InfoSphere Information Server 11.5.0

Connecting to data sources > Databases > Oracle databases > Oracle connector

IBM Support > Limitation of the Parameter APT_OSL_PARAM_ESC_SQUOTE on Plugins on Parallel Canvas

IBM Knowledge Center > InfoSphere Information Server 11.5.0

InfoSphere DataStage and Quality > Stage > Reference > Parallel Job Reference > Environment Variables > Miscellaneous > APT_OSL_PARAM_ESC_SQUOTE

 

What is the convert function in Datastage?

Algorithm
Algorithm

 

What is the convert function in Datastage?

In its simplest form, the convert function in Infosphere DataStage is a string replacement operation.  Convert can be used to replace a specific character, a list of characters, or a unicode character (e.g. thumbs Up Sign or Grinning Face).

Convert Syntax

convert(‘<<Value to be replaced’,'<<Replacement value >>’,<<Input field>>)

Using the Convert Function to remove a list of Characters

Special Characters in DataStage Handles/converts special characters in a transformer stage, which can cause issues in XML processing and certain databases.

Convert a list of General Characters

Convert(“;:?\+&,*`#’$()|^~@{}[]%!”,”, TrimLeadingTrailing(Lnk_In.Description))

Convert Decimal and Double Quotes

Convert(‘ ” . ‘,”, Lnk_In.Description)

Convert Char(0)

This example replaces Char(0) with nothing essentially removing it as padding and/or space.

convert(char(0),”,Lnk_In.Description)

 

Related References

String functions

InfoSphere Information Server, InfoSphere Information Server 11.5.0, InfoSphere DataStage and QualityStage, Developing parallel jobs, Parallel transform functions, String functions

Datastage – When checking operator: Operator of type “APT_TSortOperator”: will partition despite the preserve-partitioning flag on the data set on input port 0

APT_TSortOperator Warning
APT_TSortOperator Warning

The APT_TSortOperator  warning happens when there is a conflict in the portioning behavior between stages.  Usually, because the successor (down Stream) stage has the ‘Partitioning / Collecting’ and ‘Sorting’ property set in a way that conflicts with predecessor (upstream) stage’s properties, which it is set to preserver.  This can occur when the successor stage has the “Preserve Partitioning” property set to:

  • ‘Default (Propagate)’
  • ‘Propagate’, or
  • ‘Set’

Preserve Partitioning Property - list
Preserve Partitioning Property – list

Message ID

  • IIS-DSEE-TFOR-00074

Message Text

  • <<Link Name Where Warning Occurred>>: When checking operator: Operator of type “APT_TSortOperator”: will partition despite the preserve-partitioning flag on the data set on input port 0.

Warning Fixes

  • First, if the verify that the partitioning behaviors of both stages are correct
  • If so, set the predecessor ‘Preserve Partitioning’ property to “Clear”
  • If not, then correct the partitioning behavior of the stage which is in error

Clear Partitioning Property Screenshot

Preserve Partitioning Property - Set To Clear
Preserve Partitioning Property – Set To Clear

Infosphere DataStage – Boolean Handling for Netezza

Datastage Director Message - Numeric string expected
Datastage Director Message – Numeric string expected

 

Beware when you see this message when working with Boolean in DataStage, the message displays as informational (at list it did for me) not as a warning or an error.  Even though it seems innocuous, what it meant for my job, was the Boolean (‘true’ / ‘false’) was not being interpreted and everything posted to ‘false’.

In DataStage the Netezza ‘Boolean’ field/Data SQL type maps to the ‘Bit’ SQL type, which expects a numeric input of Zero (0) or one (1).  So, my solution (once I detected the problem during unit testing) was to put Transformer Stage logic in place to convert the Boolean input to the expected number value.

 

Netezza to Datastage Data Type Mapping

Netezza data types

InfoSphere DataStage

data types (SQL types)

Expected Input value

BOOLEAN Bit 0 or 1 (1 = true, 0 = false)

 

Transformer Stage logic Boolean Handling Logic

A Netezza Boolean field can store: true values, false values, and null. So, some thought should be given to you desired data outcome for nulls

This first example sets a that the nulls are set to a specific value, which can support a specific business rule for null handling and, also, provide null handling for non-nullable fields.  Here we are setting nulls to the numeric value for ‘true’ and all other non-true inputs to ‘false’.

If isnull(Lnk_Src_In.USER_ACTIVE) then 1 Else if Lnk_Src_In.USER_ACTIVE = ‘true’ Then 1 Else 0

These second examples sets a that the nulls are set by the Else value, if your logic direction is correct value and still provides null handling for non-nullable fields.

  • If  Lnk_Src_In.USER_ACTIVE = ‘true’ Then 1 Else 0

  • If  Lnk_Src_In.USER_ACTIVE = ‘False’ Then 0 Else 1

Director Log Message

Message ID

  • IIS-DSEE-TBLD-00008

Message Text

  • <<Link Name Where Message Occurred>>: Numeric string expected. Use default value.

Or something like this:

  • <<Link Name Where Message Occurred>>: Numeric string expected for input column ‘<<Field Name Here>>‘. Use default value.

Related References

Boolean

PureData System for Analytics, PureData System for Analytics 7.2.1, IBM Netezza user-defined functions, UDX data types reference information, Supported data types, Boolean

https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.1/com.ibm.nz.udf.doc/r_udf_boolean_datatype.html

Data types and aliases

PureData System for Analytics, PureData System for Analytics 7.2.1, IBM Netezza stored procedures, NZPLSQL statements and grammar, Variables and constants, Data types and aliases

https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.1/com.ibm.nz.sproc.doc/c_sproc_data_types_aliases.html

Logical data types

PureData System for Analytics, PureData System for Analytics 7.2.1, IBM Netezza database user documentation, Netezza SQL basics, Data types, Logical data types

https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.1/com.ibm.nz.dbu.doc/r_dbuser_data_types_logical.html

Data type conversions from Netezza to DataStage

InfoSphere Information Server, InfoSphere Information Server 11.5.0, Connecting to data sources, Databases, Netezza Performance Server, Netezza connector, Designing jobs by using the Netezza connector, Defining a Netezza connector job, Data type conversions, Data type conversions from Netezza to DataStage

https://www.ibm.com/support/knowledgecenter/en/SSZJPZ_11.5.0/com.ibm.swg.im.iis.conn.netezza.use.doc/topics/nzcc_mappingdatatypes.html

Infosphere DataStage – Designer Client Repository Structure

Default Repository Structure

When a project is created, there is a default repository structure created for use in the DataStage designer client.

Default DataStage Repository Structure
Default DataStage Repository Structure

However, some additional organization will be required for most DataStage projects.  Usually, this organization occurs in in these areas:

  • Addition of structure within the “Jobs” folder
  • Addition of a “Parameter Sets” folder
  • Addition of structure within the “Table Definitions” folder
  • Addition of a “Developer Work Area” folder

Repository Structure within the “Jobs” folder

Below is a sample of a folder structure for multiple applications that share a common Repository.  Pattern includes, but does not illustrate all other delivered folders. In addition to the core folder structure, developers can create individual working, test, and in progress folders, which do not migrate, but keep work segregated.

Jobs Folder Pattern Datastage Repository Structure
Jobs Folder Pattern Datastage Repository Structure

Parameter Sets Folders

The parameter set folders or for two sets of information.

  • First, are the database parameters, which include data connections and the attached parameter sets.
  • The second, for job parameters, which may include parameter sets, for things like e-mail parameters, surrogate key file paths, etc.; which is a best practice, rather creating them as project level parameters.
Parameter Sets Folder Pattern Datastage Repository Structure
Parameter Sets Folder Pattern Datastage Repository Structure

Table Definitions

The Tables Definition folder have folders added to segregate the imported meta data for source and target system and, in some case, may need folders to logically organize imported meta which may reside within the same database and/or schema, but belong to different logical layer.

Table Definitions Folder Pattern DataStage Repository Structure
Table Definitions Folder Pattern DataStage Repository Structure