Essbase Connector Error – Client Commands are Currently Not Being Accepted

DataStage Essbase Connector, Essbase Connector Error, Client Commands are Currently Not Being Accepted

DataStage Essbase Connector

While investigating a recent Infosphere Information Server (IIS), Datastage, Essbase Connect error I found the explanations of the probable causes of the error not to be terribly meaningful.  So, now that I have run our error to ground, I thought it might be nice to jot down a quick note of the potential cause of the ‘Client Commands are Currently Not Being Accepted’ error, which I gleaned from the process.

Error Message Id

  • IIS-CONN-ESSBASE-01010

Error Message

An error occurred while processing the request on the server. The error information is 1051544 (message on contacting or from application:[<<DateTimeStamp>>]Local////3544/Error(1013204) Client Commands are Currently Not Being Accepted.

Possible Causes of The Error

This Error is a problem with access to the Essbase object or accessing the security within the Essbase Object.  This can be a result of multiple issues, such as:

  • Object doesn’t exist – The Essbase object didn’t exist in the location specified,
  • Communications – the location is unavailable or cannot be reached,
  • Path Security – Security gets in the way to access the Essbase object location
  • Essbase Security – Security within the Essbase object does not support the user or filter being submitted. Also, the Essbase object security may be corrupted or incomplete.
  • Essbase Object Structure –  the Essbase object was not properly structured to support the filter or the Essbase filter is malformed for the current structure.

Related References

IBM Knowledge Center, InfoSphere Information Server 11.7.0, Connecting to data sources, Enterprise applications, IBM InfoSphere Information Server Pack for Hyperion Essbase

Printable PDF Version of This Article

 

Advertisements

What Are The DataStage / QualityStage Join Stages?

Three Stages Which Join Records

Three Stages Which Join Records

While chasing an error to which only applied to join type stages, I thought it might be nice to identify what the InfoSphere Information Server DataStage / QualityStage are.  There are three of them, as you can see from the picture above, which are the:

  • Join Stage,
  • Lookup Stage,
  • And, Merge Stage.

All three stages that join data based on the values of identified key columns.

Related References

IBM Knowledge Center, InfoSphere Information Server 11.7.0, InfoSphere DataStage and QualityStage, Developing parallel jobs, Processing Data, Lookup Stage

IBM Knowledge Center, InfoSphere Information Server 11.7.0, InfoSphere DataStage and QualityStage, Developing parallel jobs, Processing Data, Join Stage

IBM Knowledge Center, InfoSphere Information Server 11.7.0, InfoSphere DataStage and QualityStage, Developing parallel jobs, Processing Data, Merge Stage

https://www.ibm.com/support/knowledgecenter/SSZJPZ_11.7.0/com.ibm.swg.im.iis.ds.parjob.dev.doc/topics/c_deeref_Merge_Stage.html

DataStage – How to Pass the Invocation ID from one Sequence to another

DataStage Invocation ID Passing Pattern Overview

DataStage Invocation ID Passing Pattern Overview

When you are controlling a chain of sequences in the job stream and taking advantage of reusable (multiple instances) jobs it is useful to be able to pass the Invocation ID from the master controlling sequence and have it passed down and assigned to the job run.  This can easily be done with needing to manual enter the values in each of the sequences, by leveraging the DSJobInvocationId variable.  For this to work:

  • The job must have ‘Allow Multiple Instance’ enabled
  • The Invocation Id must be provided in the Parent sequence must have the Invocation Name entered
  • The receiving child sequence will have the invocation variable entered
  • At runtime, a DataStage invocation id instance of the multi-instance job will generate with its own logs.

Variable Name

  • DSJobInvocationId

Note

This approach allows for the reuse of job and the assignment of meaningful instance extension names, which are managed for a single point of entry in the object tree.

Related References: 

IBM Knowledge Center > InfoSphere Information Server 11.5.0

InfoSphere DataStage and QualityStage > Designing DataStage and QualityStage jobs > Building sequence jobs > Sequence job activities > Job Activity properties

DataStage – How to use single quoted parameter list in an Oracle Connector

Data Integration

Data Integration

While working with a client’s 9.1 DataStage version, I ran into a situation where they wanted to parameterize SQL where clause lists in an Oracle Connector stage, which honestly was not very straight forward to figure out.  First, if the APT_OSL_PARAM_ESC_SQUOTE is not set and single quotes are used in the parameter, the job creates unquoted invalid SQL when the parameter is populated.  Second, I found much of the information confusing and/or incomplete in its explanation.   After some research and some trial and error, here is how I resolved the issue.  I’ll endeavor to be concise, but holistic in my explanation.

When this Variable applies

This where I know this process applies, there may be other circumstances to which is this applicable, but I’m listing the ones here with which I have recent experience.

Infosphere Information Server Datastage

  • Versions 91, 11.3, and 11.5

Oracle RDBMS

  • Versions 11g and 12c

Configurations process

Here is a brief explanation of the steps I used to implement the where clause as a parameter.  Please note that in this example, I am using a job parameter to populate on a portion of the where clause, you can certainly pass the entire where clause as a parameter, if it is not too long.

Configure Project Variable in Administrator

  • Add APT_OSL_PARAM_ESC_SQUOTE to project in Administrator
  • Populate the APT_OSL_PARAM_ESC_SQUOTE Variable \
APT_OSL_PARAM_ESC_SQUOTE Project Variable

APT_OSL_PARAM_ESC_SQUOTE Project Variable

Create job parameter

Following your project name convention or standard practice, if you customer and/or project do not have established naming conventions, create the job parameter in the job. See jp_ItemSource parameter in the image below.

Job Parameter In Oracle Connector

Job Parameter In Oracle Connector

Add job parameter to Custom SQL in Select Oracle Connector Stage

On the Job parameter has been created, add the job parameter to the SQL statement of the job.

Job Parameter In SQL

Job Parameter In SQL

Related References

IBM Knowledge Center > InfoSphere Information Server 11.5.0

Connecting to data sources > Databases > Oracle databases > Oracle connector

IBM Support > Limitation of the Parameter APT_OSL_PARAM_ESC_SQUOTE on Plugins on Parallel Canvas

IBM Knowledge Center > InfoSphere Information Server 11.5.0

InfoSphere DataStage and Quality > Stage > Reference > Parallel Job Reference > Environment Variables > Miscellaneous > APT_OSL_PARAM_ESC_SQUOTE

 

OLTP vs Data Warehousing

Database, OLTP vs Data Warehousing

Database

OLTP Versus Data Warehousing

I’ve tried to explain the difference between OLTP systems and a Data Warehouse to my managers many times, as I’ve worked at a hospital as a Data Warehouse Manager/data analyst for many years. Why was the list that came from the operational applications different than the one that came from the Data Warehouse? Why couldn’t I just get a list of patients that were laying in the hospital right now from the Data Warehouse? So I explained, and explained again, and explained to another manager, and another. You get the picture.
In this article I will explain this very same thing to you. So you know  how to explain this to your manager. Or, if you are a manager, you might understand what your data analyst can and cannot give you.

OLTP

OLTP stands for OLine Transactional Processing. With other words: getting your data directly from the operational systems to make reports. An operational system is a system that is used for the day to day processes.
For example: When a patient checks in, his or her information gets entered into a Patient Information System. The doctor put scheduled tests, a diagnoses and a treatment plan in there as well. Doctors, nurses and other people working with patients use this system on a daily basis to enter and get detailed information on their patients.
The way the data is stored within operational systems is so the data can be used efficiently by the people working directly on the product, or with the patient in this case.

Data Warehousing

A Data Warehouse is a big database that fills itself with data from operational systems. It is used solely for reporting and analytical purposes. No one uses this data for day to day operations. The beauty of a Data Warehouse is, among others, that you can combine the data from the different operational systems. You can actually combine the number of patients in a department with the number of nurses for example. You can see how far a doctor is behind schedule and find the cause of that by looking at the patients. Does he run late with elderly patients? Is there a particular diagnoses that takes more time? Or does he just oversleep a lot? You can use this information to look at the past, see trends, so you can plan for the future.

The difference between OLTP and Data Warehousing

This is how a Data Warehouse works:

How a Data Warehouse works

How a Data Warehouse works

The data gets entered into the operational systems. Then the ETL processes Extract this data from these systems, Transforms the data so it will fit neatly into the Data Warehouse, and then Loads it into the Data Warehouse. After that reports are formed with a reporting tool, from the data that lies in the Data Warehouse.

This is how OLTP works:

How OLTP works

How OLTP works

Reports are directly made from the data inside the database of the operational systems. Some operational systems come with their own reporting tool, but you can always use a standalone reporting tool to make reports form the operational databases.

Pro’s and Con’s

Data Warehousing

Pro’s:

  • There is no strain on the operational systems during business hours
    • As you can schedule the ETL processes to run during the hours the least amount of people are using the operational system, you won’t disturb the operational processes. And when you need to run a large query, the operational systems won’t be affected, as you are working directly on the Data Warehouse database.
  • Data from different systems can be combined
    • It is possible to combine finance and productivity data for example. As the ETL process transforms the data so it can be combined.
  • Data is optimized for making queries and reports
    • You use different data in reports than you use on a day to day base. A Data Warehouse is built for this. For instance: most Data Warehouses have a separate date table where the weekday, day, month and year is saved. You can make a query to derive the weekday from a date, but that takes processing time. By using a separate table like this you’ll save time and decrease the strain on the database.
  • Data is saved longer than in the source systems
    • The source systems need to have their old records deleted when they are no longer used in the day to day operations. So they get deleted to gain performance.

Con’s:

  • You always look at the past
    • A Data Warehouse is updated once a night, or even just once a week. That means that you never have the latest data. Staying with the hospital example: you never knew how many patients are in the hospital are right now. Or what surgeon didn’t show up on time this morning.
  • You don’t have all the data
    • A Data Warehouse is built for discovering trends, showing the big picture. The little details, the ones not used in trends, get discarded during the ETL process.
  • Data isn’t the same as the data in the source systems
    • Because the data is older than those of the source systems it will always be a little different. But also because of the Transformation step in the ETL process, data will be a little different. It doesn’t mean one or the other is wrong. It’s just a different way of looking at the data. For example: the Data Warehouse at the hospital excluded all transactions that were marked as cancelled. If you try to get the same reports from both systems, and don’t exclude the cancelled transactions in the source system, you’ll get different results.

online transactional processing (OLTP)

Pro’s

  • You get real time data
    • If someone is entering a new record now, you’ll see it right away in your report. No delays.
  • You’ve got all the details
    • You have access to all the details that the employees have entered into the system. No grouping, no skipping records, just all the raw data that’s available.

Con’s

  • You are putting strain on an application during business hours.
    • When you are making a large query, you can take processing space that would otherwise be available to the people that need to work with this system for their day to day operations. And if you make an error, by for instance forgetting to put a date filter on your query, you could even bring the system down so no one can use it anymore.
  • You can’t compare the data with data from other sources.
    • Even when the systems are similar. Like an HR system and a payroll system that use each other to work. Data is always going to be different because it is granulated on a different level, or not all data is relevant for both systems.
  • You don’t have access to old data
    • To keep the applications at peak performance, old data, that’s irrelevant to day to day operations is deleted.
  • Data is optimized to suit day to day operations
    • And not for report making. This means you’ll have to get creative with your queries to get the data you need.

So what method should you use?

That all depends on what you need at that moment. If you need detailed information about things that are happening now, you should use OLTP.
If you are looking for trends, or insights on a higher level, you should use a Data Warehouse.

 Related References

 

 

SFDC – Using a date literal in a where clause

Salesforce Connector

I found working with date literal, when working with the Infosphere SFDC Connector soql, to be counterintuitive for me.  At least as I, normally, as I use SQL.  I spent a little time running trials in Workbench, before I finally locked on to the ‘where clause’ criteria data pattern.  So, here a quick example.

SOQL DATE String Literals Where Clause Rules

Basically, the date pattern is straight forward. The basic rules are for a soql where clause:

  • No quotes
  • No functions
  • No Casting function, or casting for the where soql where clause to read.

Example SOQL DATE String Literals

So, here are a couple of date string literal examples in SQL:

  • 1901-01-01
  • 2016-01-31
  • 9999-10-31

Example SQL with Date String Literal Where Clause

 

Select

t.id,

t.Name,

t.Target_Date__c,

t.User_Active__c

From Target_and_Segmentation__c t

where t.Target_Date__c > 2014-10-31

 

Related References

Salesforce Developer Documentation

Home, Developer Documentation, Force.com SOQL and SOSL Reference

https://developer.salesforce.com/docs/atlas.en-us.soql_sosl.meta/soql_sosl/sforce_api_calls_soql_select_dateformats.htm

Salesforce Workbench

Home, Technical Library, Workbench

 

InfoSphere / Datastage – What are The support Connectors stages for dashDB?

dashDB

dashDB

In a recent discussion, the question came up concern which Infosphere Datastage connectors and/or stages are supported by IBM for dashDB.  So, it seems appropriate to share the insight gained from the question being answered.

What Datastage Connectors and/or stages are Supported for dashDB

You have three choices as to connectors, which may best meet you your needs based on the nature of your environment and the configuration chooses which have been applied:

  1. The DB2 Connector Stage
  2. The JDBC Connector stage
  3. The ODBC Stage

Related References

Connecting to IBM dashDB

InfoSphere Information Server, InfoSphere Information Server 11.5.0, Information Server on Cloud offerings, Connecting to other systems, Connecting to IBM dashDB

DB2 connector

InfoSphere Information Server, InfoSphere Information Server 11.5.0, Connecting to data sources, Databases, IBM DB2 databases, DB2 connector

ODBC stage

InfoSphere Information Server, InfoSphere Information Server 11.5.0, Connecting to data sources, Older stages for connectivity, ODBC stage

JDBC data sources

InfoSphere Information Server, InfoSphere Information Server 11.5.0, Connecting to data sources, Multiple data sources, JDBC data sources

What is the convert function in Datastage?

Algorithm

Algorithm

 

What is the convert function in Datastage?

In its simplest form, the convert function in Infosphere DataStage is a string replacement operation.  Convert can be used to replace a specific character, a list of characters, or a unicode character (e.g. thumbs Up Sign or Grinning Face).

Convert Syntax

convert(‘<<Value to be replaced’,'<<Replacement value >>’,<<Input field>>)

Using the Convert Function to remove a list of Characters

Special Characters in DataStage Handles/converts special characters in a transformer stage, which can cause issues in XML processing and certain databases.

Convert a list of General Characters

Convert(“;:?\+&,*`#’$()|^~@{}[]%!”,”, TrimLeadingTrailing(Lnk_In.Description))

Convert Decimal and Double Quotes

Convert(‘ ” . ‘,”, Lnk_In.Description)

Convert Char(0)

This example replaces Char(0) with nothing essentially removing it as padding and/or space.

convert(char(0),”,Lnk_In.Description)

 

Related References

String functions

InfoSphere Information Server, InfoSphere Information Server 11.5.0, InfoSphere DataStage and QualityStage, Developing parallel jobs, Parallel transform functions, String functions

Data Modeling – Fact Table Effective Practices

Database Table

Database Table

Here are a few guidelines for modeling and designing fact tables.

Fact Table Effective Practices

  • The table naming convention should identify it as a fact table. For example:
    • Suffix Pattern:
      • <<TableName>>_Fact
      • <<TableName>>_F
    • Prefix Pattern:
      • FACT_<TableName>>
      • F_<TableName>>
    • Must contain a temporal dimension surrogate key (e.g. date dimension)
    • Measures should be nullable – this has an impact on aggregate functions (SUM, COUNT, MIN, MAX, and AVG, etc.)
    • Dimension Surrogate keys (srky) should have a foreign key (FK) constraint
    • Do not place the dimension processing in the fact jobs

Related References

Data Modeling – Dimension Table Effective Practices

Database Table

Database Table

I’ve had these notes laying around for a while, so, I thought I consolidate them here.   So, here are few guidelines to ensure the quality of your dimension table structures.

Dimension Table Effective Practices

  • The table naming convention should identify it as a dimension table. For example:
    • Suffix Pattern:
      • <<TableName>>_Dim
      • <<TableName>>_D
    • Prefix Pattern:
      • Dim_<TableName>>
      • D_<TableName>>
  • Have Primary Key (PK) assigned on table surrogate Key
  • Audit fields – Type 1 dimensions should:
    • Have a Created Date timestamp – When the record was initially created
    • have a Last Update Timestamp – When was the record last updated
  • Job Flow: Do not place the dimension processing in the fact jobs.
  • Every Dimension should have a Zero (0), Unknown, row
  • Fields should be ‘NOT NULL’ replacing nulls with a zero (0) numeric and integer type fields or space ( ‘ ‘ ) for Character type files.
  • Keep dimension processing outside of the fact jobs

Related References

 

 

Datastage – When checking operator: Operator of type “APT_TSortOperator”: will partition despite the preserve-partitioning flag on the data set on input port 0

APT_TSortOperator Warning

APT_TSortOperator Warning

The APT_TSortOperator  warning happens when there is a conflict in the portioning behavior between stages.  Usually, because the successor (down Stream) stage has the ‘Partitioning / Collecting’ and ‘Sorting’ property set in a way that conflicts with predecessor (upstream) stage’s properties, which it is set to preserver.  This can occur when the successor stage has the “Preserve Partitioning” property set to:

  • ‘Default (Propagate)’
  • ‘Propagate’, or
  • ‘Set’
Preserve Partitioning Property - list

Preserve Partitioning Property – list

Message ID

  • IIS-DSEE-TFOR-00074

Message Text

  • <<Link Name Where Warning Occurred>>: When checking operator: Operator of type “APT_TSortOperator”: will partition despite the preserve-partitioning flag on the data set on input port 0.

Warning Fixes

  • First, if the verify that the partitioning behaviors of both stages are correct
  • If so, set the predecessor ‘Preserve Partitioning’ property to “Clear”
  • If not, then correct the partitioning behavior of the stage which is in error

Clear Partitioning Property Screenshot

Preserve Partitioning Property - Set To Clear

Preserve Partitioning Property – Set To Clear

Infosphere DataStage – Boolean Handling for Netezza

Datastage Director Message - Numeric string expected

Datastage Director Message – Numeric string expected

 

Beware when you see this message when working with Boolean in DataStage, the message displays as informational (at list it did for me) not as a warning or an error.  Even though it seems innocuous, what it meant for my job, was the Boolean (‘true’ / ‘false’) was not being interpreted and everything posted to ‘false’.

In DataStage the Netezza ‘Boolean’ field/Data SQL type maps to the ‘Bit’ SQL type, which expects a numeric input of Zero (0) or one (1).  So, my solution (once I detected the problem during unit testing) was to put Transformer Stage logic in place to convert the Boolean input to the expected number value.

 

Netezza to Datastage Data Type Mapping

Netezza data types

InfoSphere DataStage

data types (SQL types)

Expected Input value

BOOLEAN Bit 0 or 1 (1 = true, 0 = false)

 

Transformer Stage logic Boolean Handling Logic

A Netezza Boolean field can store: true values, false values, and null. So, some thought should be given to you desired data outcome for nulls

This first example sets a that the nulls are set to a specific value, which can support a specific business rule for null handling and, also, provide null handling for non-nullable fields.  Here we are setting nulls to the numeric value for ‘true’ and all other non-true inputs to ‘false’.

If isnull(Lnk_Src_In.USER_ACTIVE) then 1 Else if Lnk_Src_In.USER_ACTIVE = ‘true’ Then 1 Else 0

These second examples sets a that the nulls are set by the Else value, if your logic direction is correct value and still provides null handling for non-nullable fields.

  • If  Lnk_Src_In.USER_ACTIVE = ‘true’ Then 1 Else 0

  • If  Lnk_Src_In.USER_ACTIVE = ‘False’ Then 0 Else 1

Director Log Message

Message ID

  • IIS-DSEE-TBLD-00008

Message Text

  • <<Link Name Where Message Occurred>>: Numeric string expected. Use default value.

Or something like this:

  • <<Link Name Where Message Occurred>>: Numeric string expected for input column ‘<<Field Name Here>>‘. Use default value.

Related References

Boolean

PureData System for Analytics, PureData System for Analytics 7.2.1, IBM Netezza user-defined functions, UDX data types reference information, Supported data types, Boolean

https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.1/com.ibm.nz.udf.doc/r_udf_boolean_datatype.html

Data types and aliases

PureData System for Analytics, PureData System for Analytics 7.2.1, IBM Netezza stored procedures, NZPLSQL statements and grammar, Variables and constants, Data types and aliases

https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.1/com.ibm.nz.sproc.doc/c_sproc_data_types_aliases.html

Logical data types

PureData System for Analytics, PureData System for Analytics 7.2.1, IBM Netezza database user documentation, Netezza SQL basics, Data types, Logical data types

https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.1/com.ibm.nz.dbu.doc/r_dbuser_data_types_logical.html

Data type conversions from Netezza to DataStage

InfoSphere Information Server, InfoSphere Information Server 11.5.0, Connecting to data sources, Databases, Netezza Performance Server, Netezza connector, Designing jobs by using the Netezza connector, Defining a Netezza connector job, Data type conversions, Data type conversions from Netezza to DataStage

https://www.ibm.com/support/knowledgecenter/en/SSZJPZ_11.5.0/com.ibm.swg.im.iis.conn.netezza.use.doc/topics/nzcc_mappingdatatypes.html

InfoSphere DataStage – Operations Manual Template

Documentation

Documentation

When projects transition to production, the receiving team need some transition documentation to help with knowledge transfer and to guide them while they get accustomed to operating the application and can form their own documentation.  This temple is shell, which usually provides enough detail, with some updating, to provide a quick reference regarding the application.  Also, having a temple can be a real time saver, as opposed to writing it from scratch.

Related References

InfoSphere DataStage – DataStage Parallel Job Peer Code Review Checklist Template

SDLC Development Phase

SDLC Development Phase

Peer code review happens during the development phase and focus on the overall quality and compliance to standards of code and configuration artifacts. However, the hard part of performing a Peer code review isn’t, performing the review, but rather to achieving consistency and thoroughness in the review.   This is where a checklist can contribute significantly, providing a list of things to check and providing a relative weight for the findings.  I hope this template assists with your DataStage job review process.

 

Infosphere Datastage – Standard Practice- Sequence Naming Conventions

Standards

Standards

Standard practices help you and other understand your work.  This can be very important when working on large teams, working across team boundaries, or when large complex sets of process and objects may be involved.  When you consider the importance of naming convention, when coupled with standard practice, the benefit should be obvious, but often practice doesn’t execute or document their conventions.  So, these standard naming conventions may help when none exist or you need to assemble your own naming conventions.

<<SomeIdentifier >> = should be replaced with appropriate information

  1. Sequence Object Naming Conventions
Entity Convention
Master Control Sequence (parent) Master_<<Application>>_ <<Application Job Stream Name>>_Seq
Sequence <<Application>>_<<job_Name>>_Seq

 

  1. Sequence Stage Naming Conventions
Entity Convention
End Loop EL__<<PrimaryFunction>>
Error Handler EH_<<PrimaryFunction>>
Execute Command EC_<<PrimaryFunction>>
Job Activity Job_<<PrimaryFunction>>
Nested Condition NC_<<PrimaryFunction>>
Notify Notify_<<PrimaryFunction>>
Routine Activity Rtn_<<PrimaryFunction>>
Sequence Links (messages) Msg_<<Number or Description>>
Sequence Links (Non-messages) Lnk_<<Number or Description>>
Sequencer Seqr_<<Number or Description>>
Sequencer (All) SeqAll_<<Identifier>>
Sequencer (Any) SeqAny_<<Identifier>>
Start Loop SL__<<PrimaryFunction>>
Terminator Activity TA__<<PrimaryFunction>>
User Variables UV__<<PrimaryPurpose>>
Wait For File WFF__<<PrimaryFunction>>

 

 

Related References

InfoSphere DataStage – Ways to Create a Datastage Parameter Set

Parameter Sets

Parameter Sets

There are three primary ways to create a parameter sets and is a different practice from adding ‘User Defined’ variables in InfoSphere DataStage Administrator. The ways to create a parameter set are:

  • Create a parameter set from a data connection stage
  • Create a Parameter Set from the navigation of DataStage designer, and
  • Create a Parameter Set from a job

Create a parameter set from Data Connection Stage

This is used to create parameter sets for Database connections parameters

To create a new Parameter Set from a Data Connection

  • Select: File > New > Other and select “Data Connection
  • Complete the data Connection stage properties, then save the stage.
  • Open the Connection stage and navigate to the “Parameters” Tab
  • Then, click on the “Associated Parameter Set” button, and Chose the “Create & Attach” menu item
  • This will Launch a Dialog
  • Fill out the appropriate information on the General tab and the proceed to the Parameters Tab:
  • In the Parameters Tab, enter in the Parameters you wish to include in this Parameter Set
  • On the Values tab, specify a Value File name (please follow naming convention standards to prevent rework and other problems). This is the name of the file that will automatically be created on the Engine tier. This tab also allows you to view/edit values located in the value file.
  • Click OK to save the Parameter set.

Create a Parameter Set from the navigation of DataStage designer

This is, perhaps, the more traditional way of creating a parameter set.

To create a new Parameter Set

  • Select: File > New > Other and select “Parameter Set”
  • This will Launch a Dialog
  • Fill out the appropriate information on the General tab and the proceed to the Parameters Tab:
  • In the Parameters Tab, enter in the Parameters you wish to include in this Parameter Set.
Note: Existing Environment Variables can also added.
  • Create a Parameter Set from a job
    On the Values tab, specify a Value File name (please follow naming convention standards to prevent rework and other problems). This is the name of the file that will automatically be created on the Engine tier. This tab also allows you to view/edit values located in the value file.
  • Click OK to save the Parameter set.

This approach is, perhaps, less traditional, but is equally effective, if you find yourself creating additional jobs and now need to share the same parameters.  This is a quick and easy to generate a parameter set from an existing job.

To create a new Parameter Set from a job

  • Open the job that you want to create a parameter set for.
  • Click “Edit > Job Properties” to open the “Job Properties” window.
  • Click the “Parameters” tab.
  • Press and hold the Ctrl key, then select the parameters that you want to include in the parameter set.
  • With your parameters highlighted, click “Create Parameter Set”.  The Parameter Set window opens.
    • Enter a name and short description for your parameter set.
    • Click the “Parameters” tab; the parameters that you selected are listed.
    • Click the ”Values” tab.
    • Enter a name in the Value File name field, then press Enter.  The value for each of your parameters is automatically populated with the path name that you entered.
    • If a default value is not already set, enter a value for each parameter. For example, if the variable is a Pathname type, enter a default path name.
    • Click “OK” to close the Parameter Set window.
    • In the Save Parameter Set As window, select the folder where you want to save your parameter set and click Save. When prompted to replace the selected parameters with the parameter set, click Yes.
  • Click “OK” to close the Job Properties window.

Related References

Infosphere Datastage – Data type conversions from Netezza to DataStage

IBM Infosphere Information Server (IIS)

IBM Infosphere Information Server (IIS)

 

Some recent research to eliminate some ETL Data Conversion issues, made me want to make an enhancement to the documentation provided by IBM, to prevent repeating the research.

Netezza data types and their equivalent InfoSphere DataStage data types

Netezza data types

InfoSphere DataStage data types (SQL types)

Notes

 
BYTEINT TinyInt
SMALLINT SmallInt
INT Integer
BIGINT BigInt
NUMERIC(p, s) Numeric, decimal, double Decimal and double are aliases of Numeric
FLOAT(p) Float
REAL Real
DOUBLE PRECISION Double
CHAR(n) Char
VARCHAR(n) VarChar
NCHAR(n) NChar
NVARCHAR(n) NVarChar
BOOLEAN Bit
DATE Date
TIME Time
TIME WITH TIME ZONE VarChar The value of time with time zone will be returned without the time zone information.
TIMESTAMP TimeStamp
INTERVAL VarChar You cannot load the interval data type from an external table.
ROWID BigInt
TRANSACTION ID BigInt
DATASLICE Integer

Related References

Data type conversions from Netezza to DataStage

InfoSphere Information Server,InfoSphere Information Server 11.5.0, Connecting to data sources, Databases, Netezza Performance Server, Netezza connector, Designing jobs by using the Netezza connector, Defining a Netezza connector job, Data type conversions, Data type conversions from Netezza to DataStage

Netezza / PureData – What is the maximum length of CHAR, VARCHAR, and NVARCHAR fields

IBM Netezza / IBM PureData for Analytics

IBM Netezza / IBM PureData for Analytics

 

 

During a recent project, I had reason to investigate what the maximum character field links for Netezza to be able to fit the data ETL ‘ed from the source.  The constraints on Netezza character fields are not as long as some other databases, therefore, I thought I would document and a quick post for future reference.   Knowing these limits will help with your ETL design and construction but you may know where to truncate your field input data.

 

Data type Alias names InfoSphere DataStage data types (SQL type) Maximum Length Notes
CHAR CHARACTER, CHAR(n), CHARACTER(n) Char 64,000 Fixed-length character string, blank padded to length n. If you do not specify n, the default is an unsized CHAR value.
VARCHAR CHARACTER VARYING, VARCHAR(n), CHARACTER VARYING(n), CHAR VARYING(n) VarChar, LongVarChar 64,000 Variable length character string to a maximum length of n. If you do not specify n, the default is an unsized VARCHAR value. There is no blank padding, and the value is stored as entered.
NCHAR NATIONAL CHARACTER, NATIONAL CHAR(n), NCHAR(size) NChar, Char, 16,000 Fixed-length character string, blank padded to length n. If you do not specify n, the default is an unsized NCHAR value.
NVARCHAR NATIONAL CHARACTER VARYING, NATIONAL CHAR VARYING, NATIONAL CHAR VARYING(n), NATIONAL CHARACTER VARYING(n), and NVARCHAR(n) VarChar, NVarChar ,LongVarChar, LongNVarChar, VarChar 16,000 Variable length character string to a maximum length of n. If you do not specify n, the default is an unsized NVARCHAR value.

Related References

Data types and aliases

PureData System for Analytics ,PureData System for Analytics 7.2.1, IBM Netezza stored procedures, NZPLSQL statements and grammar, Variables and constants, Data types and aliases

Data type conversions from DataStage to Netezza

InfoSphere Information Server, InfoSphere Information Server 11.5.0, Connecting to data sources, Databases, Netezza Performance Server, Netezza connector, Designing jobs by using the Netezza connector, Defining a Netezza connector job, Data type conversions, Data type conversions from DataStage to Netezza

*DataStage*DSR_PROJECT (Action=8); check DataStage is set up correctly in project

Error

Error

 

Basically, the Action=8 error, which I normally see when opening the DataStage Director Client application, means that one or more of the RT_LOG files have become corrupted.  Usually, this problem occurs in relation to disk space issues; although, there can be other causes.

Error Message

Error calling subroutine: *DataStage*DSR_PROJECT (Action=8); check DataStage is set up correctly in project

(Subroutine failed to complete successfully (30107))

The Cleanup approach

The cleanup process really consists of three primary steps:

  • Free disk space
  • Restart application process
  • And, fix corrupted log

Free Disk Space

This can consist of:

  • Cleaning ‘/tmp’ Space
  • Removing any large unnecessary files
  • Enlarging ‘/tmp’ space allocation
  • Adding addition disks space, if necessary

Restart Application Processes

Once you have free the disk space available restarting VM/server is recommended, However, if that is not a realistic option, then at least reboot the Infosphere Datastage engine to ensure the newly freed memory is registering with the applications and to ensure everything is restarted and running.

Fix Corrupted logs

Perhaps, the cleanest way reset the all logs is to perform a ‘Multiple Job Compile’.  Running the jobs will also over write the logs, but is a little more hit and miss, if not all the jobs are not in job streams/batches, which can be run at this time.  The logs can be manually overwritten by compiling the job or performing a reset.  The trick, with manual reset is that you have to know which job to reset, so, this could take a while to get them all. The logs can be manually, dropped and reset, but I recommend that approach only as a last resort.

 

 

*DataStage*DSR_SELECT (Action=3); check DataStage is set up correctly in project

Error

Error

Having encountered this DataStage client error in Linux a few times recently, I thought I would document the solution, which has worked for me.

Error Message:

Error calling subroutine: *DataStage*DSR_SELECT (Action=3); check DataStage is set up correctly in project

(Subroutine failed to complete successfully (30107))

Probable Cause of Error

  • NodeAgents has stopped running
  • Insufficient /temp disk space

Triage Approach

To fix this error in Linux:

  • Ensure disk space is available and you may want clean up the /tmp directory of any excel non-required files.
  • Start the NodeAgents.sh, if it is not running

Command to verify Node Agent is running

ps -ef | grep java | grep Agent

 

Command to Start Node Agent

This example command assumes the shell script is in its normal location, if not you will need to adjust the path.

/opt/IBM/InformationServer/ASBNode/bin/NodeAgents.sh start

Node Agent Logs

These logs may be helpful:

  • asbagent_startup.err
  • asbagent_startup.out

Node Agent Logs Location

This command will get you to where the logs are normally located:

cd /opt/IBM/InformationServer/ASBNode/