How the IBM Common SQL Engine (CSE) Improves DB2

Common SQL Engine (CSE)

Common SQL Engine (CSE)

Today, newfound efficiencies and innovation are key to any business success – small, medium or large. In the rapidly evolving field of data analytics, innovative approaches to handling data are particularly important since data is the most valuable resource any business can have. IBM common SQL Engine is delivering application and query compatibility that is allowing companies to turn their data into actionable insights. This is allowing businesses to unleash the power of their databases without constraints.

But, is this really important?

Yes. Many businesses have accumulated tons of data over the years. This data resides in higher volumes, more locations throughout an enterprise – on-premise and on-cloud –, and in greater variety. Typically, this data should be a huge advantage, providing enterprises with actionable insights. But, often, this doesn’t happen.

IBM Hybrid Data Management.

With such a massive barrel of complex legacy data, many organizations find it confusing to decide what to do with it. Or where to start. The process of migrating all that data into new systems is simply a non-starter. As a solution, enterprises are turning to IBM Db2 – a hybrid, intuitive data approach that marries data and analytics seamlessly. IBM Db2 hybrid data management allows flexible cloud and on-premises deployment of data.

However, such levels of flexibility typically require organizations to rewrite or restructure their queries, and applications that will use the diverse, ever-changing data. These changes may even require you to license new software. This is costly and unfeasible. To bridge this gap, the Common SQL Engine (CSE) comes into play.

How IBM Common SQL Engine is Positioning Db2 for the Future?

The IBM Common SQL Engine inserts a single layer of data abstraction at the very data source. This means that, instead of migrating the data all at once, you can now apply data analytics wherever the data resides – whether on private, public or hybrid cloud – by using the Common SQL Engine as a bridge.

The IBM’s Common SQL Engine provides portability and consistency of SQL commands, meaning that the SQL is functionally portable across multiple implementations. It allows seamless movement of workloads to the cloud and allows for multiplatform integration and configurations regardless of their programming language.

Ideally, the Common SQL Engine is supposed to be the heart of the query and the foundation of application compatibility. But it does so much more!

Its compatibility extends beyond data analytic applications to include security, management, governance, data management, and other functionalities as well.

How does this improve the quality, flexibility, and portability of Db2?

By allowing for integration across multiple platforms, workloads and programming languages, the Common SQL Engine, ultimately, leads to a “data without limits” environment for Db2 hybrid data management family through:

  1. Query and application compatibility

The Common SQL engine (CSE) ensures that users can write a query, and be confident that it will work across the Db2 hybrid data management family of offerings. With the CSE, you can change your data infrastructure and location – on-cloud or on-premises – without having to worry about license costs and application compatibility.

  1. Data virtualization and Integration

The common SQL engine has a built-in data virtualization service that ensures that you can access your data from all your sources. These services position Db2 family of offerings including, IBM Db2 warehouse, IBM Db2, IBM Db2 BigSQL amongst others.

This services also applies to IBM Integrated Analytics System, Teradata, Oracle, Puredata and Microsoft SQL server. Besides, you can work seamlessly with open-source solutions such as HIVE; and cloud sources such as Amazon Redshift. Such levels of integration are unprecedented!

By allowing users to effectively pull data from Db2 data stores and integrate it with data from non-IBM stores using a single query, the common SQL engine places Db2 at an authoritative position as compared to other data stores.

  1. Flexible Licensing

Licensing is one of the hardest nuts to crack, especially for smart organizations who rely on technologies such as the cloud to deliver their services. While application compatibility and data integration will save you time, flexible licensing saves you money, on the spot.

IBM’s common SQL engine allows flexible licensing, meaning that you can purchase one license model and deploy it whenever needed, or as your data architecture evolves. Using IBM’s FlexPoint licensing, you can purchase FlexPoints and use them across all Db2 data management offerings. This is a convenience in one place.

The flexible licensing will not only simplify the adoption and exchange of platform capabilities, but it also positions your business strategically by making it more agile. Your data managers will be able to access the tools needed on the fly, without going through a lethargic and tedious procurement process.

IBM Db2 Data Management Family Is Supported by Common SQL Engine (CSE) .

IBM Db2 is a family of custom, deployable database that allows enterprises to leverage existing investments. IBM Db2 allows businesses to use any type of data from an either structured or unstructured database (or data warehouse). It provides the right data foundation/environment with industry-leading data compression, on-premise and cloud deployment options, modern data security, robust performance for mixed loads and the ability to adjust and scale without redesigning.

The IBM Db2 family enable businesses to adapt, scale quickly and remain competitive without compromising security, risk levels or privacy. It features:

  • Always-on availability
  • Deployment and flexibility: On-premises, scale-on demand, and private or cloud deployments• Compression and performance
  • Embedded IoT technology is allowing businesses to act fast on the fly.

Some of these Db2 family offerings that are supported by the common SQL engine include:

  • Db2 Database
  • Db2 Hosted
  • Db2 Big SQL
  • Db2 on Cloud
  • Db2 Warehouse
  • Db2 Warehouse on Cloud
  • IBM Integrated Analytics System (IIAS)

Db2 Family Offerings and Beyond

Since the common SQL engine mainly focuses on data federation and propensity, other non-IBM databases can as well plug into the engine for SQL processing. These other 3rd party offerings include:

  • Watson Data Platform
  • Oracle
  • Hadoop
  • Microsoft SQL Server
  • Teradata
  • Hive

Conclusion

IBM Common SQL engine is allowing organizations to fully use data analytics to future-proof their business, and as well remain agile and competitive. In fact, besides the benefits of having robust tools woven into CSE, this SQL engine offers superior analytics and machine-learning positioning. Data processing can now happen at the speed of light –- 2X to 5X faster. The IBM Common SQL engine adds important capabilities to Db2, including freedom of location, freedom of use, and freedom of assembly.

Related References

An Overview of DB2 Federation

DB2 Federation

DB2 Federation

Data analytics has changed where data is no longer manageable in relational databases only. Data is flowing from various sources which are not of the same format. This means it is not possible to store all data in the same repository. Some are best suited for storing in relational databases, others for Apache Hadoop while others are best suited for NoSQL databases.

During data analyzing, so much time is taken in trying to bring the distributed data together instead of obtaining insights. Db2 Federation has come to the rescue of data analysts. Federation concept in db2 eliminates the need for storing data in different repositories and reduces the hassle of getting insights.

What is DB2 Federation?

DB2 federation is a data integration technology that permits remote database objects to be accessed as local DB2 database objects. This technology connects multiple databases and makes them appear like one database.

How does DB2 federation work?

Federation allows you to access all of your data that is on multiple distributed databases using a single query. When implemented in an organization, this technology can be used to access data that is on any of the organization’s Db2, whether local or in the cloud.

Why use DB2 federation?

So, why should you use the federation? This concept brings data of all formats into one virtual source. With data being retrieved from one virtual source, analyzing it becomes cost-effective and efficient.

What are its primary use cases for DB2 federation?

Merging of various sources of data

DB2 federation facilitates consolidating of data from sources data local and cloud to form one virtual data source. This eliminates the process of migrating data which can be expensive and troublesome.

Increase the capacity of a repository beyond the fixed limits

Physical storage capacity is bound to have a limit which is one reason you may find an organization has distributed its data in various repositories. With federation, the storage is virtual and therefore doesn’t have any limit. This technology can greatly help you if your physical dataset is running low on space.

Linking up to Db2 Warehouse on Cloud

People who use Db2 products can federate data from Db2 on Cloud and Db2 Warehouse on the Cloud. This will give them a joint interface where they can access, add, query, and analyze data without encountering the complex ETL processes. Better still, no additional code will be required to execute all these processes. This makes it easy for people with the low technical know-how to use these products smoothly.

Split data across different servers

At times, you might choose to partition your data. With federation integration technology, partitioned data can be queried with a unified interface. Federation allows you to better balance your workloads, scale precise parts of an app, and create micro-services that work harmoniously.

Generally, db2 federation makes it access data by bringing it together into a single virtual source. This brings about cost and time-saving benefits. When you want to analyze data, you can get insights immediately instead of spending a lot of time querying through repositories.

Related References

DataStage – Netezza Connector Action Column

Over the years have occasionally use the action column feature, however, the last month or so I have found myself using it quite a lot. This is especially true in relation to the tea set and not just in relation to the change capture stage.

The first thing you need to know is, if you want to prevent getting the ‘no action column found’ notice on the target stage, need to ensure that the action column has been coded to be a single character field char (1). Otherwise, the Netezza connector stage will not recognize your field as an action column.

While most developers will commonly work with the action column feature in relation to the change capture stage, it can also be very useful if you have created a field from one or more inputs to tell you what behavior the row requires. I have found that this approach can be very useful and efficient under the right circumstances.

Example Pattern for Action Column Using Multiple Source Selects

Example Pattern for Action Column Using Multiple Source Selects

Action column configuration example

Action Column Field Type

Action Column Field Type

 Change Code Values Mapping To Action Column

  • Here’s a quick reference table to provide the interpretation of the change type code to the actual one character action column value to which it will need to be interpreted.

Change Code Type

Change Type Code

Action Column Value

Copy (Data Without Changes)

0

No
value for this Change Type

Insert

1

I

Delete

2

D

Update

3

U

Example Transformer Stage, Derivation

  •  Here is a quick transformer stage derivation coding example to take advantage of the action call capabilities. If you haven’t already handled the removal of the copy rows, you may also want to add a constraint.
  • The combination I most frequently find myself using is the insert and update combination.
if Lnk_Out_To_Tfm.change_code=1 then ‘I’

Else if Lnk_Out_To_Tfm.change_code=2 then ‘D’

Else if Lnk_Out_To_Tfm.change_code=3 then ‘U’

Related References

Home > InfoSphere Information Server 11.7.0 > InfoSphere DataStage and QualityStage > Developing parallel jobs > Introduction to InfoSphere DataStage Balanced Optimization > Job design considerations  > Specific considerations for the Netezza connector

Netezza / PureData – List of Views against a table

PureData Powered by Netezza

PureData Powered by Netezza

I have found myself using this simple, but useful SQL time in recent weeks to research different issues and to help with impact analysis.  So, I thought I would post it while I’m thinking about it.  It just gives a list of views using a table, which can be handy to know.  This SQL is simple and could be converted to an equi-join.  I used the like statement mostly because I sometimes want to know if there are other views a similar nature in the same family (by naming convention) of tables.

Select All Fields From The _V_View

This is the simplest form of this SQL to views, which a table.

Select * from _v_view

where DEFINITION like ‘%<<TABLE_NAME>>%’ ;

Select Minimal Fields From The _V_View

This is the version of the SQL, which I normally use, to list the views, which use a table.

Select VIEWNAME, OWNER from _v_view

where DEFINITION like ‘%<<TABLE_NAME>>%’;

Related References

Netezza / Puredata – How to replace or trim CHAR(0) is NULL characters in a field

PureData Powered by Netezza

PureData Powered by Netezza

Occasionally, one runs into the problem of hidden field values breaking join criteria.  I have had to clean up bad archive and conversion data with hidden characters serval times over the last couple of weeks, so, I thought I might as well capture this note for future use.

I tried the Replace command which is prevalent for Netezza answers to this issue on the web, but my client’s version does not support that command.  So, I needed to use the Translate command instead to accomplish it.  It took a couple of searches of the usual bad actors to find the character causing the issue, which on this day was chr(0).  Here is a quick mockup of the command I used to solve this issue.

Example Select Statement

Here is a quick example select SQL to identify problem rows.

SELECT TRANSLATE(F.BLOGTYPE_CODE, CHR(0), ”) AS BLOGTYPE_CODE, BT.BLOG_TYP_ID, LENGTH(BT.BLOG_TYP_ID) AS LNGTH_BT, LENGTH(F.BLOGTYPE_CODE) AS LNGTH_ BLOGTYPE

FROM  BLOGS_TBL F,  BLOG_TYPES BT WHERE TRANSLATE(F.BLOGTYPE_CODE, CHR(0), ”) =  BT.BLOG_TYP_ID AND LENGTH(BT.BLOG_TYP_ID) <>Length(LENGTH(F.BLOGTYPE_CODE) ;

 

Example Update Statement

Here is a quick shell update statement to remove the Char(0) characters from the problem field.

Update <<Your Table Name>> A

Set A.<<Your Field Name>> = TRANSLATE(A.<<Your FieldName>>, CHR(0), ”)

where length(A.<<Your Field Name>>) <> Length(A.<<Your FieldName>>) And << Additional criteria>>;

 

 

 

SQL Server – how to know when a stored procedure ran last

Microsoft SQL Server 2017

Microsoft SQL Server 2017

This week I needed to know if a stored procedure was running when expected during our batch.  So, here is a quick couple of SQL to answer the question:

When a Stored Procedure was run last

This version of the SQL gives the date for the last time the Stored Procure was run:

select distinct   top 1     s.last_execution_time

from  sys.dm_exec_query_stats s

cross apply sys.dm_exec_query_plan (s.plan_handle) p

where  object_name(p.objectid, db_id(‘<<DATABASE_NAME>>’)) = ‘<<STORED_PROCEDURE_NAME>>’

Order by s.last_execution_time desc

Get a list of when Stored Procedure has been run

This version of the SQL provides a list of dates of when Stored Procure has been run:

select distinct   s.last_execution_time

from  sys.dm_exec_query_stats s

cross apply sys.dm_exec_query_plan (s.plan_handle) p

where object_name(p.objectid, db_id(‘<<DATABASE_NAME>>’)) = ‘<<STORED_PROCEDURE_NAME>>’

Order by s.last_execution_time desc

 

 

Netezza / PureData – How to Substring on a Character

PureData Powered by Netezza

PureData Powered by Netezza

 

I had a reason this week to perform a substring on a character in Netezza this week, something I have not had a need to do before.  The process was not as straightforward as I would have thought, since the command is explained as a static position command, and the IBM documentation, honestly, wasn’t much help.  Knowing full well, that text strings are variable having to provide a static position is not terribly useful in and of itself.  So, we need to use an expression to make the substring command flexible and dynamic.

I did get it work the way I needed, but it took two commands to make it happen:

  • The First was the ’instr’ command to identify the field and character I wanted to substring on: instr(<<FIELD_NAME>>,’~’) as This provides the position number of the tilde (~).
  • The second was the ‘substr’ command in which I embedded the ‘instr’ command: substr(<<FIELD_NAME>>,0,instr(<<FIELD_NAME>>,’~’) )

This worked nicely for what I needed, which was to pick out a file name from the beginning of a string, which was delimited with a tilde (~)

Substring on a Character Command Format

  • This format example starts with position zero (0) as position 1 of substring command and goes to the first tilde (~) as position 2 of the substring command.
Select  <<FIELD_NAME>>

, instr(<>,’~’) as pos2

, substr(<<FIELD_NAME>>,0,instr(<<FIELD_NAME>>,’~’) ) as Results

From <<Table_Name>>

where  <<Where_Clause>>;

 

 

Related references

IBM Knowledge Center, Home, PureData System for Analytics 7.2.1, IBM Netezza database user documentation, Netezza SQL basics, Netezza SQL extensions, Character string functions

IBM Knowledge Center, Home PureData System for Analytics 7.0.3, IBM Netezza Database User’s Guide, Netezza SQL basics, Netezza SQL extensions, Character functions

Oracle TO_CHAR to SQL Server CONVERT Equivalents to change Date to String

Transact SQL (T-SQL)

Transact SQL (T-SQL)

When it comes to SQL I tend to lean on the SQL I have used the most over the years, which is Oracle.  Today was no exception, I found myself trying to use the TO_CHAR command in SQL Server to format a date, which of course does not work. So, after a little thought, here are some examples of how you can the SQL Server Convert Command the achieve the equivalent result.

Example SQL Server Date Conversion SQL

Example SQL Server Date Conversion SQL

Example SQL Server Date Conversion SQL Code

This SQL of examples runs, as is, no from table required.

 

Select

CONVERT(VARCHAR(10), GETDATE(), 20) as
‘YYYY-MM-DD’

,CONVERT(VARCHAR(19), GETDATE(), 20) as ‘YYYY-MM-DD HH24:MI:SS’

,CONVERT(VARCHAR(8), GETDATE(), 112) as YYYYMMDD

,CONVERT(VARCHAR(6), GETDATE(), 112) as YYYYMM

,CONVERT(VARCHAR(12), DATEPART(YEAR, GETDATE()))+ RIGHT(‘0’+CAST(MONTH(GETDATE()) AS VARCHAR(2)),2)
as
YYYYMM_Method_2

,CONVERT(VARCHAR(4), GETDATE(), 12) as YYMM

,CONVERT(VARCHAR(4), GETDATE(), 112) as YYYY

,CONVERT(VARCHAR(4), DATEPART(YEAR, GETDATE())) as YYYY_Method_2

,CONVERT(VARCHAR(4), YEAR(GETDATE())) as YYYY_Method_3

,RIGHT(‘0’+CAST(MONTH(GETDATE()) AS VARCHAR(2)),2) as Two_Digit_Month

,SUBSTRING(ltrim(CONVERT(VARCHAR(4), GETDATE(), 12)),3,2) as Two_Digit_Month_2

,CONVERT(VARCHAR(10), GETDATE(), 111) as ‘YYYY/MM/DD’

,CONVERT(VARCHAR(5), GETDATE(), 8) as ‘HH24:MI’

,CONVERT(VARCHAR(8), GETDATE(), 8) ‘HH24:MI:SS’

Map TO_CHAR formats to SQL Server

You can map an Oracle TO_CHAR formats to SQL Server alternative commands as follows:

TO_CHAR
String

VARCHAR
Length

SQL
Server Convert Style

YYYY-MM-DD

VARCHAR(10)

20,
21, 120, 121, 126 and 127

YYYY-MM-DD
HH24:MI:SS

VARCHAR(19)

20,
21, 120 and 121

YYYYMMDD

VARCHAR(8)

112

YYYYMM

VARCHAR(6)

112

YYMM

VARCHAR(4)

12

YYYY

VARCHAR(4)

112

MM

VARCHAR(2)

12

YYYY/MM/DD

VARCHAR(10)

111

HH24:MI

VARCHAR(5)

8,
108, 14 and 114

HH24:MI:SS

VARCHAR(8)

8,
108, 14 and 114

Translating the formats commands

Here are some example of translating the formats commands.

Format

SQL
Server

YYYY-MM-DD

CONVERT(VARCHAR(10),
GETDATE(), 20)

YYYY-MM-DD
HH24:MI:SS

CONVERT(VARCHAR(19),
GETDATE(), 20)

YYYYMMDD

CONVERT(VARCHAR(8),
GETDATE(), 112)

YYYYMM

CONVERT(VARCHAR(6),
GETDATE(), 112)

YYMM

CONVERT(VARCHAR(4),
GETDATE(), 12)

YYYY

CONVERT(VARCHAR(4),
GETDATE(), 112)

YYYY

CONVERT(VARCHAR(4),
DATEPART(YEAR, GETDATE()))

YYYY

CONVERT(VARCHAR(4),
YEAR(GETDATE()))

MM

RIGHT(‘0’+CAST(MONTH(GETDATE())
AS VARCHAR(2)),2)

MM

SUBSTRING(ltrim(CONVERT(VARCHAR(4),
GETDATE(), 12)),3,2)

YYYY/MM/DD

CONVERT(VARCHAR(10),
GETDATE(), 111)

HH24:MI

CONVERT(VARCHAR(5),
GETDATE(), 8)

HH24:MI:SS

CONVERT(VARCHAR(8),
GETDATE(), 8)

Related Reference

Microsoft Docs, SQL, T-SQL Functions, GETDATE (Transact-SQL)

Microsoft Docs, SQL, T-SQL Functions, Date and Time Data Types and Functions (Transact-SQL)

Microsoft Docs, SQL, T-SQL Functions, DATEPART (Transact-SQL)

 

 

SQL server table Describe (DESC) equivalent

 

Transact SQL (T-SQL)

Transact SQL (T-SQL)

Microsoft SQL Server doesn’t seem have a describe command and usually, folks seem to want to build a stored procedure to get the describe behaviors.  However, this is not always practical based on your permissions. So, the simple SQL below will provide describe like information in a pinch.  You may want to dress it up a bit; but I usually just use it raw, as shown below by adding the table name.

Describe T-SQL Equivalent

Select *

From INFORMATION_SCHEMA.COLUMNS

Where TABLE_NAME = ‘<<TABLENAME>>’;

Related References

Microsoft SQL Server – Useful links

Microsoft SQL Server 2017

Microsoft SQL Server 2017

Here are a few references for the Microsoft SQL Server 2017 database, which may be helpful.

Table Of Useful Microsoft SQL Server Database References

Reference Type

Link

SQL Server 2017 Download Page

https://www.microsoft.com/en-us/sql-server/sql-server-downloads

SQL SERVER version, edition, and update level

https://support.microsoft.com/en-us/help/321185/how-to-determine-the-version–edition-and-update-level-of-sql-server-a

SQL Server 2017 Release Notes

https://docs.microsoft.com/en-us/sql/sql-server/sql-server-2017-release-notes

SQL Server Transact SQL Commands

https://technet.microsoft.com/en-us/library/ms189826(v=sql.90).aspx

Related References

Netezza / PureData – How To Get A List Of When A Store Procedure Was Last Changed Or Created

Netezza / Puredata - SQL (Structured Query Language)

Netezza / Puredata – SQL (Structured Query Language)

In the continuing journey to track down impacted objects and to determine when the code in a database was last changed or added, here is another quick SQL, which can be used in Aginity Workbench for Netezza to retrieve a list of when Store Procedures were last updated or were created.

SQL List of When A Stored Procedure was Last Changed or Created

select t.database — Database
, t.OWNER — Object Owner
, t.PROCEDURE — Procedure Name
, o.objmodified — The Last Modified Datetime
, o.objcreated — Created Datetime

from _V_OBJECT o
, _v_procedure t
where
o.objid = t.objid
and t.DATABASE = ‘<<Database Name>>
order by o.objmodified Desc, o.objcreated Desc;

 

Related References

 

Netezza / PureData – How To Get a SQL List of When View Was Last Changed or Created

Netezza / PureData SQL (Structured Query Language)

Netezza / PureData SQL (Structured Query Language)

Sometimes it is handy to be able to get a quick list of when a view was changed last.  It could be for any number of reason, but sometimes folks just lose track of when a view was last updated or even need to verify that it hadn’t been changed recently.  So here is a quick SQL, which can be dropped in Aginity Workbench for Netezza to create a list of when a view was created or was update dated last.  Update the Database name in the SQL and run it.

SQL List of When A view was Last Changed or Created

select t.database — Database
, t.OWNER — Object Owner
, t.VIEWNAME — View Name
, o.objmodified — The Last Modified Datetime
, o.objcreated — Created Datetime

from _V_OBJECT o
,_V_VIEW_XDB t
where
o.objid = t.objid
and DATABASE = ‘<<Database Name>>
order by o.objcreated Desc, o.objmodified Desc;

Related References

 

Netezza / PureData – How To Quote a Single Quote in Netezza SQL

How To Quote a Single Quote in Netezza SQL?

The short answer is to use four single quotes (””), which will result in a single quote within the select statement results.

How to Assemble the SQL to Quote a Single Quote in a SQL Select Statement

Knowing how to construct a list to embed in a SQL where clause ‘in’ list or to add to an ETL job can be a serious time saver eliminating the need to manually edit large lists.  In the example below, I used the Select result set to create a rather long list of values, which needed to be included in an ELT where clause.  By:

  • Adding the comma delimiter (‘,’) and a Concatenate (||) on the front
  • Followed by adding a quoted single Quote (entered as four single quotes (””)) and a Concatenate (||)
  • The Field I which to have delaminated and Quoted (S1.ORDER_NUM)
  • And closed with a quoted single Quote (entered as four single quotes (””))

This results in a delimited and quoted list ( ,’116490856′) which needs only to have the parentheses added and the first comma removed, which is much less work than manually editing the 200 item that resulted from this select.

Example SQL:

SELECT Distinct

‘,’||””|| S1.ORDER_NUM||”” as Quoted_Order_Number

FROM Sales S1

 

How to Quote A Single Quote Example SQL

How to Quote A Single Quote Example SQL

Related Reference

DataStage – How to use single quoted parameter list in an Oracle Connector

Data Integration

Data Integration

While working with a client’s 9.1 DataStage version, I ran into a situation where they wanted to parameterize SQL where clause lists in an Oracle Connector stage, which honestly was not very straight forward to figure out.  First, if the APT_OSL_PARAM_ESC_SQUOTE is not set and single quotes are used in the parameter, the job creates unquoted invalid SQL when the parameter is populated.  Second, I found much of the information confusing and/or incomplete in its explanation.   After some research and some trial and error, here is how I resolved the issue.  I’ll endeavor to be concise, but holistic in my explanation.

When this Variable applies

This where I know this process applies, there may be other circumstances to which is this applicable, but I’m listing the ones here with which I have recent experience.

Infosphere Information Server Datastage

  • Versions 91, 11.3, and 11.5

Oracle RDBMS

  • Versions 11g and 12c

Configurations process

Here is a brief explanation of the steps I used to implement the where clause as a parameter.  Please note that in this example, I am using a job parameter to populate on a portion of the where clause, you can certainly pass the entire where clause as a parameter, if it is not too long.

Configure Project Variable in Administrator

  • Add APT_OSL_PARAM_ESC_SQUOTE to project in Administrator
  • Populate the APT_OSL_PARAM_ESC_SQUOTE Variable \
APT_OSL_PARAM_ESC_SQUOTE Project Variable

APT_OSL_PARAM_ESC_SQUOTE Project Variable

Create job parameter

Following your project name convention or standard practice, if you customer and/or project do not have established naming conventions, create the job parameter in the job. See jp_ItemSource parameter in the image below.

Job Parameter In Oracle Connector

Job Parameter In Oracle Connector

Add job parameter to Custom SQL in Select Oracle Connector Stage

On the Job parameter has been created, add the job parameter to the SQL statement of the job.

Job Parameter In SQL

Job Parameter In SQL

Related References

IBM Knowledge Center > InfoSphere Information Server 11.5.0

Connecting to data sources > Databases > Oracle databases > Oracle connector

IBM Support > Limitation of the Parameter APT_OSL_PARAM_ESC_SQUOTE on Plugins on Parallel Canvas

IBM Knowledge Center > InfoSphere Information Server 11.5.0

InfoSphere DataStage and Quality > Stage > Reference > Parallel Job Reference > Environment Variables > Miscellaneous > APT_OSL_PARAM_ESC_SQUOTE

 

Netezza / PureData – How to build a multi table drop command from a select

Database Management

Database Management

How to Quick Drop Multiple Tables

occasionally, there is a need to quickly drop a list of tables and you don’t always want to write or generate each command individually in Aginity.  So, here is a quick example of how you can use a ‘Select’ SQL statement to generate a list of drop commands for you. Now, this approach assumes there is a common naming convention, so, you may need to adapt it to your needs.

An outline of the Drop Multiple Tables Process

Here is a quick summary of the steps to generate the drop statements from _V_Table:

  1. Build required Netezza SQL select; paying particular attention to the where clause criteria to exclude any unnecessary tables.
  2. Execute the SQL statement
  3. Copy from Aginity Results Tab without headers
  4. Past into new Aginity Query window
  5. validate that only the tables are in the list — No extras
  6. Click with the SQL Drop command list and Execute as a single batch

Example generate the drop statements

select  ‘Drop table ‘||tablename||’;’

from _V_TABLE

where tablename like ‘NZCC_TT_%’;

 

Related References

IBM Knowledge Center > PureData System for Analytics 7.2.1

IBM Netezza database user documentation > Netezza SQL command reference > Drop Table

How to know if your Oracle Client install is 32 Bit or 64 Bit

Oracle Database, How to know if your Oracle Client install is 32 Bit or 64 Bit

Oracle Database

 

How to know if your Oracle Client install is 32 Bit or 64 Bit

Sometimes you just need to know if your Oracle Client install is 32 bit or 64 bit. But how do you figure that out? Here are two methods you can try.

The first method

Go to the %ORACLE_HOME%\inventory\ContentsXML folder and open the comps.xml file.
Look for <DEP_LIST> on the ~second screen.

If you see this: PLAT=”NT_AMD64” then your Oracle Home is 64 bit
If you see this: PLAT=”NT_X86” then your Oracle Home is 32 bit.

It is possible to have both the 32-bit and the 64-bit Oracle Homes installed.

The second method

This method is a bit faster. Windows has a different lib directory for 32-bit and 64-bit software. If you look under the ORACLE_HOME folder if you see a “lib” AND a “lib32” folder you have a 64 bit Oracle Client. If you see just the “lib” folder you’ve got a 32 bit Oracle Client.

Related References

 

OLTP vs Data Warehousing

Database, OLTP vs Data Warehousing

Database

OLTP Versus Data Warehousing

I’ve tried to explain the difference between OLTP systems and a Data Warehouse to my managers many times, as I’ve worked at a hospital as a Data Warehouse Manager/data analyst for many years. Why was the list that came from the operational applications different than the one that came from the Data Warehouse? Why couldn’t I just get a list of patients that were laying in the hospital right now from the Data Warehouse? So I explained, and explained again, and explained to another manager, and another. You get the picture.
In this article I will explain this very same thing to you. So you know  how to explain this to your manager. Or, if you are a manager, you might understand what your data analyst can and cannot give you.

OLTP

OLTP stands for OLine Transactional Processing. With other words: getting your data directly from the operational systems to make reports. An operational system is a system that is used for the day to day processes.
For example: When a patient checks in, his or her information gets entered into a Patient Information System. The doctor put scheduled tests, a diagnoses and a treatment plan in there as well. Doctors, nurses and other people working with patients use this system on a daily basis to enter and get detailed information on their patients.
The way the data is stored within operational systems is so the data can be used efficiently by the people working directly on the product, or with the patient in this case.

Data Warehousing

A Data Warehouse is a big database that fills itself with data from operational systems. It is used solely for reporting and analytical purposes. No one uses this data for day to day operations. The beauty of a Data Warehouse is, among others, that you can combine the data from the different operational systems. You can actually combine the number of patients in a department with the number of nurses for example. You can see how far a doctor is behind schedule and find the cause of that by looking at the patients. Does he run late with elderly patients? Is there a particular diagnoses that takes more time? Or does he just oversleep a lot? You can use this information to look at the past, see trends, so you can plan for the future.

The difference between OLTP and Data Warehousing

This is how a Data Warehouse works:

How a Data Warehouse works

How a Data Warehouse works

The data gets entered into the operational systems. Then the ETL processes Extract this data from these systems, Transforms the data so it will fit neatly into the Data Warehouse, and then Loads it into the Data Warehouse. After that reports are formed with a reporting tool, from the data that lies in the Data Warehouse.

This is how OLTP works:

How OLTP works

How OLTP works

Reports are directly made from the data inside the database of the operational systems. Some operational systems come with their own reporting tool, but you can always use a standalone reporting tool to make reports form the operational databases.

Pro’s and Con’s

Data Warehousing

Pro’s:

  • There is no strain on the operational systems during business hours
    • As you can schedule the ETL processes to run during the hours the least amount of people are using the operational system, you won’t disturb the operational processes. And when you need to run a large query, the operational systems won’t be affected, as you are working directly on the Data Warehouse database.
  • Data from different systems can be combined
    • It is possible to combine finance and productivity data for example. As the ETL process transforms the data so it can be combined.
  • Data is optimized for making queries and reports
    • You use different data in reports than you use on a day to day base. A Data Warehouse is built for this. For instance: most Data Warehouses have a separate date table where the weekday, day, month and year is saved. You can make a query to derive the weekday from a date, but that takes processing time. By using a separate table like this you’ll save time and decrease the strain on the database.
  • Data is saved longer than in the source systems
    • The source systems need to have their old records deleted when they are no longer used in the day to day operations. So they get deleted to gain performance.

Con’s:

  • You always look at the past
    • A Data Warehouse is updated once a night, or even just once a week. That means that you never have the latest data. Staying with the hospital example: you never knew how many patients are in the hospital are right now. Or what surgeon didn’t show up on time this morning.
  • You don’t have all the data
    • A Data Warehouse is built for discovering trends, showing the big picture. The little details, the ones not used in trends, get discarded during the ETL process.
  • Data isn’t the same as the data in the source systems
    • Because the data is older than those of the source systems it will always be a little different. But also because of the Transformation step in the ETL process, data will be a little different. It doesn’t mean one or the other is wrong. It’s just a different way of looking at the data. For example: the Data Warehouse at the hospital excluded all transactions that were marked as cancelled. If you try to get the same reports from both systems, and don’t exclude the cancelled transactions in the source system, you’ll get different results.

online transactional processing (OLTP)

Pro’s

  • You get real time data
    • If someone is entering a new record now, you’ll see it right away in your report. No delays.
  • You’ve got all the details
    • You have access to all the details that the employees have entered into the system. No grouping, no skipping records, just all the raw data that’s available.

Con’s

  • You are putting strain on an application during business hours.
    • When you are making a large query, you can take processing space that would otherwise be available to the people that need to work with this system for their day to day operations. And if you make an error, by for instance forgetting to put a date filter on your query, you could even bring the system down so no one can use it anymore.
  • You can’t compare the data with data from other sources.
    • Even when the systems are similar. Like an HR system and a payroll system that use each other to work. Data is always going to be different because it is granulated on a different level, or not all data is relevant for both systems.
  • You don’t have access to old data
    • To keep the applications at peak performance, old data, that’s irrelevant to day to day operations is deleted.
  • Data is optimized to suit day to day operations
    • And not for report making. This means you’ll have to get creative with your queries to get the data you need.

So what method should you use?

That all depends on what you need at that moment. If you need detailed information about things that are happening now, you should use OLTP.
If you are looking for trends, or insights on a higher level, you should use a Data Warehouse.

 Related References

 

 

Oracle – How to get a list of user permission grants

IBM Infosphere Information Server (IIS), Oracle – How to get a list of user permission grants

IBM Infosphere Information Server (IIS)

Since the Infosphere, information server, repository, has to be installed manually with the scripts provided in the IBM software, sometimes you run into difficulties. So, here’s a quick script, which I have found useful in the past to identify user permissions for the IAUSER on Oracle database’s to help rundown discrepancies in user permissions.

 

SELECT *

FROM ALL_TAB_PRIVS

WHERE  GRANTEE = ‘iauser’

 

If we cannot run against the ALL_TAB_PRIVS view, then we can try the ALL_TAB_PRIVS view:

 

SELECT *

FROM USER_TAB_PRIVS

WHERE  GRANTEE = ‘iauser’

 

Related References

oracle help Center > Database Reference > ALL_TAB_PRIVS view

Netezza / PureData – Aginity for Netezza shortcut key list

Aginity for netezza shortcut key list

Aginity for Netezza

Recently, while working with a couple of my teammates on different projects I picked up a couple shortcut keys for Aginity for netezza, which I did not know existed. So, I thought about be nice to put a list of shortcut keys for future reference. I don’t use most of them very often, but I have flagged the ones that I have found to be frequently useful. I hope you find this useful as well.

Frequently Used By Me

Shortcut Keystrokes

Shortcut Description

Alt-C Complete Code Snippet
Alt-F4 Exit
Alt-Q Go to Query
Alt-R Go to Results
Alt-T Go to Tree
Atl-H User Query History
Ctrl-Alt-0 Goto Bookmark 0
Ctrl-Alt-1 Goto Bookmark 1
Ctrl-Alt-2 Goto Bookmark 2
Ctrl-Alt-3 Goto Bookmark 3
Ctrl-Alt-4 Goto Bookmark 4
Ctrl-Alt-5 Goto Bookmark 5
Ctrl-Alt-6 Goto Bookmark 6
Ctrl-Alt-7 Goto Bookmark 7
Ctrl-Alt-8 Goto Bookmark 8
Ctrl-Alt-9 Goto Bookmark 9
X Ctrl-Alt-C Comment Selection
Ctrl-Alt-Left Goto Previous Bookmark
Ctrl-Alt-Right Goto Next Bookmark
Ctrl-Alt-Shift-0 Set Bookmark 0
Ctrl-Alt-Shift-1 Set Bookmark 1
Ctrl-Alt-Shift-2 Set Bookmark 2
Ctrl-Alt-Shift-3 Set Bookmark 3
Ctrl-Alt-Shift-4 Set Bookmark 4
Ctrl-Alt-Shift-5 Set Bookmark 5
Ctrl-Alt-Shift-6 Set Bookmark 6
Ctrl-Alt-Shift-7 Set Bookmark 7
Ctrl-Alt-Shift-8 Set Bookmark 8
Ctrl-Alt-Shift-9 Set Bookmark 9
Ctrl-Alt-Shift-U Change select case
X Ctrl-Alt-U Uncomment Selection
Ctrl-Alt-W Word Wrap
X Ctrl-A Select All
Ctrl-B Toggle Object Browser
X Ctrl-C Copy
X Ctrl+Double click object name Find object in browser panel for current database
X Ctrl-F Find
Ctrl-F5 Execute as Single Batch
Ctrl-F6 Next Query Tab
Ctrl-G Goto Line
Ctrl-H Replace
Ctrl-N New Query Window
Ctrl-O Open SQL File
Ctrl-P Print
Ctrl-R Toggle
Ctrl-S Save Query
Ctrl-Shift-F6 Previous Query Tab
Ctrl-Shift-U Make selection UPPER case
Ctrl-T Add New Query Editor
Ctrl-U Make selection LOWER case
Ctrl-V Paste
X Ctrl-X Cut
X Ctrl-Y Redo
X Ctrl-Z Undo
F11 Toggle full screen
F12 Select Query at Cursor
 X F3 Find Again
F5 Execute
F8 Explain
F9 Toggle Bookmark
Shift-F5 Execute All

Related References

 

Netezza / PureData – Table Describe SQL

Netezza Puredata Table Describe SQL

Netezza / Puredata Table Describe SQL

If you want to describe a PureData / Netezza table in SQL, it can be done, but Netezza doesn’t have a describe command.  Here is a quick SQL, which will give the basic structure of a table or a view.  Honestly, if you have Aginity Generating the DDL is fast and more informative, at least to me.  If you have permissions to access NZSQL you can also use the slash commands (e.g. \d).

Example Netezza Table Describe SQL

select  name as Table_name,

owner as Table_Owner,

Createdate as Table_Created_Date,

type as Table_Type,

Database as Database_Name,

schema as Database_Schema,

attnum as Field_Order,

attname as Field_Name,

format_type as Field_Type,

attnotnull as Field_Not_Null_Indicator,

attlen as Field_Length

from _v_relation_column

where

name='<<Table Name Here>>’

Order by attnum;

 

Related References

IBM Knowledge Center, PureData System for Analytics, Version 7.2.1

IBM Netezza database user documentation, Command-line options for nzsql, Internal slash options

IBM Knowledge Center, PureData System for Analytics, Version 7.2.1

IBM Netezza getting started tips, About the Netezza data warehouse appliance, Commands and queries, Basic Netezza SQL information, Commonly used nzsql internal slash commands

IBM Knowledge Center, PureData System for Analytics, Version 7.2.1

IBM Netezza database user documentation, Netezza SQL introduction, The nzsql command options, Slash options