An Overview of DB2 Federation

DB2 Federation

DB2 Federation

Data analytics has changed where data is no longer manageable in relational databases only. Data is flowing from various sources which are not of the same format. This means it is not possible to store all data in the same repository. Some are best suited for storing in relational databases, others for Apache Hadoop while others are best suited for NoSQL databases.

During data analyzing, so much time is taken in trying to bring the distributed data together instead of obtaining insights. Db2 Federation has come to the rescue of data analysts. Federation concept in db2 eliminates the need for storing data in different repositories and reduces the hassle of getting insights.

What is DB2 Federation?

DB2 federation is a data integration technology that permits remote database objects to be accessed as local DB2 database objects. This technology connects multiple databases and makes them appear like one database.

How does DB2 federation work?

Federation allows you to access all of your data that is on multiple distributed databases using a single query. When implemented in an organization, this technology can be used to access data that is on any of the organization’s Db2, whether local or in the cloud.

Why use DB2 federation?

So, why should you use the federation? This concept brings data of all formats into one virtual source. With data being retrieved from one virtual source, analyzing it becomes cost-effective and efficient.

What are its primary use cases for DB2 federation?

Merging of various sources of data

DB2 federation facilitates consolidating of data from sources data local and cloud to form one virtual data source. This eliminates the process of migrating data which can be expensive and troublesome.

Increase the capacity of a repository beyond the fixed limits

Physical storage capacity is bound to have a limit which is one reason you may find an organization has distributed its data in various repositories. With federation, the storage is virtual and therefore doesn’t have any limit. This technology can greatly help you if your physical dataset is running low on space.

Linking up to Db2 Warehouse on Cloud

People who use Db2 products can federate data from Db2 on Cloud and Db2 Warehouse on the Cloud. This will give them a joint interface where they can access, add, query, and analyze data without encountering the complex ETL processes. Better still, no additional code will be required to execute all these processes. This makes it easy for people with the low technical know-how to use these products smoothly.

Split data across different servers

At times, you might choose to partition your data. With federation integration technology, partitioned data can be queried with a unified interface. Federation allows you to better balance your workloads, scale precise parts of an app, and create micro-services that work harmoniously.

Generally, db2 federation makes it access data by bringing it together into a single virtual source. This brings about cost and time-saving benefits. When you want to analyze data, you can get insights immediately instead of spending a lot of time querying through repositories.

Related References

End Of Support For IBM InfoSphere 9.1.0

IBM Information Server (IIS)

IBM Information Server (IIS)

End of Support for IBM InfoSphere Information Server 9.1.0

IBM InfoSphere Information Server 9.1.0 will reach End of Support on 2018-09-30.  If you are still on the InfoSphere Information Server (IIS) 9.1.0, I hope you have a plan to migrate to an 11-series version soon.  InfoSphere Information Server (IIS) 11.7 would be worth considering if you don’t already own an 11-series license. InfoSphere Information Server (IIS) 11.7 will allow you to take advantage of the evolving thin client tools and other capabilities in the 2018 release pipeline without needing to perform another upgrade.

Related References

IBM Support, End of support notification: InfoSphere Information Server 9.1.0

IBM Support, Software lifecycle, InfoSphere Information Server 9.1.0

IBM Knowledge Center, Home, InfoSphere Information Server 11.7.0, IBM InfoSphere Information Server Version 11.7.0 documentation

Essbase Connector Error – Client Commands are Currently Not Being Accepted

DataStage Essbase Connector, Essbase Connector Error, Client Commands are Currently Not Being Accepted

DataStage Essbase Connector

While investigating a recent Infosphere Information Server (IIS), Datastage, Essbase Connect error I found the explanations of the probable causes of the error not to be terribly meaningful.  So, now that I have run our error to ground, I thought it might be nice to jot down a quick note of the potential cause of the ‘Client Commands are Currently Not Being Accepted’ error, which I gleaned from the process.

Error Message Id

  • IIS-CONN-ESSBASE-01010

Error Message

An error occurred while processing the request on the server. The error information is 1051544 (message on contacting or from application:[<<DateTimeStamp>>]Local////3544/Error(1013204) Client Commands are Currently Not Being Accepted.

Possible Causes of The Error

This Error is a problem with access to the Essbase object or accessing the security within the Essbase Object.  This can be a result of multiple issues, such as:

  • Object doesn’t exist – The Essbase object didn’t exist in the location specified,
  • Communications – the location is unavailable or cannot be reached,
  • Path Security – Security gets in the way to access the Essbase object location
  • Essbase Security – Security within the Essbase object does not support the user or filter being submitted. Also, the Essbase object security may be corrupted or incomplete.
  • Essbase Object Structure –  the Essbase object was not properly structured to support the filter or the Essbase filter is malformed for the current structure.

Related References

IBM Knowledge Center, InfoSphere Information Server 11.7.0, Connecting to data sources, Enterprise applications, IBM InfoSphere Information Server Pack for Hyperion Essbase

Printable PDF Version of This Article

 

Oracle TO_CHAR to SQL Server CONVERT Equivalents to change Date to String

Transact SQL (T-SQL)

Transact SQL (T-SQL)

When it comes to SQL I tend to lean on the SQL I have used the most over the years, which is Oracle.  Today was no exception, I found myself trying to use the TO_CHAR command in SQL Server to format a date, which of course does not work. So, after a little thought, here are some examples of how you can the SQL Server Convert Command the achieve the equivalent result.

Example SQL Server Date Conversion SQL

Example SQL Server Date Conversion SQL

Example SQL Server Date Conversion SQL Code

This SQL of examples runs, as is, no from table required.

 

Select

CONVERT(VARCHAR(10), GETDATE(), 20) as
‘YYYY-MM-DD’

,CONVERT(VARCHAR(19), GETDATE(), 20) as ‘YYYY-MM-DD HH24:MI:SS’

,CONVERT(VARCHAR(8), GETDATE(), 112) as YYYYMMDD

,CONVERT(VARCHAR(6), GETDATE(), 112) as YYYYMM

,CONVERT(VARCHAR(12), DATEPART(YEAR, GETDATE()))+ RIGHT(‘0’+CAST(MONTH(GETDATE()) AS VARCHAR(2)),2)
as
YYYYMM_Method_2

,CONVERT(VARCHAR(4), GETDATE(), 12) as YYMM

,CONVERT(VARCHAR(4), GETDATE(), 112) as YYYY

,CONVERT(VARCHAR(4), DATEPART(YEAR, GETDATE())) as YYYY_Method_2

,CONVERT(VARCHAR(4), YEAR(GETDATE())) as YYYY_Method_3

,RIGHT(‘0’+CAST(MONTH(GETDATE()) AS VARCHAR(2)),2) as Two_Digit_Month

,SUBSTRING(ltrim(CONVERT(VARCHAR(4), GETDATE(), 12)),3,2) as Two_Digit_Month_2

,CONVERT(VARCHAR(10), GETDATE(), 111) as ‘YYYY/MM/DD’

,CONVERT(VARCHAR(5), GETDATE(), 8) as ‘HH24:MI’

,CONVERT(VARCHAR(8), GETDATE(), 8) ‘HH24:MI:SS’

Map TO_CHAR formats to SQL Server

You can map an Oracle TO_CHAR formats to SQL Server alternative commands as follows:

TO_CHAR
String

VARCHAR
Length

SQL
Server Convert Style

YYYY-MM-DD

VARCHAR(10)

20,
21, 120, 121, 126 and 127

YYYY-MM-DD
HH24:MI:SS

VARCHAR(19)

20,
21, 120 and 121

YYYYMMDD

VARCHAR(8)

112

YYYYMM

VARCHAR(6)

112

YYMM

VARCHAR(4)

12

YYYY

VARCHAR(4)

112

MM

VARCHAR(2)

12

YYYY/MM/DD

VARCHAR(10)

111

HH24:MI

VARCHAR(5)

8,
108, 14 and 114

HH24:MI:SS

VARCHAR(8)

8,
108, 14 and 114

Translating the formats commands

Here are some example of translating the formats commands.

Format

SQL
Server

YYYY-MM-DD

CONVERT(VARCHAR(10),
GETDATE(), 20)

YYYY-MM-DD
HH24:MI:SS

CONVERT(VARCHAR(19),
GETDATE(), 20)

YYYYMMDD

CONVERT(VARCHAR(8),
GETDATE(), 112)

YYYYMM

CONVERT(VARCHAR(6),
GETDATE(), 112)

YYMM

CONVERT(VARCHAR(4),
GETDATE(), 12)

YYYY

CONVERT(VARCHAR(4),
GETDATE(), 112)

YYYY

CONVERT(VARCHAR(4),
DATEPART(YEAR, GETDATE()))

YYYY

CONVERT(VARCHAR(4),
YEAR(GETDATE()))

MM

RIGHT(‘0’+CAST(MONTH(GETDATE())
AS VARCHAR(2)),2)

MM

SUBSTRING(ltrim(CONVERT(VARCHAR(4),
GETDATE(), 12)),3,2)

YYYY/MM/DD

CONVERT(VARCHAR(10),
GETDATE(), 111)

HH24:MI

CONVERT(VARCHAR(5),
GETDATE(), 8)

HH24:MI:SS

CONVERT(VARCHAR(8),
GETDATE(), 8)

Related Reference

Microsoft Docs, SQL, T-SQL Functions, GETDATE (Transact-SQL)

Microsoft Docs, SQL, T-SQL Functions, Date and Time Data Types and Functions (Transact-SQL)

Microsoft Docs, SQL, T-SQL Functions, DATEPART (Transact-SQL)

 

 

What Are The DataStage / QualityStage Join Stages?

Three Stages Which Join Records

Three Stages Which Join Records

While chasing an error to which only applied to join type stages, I thought it might be nice to identify what the InfoSphere Information Server DataStage / QualityStage are.  There are three of them, as you can see from the picture above, which are the:

  • Join Stage,
  • Lookup Stage,
  • And, Merge Stage.

All three stages that join data based on the values of identified key columns.

Related References

IBM Knowledge Center, InfoSphere Information Server 11.7.0, InfoSphere DataStage and QualityStage, Developing parallel jobs, Processing Data, Lookup Stage

IBM Knowledge Center, InfoSphere Information Server 11.7.0, InfoSphere DataStage and QualityStage, Developing parallel jobs, Processing Data, Join Stage

IBM Knowledge Center, InfoSphere Information Server 11.7.0, InfoSphere DataStage and QualityStage, Developing parallel jobs, Processing Data, Merge Stage

https://www.ibm.com/support/knowledgecenter/SSZJPZ_11.7.0/com.ibm.swg.im.iis.ds.parjob.dev.doc/topics/c_deeref_Merge_Stage.html

Parallel jobs on Windows fail with APT_IOPort::readBlkVirt;error

APT_IOPort::readBlkVirt Error Screenshot

APT_IOPort::readBlkVirt Error Screenshot

This a known error for windows systems and applies to DataStage and DataQuality jobs using the any or all the three join type stages (Join, Merge, and Lookup).

Error Message

  • <<Link name>>,0: APT_IOPort::readBlkVirt: read for block header, partition 0, [fd 4], returned -1 with errno 10,054 (Unknown error)

Message ID

  • IIS-DSEE-TFIO-00223

Applies To

  • Windows systems only
  • Parallel Engine Jobs the three join type stages (Join, Merge, and Lookup). It does not apply to Server Engine jobs.
  • Infosphere Information Server (IIS), Datastage and DataQuality 9.1 and higher

The Fix

  • Add the APT_NO_IOCOMM_OPTIMIZATION in project administrator and set to blank or 0. I left it blank so it would not impact other jobs
  • Add the environment variable to the job producing the error and set to 1

What it APT_NO_IOCOMM_OPTIMIZATION Does

  • Sets the use of shared memory as the transport type, rather than using the default sockets transport type.
  • Note that in most cases sockets transport type is faster, so, you likely will not to set this across the project as the default for all job. It is best to apply it as necessary for problematic jobs.

Related References

InfoSphere DataStage and QualityStage, Version 9.1 Job Compatibility

IBM Support, JR54078: PARALLEL JOBS ON WINDOWS FAIL WITH APT_IOPORT::READBLKVIRT; ERROR

IBM Support, Information Server DataStage job fails with unknown error 10,054.

 

SQL server table Describe (DESC) equivalent

 

Transact SQL (T-SQL)

Transact SQL (T-SQL)

Microsoft SQL Server doesn’t seem have a describe command and usually, folks seem to want to build a stored procedure to get the describe behaviors.  However, this is not always practical based on your permissions. So, the simple SQL below will provide describe like information in a pinch.  You may want to dress it up a bit; but I usually just use it raw, as shown below by adding the table name.

Describe T-SQL Equivalent

Select *

From INFORMATION_SCHEMA.COLUMNS

Where TABLE_NAME = ‘<<TABLENAME>>’;

Related References

Microsoft SQL Server – Useful links

Microsoft SQL Server 2017

Microsoft SQL Server 2017

Here are a few references for the Microsoft SQL Server 2017 database, which may be helpful.

Table Of Useful Microsoft SQL Server Database References

Reference Type

Link

SQL Server 2017 Download Page

https://www.microsoft.com/en-us/sql-server/sql-server-downloads

SQL SERVER version, edition, and update level

https://support.microsoft.com/en-us/help/321185/how-to-determine-the-version–edition-and-update-level-of-sql-server-a

SQL Server 2017 Release Notes

https://docs.microsoft.com/en-us/sql/sql-server/sql-server-2017-release-notes

SQL Server Transact SQL Commands

https://technet.microsoft.com/en-us/library/ms189826(v=sql.90).aspx

Related References

DataStage – How to Pass the Invocation ID from one Sequence to another

DataStage Invocation ID Passing Pattern Overview

DataStage Invocation ID Passing Pattern Overview

When you are controlling a chain of sequences in the job stream and taking advantage of reusable (multiple instances) jobs it is useful to be able to pass the Invocation ID from the master controlling sequence and have it passed down and assigned to the job run.  This can easily be done with needing to manual enter the values in each of the sequences, by leveraging the DSJobInvocationId variable.  For this to work:

  • The job must have ‘Allow Multiple Instance’ enabled
  • The Invocation Id must be provided in the Parent sequence must have the Invocation Name entered
  • The receiving child sequence will have the invocation variable entered
  • At runtime, a DataStage invocation id instance of the multi-instance job will generate with its own logs.

Variable Name

  • DSJobInvocationId

Note

This approach allows for the reuse of job and the assignment of meaningful instance extension names, which are managed for a single point of entry in the object tree.

Related References: 

IBM Knowledge Center > InfoSphere Information Server 11.5.0

InfoSphere DataStage and QualityStage > Designing DataStage and QualityStage jobs > Building sequence jobs > Sequence job activities > Job Activity properties

Netezza / PureData – How To Get A List Of When A Store Procedure Was Last Changed Or Created

Netezza / Puredata - SQL (Structured Query Language)

Netezza / Puredata – SQL (Structured Query Language)

In the continuing journey to track down impacted objects and to determine when the code in a database was last changed or added, here is another quick SQL, which can be used in Aginity Workbench for Netezza to retrieve a list of when Store Procedures were last updated or were created.

SQL List of When A Stored Procedure was Last Changed or Created

select t.database — Database
, t.OWNER — Object Owner
, t.PROCEDURE — Procedure Name
, o.objmodified — The Last Modified Datetime
, o.objcreated — Created Datetime

from _V_OBJECT o
, _v_procedure t
where
o.objid = t.objid
and t.DATABASE = ‘<<Database Name>>
order by o.objmodified Desc, o.objcreated Desc;

 

Related References

 

Netezza / PureData – How To Get a SQL List of When View Was Last Changed or Created

Netezza / PureData SQL (Structured Query Language)

Netezza / PureData SQL (Structured Query Language)

Sometimes it is handy to be able to get a quick list of when a view was changed last.  It could be for any number of reason, but sometimes folks just lose track of when a view was last updated or even need to verify that it hadn’t been changed recently.  So here is a quick SQL, which can be dropped in Aginity Workbench for Netezza to create a list of when a view was created or was update dated last.  Update the Database name in the SQL and run it.

SQL List of When A view was Last Changed or Created

select t.database — Database
, t.OWNER — Object Owner
, t.VIEWNAME — View Name
, o.objmodified — The Last Modified Datetime
, o.objcreated — Created Datetime

from _V_OBJECT o
,_V_VIEW_XDB t
where
o.objid = t.objid
and DATABASE = ‘<<Database Name>>
order by o.objcreated Desc, o.objmodified Desc;

Related References

 

DataStage – How to use single quoted parameter list in an Oracle Connector

Data Integration

Data Integration

While working with a client’s 9.1 DataStage version, I ran into a situation where they wanted to parameterize SQL where clause lists in an Oracle Connector stage, which honestly was not very straight forward to figure out.  First, if the APT_OSL_PARAM_ESC_SQUOTE is not set and single quotes are used in the parameter, the job creates unquoted invalid SQL when the parameter is populated.  Second, I found much of the information confusing and/or incomplete in its explanation.   After some research and some trial and error, here is how I resolved the issue.  I’ll endeavor to be concise, but holistic in my explanation.

When this Variable applies

This where I know this process applies, there may be other circumstances to which is this applicable, but I’m listing the ones here with which I have recent experience.

Infosphere Information Server Datastage

  • Versions 91, 11.3, and 11.5

Oracle RDBMS

  • Versions 11g and 12c

Configurations process

Here is a brief explanation of the steps I used to implement the where clause as a parameter.  Please note that in this example, I am using a job parameter to populate on a portion of the where clause, you can certainly pass the entire where clause as a parameter, if it is not too long.

Configure Project Variable in Administrator

  • Add APT_OSL_PARAM_ESC_SQUOTE to project in Administrator
  • Populate the APT_OSL_PARAM_ESC_SQUOTE Variable \
APT_OSL_PARAM_ESC_SQUOTE Project Variable

APT_OSL_PARAM_ESC_SQUOTE Project Variable

Create job parameter

Following your project name convention or standard practice, if you customer and/or project do not have established naming conventions, create the job parameter in the job. See jp_ItemSource parameter in the image below.

Job Parameter In Oracle Connector

Job Parameter In Oracle Connector

Add job parameter to Custom SQL in Select Oracle Connector Stage

On the Job parameter has been created, add the job parameter to the SQL statement of the job.

Job Parameter In SQL

Job Parameter In SQL

Related References

IBM Knowledge Center > InfoSphere Information Server 11.5.0

Connecting to data sources > Databases > Oracle databases > Oracle connector

IBM Support > Limitation of the Parameter APT_OSL_PARAM_ESC_SQUOTE on Plugins on Parallel Canvas

IBM Knowledge Center > InfoSphere Information Server 11.5.0

InfoSphere DataStage and Quality > Stage > Reference > Parallel Job Reference > Environment Variables > Miscellaneous > APT_OSL_PARAM_ESC_SQUOTE

 

How to know if your Oracle Client install is 32 Bit or 64 Bit

Oracle Database, How to know if your Oracle Client install is 32 Bit or 64 Bit

Oracle Database

 

How to know if your Oracle Client install is 32 Bit or 64 Bit

Sometimes you just need to know if your Oracle Client install is 32 bit or 64 bit. But how do you figure that out? Here are two methods you can try.

The first method

Go to the %ORACLE_HOME%\inventory\ContentsXML folder and open the comps.xml file.
Look for <DEP_LIST> on the ~second screen.

If you see this: PLAT=”NT_AMD64” then your Oracle Home is 64 bit
If you see this: PLAT=”NT_X86” then your Oracle Home is 32 bit.

It is possible to have both the 32-bit and the 64-bit Oracle Homes installed.

The second method

This method is a bit faster. Windows has a different lib directory for 32-bit and 64-bit software. If you look under the ORACLE_HOME folder if you see a “lib” AND a “lib32” folder you have a 64 bit Oracle Client. If you see just the “lib” folder you’ve got a 32 bit Oracle Client.

Related References

 

OLTP vs Data Warehousing

Database, OLTP vs Data Warehousing

Database

OLTP Versus Data Warehousing

I’ve tried to explain the difference between OLTP systems and a Data Warehouse to my managers many times, as I’ve worked at a hospital as a Data Warehouse Manager/data analyst for many years. Why was the list that came from the operational applications different than the one that came from the Data Warehouse? Why couldn’t I just get a list of patients that were laying in the hospital right now from the Data Warehouse? So I explained, and explained again, and explained to another manager, and another. You get the picture.
In this article I will explain this very same thing to you. So you know  how to explain this to your manager. Or, if you are a manager, you might understand what your data analyst can and cannot give you.

OLTP

OLTP stands for OLine Transactional Processing. With other words: getting your data directly from the operational systems to make reports. An operational system is a system that is used for the day to day processes.
For example: When a patient checks in, his or her information gets entered into a Patient Information System. The doctor put scheduled tests, a diagnoses and a treatment plan in there as well. Doctors, nurses and other people working with patients use this system on a daily basis to enter and get detailed information on their patients.
The way the data is stored within operational systems is so the data can be used efficiently by the people working directly on the product, or with the patient in this case.

Data Warehousing

A Data Warehouse is a big database that fills itself with data from operational systems. It is used solely for reporting and analytical purposes. No one uses this data for day to day operations. The beauty of a Data Warehouse is, among others, that you can combine the data from the different operational systems. You can actually combine the number of patients in a department with the number of nurses for example. You can see how far a doctor is behind schedule and find the cause of that by looking at the patients. Does he run late with elderly patients? Is there a particular diagnoses that takes more time? Or does he just oversleep a lot? You can use this information to look at the past, see trends, so you can plan for the future.

The difference between OLTP and Data Warehousing

This is how a Data Warehouse works:

How a Data Warehouse works

How a Data Warehouse works

The data gets entered into the operational systems. Then the ETL processes Extract this data from these systems, Transforms the data so it will fit neatly into the Data Warehouse, and then Loads it into the Data Warehouse. After that reports are formed with a reporting tool, from the data that lies in the Data Warehouse.

This is how OLTP works:

How OLTP works

How OLTP works

Reports are directly made from the data inside the database of the operational systems. Some operational systems come with their own reporting tool, but you can always use a standalone reporting tool to make reports form the operational databases.

Pro’s and Con’s

Data Warehousing

Pro’s:

  • There is no strain on the operational systems during business hours
    • As you can schedule the ETL processes to run during the hours the least amount of people are using the operational system, you won’t disturb the operational processes. And when you need to run a large query, the operational systems won’t be affected, as you are working directly on the Data Warehouse database.
  • Data from different systems can be combined
    • It is possible to combine finance and productivity data for example. As the ETL process transforms the data so it can be combined.
  • Data is optimized for making queries and reports
    • You use different data in reports than you use on a day to day base. A Data Warehouse is built for this. For instance: most Data Warehouses have a separate date table where the weekday, day, month and year is saved. You can make a query to derive the weekday from a date, but that takes processing time. By using a separate table like this you’ll save time and decrease the strain on the database.
  • Data is saved longer than in the source systems
    • The source systems need to have their old records deleted when they are no longer used in the day to day operations. So they get deleted to gain performance.

Con’s:

  • You always look at the past
    • A Data Warehouse is updated once a night, or even just once a week. That means that you never have the latest data. Staying with the hospital example: you never knew how many patients are in the hospital are right now. Or what surgeon didn’t show up on time this morning.
  • You don’t have all the data
    • A Data Warehouse is built for discovering trends, showing the big picture. The little details, the ones not used in trends, get discarded during the ETL process.
  • Data isn’t the same as the data in the source systems
    • Because the data is older than those of the source systems it will always be a little different. But also because of the Transformation step in the ETL process, data will be a little different. It doesn’t mean one or the other is wrong. It’s just a different way of looking at the data. For example: the Data Warehouse at the hospital excluded all transactions that were marked as cancelled. If you try to get the same reports from both systems, and don’t exclude the cancelled transactions in the source system, you’ll get different results.

online transactional processing (OLTP)

Pro’s

  • You get real time data
    • If someone is entering a new record now, you’ll see it right away in your report. No delays.
  • You’ve got all the details
    • You have access to all the details that the employees have entered into the system. No grouping, no skipping records, just all the raw data that’s available.

Con’s

  • You are putting strain on an application during business hours.
    • When you are making a large query, you can take processing space that would otherwise be available to the people that need to work with this system for their day to day operations. And if you make an error, by for instance forgetting to put a date filter on your query, you could even bring the system down so no one can use it anymore.
  • You can’t compare the data with data from other sources.
    • Even when the systems are similar. Like an HR system and a payroll system that use each other to work. Data is always going to be different because it is granulated on a different level, or not all data is relevant for both systems.
  • You don’t have access to old data
    • To keep the applications at peak performance, old data, that’s irrelevant to day to day operations is deleted.
  • Data is optimized to suit day to day operations
    • And not for report making. This means you’ll have to get creative with your queries to get the data you need.

So what method should you use?

That all depends on what you need at that moment. If you need detailed information about things that are happening now, you should use OLTP.
If you are looking for trends, or insights on a higher level, you should use a Data Warehouse.

 Related References

 

 

Oracle – How to get a list of user permission grants

IBM Infosphere Information Server (IIS), Oracle – How to get a list of user permission grants

IBM Infosphere Information Server (IIS)

Since the Infosphere, information server, repository, has to be installed manually with the scripts provided in the IBM software, sometimes you run into difficulties. So, here’s a quick script, which I have found useful in the past to identify user permissions for the IAUSER on Oracle database’s to help rundown discrepancies in user permissions.

 

SELECT *

FROM ALL_TAB_PRIVS

WHERE  GRANTEE = ‘iauser’

 

If we cannot run against the ALL_TAB_PRIVS view, then we can try the ALL_TAB_PRIVS view:

 

SELECT *

FROM USER_TAB_PRIVS

WHERE  GRANTEE = ‘iauser’

 

Related References

oracle help Center > Database Reference > ALL_TAB_PRIVS view

Netezza / PureData – Table Describe SQL

Netezza Puredata Table Describe SQL

Netezza / Puredata Table Describe SQL

If you want to describe a PureData / Netezza table in SQL, it can be done, but Netezza doesn’t have a describe command.  Here is a quick SQL, which will give the basic structure of a table or a view.  Honestly, if you have Aginity Generating the DDL is fast and more informative, at least to me.  If you have permissions to access NZSQL you can also use the slash commands (e.g. \d).

Example Netezza Table Describe SQL

select  name as Table_name,

owner as Table_Owner,

Createdate as Table_Created_Date,

type as Table_Type,

Database as Database_Name,

schema as Database_Schema,

attnum as Field_Order,

attname as Field_Name,

format_type as Field_Type,

attnotnull as Field_Not_Null_Indicator,

attlen as Field_Length

from _v_relation_column

where

name='<<Table Name Here>>’

Order by attnum;

 

Related References

IBM Knowledge Center, PureData System for Analytics, Version 7.2.1

IBM Netezza database user documentation, Command-line options for nzsql, Internal slash options

IBM Knowledge Center, PureData System for Analytics, Version 7.2.1

IBM Netezza getting started tips, About the Netezza data warehouse appliance, Commands and queries, Basic Netezza SQL information, Commonly used nzsql internal slash commands

IBM Knowledge Center, PureData System for Analytics, Version 7.2.1

IBM Netezza database user documentation, Netezza SQL introduction, The nzsql command options, Slash options

 

 

Netezza / PureData – Substring Function Example

SQL (Structured Query Language), Netezza PureData – Substring Function Example, Substr

Netezza / PureData – Substring Function Example

The function Substring (SUBSTR) in Netezza PureData provides the capability parse character type fields based on position within a character string.

Substring Functions Basic Syntax

SUBSTRING Function Syntax

SUBSTRING(<<CharacterField>>,<< StartingPosition integer>>, <<for Number of characters Integer–optional>>)

 

SUBSTR Function Syntax

SUBSTR((<>,<< StartingPosition integer>>, <>)

 

Example Substring SQL

Netezza / PureData Substring Example

Netezza / PureData Substring Example

Substring SQL Used In Example

SELECT  LOCATIONTEXT

— From the Left Of the String

— Using SUBSTRING Function

,’==SUBSTRING From the Left==’ as Divider1

,SUBSTRING(LOCATIONTEXT,1,5) as Beggining_Using_SUBSTRING_LFT

,SUBSTRING(LOCATIONTEXT,7,6) as Middle_Using_SUBSTRING_LFT

,SUBSTRING(LOCATIONTEXT,15) as End_Using_SUBSTRING_LFT

,’==SUBSTR From the Left==’ as Divider2

—Using SUBSTR Function

 

,SUBSTR(LOCATIONTEXT,1,5) as Beggining_Using_SUBSTR_LFT

,SUBSTR(LOCATIONTEXT,7,6) as Middle_Using_SUBSTR_LFT

,SUBSTR(LOCATIONTEXT,15) as End_Using_SUBSTR_LFT

— From the right of the String

,’==SUBSTRING From the Right==’ as Divider3

,SUBSTRING(LOCATIONTEXT,LENGTH(LOCATIONTEXT)-18, 8) as Beggining_Using_SUBSTRING_RGT

,SUBSTRING(LOCATIONTEXT,LENGTH(LOCATIONTEXT)-9, 6) as Middle_Using_SUBSTRING_RGT

,SUBSTRING(LOCATIONTEXT,LENGTH(LOCATIONTEXT)-1) as End_Using_SUBSTRING_RGT

,’==SUBSTR From the right==’ as Divider4

,SUBSTR(LOCATIONTEXT,LENGTH(LOCATIONTEXT)-18, 8) as Beggining_Using_SUBSTR_RGT

,SUBSTR(LOCATIONTEXT,LENGTH(LOCATIONTEXT)-9, 6) as Middle_Using_SUBSTR_RGT

,SUBSTR(LOCATIONTEXT,LENGTH(LOCATIONTEXT)-1) as End_Using_SUBSTR_RGT

FROM BLOG.D_ZIPCODE

where STATE = ‘PR’

AND CITY = ‘REPTO ROBLES’;

Related References

IBM Knowledge Center, PureData System for Analytics, Version 7.2.1

IBM Netezza database user documentation, Netezza SQL basics, Netezza SQL extensions, Character string functions

IBM Knowledge Center, PureData System for Analytics, Version 7.1.0

IBM Netezza Database User’s Guide, Netezza SQL basics, Functions and operators, Functions, Standard string functions

IBM Knowledge Center, PureData System for Analytics, Version 7.2.1

IBM Netezza database user documentation, Netezza SQL command reference, Functions

Netezza / PureData – Substring Function On Specific Delimiter

SQL (Structured Query Language), Netezza / PureData - Substring Function On Specific Delimiter, substr

Netezza / PureData – Substring Function On Specific Delimiter

The function Substring (SUBSTR) in Netezza PureData provides the capability parse character type fields based on position within a character string.  However, it is possible, with a little creativity, to substring based on the position of a character in the string. This approach give more flexibility to the substring function and makes the substring more useful in many cases. This approach works fine with either the substring or substr functions.  In this example, I used the position example provide the numbers for the string command.

 

Example Substring SQL

Netezza PureData Substring Function On Specific Character In String, substring, substr

Netezza PureData Substring Function On Specific Character In String

 

Substring SQL Used In Example

select LOCATIONTEXT

,position(‘,’ in LOCATIONTEXT) as Comma_Postion_In_String

—without Adjustment

,SUBSTRING(LOCATIONTEXT,position(‘,’ in LOCATIONTEXT)) as Substring_On_Comma

—Adjusted to account for extra space

,SUBSTRING(LOCATIONTEXT,position(‘,’ in LOCATIONTEXT)+2) as Substring_On_Comma_Ajusted

,’==Breaking_Up_The_Sting==’ as Divider

— breaking up the string

,SUBSTRING(LOCATIONTEXT,1, position(‘ ‘ in LOCATIONTEXT)-1) as Beggining_of_String

,SUBSTRING(LOCATIONTEXT,position(‘ ‘ in LOCATIONTEXT)+1, position(‘ ‘ in LOCATIONTEXT)-1) as Middle_Of_String

,SUBSTRING(LOCATIONTEXT,position(‘,’ in LOCATIONTEXT)+2) as End_Of_String

 

FROM Blog.D_ZIPCODE

where STATE = ‘PR’

AND CITY = ‘REPTO ROBLES’

Related References

IBM Knowledge Center, PureData System for Analytics, Version 7.2.1

IBM Netezza database user documentation, Netezza SQL basics, Netezza SQL extensions, Character string functions

IBM Knowledge Center, PureData System for Analytics, Version 7.1.0

IBM Netezza Database User’s Guide, Netezza SQL basics, Functions and operators, Functions, Standard string functions

IBM Knowledge Center, PureData System for Analytics, Version 7.2.1

IBM Netezza database user documentation, Netezza SQL command reference, Functions

Netezza / PureData – Position Function

SQL (Structured Query Language), Netezza PureData Position Function, SQL, Position Function

Netezza / PureData Position Function

 

The position function in Netezza is a simple enough function, it just returns the number of a specified character within a string (char, varchar, nvarchar, etc.) or zero if the character not found. The real power of this command is when you imbed it with character function, which require a numeric response, but the character may be inconsistent from row to row in a field.

The Position Function’s Basic Syntax

position(<<character or Character String>> in <<CharacterFieldName>>)

 

Example Position Function SQL

Netezza PureData Position Function, SQL, Position Function

Netezza PureData Position Function

 

Position Function SQL Used in Example

select LOCATIONTEXT, CITY

,’==Postion Funtion Return Values==’ as Divider

,position(‘,’ in LOCATIONTEXT) as Postion_In_Nbr_String

,position(‘-‘ in LOCATIONTEXT) as Postion_Value_Not_Found

,’==Postion Combined with Substring Function==’ as Divider2

,SUBSTRING(LOCATIONTEXT,position(‘,’ in LOCATIONTEXT)+2) as Position_Used_in_Substring_Function

FROM Blog.D_ZIPCODE  where STATE = ‘MN’ AND CITY = ‘RED WING’ limit 1;

 

 

Related References

IBM Knowledge Center, PureData System for Analytics, Version 7.1.0

IBM Netezza Database User’s Guide, Netezza SQL basics, Functions and operators, Functions, Standard string functions

IBM Knowledge Center, PureData System for Analytics, Version 7.2.1

IBM Netezza database user documentation, Netezza SQL command reference, Functions

 

Data Modeling – Column Data Classification

Data Modeling, Column Data Classification, Field Data Classification

Data Modeling

 

Column Data Classification

When analyzing individual column data, at its most foundational level, column data can be classified by their fundamental use/characteristics.  Granted, when you start rolling up the structure into multiple columns, table structure and table relationship, then other classifications/behaviors, such as keys (primary and foreign), indexes, and distribution come into play.  However, many times when working with existing data sets it is essential to understand the nature the existing data to begin the modeling and information governance process.

Column Data Classification

Generally, individual columns can be classified into the classifications:

  • Identifier — A column/field which is unique to a row and/or can identify related data (e.g., Person ID, National identifier, ). Basically, think primary key and/or foreign key.
  • Indicator — A column/field, often called a Flag, that has a binary condition (e.g., True or False, Yes or No, Female or Male, Active or Inactive). Frequently used to identify compliance with complex with a specific business rule.
  • Code — A column/field that has a distinct and defined set of values, often abbreviated (e.g., State Code, Currency Code)
  • Temporal — A column/field that contains some type date, timestamp, time, interval, or numeric duration data
  • Quantity — A column/field that contains a numeric value (decimals, integers, etc.) and is not classified as an Identifier or Code (e.g., Price, Amount, Asset Value, Count)
  • Text — A column/field that contains alphanumeric values, possibly long text, and is not classified as an Identifier or Code (e.g., Name, Address, Long Description, Short Description)
  • Large Object (LOB)– A column/field that contains data traditional long text fields or binary data like graphics. The large objects can be broadly classified as Character Large Objects (CLOBs), Binary Large Objects (BLOBs), and Double-Byte Character Large Object (DBCLOB or NCLOB).

Related References