Netezza / PureData – How To Quote a Single Quote in Netezza SQL

How To Quote a Single Quote in Netezza SQL?

The short answer is to use four single quotes (””), which will result in a single quote within the select statement results.

How to Assemble the SQL to Quote a Single Quote in a SQL Select Statement

Knowing how to construct a list to embed in a SQL where clause ‘in’ list or to add to an ETL job can be a serious time saver eliminating the need to manually edit large lists.  In the example below, I used the Select result set to create a rather long list of values, which needed to be included in an ELT where clause.  By:

  • Adding the comma delimiter (‘,’) and a Concatenate (||) on the front
  • Followed by adding a quoted single Quote (entered as four single quotes (””)) and a Concatenate (||)
  • The Field I which to have delaminated and Quoted (S1.ORDER_NUM)
  • And closed with a quoted single Quote (entered as four single quotes (””))

This results in a delimited and quoted list ( ,’116490856′) which needs only to have the parentheses added and the first comma removed, which is much less work than manually editing the 200 item that resulted from this select.

Example SQL:

SELECT Distinct

‘,’||””|| S1.ORDER_NUM||”” as Quoted_Order_Number

FROM Sales S1

How to Quote A Single Quote Example SQL
How to Quote A Single Quote Example SQL

Related Reference

How to know if your Oracle Client install is 32 Bit or 64 Bit

Oracle Database, How to know if your Oracle Client install is 32 Bit or 64 Bit
Oracle Database

 

How to know if your Oracle Client install is 32 Bit or 64 Bit

Sometimes you just need to know if your Oracle Client install is 32 bit or 64 bit. But how do you figure that out? Here are two methods you can try.

The first method

Go to the %ORACLE_HOME%\inventory\ContentsXML folder and open the comps.xml file.
Look for <DEP_LIST> on the ~second screen.

If you see this: PLAT=”NT_AMD64” then your Oracle Home is 64 bit
If you see this: PLAT=”NT_X86” then your Oracle Home is 32 bit.

It is possible to have both the 32-bit and the 64-bit Oracle Homes installed.

The second method

This method is a bit faster. Windows has a different lib directory for 32-bit and 64-bit software. If you look under the ORACLE_HOME folder if you see a “lib” AND a “lib32” folder you have a 64 bit Oracle Client. If you see just the “lib” folder you’ve got a 32 bit Oracle Client.

Related References

 

OLTP vs Data Warehousing

OLTP Versus Data Warehousing

I’ve tried to explain the difference between OLTP systems and a Data Warehouse to my managers many times, as I’ve worked at a hospital as a Data Warehouse Manager/data analyst for many years. Why was the list that came from the operational applications different than the one that came from the Data Warehouse? Why couldn’t I just get a list of patients that were laying in the hospital right now from the Data Warehouse? So I explained, and explained again, and explained to another manager, and another. You get the picture.
In this article I will explain this very same thing to you. So you know  how to explain this to your manager. Or, if you are a manager, you might understand what your data analyst can and cannot give you.

OLTP

OLTP stands for OLine Transactional Processing. With other words: getting your data directly from the operational systems to make reports. An operational system is a system that is used for the day to day processes.
For example: When a patient checks in, his or her information gets entered into a Patient Information System. The doctor put scheduled tests, a diagnoses and a treatment plan in there as well. Doctors, nurses and other people working with patients use this system on a daily basis to enter and get detailed information on their patients.
The way the data is stored within operational systems is so the data can be used efficiently by the people working directly on the product, or with the patient in this case.

Data Warehousing

A Data Warehouse is a big database that fills itself with data from operational systems. It is used solely for reporting and analytical purposes. No one uses this data for day to day operations. The beauty of a Data Warehouse is, among others, that you can combine the data from the different operational systems. You can actually combine the number of patients in a department with the number of nurses for example. You can see how far a doctor is behind schedule and find the cause of that by looking at the patients. Does he run late with elderly patients? Is there a particular diagnoses that takes more time? Or does he just oversleep a lot? You can use this information to look at the past, see trends, so you can plan for the future.

The difference between OLTP and Data Warehousing

This is how a Data Warehouse works:

The data gets entered into the operational systems. Then the ETL processes Extract this data from these systems, Transforms the data so it will fit neatly into the Data Warehouse, and then Loads it into the Data Warehouse. After that reports are formed with a reporting tool, from the data that lies in the Data Warehouse.

This is how OLTP works:

Reports are directly made from the data inside the database of the operational systems. Some operational systems come with their own reporting tool, but you can always use a standalone reporting tool to make reports form the operational databases.

Pro’s and Con’s

Data Warehousing

Pro’s:

  • There is no strain on the operational systems during business hours
    • As you can schedule the ETL processes to run during the hours the least amount of people are using the operational system, you won’t disturb the operational processes. And when you need to run a large query, the operational systems won’t be affected, as you are working directly on the Data Warehouse database.
  • Data from different systems can be combined
    • It is possible to combine finance and productivity data for example. As the ETL process transforms the data so it can be combined.
  • Data is optimized for making queries and reports
    • You use different data in reports than you use on a day to day base. A Data Warehouse is built for this. For instance: most Data Warehouses have a separate date table where the weekday, day, month and year is saved. You can make a query to derive the weekday from a date, but that takes processing time. By using a separate table like this you’ll save time and decrease the strain on the database.
  • Data is saved longer than in the source systems
    • The source systems need to have their old records deleted when they are no longer used in the day to day operations. So they get deleted to gain performance.

Con’s:

  • You always look at the past
    • A Data Warehouse is updated once a night, or even just once a week. That means that you never have the latest data. Staying with the hospital example: you never knew how many patients are in the hospital are right now. Or what surgeon didn’t show up on time this morning.
  • You don’t have all the data
    • A Data Warehouse is built for discovering trends, showing the big picture. The little details, the ones not used in trends, get discarded during the ETL process.
  • Data isn’t the same as the data in the source systems
    • Because the data is older than those of the source systems it will always be a little different. But also because of the Transformation step in the ETL process, data will be a little different. It doesn’t mean one or the other is wrong. It’s just a different way of looking at the data. For example: the Data Warehouse at the hospital excluded all transactions that were marked as cancelled. If you try to get the same reports from both systems, and don’t exclude the cancelled transactions in the source system, you’ll get different results.

online transactional processing (OLTP)

Pro’s

  • You get real time data
    • If someone is entering a new record now, you’ll see it right away in your report. No delays.
  • You’ve got all the details
    • You have access to all the details that the employees have entered into the system. No grouping, no skipping records, just all the raw data that’s available.

Con’s

  • You are putting strain on an application during business hours.
    • When you are making a large query, you can take processing space that would otherwise be available to the people that need to work with this system for their day to day operations. And if you make an error, by for instance forgetting to put a date filter on your query, you could even bring the system down so no one can use it anymore.
  • You can’t compare the data with data from other sources.
    • Even when the systems are similar. Like an HR system and a payroll system that use each other to work. Data is always going to be different because it is granulated on a different level, or not all data is relevant for both systems.
  • You don’t have access to old data
    • To keep the applications at peak performance, old data, that’s irrelevant to day to day operations is deleted.
  • Data is optimized to suit day to day operations
    • And not for report making. This means you’ll have to get creative with your queries to get the data you need.

So what method should you use?

That all depends on what you need at that moment. If you need detailed information about things that are happening now, you should use OLTP.
If you are looking for trends, or insights on a higher level, you should use a Data Warehouse.

 Related References

Aginity For Netezza – How to Generate DDL

Aginity, Aginity for Netezza, Netezza, PureData, DDL, SQL
Aginity

How to Generate Netezza Object DDL

In ‘Aginity for Netezza’ this process is easy, if you have a user with sufficient permissions.

The basic process is:

  • In the object browser, navigate to the Database
  • select the Object (e.g. table, view, stored procedure)
  • Right Click, Select ‘Script’ > ‘DDL to query window’
  • The Object DDL will appear in the query window

Create DDL to Query Window
Create DDL to Query Window

Related References

 

Aginity for Netezza – How to Display Query Results in a Single Row Grid

Aginity
Aginity

Displaying your Netezza query results in a grid can be useful.  Especially, when desiring to navigation left and right to see an entire rows data and to avoid the distraction of other rows being displayed on the screen. I use this capability in Aginity when I’m proofing code results and/or validating data in a table.

How To switch to the Single Row Grid

  • Execute your Query
  • When the results return, right click on the gray bar above your results (where you see the drag a column box
  • Choose the ‘Show a Single Row Grid’ Menu item

    Aginity Show Single Row Grid
    Aginity Show Single Row Grid

 

Grid View Change

  • Your result display will change from a horizontal row to a vertical grid as shown below

Aginity Single Row Grid Display
Aginity Single Row Grid Display

How to Navigate in the Single Row Grid

  • To navigate in the single row grid, use the buttons provided at the bottom of the results section.

Aginity Single Row Grid Navigation Buttons
Aginity Single Row Grid Navigation Buttons

Related References

 

Netezza / PureData – ERROR [HY000] ERROR: Bad timestamp external representation ‘0000-00-00’

SQL (Structured Query Language),  Bad timestamp external representation '0000-00-00'
Bad timestamp external representation ‘0000-00-00’

While working on a recent data conversion a data timestamp error occurred, where the legacy code and data was ‘0000-00-00’ default stamp, which was causing errors in Netezza/PureData Analytics (PDA).  ‘0000-00-00’ is not a valid timestamp in Netezza and will produce a ‘ ERROR [HY000] ERROR:  Bad timestamp external representation ‘0000-00-00’’ notice.

When handling default dates in SQL and ETL’s, generally speaking, ‘1901-01-01’ or ‘0001-01-01’ as use as defaults for null and/or invalid dates.  Both of these dates and or timestamps will insert into Netezza, which I used a quick, simple, proof table to demonstrate, which were SQL extracted and in the table below.

Valid Defaults dates in Netezza

TEST_SKTEST_TS
11901-01-01 00:00:00
2 00:00:00

Related References

Temporal data types

PureData System for Analytics, PureData System for Analytics 7.2.1, IBM Netezza database user documentation, Netezza SQL basics, Data types, Temporal data types

Netezza date/time data type representations

PureData System for Analytics, PureData System for Analytics 7.2.1, IBM Netezza user-defined functions, Data type helper API reference, Temporal data type helper functions, Netezza date/time data type representations

Date/time functions

PureData System for Analytics, PureData System for Analytics 7.2.1, IBM Netezza database user documentation, Netezza SQL basics, Netezza SQL extensions, Date/time functions

Netezza / PureData – Row number within a group

Netezza / PureData – row number within a group
SQL (Structured Query Language)

Sometimes it is useful to be able to number data with its groups, which can be quickly and easily done using the Row Number function in Netezza. However, you do need to pay particular attention to you ‘partition by’ (conceptually a group by operation) field list and field order, as well as, you ‘Order by’ field list.  Otherwise, you can get some bizarre results.  it finds it helpful to think of them as having a parent, child, relationship.  In this line of thought, your ‘partition by’ field list should usually have one less more field than your ‘order by’.  This is not an absolute rule but under normal circumstances your  ‘order by’ field list should contain all the field in your ‘partition by’  and, occasionally, provide an order by direction (ascending or descending).

SQL Netezza Row Number Format

ROW_NUMBER() OVER( PARTITION BY <<Partition_By_Field_list>> ORDER BY <<Sort_By_Field_List [direction asc/desc]>> ) as ROW_NUM

Example Row Number SQL

Row Number Within A Group
Row Number Within A Group

SELECT

qh_database

,qh_sessionid

, qh_tend

,ROW_NUMBER() OVER( PARTITION BY qh_database, qh_sessionid

ORDER BY qh_database, qh_sessionid, qh_tend desc ) as ROW_NUM

FROM nz_query_history

where qh_database = ‘system’

order by qh_database, qh_sessionid, qh_tend desc

Related References

IBM Knowledge Center, PureData System for Analytics, Version 7.2.1

IBM Netezza database user documentation, Netezza SQL basics, Netezza SQL extensions, Character string functions

IBM Knowledge Center, PureData System for Analytics, Version 7.1.0

IBM Netezza Database User’s Guide, Netezza SQL basics, Functions and operators, Functions, Standard string functions