DataStage – Netezza Connector Action Column

Over the years have occasionally use the action column feature, however, the last month or so I have found myself using it quite a lot. This is especially true in relation to the tea set and not just in relation to the change capture stage.

The first thing you need to know is, if you want to prevent getting the ‘no action column found’ notice on the target stage, need to ensure that the action column has been coded to be a single character field char (1). Otherwise, the Netezza connector stage will not recognize your field as an action column.

While most developers will commonly work with the action column feature in relation to the change capture stage, it can also be very useful if you have created a field from one or more inputs to tell you what behavior the row requires. I have found that this approach can be very useful and efficient under the right circumstances.

Example Pattern for Action Column Using Multiple Source Selects

Example Pattern for Action Column Using Multiple Source Selects

Action column configuration example

Action Column Field Type

Action Column Field Type

 Change Code Values Mapping To Action Column

  • Here’s a quick reference table to provide the interpretation of the change type code to the actual one character action column value to which it will need to be interpreted.

Change Code Type

Change Type Code

Action Column Value

Copy (Data Without Changes)

0

No
value for this Change Type

Insert

1

I

Delete

2

D

Update

3

U

Example Transformer Stage, Derivation

  •  Here is a quick transformer stage derivation coding example to take advantage of the action call capabilities. If you haven’t already handled the removal of the copy rows, you may also want to add a constraint.
  • The combination I most frequently find myself using is the insert and update combination.
if Lnk_Out_To_Tfm.change_code=1 then ‘I’

Else if Lnk_Out_To_Tfm.change_code=2 then ‘D’

Else if Lnk_Out_To_Tfm.change_code=3 then ‘U’

Related References

Home > InfoSphere Information Server 11.7.0 > InfoSphere DataStage and QualityStage > Developing parallel jobs > Introduction to InfoSphere DataStage Balanced Optimization > Job design considerations  > Specific considerations for the Netezza connector

Netezza / PureData – List of Views against a table

PureData Powered by Netezza

PureData Powered by Netezza

I have found myself using this simple, but useful SQL time in recent weeks to research different issues and to help with impact analysis.  So, I thought I would post it while I’m thinking about it.  It just gives a list of views using a table, which can be handy to know.  This SQL is simple and could be converted to an equi-join.  I used the like statement mostly because I sometimes want to know if there are other views a similar nature in the same family (by naming convention) of tables.

Select All Fields From The _V_View

This is the simplest form of this SQL to views, which a table.

Select * from _v_view

where DEFINITION like ‘%<<TABLE_NAME>>%’ ;

Select Minimal Fields From The _V_View

This is the version of the SQL, which I normally use, to list the views, which use a table.

Select VIEWNAME, OWNER from _v_view

where DEFINITION like ‘%<<TABLE_NAME>>%’;

Related References

Major Cloud Computing Models

Cloud Computing

Cloud Computing

Cloud computing enables convenient, ubiquitous, measures, and on-demand access to a shared pool of scalable and configurable resources, such as servers, applications, databases, networks, and other services. Also, these resources can be provisioned and released rapidly with minimum interaction and management from the provider.

The rapidly expanding technology is rife with obscure acronyms, with major ones being SaaS, PaaS, and IaaS. These acronyms distinguish the three major cloud computing models discussed in this article. Notably, cloud computing virtually meets any imaginable IT needs in diverse ways. In effect, the cloud computing models are necessary to show the role that a cloud service provides and how the function is accomplished. The three main cloud computing paradigms can be demonstrated on the diagram shown below.

The three major cloud computing models

The three major cloud computing models

Infrastructure as a Service (IaaS)

In infrastructure as a service model, the cloud provider offers a service that allows users to process, store, share, and user other fundamental computing resources to run their software, which can include operating systems and applications. In this case, a consumer has minimum control over the underlying cloud infrastructure, but has significant control over operating systems, deployed applications, storage, and some networking components, such as the host firewalls.

Based on its description, IaaS can be regarded as the lowest-level cloud service paradigm, and possibly the most crucial one. With this paradigm, a cloud vendor provides pre-configured computing resources to consumers via a virtual interface. From the definition, IaaS pertains underlying cloud infrastructure but does not include applications or an operating system. Implementation of the applications, operating system, and some network components, such as the host firewalls is left up to the end user. In other words, the role of the cloud provider is to enable access to the computing infrastructure necessary to drive and support their operating systems and application solutions.

In some cases, the IaaS model can provide extra storage for data backups, network bandwidth, or it can provide access to enhanced performance computing which was traditionally available using supercomputers. IaaS services are typically provided to users through an API or a dashboard.

Features of IaaS

  • Users transfer the cost of purchasing IT infrastructure to a cloud provider
  • Infrastructure offered to a consumer can be increased or reduced depending on business storage and processing needs
  • The consumer will be saved from challenges and costs of maintaining hardware
  • High availability of data is in the cloud
  • Administrative tasks are virtualized
  • IaaS is highly flexible compared to other models
  • Highly scalable and available
  • Permits consumers to focus on their core business and transfer critical IT roles to a cloud provider
Infrastructure as a Service (IaaS)

Infrastructure as a Service (IaaS)

IaaS Use Cases

A series of use cases can explore the above benefits and features afforded by IaaS. For instance, an organization that lacks the capital to own and manage their data centers can purchase an IaaS offering to achieve fast and affordable IT infrastructure for their business. Also, the IaaS can be expanded or terminated based on the consumer needs. Another set of companies that can deploy IaaS include traditional organizations seeking large computing power with low expenditure to run their workloads. IaaS model is also a good option for rapidly growing enterprises that avoid committing to specific hardware or software since their business needs are likely to evolve.

Popular IaaS Services

Major IT companies are offering popular IaaS services that are powering a significant portion of the Internet even without users realizing it.

Amazon EC2: Offers scalable and highly available computing capacity in the cloud. Allows users to develop and deploy applications rapidly without upfront investment in hardware

IBM’s SoftLayer: Cloud computing services offering a series of capabilities, such as computing, networking, security, storage, and so on, to enable faster and reliable application development. The solution features bare-metal, hypervisors, operating systems, database systems, and virtual servers for software developers.

NaviSite: offers application services, hosting, and managed cloud services for IT infrastructure

ComputeNext: the solution empowers internal business groups and development teams with DevOps productivity from a single API.

Platform as a Service (PaaS)

Platform as a service model involves the provision of capabilities that allow users to create their applications using programming languages, tools, services, and libraries owned and distributed by a cloud provider. In this case, the consumer has minimum control over the underlying cloud computing resources such as servers, storage, and operating system. However, the user has significant control over the applications developed and deployed on the PaaS service.

In PaaS, cloud computing is used to provide a platform for consumers to deploy while developing, initializing, implementing, and managing their application. This offering includes a base operating system and a suite of development tools and solutions. PaaS effectively eliminates the needs for consumers to purchase, implement and maintain the computing resources traditionally needed to build useful applications. Some people use the term ‘middleware’ to refer to PaaS model since the offering comfortably sits between SaaS and IaaS.

Features of PaaS

  • PaaS service offers a platform for development, tasking, and hosting tools for consumer applications
  • PaaS is highly scalable and available
  • Offer cost effective and simple way to develop and deploy applications
  • Users can focus on developing quality applications without worrying about the underlying IT infrastructure
  • Business policy automation
  • Many users can access a single development service or tool
  • Offers database and web services integration
  • Consumers have access to powerful and reliable server software, storage capabilities, operating systems, and information and application backup
  • Allows remote teams to collaborate, which improves employee productivity
Platform as a Service (PaaS)

Platform as a Service (PaaS)

PaaS Use Cases

Software development companies and other enterprises that want to implement agile development methods can explore PaaS capabilities in their business models. Many PaaS services can be used in application development. PaaS development tools and services are always updated and made available via the Internet to offer a simple way for businesses to develop, test, and prototype their software solutions. Since developers’ productivity is enhanced by allowing remote workers to collaborate, PaaS consumers can rapidly release applications and get feedback for improvement. PaaS has led to the emergence of the API economy in application development.

Popular PaaS Offerings

There exist major PaaS services that are helping organizations to streamline application development. PaaS offering is delivered over the Internet and allows developers to focus more on creating quality and highly functional application while not worrying about the operating system, storage, and other infrastructure.

Google’s App Engine: the solution allows developers to build scalable mobile and web backends in any language in the cloud. Users can bring their own language runtimes, third-party libraries, and frameworks

IBM BlueMix: this PaaS solution from IBM allows developers to avoid vendor lock-in and leverage the flexible and open cloud environment using diverse IBM tools, open technologies, and third-party libraries and frameworks.

Heroku: the solution provides companies with a platform where they can build, deliver, manage, and scale their applications while abstracting and bypassing computing infrastructure hassles

Apache Stratos: this PaaS offering offers enterprise-ready quality service, security, governance, and performance that allows development, modification, deployment, and distribution of applications.

Red Hat’s OpenShift: a container application platform that offers operations and development-centric tools for rapid application development, easy deployment, scalability, and long-term maintenance of applications

Software as a Service (SaaS)

Software as a service model involves the capabilities provided to users by using a cloud vendor’s application hosted and running on a cloud infrastructure. Such applications are conveniently accessible from different platforms and devices through a web browser, a thin client interface, or a program interface. In this model, the end user has minimum control of the underlying cloud-based computing resources, such as servers, operating system, or the application capabilities

SaaS can be described as software licensing and delivery paradigm that features a complete and functional software solutions provided to users on a metered and subscription basis. Since users access the application via browsers or thin client and program interfaces, SaaS makes the host operating system insignificant in the operation of the product. As mentioned, the service is metered. In this case, SaaS customers are billed based on their consumption, while others pay a flat monthly fee.

Features of SaaS

  • SaaS providers offer applications via subscription structure
  • User transfer the need to develop, install, manage, or upgrade applications to SaaS vendors
  • Applications and data is securely stored in the cloud
  • SaaS is easily managed from a central location
  • Remote serves are deployed to host the application
  • Users can access SaaS offering from any location with Internet access
  • On-premise hardware failure does not interfere with an application or cause data loss
  • Users can reduce or increase use of cloud-based resources depending on their processing and storage needs
  • Applications offered via SaaS model are accessible from any location and almost all Internet-enabled devices
Software as a Service (SaaS)

Software as a Service (SaaS)

SaaS Use Cases

SaaS use case is a typical use case for many companies seeking to benefit from quality application usage without the need to develop, maintain and upgrade the required components. Companies can acquire SaaS solutions for ERP, mail, office applications, collaboration tool, among others. SaaS is also crucial for small companies and startups that wish to launch e-commerce service rapidly but lack the time and resource to develop and maintain the software or buy servers for hosting the platform. SaaS is also used by companies with short-term projects that require collaboration from different members located remotely.

Popular SaaS Services

SaaS offerings are more widespread as compared to IaaS and PaaS. In fact, a majority of consumers use SaaS services without realizing it.

Office365: the cloud-based solution provides productivity software for subscribed consumers. Allows users to access Microsoft Office tools on various platforms, such as Android, MacOS, and Windows, etc.

Box: the SaaS offers secure file storage, sharing, and collaboration from any location and platform

Dropbox: modern application designed for collaboration and for creating, storing, and accessing files, docs, and folders.

Salesforce: the SaaS is among the leading customer relationship management platform that offers a series of capabilities for sales, marketing, service, and more.

Today, cloud computing models have revolutionized the way businesses deploy and manage computing resources and infrastructure. With the advent and evolution of the three major cloud computing models, that it IaaS, PaaS, and SaaS, consumers will find a suitable cloud offering that satisfies virtually all IT needs. These models’ capabilities coupled with competition from popular cloud computing service providers will continue availing IT solutions for consumers demanding for availability, enhanced performance, quality services, better coverage, and secure applications.

Consumers should review their business needs and do a cost-benefit analysis to approve the best model for their business. Also, consumers should conduct thorough workload assessment while migrating to a cloud service.

Big Data vs. Virtualization

Big Data Information Approaches

Big Data Information Approaches

Globally, organizations are facing challenges emanating from data issues, including data consolidation, value, heterogeneity, and quality. At the same time, they have to deal with the aspect of Big Data. In other words, consolidating, organizing, and realizing the value of data in an organization has been a challenge over the years. To overcome these challenges, a series of strategies have been devised. For instance, organizations are actively leveraging on methods such as Data Warehouses, Data Marts, and Data Stores to meet their data assets requirements. Unfortunately, the time and resources required to deliver value using these legacy methods is a distressing issue. In most cases, typical Data Warehouses applied for business intelligence (BI) rely on batch processing to consolidate and present data assets. This traditional approach is affected by the latency of information.

Big Data

As the name suggests, Big Data describes a large volume of data that can either be structured or unstructured. It originates from business processes among other sources. Presently, artificial intelligence, mobile technology, social media, and the Internet of Things (IoT) have become new sources of vast amounts of data. In Big Data, the organization and consolidation matter more than the volume of the data. Ultimately, big data can be analyzed to generate insights that can be crucial in strategic decision making for a business.

Features of Big Data

The term Big Data is relatively new. However, the process of collecting and preserving vast amounts of information for different purposes has been there for decades. Big Data gained momentum recently with the three V’s features that include volume, velocity, and variety.

Volume: First, businesses gather information from a set of sources, such as social media, day-to-day operations, machine to machine data, weblogs, sensors, and so on. Traditionally, storing the data was a challenge. However, the requirement has been made possible by new technologies such as Hadoop.

Velocity: Another defining nature of Big Data is that it flows at an unprecedented rate that requires real-time processing. Organizations are gathering information from RFID tags, sensors, and other objects that need timely processing of data torrents.

Variety: In modern enterprises, information comes in different formats. For instance, a firm can gather numeric and structured data from traditional databases as well as unstructured emails, video, audio, business transactions, and texts.

Complexity: As mentioned above, Big Data comes from diverse sources and in varying formats. In effect, it becomes a challenge to consolidate, match, link, cleanse, or modify this data across an organizational system. Unfortunately, Big Data opportunities can only be explored when an organization successfully correlates relationships and connects multiple data sets to prevent it from spiraling out of control.

Variability: Big Data can have inconsistent flows within periodic peaks. For instance, in social media, a topic can be trending, which can tremendously increase collected data. Variability is also common while dealing with unstructured data.

Big Data Potential and Importance

The vast amount of data collected and preserved on a global scale will keep growing. This fact implies that there is more potential to generate crucial insights from this information. Unfortunately, due to various issues, only a small fraction of this data actually gets analyzed. There is a significant and untapped potential that businesses can explore to make proper and beneficial use of this information.

Analyzing Big Data allows businesses to make timely and effective decisions using raw data. In reality, organizations can gather data from diverse sources and process it to develop insights that can aid in reducing operational costs, production time, innovating new products, and making smarter decisions. Such benefits can be achieved when enterprises combine Big Data with analytic techniques, such as text analytics, predictive analytics, machine learning, natural language processing, data mining and so on.

Big Data Application Areas

Practically, Big Data can be used in nearly all industries. In the financial sector, a significant amount of data is gathered from diverse sources, which requires banks and insurance companies to innovate ways to manage Big Data. This industry aims at understanding and satisfying their customers while meeting regulatory compliance and preventing fraud. In effect, banks can exploit Big Data using advanced analytics to generate insights required to make smart decisions.

In the education sector, Big Data can be employed to make vital improvements on school systems, quality of education and curriculums. For instance, Big Data can be analyzed to assess students’ progress and to design support systems for professors and tutors.

Healthcare providers, on the other hand, collect patients’ records and design various treatment plans. In the healthcare sector, practitioners and service providers are required to offer accurate and timely treatment that is transparent to meet the stringent regulations in the industry and to enhance the quality of life. In this case, Big Data can be managed to uncover insights that can be used to improve the quality of service.

Governments and different authorities can apply analytics to Big Data to create the understanding required to manage social utilities and to develop solutions necessary to solve common problems, such as city congestion, crime, and drug use. However, governments must also consider other issues such as privacy and confidentiality while dealing with Big Data.

In manufacturing and processing, Big Data offers insights that stakeholders can use to efficiently use raw materials to output quality products. Manufacturers can perform analytics on big data to generate ideas that can be used to increase market share, enhance safety, minimize wastage, and solve other challenges faster.

In the retail sector, companies rely heavily on customer loyalty to maintain market share in a highly competitive market. In this case, managing big data can help retailers to understand the best methods to utilize in marketing their products to existing and potential consumers, and also to sustain relationships.

Challenges Handling Big Data

With the introduction of Big Data, the challenge of consolidating and creating value on data assets becomes magnified. Today, organizations are expected to handle increased data velocity, variety, and volume. It is now a business necessity to deal with traditional enterprise data and Big Data. Traditional relational databases are suitable for storing, processing, and managing low-latency data. Big Data has increased volume, variety, and velocity, making it difficult for legacy database systems to efficiently handle it.

Failing to act on this challenge implies that enterprises cannot tap the opportunities presented by data generated from diverse sources, such as machine sensors, weblogs, social media, and so on. On the contrary, organizations that will explore Big Data capabilities amidst its challenges will remain competitive. It is necessary for businesses to integrate diverse systems with Big Data platforms in a meaningful manner, as heterogeneity of data environments continue to increase.

Virtualization

Virtualization involves turning physical computing resources, such as databases and servers into multiple systems. The concept consists of making the function of an IT resource simulated in software, making it identical to the corresponding physical object. Virtualization technique uses abstraction to create a software application to appear and operate like hardware to provide a series of benefits ranging from flexibility, scalability, performance, and reliability.

Typically, virtualization is made possible using virtual machines (VMs) implemented in microprocessors with necessary hardware support and OS-level implementations to enhance computational productivity. VMs offers additional convenience, security, and integrity with little resource overhead.

Benefits of Virtualization

Achieving the economics of wide-scale functional virtualization using available technologies is easy to improve reliability by employing virtualization offered by cloud service providers on fully redundant and standby basis. Traditionally, organizations would deploy several services to operate at a fraction of their capacity to meet increased processing and storage demands. These requirements resulted in increased operating costs and inefficiencies. With the introduction of virtualization, the software can be used to simulate functionalities of hardware. In effect, businesses can outstandingly eliminate the possibility of system failures. At the same time, the technology significantly reduces capital expense components of IT budgets. In future, more resources will be spent on operating, than acquisition expenses. Company funds will be channeled to service providers instead of purchasing expensive equipment and hiring local personnel.

Overall, virtualization enables IT functions across business divisions and industries to be performed more efficiently, flexibly, inexpensively, and productively. The technology meaningfully eliminates expensive traditional implementations.

Apart from reducing capital and operating costs for organizations, virtualization minimizes and eliminates downtime. It also increases IT productivity, responsiveness, and agility. The technology provides faster provisioning of resources and applications. In case of incidents, virtualization allows fast disaster recovery that maintains business continuity.

Types of Virtualization

There are various types of virtualization, such as a server, network, and desktop virtualization.

In server virtualization, more than one operating system runs on a single physical server to increase IT efficiency, reduce costs, achieve timely workload deployment, improve availability and enhance performance.

Network virtualization involves reproducing a physical network to allow applications to run on a virtual system. This type of virtualization provides operational benefits and hardware independence.

In desktop virtualization, desktops and applications are virtualized and delivered to different divisions and branches in a company. Desktop virtualization supports outsourced, offshore, and mobile workers who can access simulate desktop on tablets and iPads.

Characteristics of Virtualization

Some of the features of virtualization that support the efficiency and performance of the technology include:

Partitioning: In virtualization, several applications, database systems, and operating systems are supported by a single physical system since the technology allows partitioning of limited IT resources.

Isolation: Virtual machines can be isolated from the physical systems hosting them. In effect, if a single virtual instance breaks down, the other machine, as well as the host hardware components, will not be affected.

Encapsulation: A virtual machine can be presented as a single file while abstracting other features. This makes it possible for users to identify the VM based on a role it plays.

Data Virtualization – A Solution for Big Data Challenges

Virtualization can be viewed as a strategy that helps derive information value when needed. The technology can be used to add a level of efficiency that makes big data applications a reality. To enjoy the benefits of big data, organizations need to abstract data from different reinforcements. In other words, virtualization can be deployed to provide partitioning, encapsulation, and isolation that abstracts the complexities of Big Data stores to make it easy to integrate data from multiple stores with other data from systems used in an enterprise.

Virtualization enables ease of access to Big Data. The two technologies can be combined and configured using the software. As a result, the approach makes it possible to present an extensive collection of disassociated and structured and unstructured data ranging from application and weblogs, operating system configuration, network flows, security events, to storage metrics.

Virtualization improves storage and analysis capabilities on Big Data. As mentioned earlier, the current traditional relational databases are incapable of addressing growing needs inherent to Big Data. Today, there is an increase in special purpose applications for processing varied and unstructured big data. The tools can be used to extract value from Big Data efficiently while minimizing unnecessary data replication. Virtualization tools also make it possible for enterprises to access numerous data sources by integrating them with legacy relational data centers, data warehouses, and other files that can be used in business intelligence. Ultimately, companies can deploy virtualization to achieve a reliable way to handle complexity, volume, and heterogeneity of information collected from diverse sources. The integrated solutions will also meet other business needs for near-real-time information processing and agility.

In conclusion, it is evident that the value of Big Data comes from processing information gathered from diverse sources in an enterprise. Virtualizing big data offers numerous benefits that cannot be realized while using physical infrastructure and traditional database systems. It provides simplification of Big Data infrastructure that reduces operational costs and time to results. Shortly, Big Data use cares will shift from theoretical possibilities to multiple use patterns that feature powerful analytics and affordable archival of vast datasets. Virtualization will be crucial in exploiting Big Data presented as abstracted data services.

 

Netezza / Puredata – How to replace or trim CHAR(0) is NULL characters in a field

PureData Powered by Netezza

PureData Powered by Netezza

Occasionally, one runs into the problem of hidden field values breaking join criteria.  I have had to clean up bad archive and conversion data with hidden characters serval times over the last couple of weeks, so, I thought I might as well capture this note for future use.

I tried the Replace command which is prevalent for Netezza answers to this issue on the web, but my client’s version does not support that command.  So, I needed to use the Translate command instead to accomplish it.  It took a couple of searches of the usual bad actors to find the character causing the issue, which on this day was chr(0).  Here is a quick mockup of the command I used to solve this issue.

Example Select Statement

Here is a quick example select SQL to identify problem rows.

SELECT TRANSLATE(F.BLOGTYPE_CODE, CHR(0), ”) AS BLOGTYPE_CODE, BT.BLOG_TYP_ID, LENGTH(BT.BLOG_TYP_ID) AS LNGTH_BT, LENGTH(F.BLOGTYPE_CODE) AS LNGTH_ BLOGTYPE

FROM  BLOGS_TBL F,  BLOG_TYPES BT WHERE TRANSLATE(F.BLOGTYPE_CODE, CHR(0), ”) =  BT.BLOG_TYP_ID AND LENGTH(BT.BLOG_TYP_ID) <>Length(LENGTH(F.BLOGTYPE_CODE) ;

 

Example Update Statement

Here is a quick shell update statement to remove the Char(0) characters from the problem field.

Update <<Your Table Name>> A

Set A.<<Your Field Name>> = TRANSLATE(A.<<Your FieldName>>, CHR(0), ”)

where length(A.<<Your Field Name>>) <> Length(A.<<Your FieldName>>) And << Additional criteria>>;

 

 

 

SQL Server – how to know when a stored procedure ran last

Microsoft SQL Server 2017

Microsoft SQL Server 2017

This week I needed to know if a stored procedure was running when expected during our batch.  So, here is a quick couple of SQL to answer the question:

When a Stored Procedure was run last

This version of the SQL gives the date for the last time the Stored Procure was run:

select distinct   top 1     s.last_execution_time

from  sys.dm_exec_query_stats s

cross apply sys.dm_exec_query_plan (s.plan_handle) p

where  object_name(p.objectid, db_id(‘<<DATABASE_NAME>>’)) = ‘<<STORED_PROCEDURE_NAME>>’

Order by s.last_execution_time desc

Get a list of when Stored Procedure has been run

This version of the SQL provides a list of dates of when Stored Procure has been run:

select distinct   s.last_execution_time

from  sys.dm_exec_query_stats s

cross apply sys.dm_exec_query_plan (s.plan_handle) p

where object_name(p.objectid, db_id(‘<<DATABASE_NAME>>’)) = ‘<<STORED_PROCEDURE_NAME>>’

Order by s.last_execution_time desc

 

 

Data Warehousing vs. Data Virtualization

Information Management

Information Management

Today, a business heavily depends on data to gain insights into their processes and operations and to develop new ways to increase market share and profits. In most cases, data required to generate the insights are sourced and located in diverse places, which requires reliable access mechanism. Currently, data warehousing and data virtualization are two principal techniques used to store and access the sources of critical data in a company. Each approach offers various capabilities and can be deployed for particular use cases as described in this article.

Data Warehousing

A data warehouse is designed and developed to secure host historical data from different sources. In effect, this technique protects data sources from performance degradation caused by the impact of sophisticated analytics and enormous demands for reports. Today, various tools and platforms have been developed for data warehouse automation in companies. They can be deployed to quicken development, automate testing, maintenance, and other steps involved in data warehousing. In a data warehouse, data is stored as a series of snapshots, where a record represents data at a particular time. In effect, companies can analyze data warehouse snapshots to compare data between different periods. The results are converted into insights required to make crucial business decisions.

Moreover, a data warehouse is optimized for other functions, such as data retrieval. The technology duplicates data to allow database de-normalization that enhances query performance. The solution is further deployed to create an enterprise data warehouse (EDW) used to service the entire organization.

Data Warehouse Information Architecture

Data Warehouse Information Architecture

Features of a Data Warehouse

A data warehouse is subject-oriented, and it is designed to help entities analyze data. For instance, a company can start a data warehouse focused on sales to learn more about sales data. Analytics on this warehouse can help establish insights such as the best customer for the period. The data warehouse is subject oriented since it can be defined based on a subject matter.

A data warehouse is integrated. Data from various sources is first out into a consistent format. The process requires the firm to resolve some challenges, such as naming conflicts and inconsistencies on units of measure.

A data warehouse in nonvolatile. In effect, data entered into the warehouse should not change after it is stored. This feature increases accuracy and integrity in data warehousing.

A data warehouse is time variant since it focuses on data changes over time. Data warehousing discovers trends in business by using large amounts of historical data. In effect, a typical operation in a data warehouse scans millions of rows to return an output.

A data warehouse is designed and developed to handle ad hoc queries. In most cases, organizations may not predict the amount of workload of a data warehouse. Therefore, it is recommendable to optimize the data warehouse to perform optimally over any possible query operation.

A data warehouse is regularly updated by the ETL process using bulk data modification techniques. Therefore, end users cannot directly update the data warehouse.

Advantages of Data Warehousing

The primary motivation for developing a data warehouse is to provide timely information required for decision making in an organization. A business intelligence data warehouse serves as an initial checkpoint for crucial business data. When a company stores its data in a data warehouse, tracking it becomes natural. The technology allows users to perform quick searches to be able to retrieve and analyze static data.

Another driver for companies investing in data warehouses involves integrating data from disparate sources. This capability adds value to operational applications like customer relationship management systems. A well-integrated warehouse allows the solution to translate information to a more usable and straightforward format, making it easy for users to understand the business data.

The technology also allows organizations to perform a series of analysis on data.

A data warehouse reduces the cost to access historical data in an organization.

Data warehousing provides standardization of data across an organization. Moreover, it helps identify and eliminate errors. Before loading data, the solution shows inconsistencies to users and corrects them.

A data warehouse also improves the turnaround time for analysis and report generation.

The technology makes it easy for users to access and share data. A user can conduct a quick search on a data warehouse to find and analyze static data without wasting time.

Data warehousing removes informational processing load from transaction-oriented databases.

Disadvantages of Data Warehousing

While data warehousing technology is undoubtedly beneficial to many organizations, not all data warehouses are relevant to a business. In some cases, a data warehouse can be expensive to scale and maintain.

Preparing a data warehouse is time-consuming since it requires users to input raw data, which has to be achieved manually.

A data warehouse is not a perfect choice for handing unstructured and complex raw data. Moreover, it faces difficulties incompatibility. Depending on the data sources, companies may require a business intelligence team to ensure compatibility is achieved for data coming from sources running distinct operating systems and programs.

The technology requires a maintenance cost to continue working correctly. The solution needs to be updated with latest features that might be costly. Regularly maintaining a data warehouse will need a business to spend more on top of the initial investment.

A data warehouse use can be limited due to information privacy and confidentiality issues. In most cases, businesses collect and store sensitive data belonging to their clients. Viewing it is only allowed to individual employees, which limits the benefits offered by a data warehouse.

Data Warehousing Use Case

There are a series of ways organizations use data warehouses. Businesses can optimize the technology for performance by identifying the type of data warehouse they have.

  1. A data warehouses can be used by an organization that is struggling to report efficiently on business operations and activities. The solution makes it possible to access the required data
  2. A data warehouse is necessary for an organization where data is copied separately by different divisions for analysis in spreadsheets that are not consistent with one another.
  3. Data warehousing is crucial in organizations where uncertainties about data accuracy are causing executives to question the veracity of reports.
  4. A data warehouse is crucial for business intelligence acceleration. The technology delivers rapid data insights to analysts at different scales, concurrency, and without requiring manual tuning or optimization of a database.
Data Virtualization Information Architecture

Data Virtualization Information Architecture

Data Virtualization

Data virtualization technology does not require transfer or storage of data. Instead, users employ a combination of application programming interfaces (APIs) and metadata (data about data) to interface with data in different sources. Users use joined queries to gain access to the original data sources. In other words, data virtualization offers a simplified and integrated view to business data in real-time as requested by business users, applications, and analytics. In effect, the technology makes it possible to integrate data from distinct sources, formats, and locations, without replication. It creates a unified virtual data layer that delivers data services to support users and various business applications.

Data virtualization performs many of the same data integration functions, that is, extract, transform, and load, data replication, and federation. It leverages modern technology to deliver real-time data integration with agility, low cost, and high speed. In effect, data virtualization eliminates traditional data integration and reduces the need for replicated data warehouses and data marts in most cases.

Capabilities and Benefits of Data Virtualization

There are various benefits of implementing data virtualization in an organization.

Firstly, data virtualization allows access and leverage of all information that helps a firm achieve a competitive advantage. The solution offers a unified virtual layer that abstracts the underlying source complexity and presents disparate data sources as a single source.

Data virtualization is cheaper since it does not require actual hardware devices to be installed. In other words, organizations no longer need to purchase and dedicate a lot of IT resources and additional monetary investment to create on-site resources, similar to the one used in a data warehouse.

Data virtualization allows speedy deployment of resources. In this solution, resource provisioning is fast and straightforward. Organizations are not required to set up physical machines or to create local networks or install other IT components. Users have a single point of access to a virtual environment that can be distributed to the entire company.

Data virtualization is an energy-efficient system since the solution does not require additional local hardware and software. Therefore, an organization will not be required to install cooling systems.

Disadvantages of Data Virtualization

Data virtualization creates a security risk. In the modern world, having information is a cheap way to make money. In effect, company data is frequently targeted by hackers. Implementing data virtualization from disparate sources may give an opportunity to malicious users to steal critical information and use it for monetary gain.

Data virtualization requires a series of channels or links that must work in cohesion to perform the intended task. In this cases, all data sources should be available for virtualization to work effectively.

Data Virtualization Use Cases

  • Companies that rely on business intelligence require data virtualization for rapid prototyping to meet immediate business needs. Data virtualization can create a real-time reporting solution that unifies access to multiple internal databases.
  • Provisioning data services for single-view applications, such as in customer service and call center applications require data virtualization.

 

End Of Support For IBM InfoSphere 9.1.0

IBM Information Server (IIS)

IBM Information Server (IIS)

End of Support for IBM InfoSphere Information Server 9.1.0

IBM InfoSphere Information Server 9.1.0 will reach End of Support on 2018-09-30.  If you are still on the InfoSphere Information Server (IIS) 9.1.0, I hope you have a plan to migrate to an 11-series version soon.  InfoSphere Information Server (IIS) 11.7 would be worth considering if you don’t already own an 11-series license. InfoSphere Information Server (IIS) 11.7 will allow you to take advantage of the evolving thin client tools and other capabilities in the 2018 release pipeline without needing to perform another upgrade.

Related References

IBM Support, End of support notification: InfoSphere Information Server 9.1.0

IBM Support, Software lifecycle, InfoSphere Information Server 9.1.0

IBM Knowledge Center, Home, InfoSphere Information Server 11.7.0, IBM InfoSphere Information Server Version 11.7.0 documentation

What Is Machine Learning?

Machine Learning

Machine Learning

Machine learning is Artificial Intelligence (AI) which enables a system to learn from data rather than through explicit programming.  Machine learning uses algorithms that iteratively learn from data to improve, describe data, and predict outcomes.  As the algorithms ingest training data to produce a more precise machine learning model. Once trained, the machine learning model, when provided data will generate predictions based on the data that taught the model.  Machine learning is a crucial ingredient for creating modern analytics models.

Netezza / PureData – How to Substring on a Character

PureData Powered by Netezza

PureData Powered by Netezza

 

I had a reason this week to perform a substring on a character in Netezza this week, something I have not had a need to do before.  The process was not as straightforward as I would have thought, since the command is explained as a static position command, and the IBM documentation, honestly, wasn’t much help.  Knowing full well, that text strings are variable having to provide a static position is not terribly useful in and of itself.  So, we need to use an expression to make the substring command flexible and dynamic.

I did get it work the way I needed, but it took two commands to make it happen:

  • The First was the ’instr’ command to identify the field and character I wanted to substring on: instr(<<FIELD_NAME>>,’~’) as This provides the position number of the tilde (~).
  • The second was the ‘substr’ command in which I embedded the ‘instr’ command: substr(<<FIELD_NAME>>,0,instr(<<FIELD_NAME>>,’~’) )

This worked nicely for what I needed, which was to pick out a file name from the beginning of a string, which was delimited with a tilde (~)

Substring on a Character Command Format

  • This format example starts with position zero (0) as position 1 of substring command and goes to the first tilde (~) as position 2 of the substring command.
Select  <<FIELD_NAME>>

, instr(<>,’~’) as pos2

, substr(<<FIELD_NAME>>,0,instr(<<FIELD_NAME>>,’~’) ) as Results

From <<Table_Name>>

where  <<Where_Clause>>;

 

 

Related references

IBM Knowledge Center, Home, PureData System for Analytics 7.2.1, IBM Netezza database user documentation, Netezza SQL basics, Netezza SQL extensions, Character string functions

IBM Knowledge Center, Home PureData System for Analytics 7.0.3, IBM Netezza Database User’s Guide, Netezza SQL basics, Netezza SQL extensions, Character functions

Empowered

Empowered

Empowered

Power is one of the most sought-after, yet all too often misunderstood, aspects of the human condition. Everyone strives for greater power, be it in business, love or life in general. We all vie for the next advantage or seek to improve our own circumstances. This is a normal facet of humanity and should be treated as such. Power is, by itself and its definition, ambiguous – neither good or evil. It just is, yet its importance cannot be overstated. Personal power is the force that creates business titans and top managers, star athletes and household names. Yet cultivating the kind of Personal Power that galvanizes the mundane into the extraordinary means looking beyond the accepted concepts of Power to the underpinnings of what personal power really is, and what it isn’t.

Personal Power is not a position of authority or the ability to influence. Instead, Personal Power lies in the capacity, ability, and willingness to act – regardless of the environment. This book breaks down the barriers, so many of us face when we try to manifest our own personal power, explaining the roadblocks in our way and how to move past them.

From understanding the many forms of power to utilizing what you have and learning skills that can take you farther, this book is both a guide and a map, one that will help you unleash your potential by finding, and cultivating, the Personal Power within.

Empowered Kindle Edition

Empowered (Paperback)

Infosphere Information Server (IIS) – Where you can view DataStage and QualityStage Logs?

During the course of the week, the discussion happened regarding the different places where a person might read the DataStage and QualityStage logs in InfoSphere. I hadn’t really thought about it, but here are a few places that come to mind:

  • IBM InfoSphere DataStage and QualityStage Operations Console
  • IBM InfoSphere DataStage and QualityStage Director client
  • IBM InfoSphere DataStage and QualityStage Designer client by pressing Ctrl+L

Printable PDF Version of this Article

Related Reference

IBM Knowledge Center> InfoSphere Information Server 11.7.0 > InfoSphere DataStage and QualityStage > Monitoring jobs

IBM Knowledge Center > InfoSphere Information Server 11.7.0 > Installing > Troubleshooting software installation > Log files

Essbase Connector Error – Client Commands are Currently Not Being Accepted

DataStage Essbase Connector, Essbase Connector Error, Client Commands are Currently Not Being Accepted

DataStage Essbase Connector

While investigating a recent Infosphere Information Server (IIS), Datastage, Essbase Connect error I found the explanations of the probable causes of the error not to be terribly meaningful.  So, now that I have run our error to ground, I thought it might be nice to jot down a quick note of the potential cause of the ‘Client Commands are Currently Not Being Accepted’ error, which I gleaned from the process.

Error Message Id

  • IIS-CONN-ESSBASE-01010

Error Message

An error occurred while processing the request on the server. The error information is 1051544 (message on contacting or from application:[<<DateTimeStamp>>]Local////3544/Error(1013204) Client Commands are Currently Not Being Accepted.

Possible Causes of The Error

This Error is a problem with access to the Essbase object or accessing the security within the Essbase Object.  This can be a result of multiple issues, such as:

  • Object doesn’t exist – The Essbase object didn’t exist in the location specified,
  • Communications – the location is unavailable or cannot be reached,
  • Path Security – Security gets in the way to access the Essbase object location
  • Essbase Security – Security within the Essbase object does not support the user or filter being submitted. Also, the Essbase object security may be corrupted or incomplete.
  • Essbase Object Structure –  the Essbase object was not properly structured to support the filter or the Essbase filter is malformed for the current structure.

Related References

IBM Knowledge Center, InfoSphere Information Server 11.7.0, Connecting to data sources, Enterprise applications, IBM InfoSphere Information Server Pack for Hyperion Essbase

Printable PDF Version of This Article

 

Professional Emails include a signature Block

eMail, Professional Emails include a signature Block

eMail

I encountered, what I will admit is a pet peeve today, which is why I’m writing this article.  I needed contact someone whom I correspond with regularly, but I have no reason to call or be called by them.  So, after checking my phone, went to their email thinking this would be a fast and easy way to gather the contact information.  Well, not true.  I did eventually gather the information and contact the person, but what a waste of time, which is time they are being billed for one way or another.

Example Signature Block

Ewing A. BusinessProfessional

Senior, Technical Generalist

Favinger Enterprises, Inc.

100 Spacious Sky, Ice Flats, AZ 85001

Phone: (800) 900-1000 | http://www.favingerentprises.com

 

Which email should have a signature block?

  • The signature block should be on every email (both initiated by you and replied to by you), this was true even before the days of remote work, but for remote workers, contingent works, and works who travel frequently it can be a productive enhancer.
  • Plus, it is simply the professional thing to do and saves everyone time and frustration. Not to mention it makes you look unprofessional not having one. do you really want to do that to your personal brand?
  • As if that were not enough, including your signature block is free advertising for you and the company you represent.
  • Additionally, most email accounts let you build one or more signature block, which can be embedded in your email.

Where to place your Signature Block?

  • The signature block should go at the bottom of your email. I still use the five lines below the last line of the body of the email to provide white space before the closing, as I learned when writing business letters decades ago.

What should be in a signature Block?

  • The signature block should be compact and informative and at a minimum should include:

The Closing

  • The closing is simply a polite way of saying I’m ending my message now. I usually go with the tried and true ‘Sincerely’, but others go with ‘Thank you’, ‘Best Regards’, or ‘Best Wishes,’. The main points, it should be short, polite, and professional.
  • This section should be followed by two lines

Your Name

  • This line is your professional name (First Name, Middle Initial, and Last name) and designations (Ph.D.…etc.)
  • This is your chance to say who you are and brand yourself to the reader, in a way which your email address cannot. Especially, when you consider that many of us don’t control what work email address is assigned to us.

Your Business Title

  • Including your business title provides some insight into your role and professional expertise.

Your Company Name

  • Much like your title, providing the Company Name and Address lets the reader know who you represent and, perhaps, more importantly, it is free advertising for the company.

Your Phone Numbers

  • Including your phone numbers, both office and cell (if different) enable people to quickly reach out to you if they need or want to. Not everybody keeps all their infrequent business contacts in the phone directory.
  • Putting your phone numbers on your signature block, also, enable the potential caller to verify that the numbers which they may have are still correct.

There are other items are sometimes included, such as:

  • A company logo to enhance the appearance and quality of a signature block
  • The Company’s website to help customer find out more about the company and to direct business to the company
  • The senders email to reinforce the email address in the header of the email.

However, the guidance provided above will make you look a lot more professional in a hurry if you have not been including a signature block in your emails.

Printable PDF Version of This Article

My Most Used Windows 10 Keyboard Shortcuts

Shortcut Keystrokes

Shortcut Keystrokes

While there are a great number of useful windows 10 shortcuts, I have the list below the combination, which I use daily.  Many of the shortcuts can be used across multiple applications (e.g. Notepad++, MS Word, SQL Server, Aginity, etc.) and save a considerable amount of mouse work.  Overall, these shortcut keys are more efficient and faster than using the mouse to perform the same task on a repetitive basis.

You may want to investigate the numerous other Windows 10 shortcuts keys, which best apply to your daily activities, but these are the ones, which I have found most useful and which I have committed to memory.

Table of My Most Used Windows Shortcuts

Key
Strokes

Behavior

Alt
+ Tab

Switch
between open apps

Ctrl
+ A

Select
all items in a document or window

Ctrl
+ Alt + Tab

Use
the arrow keys to switch between all open apps

Ctrl
+ C

Copy
the selected item

Ctrl
+ D

Delete
the selected item and move it to the Recycle Bin

Ctrl
+ F

Select
the search box

Ctrl
+ V

Paste
the selected item

Ctrl
+ X

Cut
the selected item

Esc

Stop
or leave the current task

F5

Refresh
the active window

F11

Maximize
Window

Related References

 Microsoft > Windows Support > Keyboard shortcuts in Windows

 

 

 

 

 

 

The Benefits of Having a Project Cart

Home Office Project Cart

Home Office Project Cart

I never really thought about how to make my home office more convenient even though I have worked from home for several years. However, I recently purchased a nice three-level cart and have found it extraordinarily useful. Especially, as I have my books and other reference materials scattered throughout my house in multiple bookcases and some even in the garage on shelves.

Advantages of a project cart

Among the advantages of a project cart that I have found are:

  • ability to declutter my office
  • put my research materials and notes into organized buckets
  • ability to swap out my reference materials according to my work priorities for the day
  • the ability to hide the cart and what the contents of it are

Qualities of a good home office cart

here are some things I would recommend you consider before buying your office cart, which may make the cart more valuable to you:

  • first, the cart should be sturdy, you really don’t want a rickety cart, if you’re going to be rolling around your house and loading it up with heavy books or other project materials according to your interests.
  • the cart should be small enough to tuck away someplace when you wanted out of sight in case you have visitors or company or something. For example, when my cart is not in use, I roll the cart into a nice little closet in the hall out of the way.
  • Consider carefully the wheels; my house has tile throughout, so, small wheels are fine. However, if you have carpeting or uneven edges that you must cross with the cart you may want to consider larger and wider wheels.
  • Of course, there are the aesthetics, do you care if the cart has a rustic look, or a modern office look, or even what color it is.

 

Oracle TO_CHAR to SQL Server CONVERT Equivalents to change Date to String

Transact SQL (T-SQL)

Transact SQL (T-SQL)

When it comes to SQL I tend to lean on the SQL I have used the most over the years, which is Oracle.  Today was no exception, I found myself trying to use the TO_CHAR command in SQL Server to format a date, which of course does not work. So, after a little thought, here are some examples of how you can the SQL Server Convert Command the achieve the equivalent result.

Example SQL Server Date Conversion SQL

Example SQL Server Date Conversion SQL

Example SQL Server Date Conversion SQL Code

This SQL of examples runs, as is, no from table required.

 

Select

CONVERT(VARCHAR(10), GETDATE(), 20) as
‘YYYY-MM-DD’

,CONVERT(VARCHAR(19), GETDATE(), 20) as ‘YYYY-MM-DD HH24:MI:SS’

,CONVERT(VARCHAR(8), GETDATE(), 112) as YYYYMMDD

,CONVERT(VARCHAR(6), GETDATE(), 112) as YYYYMM

,CONVERT(VARCHAR(12), DATEPART(YEAR, GETDATE()))+ RIGHT(‘0’+CAST(MONTH(GETDATE()) AS VARCHAR(2)),2)
as
YYYYMM_Method_2

,CONVERT(VARCHAR(4), GETDATE(), 12) as YYMM

,CONVERT(VARCHAR(4), GETDATE(), 112) as YYYY

,CONVERT(VARCHAR(4), DATEPART(YEAR, GETDATE())) as YYYY_Method_2

,CONVERT(VARCHAR(4), YEAR(GETDATE())) as YYYY_Method_3

,RIGHT(‘0’+CAST(MONTH(GETDATE()) AS VARCHAR(2)),2) as Two_Digit_Month

,SUBSTRING(ltrim(CONVERT(VARCHAR(4), GETDATE(), 12)),3,2) as Two_Digit_Month_2

,CONVERT(VARCHAR(10), GETDATE(), 111) as ‘YYYY/MM/DD’

,CONVERT(VARCHAR(5), GETDATE(), 8) as ‘HH24:MI’

,CONVERT(VARCHAR(8), GETDATE(), 8) ‘HH24:MI:SS’

Map TO_CHAR formats to SQL Server

You can map an Oracle TO_CHAR formats to SQL Server alternative commands as follows:

TO_CHAR
String

VARCHAR
Length

SQL
Server Convert Style

YYYY-MM-DD

VARCHAR(10)

20,
21, 120, 121, 126 and 127

YYYY-MM-DD
HH24:MI:SS

VARCHAR(19)

20,
21, 120 and 121

YYYYMMDD

VARCHAR(8)

112

YYYYMM

VARCHAR(6)

112

YYMM

VARCHAR(4)

12

YYYY

VARCHAR(4)

112

MM

VARCHAR(2)

12

YYYY/MM/DD

VARCHAR(10)

111

HH24:MI

VARCHAR(5)

8,
108, 14 and 114

HH24:MI:SS

VARCHAR(8)

8,
108, 14 and 114

Translating the formats commands

Here are some example of translating the formats commands.

Format

SQL
Server

YYYY-MM-DD

CONVERT(VARCHAR(10),
GETDATE(), 20)

YYYY-MM-DD
HH24:MI:SS

CONVERT(VARCHAR(19),
GETDATE(), 20)

YYYYMMDD

CONVERT(VARCHAR(8),
GETDATE(), 112)

YYYYMM

CONVERT(VARCHAR(6),
GETDATE(), 112)

YYMM

CONVERT(VARCHAR(4),
GETDATE(), 12)

YYYY

CONVERT(VARCHAR(4),
GETDATE(), 112)

YYYY

CONVERT(VARCHAR(4),
DATEPART(YEAR, GETDATE()))

YYYY

CONVERT(VARCHAR(4),
YEAR(GETDATE()))

MM

RIGHT(‘0’+CAST(MONTH(GETDATE())
AS VARCHAR(2)),2)

MM

SUBSTRING(ltrim(CONVERT(VARCHAR(4),
GETDATE(), 12)),3,2)

YYYY/MM/DD

CONVERT(VARCHAR(10),
GETDATE(), 111)

HH24:MI

CONVERT(VARCHAR(5),
GETDATE(), 8)

HH24:MI:SS

CONVERT(VARCHAR(8),
GETDATE(), 8)

Related Reference

Microsoft Docs, SQL, T-SQL Functions, GETDATE (Transact-SQL)

Microsoft Docs, SQL, T-SQL Functions, Date and Time Data Types and Functions (Transact-SQL)

Microsoft Docs, SQL, T-SQL Functions, DATEPART (Transact-SQL)

 

 

What Are The DataStage / QualityStage Join Stages?

Three Stages Which Join Records

Three Stages Which Join Records

While chasing an error to which only applied to join type stages, I thought it might be nice to identify what the InfoSphere Information Server DataStage / QualityStage are.  There are three of them, as you can see from the picture above, which are the:

  • Join Stage,
  • Lookup Stage,
  • And, Merge Stage.

All three stages that join data based on the values of identified key columns.

Related References

IBM Knowledge Center, InfoSphere Information Server 11.7.0, InfoSphere DataStage and QualityStage, Developing parallel jobs, Processing Data, Lookup Stage

IBM Knowledge Center, InfoSphere Information Server 11.7.0, InfoSphere DataStage and QualityStage, Developing parallel jobs, Processing Data, Join Stage

IBM Knowledge Center, InfoSphere Information Server 11.7.0, InfoSphere DataStage and QualityStage, Developing parallel jobs, Processing Data, Merge Stage

https://www.ibm.com/support/knowledgecenter/SSZJPZ_11.7.0/com.ibm.swg.im.iis.ds.parjob.dev.doc/topics/c_deeref_Merge_Stage.html

Parallel jobs on Windows fail with APT_IOPort::readBlkVirt;error

APT_IOPort::readBlkVirt Error Screenshot

APT_IOPort::readBlkVirt Error Screenshot

This a known error for windows systems and applies to DataStage and DataQuality jobs using the any or all the three join type stages (Join, Merge, and Lookup).

Error Message

  • <<Link name>>,0: APT_IOPort::readBlkVirt: read for block header, partition 0, [fd 4], returned -1 with errno 10,054 (Unknown error)

Message ID

  • IIS-DSEE-TFIO-00223

Applies To

  • Windows systems only
  • Parallel Engine Jobs the three join type stages (Join, Merge, and Lookup). It does not apply to Server Engine jobs.
  • Infosphere Information Server (IIS), Datastage and DataQuality 9.1 and higher

The Fix

  • Add the APT_NO_IOCOMM_OPTIMIZATION in project administrator and set to blank or 0. I left it blank so it would not impact other jobs
  • Add the environment variable to the job producing the error and set to 1

What it APT_NO_IOCOMM_OPTIMIZATION Does

  • Sets the use of shared memory as the transport type, rather than using the default sockets transport type.
  • Note that in most cases sockets transport type is faster, so, you likely will not to set this across the project as the default for all job. It is best to apply it as necessary for problematic jobs.

Related References

InfoSphere DataStage and QualityStage, Version 9.1 Job Compatibility

IBM Support, JR54078: PARALLEL JOBS ON WINDOWS FAIL WITH APT_IOPORT::READBLKVIRT; ERROR

IBM Support, Information Server DataStage job fails with unknown error 10,054.

 

SQL server table Describe (DESC) equivalent

 

Transact SQL (T-SQL)

Transact SQL (T-SQL)

Microsoft SQL Server doesn’t seem have a describe command and usually, folks seem to want to build a stored procedure to get the describe behaviors.  However, this is not always practical based on your permissions. So, the simple SQL below will provide describe like information in a pinch.  You may want to dress it up a bit; but I usually just use it raw, as shown below by adding the table name.

Describe T-SQL Equivalent

Select *

From INFORMATION_SCHEMA.COLUMNS

Where TABLE_NAME = ‘<<TABLENAME>>’;

Related References