Cloud computing is a service driven model for enabling ubiquitous, convenient, on demand network access to a shared pool computing resources that can be rapidly provisioned and released with minimal administrative effort or service provider interaction.
A Road Map for Migrating to A Public Cloud Environment
Today, most organizations are looking for ways to cut down their sprawling IT budgets and define efficient paths for new developments. Making the move to the cloud is being seen as a more strategic and an economically viable idea, that is primarily allowing organizations to gain quick access to new platforms, services, and toolsets. But the migration of applications to the cloud environment needs a clear, and well-thought-out cloud migration strategy.
We are past the year of confusion and fear on matters cloud environment. In fact, almost everyone now agrees that the cloud is a key element of any company’s IT investment. What is not yet clear is what to move, how to move, and industry best practices to protect your investment in a public cloud environment. Therefore, a solid migration plan is an essential part of any cloud migration process.
Here are a few things you should pay close attention to when preparing a cloud migration planning template:
When planning to migrate to the cloud, it is paramount to remember that it is not a good idea to migrate every application. As you learn the baby steps, keep your legacy apps and other sensitive data such as private banking info off the cloud. This will ensure that, in case of a breach on your public cloud, your sensitive data and legacy systems will not fall into the hands of unsavory individuals.
Security of the data being migrated to the cloud should be just as important as on the cloud. Any temporary storage locations used during the cloud migration process should be secure from unauthorized intrusions.
Although security can be hard to quantify, it is one of the key components and considerations of any cloud service. The very basic security responsibility includes getting it right around your password security. Remember that, while you can massively increase the security around your applications, it is practically very different to deal with on-cloud threats and breaches since you technically don’t own any of the cloud software.
Some of the security concerns that you’ll need to look into include:
Is your data securely transferred and stored in the cloud?
Besides the passwords, does your cloud provider offer some type of 2-factor authentication?
How else are the users authenticated?
Does your provider meet the industry’s regulatory requirements?
Backup and Disaster recovery strategies
A backup and disaster strategy ensure that your data will be protected in case of a disaster. These strategies are unique to every organization depending on its application needs and the relevance of those applications to their organization.
To devise a foolproof DR strategy, it is important to identify and prioritize applications and determine the downtime acceptable for each application, services, and data.
Some of the things to consider when engineering your backup and disaster recovery blueprint include:
Availability of sufficient bandwidth and network capacity to redirect all users in case of a disaster.
Amount of data that may require backup.
Type of data to be protected
How long can it take to restore your systems from the cloud?
Communications Capacity enablement
Migrating to a cloud environment should make your business more agile and responsive to the market. Therefore, a robust communications enablement should be provided. Ideally, your cloud provider should be able to provide you with a contact center, unified messaging, mobility, presence, and integration with other business applications.
While the level of sophistication and efficiency of on-premise communications platforms depends on the capabilities of the company IT’s staff, cloud environments should offer communication tools with higher customizations to increase productivity.
A highly customized remote communications enablement will allow your company to refocus its IT resources to new innovation, spur agility, cut down on hardware costs and allow for more engagements with partners and customers.
Simply put, cloud communications:
Increase efficiency and productivity
Enables reimagined experience
Are designed for a seamless interaction.
legal liability and protection
Other important considerations when developing your cloud migration planning template are compliance with regulatory requirements and software licensing. For many businesses, data protection and regulatory compliance with HIPPA and GDPR is a constant concern, especially when dealing with identifiable data. Getting this right, the first time will allow you to move past the compliance issue blissfully.
When migrating, look for a cloud provider with comprehensive security assurance programs and governance-focused features. This will help your business operate more secure and in line with industry standards.
Ready to migrate your processes to a public cloud environment? Follow these pointers develop a comprehensive cloud migration planning template.
A public cloud strategy refers to a situation where you utilize cloud resources on a shared platform. Examples of shared or public cloud solutions include Microsoft Azure, Amazon Web Services and Google cloud. There are several benefits associated with cloud solutions. On the other hand, a private cloud strategy refers to a situation where you can decide to have an infrastructure which is dedicated to serving your business. It is sometimes referred to as homegrown where you employ experts to run the services so that your business can access different features. There are several advantages of using a public cloud over private cloud which you should know before you make an informed decision on the right platform to invest. Some of the benefits of the public cloud strategy include the following:
Availability and scale of Expertise
If you compare the public cloud and the private cloud services, the public cloud
allows you to access more experts. Remember the companies which offer the cloud services have enough employees who are ready to help several clients. In most cases, the other clients whom the service providers serve will not experience problems at the same time. It implies that human resource will be directed toward solving your urgent issue. You can as well scale up or down at any given time as the need arises which is unlike a case of private cloud solutions where you will have to invest in infrastructure each time you will like to upgrade.
Downgrading on a private cloud system can expose you to lose because you will leave some resources underutilized.
The volume of Technical Resources to apply
You access more technical resources in a public cloud platform. Remember the companies which offer the public cloud solutions are fully equipped with highly experienced experts. They also have the necessary tools and resources which
they can apply to assure you the best technical solutions each time you need them. It is unlike a private arrangement where you will have to incur more costs if the technical challenges will need advanced tools and highly qualified experts.
The price of a private cloud is high when compared to a public arrangement. If you are looking for ways you can save money, then the best way to go about it is to involve a public cloud solution. In the shared platform, you will only pay for
what you need. If you do not need a lot of resources at a given time, you can downgrade the services and enjoy fair prices. Services such as AWS offer great cost containment across the time which makes it easy to access the services at fair prices. For any business to grow, it should invest in the right package which brings the return on investment. The services offered by the public cloud systems allow businesses to save and grow. You should as well take into consideration other factors such as ecosystems for cloud relationships before you make an informed decision. There are some business models which prefer private cloud solutions while others can work well under public cloud-based solutions.
Cloud computing enables convenient, ubiquitous, measures, and on-demand access to a shared pool of scalable and configurable resources, such as servers, applications, databases, networks, and other services. Also, these resources can be provisioned and released rapidly with minimum interaction and management from the provider.
The rapidly expanding technology is rife with obscure acronyms, with major ones being SaaS, PaaS, and IaaS. These acronyms distinguish the three major cloud computing models discussed in this article. Notably, cloud computing virtually meets any imaginable IT needs in diverse ways. In effect, the cloud computing models are necessary to show the role that a cloud service provides and how the function is accomplished. The three main cloud computing paradigms can be demonstrated on the diagram shown below.
Infrastructure as a Service (IaaS)
In infrastructure as a service model, the cloud provider offers a service that allows users to process, store, share, and user other fundamental computing resources to run their software, which can include operating systems and applications. In this case, a consumer has minimum control over the underlying cloud infrastructure, but has significant control over operating systems, deployed applications, storage, and some networking components, such as the host firewalls.
Based on its description, IaaS can be regarded as the lowest-level cloud service paradigm, and possibly the most crucial one. With this paradigm, a cloud vendor provides pre-configured computing resources to consumers via a virtual interface. From the definition, IaaS pertains underlying cloud infrastructure but does not include applications or an operating system. Implementation of the applications, operating system, and some network components, such as the host firewalls is left up to the end user. In other words, the role of the cloud provider is to enable access to the computing infrastructure necessary to drive and support their operating systems and application solutions.
In some cases, the IaaS model can provide extra storage for data backups, network bandwidth, or it can provide access to enhanced performance computing which was traditionally available using supercomputers. IaaS services are typically provided to users through an API or a dashboard.
Features of IaaS
Users transfer the cost of purchasing IT infrastructure to a cloud provider
Infrastructure offered to a consumer can be increased or reduced depending on business storage and processing needs
The consumer will be saved from challenges and costs of maintaining hardware
High availability of data is in the cloud
Administrative tasks are virtualized
IaaS is highly flexible compared to other models
Highly scalable and available
Permits consumers to focus on their core business and transfer critical IT roles to a cloud provider
IaaS Use Cases
A series of use cases can explore the above benefits and features afforded by IaaS. For instance, an organization that lacks the capital to own and manage their data centers can purchase an IaaS offering to achieve fast and affordable IT infrastructure for their business. Also, the IaaS can be expanded or terminated based on the consumer needs. Another set of companies that can deploy IaaS include traditional organizations seeking large computing power with low expenditure to run their workloads. IaaS model is also a good option for rapidly growing enterprises that avoid committing to specific hardware or software since their business needs are likely to evolve.
Popular IaaS Services
Major IT companies are offering popular IaaS services that are powering a significant portion of the Internet even without users realizing it.
Amazon EC2: Offers scalable and highly available computing capacity in the cloud. Allows users to develop and deploy applications rapidly without upfront investment in hardware
IBM’s SoftLayer: Cloud computing services offering a series of capabilities, such as computing, networking, security, storage, and so on, to enable faster and reliable application development. The solution features bare-metal, hypervisors, operating systems, database systems, and virtual servers for software developers.
NaviSite: offers application services, hosting, and managed cloud services for IT infrastructure
ComputeNext: the solution empowers internal business groups and development teams with DevOps productivity from a single API.
Platform as a Service (PaaS)
Platform as a service model involves the provision of capabilities that allow users to create their applications using programming languages, tools, services, and libraries owned and distributed by a cloud provider. In this case, the consumer has minimum control over the underlying cloud computing resources such as servers, storage, and operating system. However, the user has significant control over the applications developed and deployed on the PaaS service.
In PaaS, cloud computing is used to provide a platform for consumers to deploy while developing, initializing, implementing, and managing their application. This offering includes a base operating system and a suite of development tools and solutions. PaaS effectively eliminates the needs for consumers to purchase, implement and maintain the computing resources traditionally needed to build useful applications. Some people use the term ‘middleware’ to refer to PaaS model since the offering comfortably sits between SaaS and IaaS.
Features of PaaS
PaaS service offers a platform for development, tasking, and hosting tools for consumer applications
PaaS is highly scalable and available
Offer cost effective and simple way to develop and deploy applications
Users can focus on developing quality applications without worrying about the underlying IT infrastructure
Business policy automation
Many users can access a single development service or tool
Offers database and web services integration
Consumers have access to powerful and reliable server software, storage capabilities, operating systems, and information and application backup
Allows remote teams to collaborate, which improves employee productivity
PaaS Use Cases
Software development companies and other enterprises that want to implement agile development methods can explore PaaS capabilities in their business models. Many PaaS services can be used in application development. PaaS development tools and services are always updated and made available via the Internet to offer a simple way for businesses to develop, test, and prototype their software solutions. Since developers’ productivity is enhanced by allowing remote workers to collaborate, PaaS consumers can rapidly release applications and get feedback for improvement. PaaS has led to the emergence of the API economy in application development.
Popular PaaS Offerings
There exist major PaaS services that are helping organizations to streamline application development. PaaS offering is delivered over the Internet and allows developers to focus more on creating quality and highly functional application while not worrying about the operating system, storage, and other infrastructure.
Google’s App Engine: the solution allows developers to build scalable mobile and web backends in any language in the cloud. Users can bring their own language runtimes, third-party libraries, and frameworks
IBM BlueMix: this PaaS solution from IBM allows developers to avoid vendor lock-in and leverage the flexible and open cloud environment using diverse IBM tools, open technologies, and third-party libraries and frameworks.
Heroku: the solution provides companies with a platform where they can build, deliver, manage, and scale their applications while abstracting and bypassing computing infrastructure hassles
Apache Stratos: this PaaS offering offers enterprise-ready quality service, security, governance, and performance that allows development, modification, deployment, and distribution of applications.
Red Hat’s OpenShift: a container application platform that offers operations and development-centric tools for rapid application development, easy deployment, scalability, and long-term maintenance of applications
Software as a Service (SaaS)
Software as a service model involves the capabilities provided to users by using a cloud vendor’s application hosted and running on a cloud infrastructure. Such applications are conveniently accessible from different platforms and devices through a web browser, a thin client interface, or a program interface. In this model, the end user has minimum control of the underlying cloud-based computing resources, such as servers, operating system, or the application capabilities
SaaS can be described as software licensing and delivery paradigm that features a complete and functional software solutions provided to users on a metered and subscription basis. Since users access the application via browsers or thin client and program interfaces, SaaS makes the host operating system insignificant in the operation of the product. As mentioned, the service is metered. In this case, SaaS customers are billed based on their consumption, while others pay a flat monthly fee.
Features of SaaS
SaaS providers offer applications via subscription structure
User transfer the need to develop, install, manage, or upgrade applications to SaaS vendors
Applications and data is securely stored in the cloud
SaaS is easily managed from a central location
Remote serves are deployed to host the application
Users can access SaaS offering from any location with Internet access
On-premise hardware failure does not interfere with an application or cause data loss
Users can reduce or increase use of cloud-based resources depending on their processing and storage needs
Applications offered via SaaS model are accessible from any location and almost all Internet-enabled devices
SaaS Use Cases
SaaS use case is a typical use case for many companies seeking to benefit from quality application usage without the need to develop, maintain and upgrade the required components. Companies can acquire SaaS solutions for ERP, mail, office applications, collaboration tool, among others. SaaS is also crucial for small companies and startups that wish to launch e-commerce service rapidly but lack the time and resource to develop and maintain the software or buy servers for hosting the platform. SaaS is also used by companies with short-term projects that require collaboration from different members located remotely.
Popular SaaS Services
SaaS offerings are more widespread as compared to IaaS and PaaS. In fact, a majority of consumers use SaaS services without realizing it.
Office365: the cloud-based solution provides productivity software for subscribed consumers. Allows users to access Microsoft Office tools on various platforms, such as Android, MacOS, and Windows, etc.
Box: the SaaS offers secure file storage, sharing, and collaboration from any location and platform
Dropbox: modern application designed for collaboration and for creating, storing, and accessing files, docs, and folders.
Salesforce: the SaaS is among the leading customer relationship management platform that offers a series of capabilities for sales, marketing, service, and more.
Today, cloud computing models have revolutionized the way businesses deploy and manage computing resources and infrastructure. With the advent and evolution of the three major cloud computing models, that it IaaS, PaaS, and SaaS, consumers will find a suitable cloud offering that satisfies virtually all IT needs. These models’ capabilities coupled with competition from popular cloud computing service providers will continue availing IT solutions for consumers demanding for availability, enhanced performance, quality services, better coverage, and secure applications.
Consumers should review their business needs and do a cost-benefit analysis to approve the best model for their business. Also, consumers should conduct thorough workload assessment while migrating to a cloud service.
Globally, organizations are facing challenges emanating from data issues, including data consolidation, value, heterogeneity, and quality. At the same time, they have to deal with the aspect of Big Data. In other words, consolidating, organizing, and realizing the value of data in an organization has been a challenge over the years. To overcome these challenges, a series of strategies have been devised. For instance, organizations are actively leveraging on methods such as Data Warehouses, Data Marts, and Data Stores to meet their data assets requirements. Unfortunately, the time and resources required to deliver value using these legacy methods is a distressing issue. In most cases, typical Data Warehouses applied for business intelligence (BI) rely on batch processing to consolidate and present data assets. This traditional approach is affected by the latency of information.
As the name suggests, Big Data describes a large volume of data that can either be structured or unstructured. It originates from business processes among other sources. Presently, artificial intelligence, mobile technology, social media, and the Internet of Things (IoT) have become new sources of vast amounts of data. In Big Data, the organization and consolidation matter more than the volume of the data. Ultimately, big data can be analyzed to generate insights that can be crucial in strategic decision making for a business.
Features of Big Data
The term Big Data is relatively new. However, the process of collecting and preserving vast amounts of information for different purposes has been there for decades. Big Data gained momentum recently with the three V’s features that include volume, velocity, and variety.
Volume: First, businesses gather information from a set of sources, such as social media, day-to-day operations, machine to machine data, weblogs, sensors, and so on. Traditionally, storing the data was a challenge. However, the requirement has been made possible by new technologies such as Hadoop.
Velocity: Another defining nature of Big Data is that it flows at an unprecedented rate that requires real-time processing. Organizations are gathering information from RFID tags, sensors, and other objects that need timely processing of data torrents.
Variety: In modern enterprises, information comes in different formats. For instance, a firm can gather numeric and structured data from traditional databases as well as unstructured emails, video, audio, business transactions, and texts.
Complexity: As mentioned above, Big Data comes from diverse sources and in varying formats. In effect, it becomes a challenge to consolidate, match, link, cleanse, or modify this data across an organizational system. Unfortunately, Big Data opportunities can only be explored when an organization successfully correlates relationships and connects multiple data sets to prevent it from spiraling out of control.
Variability: Big Data can have inconsistent flows within periodic peaks. For instance, in social media, a topic can be trending, which can tremendously increase collected data. Variability is also common while dealing with unstructured data.
Big Data Potential and Importance
The vast amount of data collected and preserved on a global scale will keep growing. This fact implies that there is more potential to generate crucial insights from this information. Unfortunately, due to various issues, only a small fraction of this data actually gets analyzed. There is a significant and untapped potential that businesses can explore to make proper and beneficial use of this information.
Analyzing Big Data allows businesses to make timely and effective decisions using raw data. In reality, organizations can gather data from diverse sources and process it to develop insights that can aid in reducing operational costs, production time, innovating new products, and making smarter decisions. Such benefits can be achieved when enterprises combine Big Data with analytic techniques, such as text analytics, predictive analytics, machine learning, natural language processing, data mining and so on.
Big Data Application Areas
Practically, Big Data can be used in nearly all industries. In the financial sector, a significant amount of data is gathered from diverse sources, which requires banks and insurance companies to innovate ways to manage Big Data. This industry aims at understanding and satisfying their customers while meeting regulatory compliance and preventing fraud. In effect, banks can exploit Big Data using advanced analytics to generate insights required to make smart decisions.
In the education sector, Big Data can be employed to make vital improvements on school systems, quality of education and curriculums. For instance, Big Data can be analyzed to assess students’ progress and to design support systems for professors and tutors.
Healthcare providers, on the other hand, collect patients’ records and design various treatment plans. In the healthcare sector, practitioners and service providers are required to offer accurate and timely treatment that is transparent to meet the stringent regulations in the industry and to enhance the quality of life. In this case, Big Data can be managed to uncover insights that can be used to improve the quality of service.
Governments and different authorities can apply analytics to Big Data to create the understanding required to manage social utilities and to develop solutions necessary to solve common problems, such as city congestion, crime, and drug use. However, governments must also consider other issues such as privacy and confidentiality while dealing with Big Data.
In manufacturing and processing, Big Data offers insights that stakeholders can use to efficiently use raw materials to output quality products. Manufacturers can perform analytics on big data to generate ideas that can be used to increase market share, enhance safety, minimize wastage, and solve other challenges faster.
In the retail sector, companies rely heavily on customer loyalty to maintain market share in a highly competitive market. In this case, managing big data can help retailers to understand the best methods to utilize in marketing their products to existing and potential consumers, and also to sustain relationships.
Challenges Handling Big Data
With the introduction of Big Data, the challenge of consolidating and creating value on data assets becomes magnified. Today, organizations are expected to handle increased data velocity, variety, and volume. It is now a business necessity to deal with traditional enterprise data and Big Data. Traditional relational databases are suitable for storing, processing, and managing low-latency data. Big Data has increased volume, variety, and velocity, making it difficult for legacy database systems to efficiently handle it.
Failing to act on this challenge implies that enterprises cannot tap the opportunities presented by data generated from diverse sources, such as machine sensors, weblogs, social media, and so on. On the contrary, organizations that will explore Big Data capabilities amidst its challenges will remain competitive. It is necessary for businesses to integrate diverse systems with Big Data platforms in a meaningful manner, as heterogeneity of data environments continue to increase.
Virtualization involves turning physical computing resources, such as databases and servers into multiple systems. The concept consists of making the function of an IT resource simulated in software, making it identical to the corresponding physical object. Virtualization technique uses abstraction to create a software application to appear and operate like hardware to provide a series of benefits ranging from flexibility, scalability, performance, and reliability.
Typically, virtualization is made possible using virtual machines (VMs) implemented in microprocessors with necessary hardware support and OS-level implementations to enhance computational productivity. VMs offers additional convenience, security, and integrity with little resource overhead.
Benefits of Virtualization
Achieving the economics of wide-scale functional virtualization using available technologies is easy to improve reliability by employing virtualization offered by cloud service providers on fully redundant and standby basis. Traditionally, organizations would deploy several services to operate at a fraction of their capacity to meet increased processing and storage demands. These requirements resulted in increased operating costs and inefficiencies. With the introduction of virtualization, the software can be used to simulate functionalities of hardware. In effect, businesses can outstandingly eliminate the possibility of system failures. At the same time, the technology significantly reduces capital expense components of IT budgets. In future, more resources will be spent on operating, than acquisition expenses. Company funds will be channeled to service providers instead of purchasing expensive equipment and hiring local personnel.
Overall, virtualization enables IT functions across business divisions and industries to be performed more efficiently, flexibly, inexpensively, and productively. The technology meaningfully eliminates expensive traditional implementations.
Apart from reducing capital and operating costs for organizations, virtualization minimizes and eliminates downtime. It also increases IT productivity, responsiveness, and agility. The technology provides faster provisioning of resources and applications. In case of incidents, virtualization allows fast disaster recovery that maintains business continuity.
Types of Virtualization
There are various types of virtualization, such as a server, network, and desktop virtualization.
In server virtualization, more than one operating system runs on a single physical server to increase IT efficiency, reduce costs, achieve timely workload deployment, improve availability and enhance performance.
Network virtualization involves reproducing a physical network to allow applications to run on a virtual system. This type of virtualization provides operational benefits and hardware independence.
In desktop virtualization, desktops and applications are virtualized and delivered to different divisions and branches in a company. Desktop virtualization supports outsourced, offshore, and mobile workers who can access simulate desktop on tablets and iPads.
Characteristics of Virtualization
Some of the features of virtualization that support the efficiency and performance of the technology include:
Partitioning: In virtualization, several applications, database systems, and operating systems are supported by a single physical system since the technology allows partitioning of limited IT resources.
Isolation: Virtual machines can be isolated from the physical systems hosting them. In effect, if a single virtual instance breaks down, the other machine, as well as the host hardware components, will not be affected.
Encapsulation: A virtual machine can be presented as a single file while abstracting other features. This makes it possible for users to identify the VM based on a role it plays.
Data Virtualization – A Solution for Big Data Challenges
Virtualization can be viewed as a strategy that helps derive information value when needed. The technology can be used to add a level of efficiency that makes big data applications a reality. To enjoy the benefits of big data, organizations need to abstract data from different reinforcements. In other words, virtualization can be deployed to provide partitioning, encapsulation, and isolation that abstracts the complexities of Big Data stores to make it easy to integrate data from multiple stores with other data from systems used in an enterprise.
Virtualization enables ease of access to Big Data. The two technologies can be combined and configured using the software. As a result, the approach makes it possible to present an extensive collection of disassociated and structured and unstructured data ranging from application and weblogs, operating system configuration, network flows, security events, to storage metrics.
Virtualization improves storage and analysis capabilities on Big Data. As mentioned earlier, the current traditional relational databases are incapable of addressing growing needs inherent to Big Data. Today, there is an increase in special purpose applications for processing varied and unstructured big data. The tools can be used to extract value from Big Data efficiently while minimizing unnecessary data replication. Virtualization tools also make it possible for enterprises to access numerous data sources by integrating them with legacy relational data centers, data warehouses, and other files that can be used in business intelligence. Ultimately, companies can deploy virtualization to achieve a reliable way to handle complexity, volume, and heterogeneity of information collected from diverse sources. The integrated solutions will also meet other business needs for near-real-time information processing and agility.
In conclusion, it is evident that the value of Big Data comes from processing information gathered from diverse sources in an enterprise. Virtualizing big data offers numerous benefits that cannot be realized while using physical infrastructure and traditional database systems. It provides simplification of Big Data infrastructure that reduces operational costs and time to results. Shortly, Big Data use cares will shift from theoretical possibilities to multiple use patterns that feature powerful analytics and affordable archival of vast datasets. Virtualization will be crucial in exploiting Big Data presented as abstracted data services.
How to know if your Oracle Client install is 32 Bit or 64 Bit
Sometimes you just need to know if your Oracle Client install is 32 bit or 64 bit. But how do you figure that out? Here are two methods you can try.
The first method
Go to the %ORACLE_HOME%\inventory\ContentsXML folder and open the comps.xml file.
Look for <DEP_LIST> on the ~second screen.
If you see this: PLAT=”NT_AMD64” then your Oracle Home is 64 bit
If you see this: PLAT=”NT_X86” then your Oracle Home is 32 bit.
It is possible to have both the 32-bit and the 64-bit Oracle Homes installed.
The second method
This method is a bit faster. Windows has a different lib directory for 32-bit and 64-bit software. If you look under the ORACLE_HOME folder if you see a “lib” AND a “lib32” folder you have a 64 bit Oracle Client. If you see just the “lib” folder you’ve got a 32 bit Oracle Client.
I’ve tried to explain the difference between OLTP systems and a Data Warehouse to my managers many times, as I’ve worked at a hospital as a Data Warehouse Manager/data analyst for many years. Why was the list that came from the operational applications different than the one that came from the Data Warehouse? Why couldn’t I just get a list of patients that were laying in the hospital right now from the Data Warehouse? So I explained, and explained again, and explained to another manager, and another. You get the picture. In this article I will explain this very same thing to you. So you know how to explain this to your manager. Or, if you are a manager, you might understand what your data analyst can and cannot give you.
OLTP stands for On Line Transactional Processing. With other words: getting your data directly from the operational systems to make reports. An operational system is a system that is used for the day to day processes. For example: When a patient checks in, his or her information gets entered into a Patient Information System. The doctor put scheduled tests, a diagnoses and a treatment plan in there as well. Doctors, nurses and other people working with patients use this system on a daily basis to enter and get detailed information on their patients. The way the data is stored within operational systems is so the data can be used efficiently by the people working directly on the product, or with the patient in this case.
A Data Warehouse is a big database that fills itself with data from operational systems. It is used solely for reporting and analytical purposes. No one uses this data for day to day operations. The beauty of a Data Warehouse is, among others, that you can combine the data from the different operational systems. You can actually combine the number of patients in a department with the number of nurses for example. You can see how far a doctor is behind schedule and find the cause of that by looking at the patients. Does he run late with elderly patients? Is there a particular diagnoses that takes more time? Or does he just oversleep a lot? You can use this information to look at the past, see trends, so you can plan for the future.
The difference between OLTP and Data Warehousing
This is how a Data Warehouse works:
The data gets entered into the operational systems. Then the ETL processes Extract this data from these systems, Transforms the data so it will fit neatly into the Data Warehouse, and then Loads it into the Data Warehouse. After that reports are formed with a reporting tool, from the data that lies in the Data Warehouse.
This is how OLTP works:
Reports are directly made from the data inside the database of the operational systems. Some operational systems come with their own reporting tool, but you can always use a standalone reporting tool to make reports form the operational databases.
Pro’s and Con’s
There is no strain on the operational systems during business hours
As you can schedule the ETL processes to run during the hours the least amount of people are using the operational system, you won’t disturb the operational processes. And when you need to run a large query, the operational systems won’t be affected, as you are working directly on the Data Warehouse database.
Data from different systems can be combined
It is possible to combine finance and productivity data for example. As the ETL process transforms the data so it can be combined.
Data is optimized for making queries and reports
You use different data in reports than you use on a day to day base. A Data Warehouse is built for this. For instance: most Data Warehouses have a separate date table where the weekday, day, month and year is saved. You can make a query to derive the weekday from a date, but that takes processing time. By using a separate table like this you’ll save time and decrease the strain on the database.
Data is saved longer than in the source systems
The source systems need to have their old records deleted when they are no longer used in the day to day operations. So they get deleted to gain performance.
You always look at the past
A Data Warehouse is updated once a night, or even just once a week. That means that you never have the latest data. Staying with the hospital example: you never knew how many patients are in the hospital are right now. Or what surgeon didn’t show up on time this morning.
You don’t have all the data
A Data Warehouse is built for discovering trends, showing the big picture. The little details, the ones not used in trends, get discarded during the ETL process.
Data isn’t the same as the data in the source systems
Because the data is older than those of the source systems it will always be a little different. But also because of the Transformation step in the ETL process, data will be a little different. It doesn’t mean one or the other is wrong. It’s just a different way of looking at the data. For example: the Data Warehouse at the hospital excluded all transactions that were marked as cancelled. If you try to get the same reports from both systems, and don’t exclude the cancelled transactions in the source system, you’ll get different results.
online transactional processing (OLTP)
You get real time data
If someone is entering a new record now, you’ll see it right away in your report. No delays.
You’ve got all the details
You have access to all the details that the employees have entered into the system. No grouping, no skipping records, just all the raw data that’s available.
You are putting strain on an application during business hours.
When you are making a large query, you can take processing space that would otherwise be available to the people that need to work with this system for their day to day operations. And if you make an error, by for instance forgetting to put a date filter on your query, you could even bring the system down so no one can use it anymore.
You can’t compare the data with data from other sources.
Even when the systems are similar. Like an HR system and a payroll system that use each other to work. Data is always going to be different because it is granulated on a different level, or not all data is relevant for both systems.
You don’t have access to old data
To keep the applications at peak performance, old data, that’s irrelevant to day to day operations is deleted.
Data is optimized to suit day to day operations
And not for report making. This means you’ll have to get creative with your queries to get the data you need.
So what method should you use?
That all depends on what you need at that moment. If you need detailed information about things that are happening now, you should use OLTP. If you are looking for trends, or insights on a higher level, you should use a Data Warehouse.