Public Cloud Versus Private Cloud

Cloud Computing

Cloud Computing

A public cloud strategy refers to a situation where you utilize cloud resources on a shared platform. Examples of shared or public cloud solutions include Microsoft Azure, Amazon Web Services and Google cloud. There are several benefits associated with cloud solutions. On the other hand, a private cloud strategy refers to a situation where you can decide to have an infrastructure which is dedicated to serving your business. It is sometimes referred to as homegrown where you employ experts to run the services so that your business can access different features. There are several advantages of using a public cloud over private cloud which you should know before you make an informed decision on the right platform to invest. Some of the benefits of the public cloud strategy include the following:

Availability and scale of Expertise

If you compare the public cloud and the private cloud services, the public cloud

allows you to access more experts. Remember the companies which offer the cloud services have enough employees who are ready to help several clients. In most cases, the other clients whom the service providers serve will not experience problems at the same time. It implies that human resource will be directed toward solving your urgent issue. You can as well scale up or down at any given time as the need arises which is unlike a case of private cloud solutions where you will have to invest in infrastructure each time you will like to upgrade.

Downgrading on a private cloud system can expose you to lose because you will leave some resources underutilized.

The volume of Technical Resources to apply

You access more technical resources in a public cloud platform. Remember the companies which offer the public cloud solutions are fully equipped with highly experienced experts. They also have the necessary tools and resources which

they can apply to assure you the best technical solutions each time you need them. It is unlike a private arrangement where you will have to incur more costs if the technical challenges will need advanced tools and highly qualified experts.

Price point

The price of a private cloud is high when compared to a public arrangement. If you are looking for ways you can save money, then the best way to go about it is to involve a public cloud solution. In the shared platform, you will only pay for

what you need. If you do not need a lot of resources at a given time, you can downgrade the services and enjoy fair prices. Services such as AWS offer great cost containment across the time which makes it easy to access the services at fair prices. For any business to grow, it should invest in the right package which brings the return on investment. The services offered by the public cloud systems allow businesses to save and grow. You should as well take into consideration other factors such as ecosystems for cloud relationships before you make an informed decision. There are some business models which prefer private cloud solutions while others can work well under public cloud-based solutions.

Related References

Major Cloud Computing Models

Cloud Computing

Cloud Computing

Cloud computing enables convenient, ubiquitous, measures, and on-demand access to a shared pool of scalable and configurable resources, such as servers, applications, databases, networks, and other services. Also, these resources can be provisioned and released rapidly with minimum interaction and management from the provider.

The rapidly expanding technology is rife with obscure acronyms, with major ones being SaaS, PaaS, and IaaS. These acronyms distinguish the three major cloud computing models discussed in this article. Notably, cloud computing virtually meets any imaginable IT needs in diverse ways. In effect, the cloud computing models are necessary to show the role that a cloud service provides and how the function is accomplished. The three main cloud computing paradigms can be demonstrated on the diagram shown below.

The three major cloud computing models

The three major cloud computing models

Infrastructure as a Service (IaaS)

In infrastructure as a service model, the cloud provider offers a service that allows users to process, store, share, and user other fundamental computing resources to run their software, which can include operating systems and applications. In this case, a consumer has minimum control over the underlying cloud infrastructure, but has significant control over operating systems, deployed applications, storage, and some networking components, such as the host firewalls.

Based on its description, IaaS can be regarded as the lowest-level cloud service paradigm, and possibly the most crucial one. With this paradigm, a cloud vendor provides pre-configured computing resources to consumers via a virtual interface. From the definition, IaaS pertains underlying cloud infrastructure but does not include applications or an operating system. Implementation of the applications, operating system, and some network components, such as the host firewalls is left up to the end user. In other words, the role of the cloud provider is to enable access to the computing infrastructure necessary to drive and support their operating systems and application solutions.

In some cases, the IaaS model can provide extra storage for data backups, network bandwidth, or it can provide access to enhanced performance computing which was traditionally available using supercomputers. IaaS services are typically provided to users through an API or a dashboard.

Features of IaaS

  • Users transfer the cost of purchasing IT infrastructure to a cloud provider
  • Infrastructure offered to a consumer can be increased or reduced depending on business storage and processing needs
  • The consumer will be saved from challenges and costs of maintaining hardware
  • High availability of data is in the cloud
  • Administrative tasks are virtualized
  • IaaS is highly flexible compared to other models
  • Highly scalable and available
  • Permits consumers to focus on their core business and transfer critical IT roles to a cloud provider
Infrastructure as a Service (IaaS)

Infrastructure as a Service (IaaS)

IaaS Use Cases

A series of use cases can explore the above benefits and features afforded by IaaS. For instance, an organization that lacks the capital to own and manage their data centers can purchase an IaaS offering to achieve fast and affordable IT infrastructure for their business. Also, the IaaS can be expanded or terminated based on the consumer needs. Another set of companies that can deploy IaaS include traditional organizations seeking large computing power with low expenditure to run their workloads. IaaS model is also a good option for rapidly growing enterprises that avoid committing to specific hardware or software since their business needs are likely to evolve.

Popular IaaS Services

Major IT companies are offering popular IaaS services that are powering a significant portion of the Internet even without users realizing it.

Amazon EC2: Offers scalable and highly available computing capacity in the cloud. Allows users to develop and deploy applications rapidly without upfront investment in hardware

IBM’s SoftLayer: Cloud computing services offering a series of capabilities, such as computing, networking, security, storage, and so on, to enable faster and reliable application development. The solution features bare-metal, hypervisors, operating systems, database systems, and virtual servers for software developers.

NaviSite: offers application services, hosting, and managed cloud services for IT infrastructure

ComputeNext: the solution empowers internal business groups and development teams with DevOps productivity from a single API.

Platform as a Service (PaaS)

Platform as a service model involves the provision of capabilities that allow users to create their applications using programming languages, tools, services, and libraries owned and distributed by a cloud provider. In this case, the consumer has minimum control over the underlying cloud computing resources such as servers, storage, and operating system. However, the user has significant control over the applications developed and deployed on the PaaS service.

In PaaS, cloud computing is used to provide a platform for consumers to deploy while developing, initializing, implementing, and managing their application. This offering includes a base operating system and a suite of development tools and solutions. PaaS effectively eliminates the needs for consumers to purchase, implement and maintain the computing resources traditionally needed to build useful applications. Some people use the term ‘middleware’ to refer to PaaS model since the offering comfortably sits between SaaS and IaaS.

Features of PaaS

  • PaaS service offers a platform for development, tasking, and hosting tools for consumer applications
  • PaaS is highly scalable and available
  • Offer cost effective and simple way to develop and deploy applications
  • Users can focus on developing quality applications without worrying about the underlying IT infrastructure
  • Business policy automation
  • Many users can access a single development service or tool
  • Offers database and web services integration
  • Consumers have access to powerful and reliable server software, storage capabilities, operating systems, and information and application backup
  • Allows remote teams to collaborate, which improves employee productivity
Platform as a Service (PaaS)

Platform as a Service (PaaS)

PaaS Use Cases

Software development companies and other enterprises that want to implement agile development methods can explore PaaS capabilities in their business models. Many PaaS services can be used in application development. PaaS development tools and services are always updated and made available via the Internet to offer a simple way for businesses to develop, test, and prototype their software solutions. Since developers’ productivity is enhanced by allowing remote workers to collaborate, PaaS consumers can rapidly release applications and get feedback for improvement. PaaS has led to the emergence of the API economy in application development.

Popular PaaS Offerings

There exist major PaaS services that are helping organizations to streamline application development. PaaS offering is delivered over the Internet and allows developers to focus more on creating quality and highly functional application while not worrying about the operating system, storage, and other infrastructure.

Google’s App Engine: the solution allows developers to build scalable mobile and web backends in any language in the cloud. Users can bring their own language runtimes, third-party libraries, and frameworks

IBM BlueMix: this PaaS solution from IBM allows developers to avoid vendor lock-in and leverage the flexible and open cloud environment using diverse IBM tools, open technologies, and third-party libraries and frameworks.

Heroku: the solution provides companies with a platform where they can build, deliver, manage, and scale their applications while abstracting and bypassing computing infrastructure hassles

Apache Stratos: this PaaS offering offers enterprise-ready quality service, security, governance, and performance that allows development, modification, deployment, and distribution of applications.

Red Hat’s OpenShift: a container application platform that offers operations and development-centric tools for rapid application development, easy deployment, scalability, and long-term maintenance of applications

Software as a Service (SaaS)

Software as a service model involves the capabilities provided to users by using a cloud vendor’s application hosted and running on a cloud infrastructure. Such applications are conveniently accessible from different platforms and devices through a web browser, a thin client interface, or a program interface. In this model, the end user has minimum control of the underlying cloud-based computing resources, such as servers, operating system, or the application capabilities

SaaS can be described as software licensing and delivery paradigm that features a complete and functional software solutions provided to users on a metered and subscription basis. Since users access the application via browsers or thin client and program interfaces, SaaS makes the host operating system insignificant in the operation of the product. As mentioned, the service is metered. In this case, SaaS customers are billed based on their consumption, while others pay a flat monthly fee.

Features of SaaS

  • SaaS providers offer applications via subscription structure
  • User transfer the need to develop, install, manage, or upgrade applications to SaaS vendors
  • Applications and data is securely stored in the cloud
  • SaaS is easily managed from a central location
  • Remote serves are deployed to host the application
  • Users can access SaaS offering from any location with Internet access
  • On-premise hardware failure does not interfere with an application or cause data loss
  • Users can reduce or increase use of cloud-based resources depending on their processing and storage needs
  • Applications offered via SaaS model are accessible from any location and almost all Internet-enabled devices
Software as a Service (SaaS)

Software as a Service (SaaS)

SaaS Use Cases

SaaS use case is a typical use case for many companies seeking to benefit from quality application usage without the need to develop, maintain and upgrade the required components. Companies can acquire SaaS solutions for ERP, mail, office applications, collaboration tool, among others. SaaS is also crucial for small companies and startups that wish to launch e-commerce service rapidly but lack the time and resource to develop and maintain the software or buy servers for hosting the platform. SaaS is also used by companies with short-term projects that require collaboration from different members located remotely.

Popular SaaS Services

SaaS offerings are more widespread as compared to IaaS and PaaS. In fact, a majority of consumers use SaaS services without realizing it.

Office365: the cloud-based solution provides productivity software for subscribed consumers. Allows users to access Microsoft Office tools on various platforms, such as Android, MacOS, and Windows, etc.

Box: the SaaS offers secure file storage, sharing, and collaboration from any location and platform

Dropbox: modern application designed for collaboration and for creating, storing, and accessing files, docs, and folders.

Salesforce: the SaaS is among the leading customer relationship management platform that offers a series of capabilities for sales, marketing, service, and more.

Today, cloud computing models have revolutionized the way businesses deploy and manage computing resources and infrastructure. With the advent and evolution of the three major cloud computing models, that it IaaS, PaaS, and SaaS, consumers will find a suitable cloud offering that satisfies virtually all IT needs. These models’ capabilities coupled with competition from popular cloud computing service providers will continue availing IT solutions for consumers demanding for availability, enhanced performance, quality services, better coverage, and secure applications.

Consumers should review their business needs and do a cost-benefit analysis to approve the best model for their business. Also, consumers should conduct thorough workload assessment while migrating to a cloud service.

Big Data vs. Virtualization

Big Data Information Approaches

Big Data Information Approaches

Globally, organizations are facing challenges emanating from data issues, including data consolidation, value, heterogeneity, and quality. At the same time, they have to deal with the aspect of Big Data. In other words, consolidating, organizing, and realizing the value of data in an organization has been a challenge over the years. To overcome these challenges, a series of strategies have been devised. For instance, organizations are actively leveraging on methods such as Data Warehouses, Data Marts, and Data Stores to meet their data assets requirements. Unfortunately, the time and resources required to deliver value using these legacy methods is a distressing issue. In most cases, typical Data Warehouses applied for business intelligence (BI) rely on batch processing to consolidate and present data assets. This traditional approach is affected by the latency of information.

Big Data

As the name suggests, Big Data describes a large volume of data that can either be structured or unstructured. It originates from business processes among other sources. Presently, artificial intelligence, mobile technology, social media, and the Internet of Things (IoT) have become new sources of vast amounts of data. In Big Data, the organization and consolidation matter more than the volume of the data. Ultimately, big data can be analyzed to generate insights that can be crucial in strategic decision making for a business.

Features of Big Data

The term Big Data is relatively new. However, the process of collecting and preserving vast amounts of information for different purposes has been there for decades. Big Data gained momentum recently with the three V’s features that include volume, velocity, and variety.

Volume: First, businesses gather information from a set of sources, such as social media, day-to-day operations, machine to machine data, weblogs, sensors, and so on. Traditionally, storing the data was a challenge. However, the requirement has been made possible by new technologies such as Hadoop.

Velocity: Another defining nature of Big Data is that it flows at an unprecedented rate that requires real-time processing. Organizations are gathering information from RFID tags, sensors, and other objects that need timely processing of data torrents.

Variety: In modern enterprises, information comes in different formats. For instance, a firm can gather numeric and structured data from traditional databases as well as unstructured emails, video, audio, business transactions, and texts.

Complexity: As mentioned above, Big Data comes from diverse sources and in varying formats. In effect, it becomes a challenge to consolidate, match, link, cleanse, or modify this data across an organizational system. Unfortunately, Big Data opportunities can only be explored when an organization successfully correlates relationships and connects multiple data sets to prevent it from spiraling out of control.

Variability: Big Data can have inconsistent flows within periodic peaks. For instance, in social media, a topic can be trending, which can tremendously increase collected data. Variability is also common while dealing with unstructured data.

Big Data Potential and Importance

The vast amount of data collected and preserved on a global scale will keep growing. This fact implies that there is more potential to generate crucial insights from this information. Unfortunately, due to various issues, only a small fraction of this data actually gets analyzed. There is a significant and untapped potential that businesses can explore to make proper and beneficial use of this information.

Analyzing Big Data allows businesses to make timely and effective decisions using raw data. In reality, organizations can gather data from diverse sources and process it to develop insights that can aid in reducing operational costs, production time, innovating new products, and making smarter decisions. Such benefits can be achieved when enterprises combine Big Data with analytic techniques, such as text analytics, predictive analytics, machine learning, natural language processing, data mining and so on.

Big Data Application Areas

Practically, Big Data can be used in nearly all industries. In the financial sector, a significant amount of data is gathered from diverse sources, which requires banks and insurance companies to innovate ways to manage Big Data. This industry aims at understanding and satisfying their customers while meeting regulatory compliance and preventing fraud. In effect, banks can exploit Big Data using advanced analytics to generate insights required to make smart decisions.

In the education sector, Big Data can be employed to make vital improvements on school systems, quality of education and curriculums. For instance, Big Data can be analyzed to assess students’ progress and to design support systems for professors and tutors.

Healthcare providers, on the other hand, collect patients’ records and design various treatment plans. In the healthcare sector, practitioners and service providers are required to offer accurate and timely treatment that is transparent to meet the stringent regulations in the industry and to enhance the quality of life. In this case, Big Data can be managed to uncover insights that can be used to improve the quality of service.

Governments and different authorities can apply analytics to Big Data to create the understanding required to manage social utilities and to develop solutions necessary to solve common problems, such as city congestion, crime, and drug use. However, governments must also consider other issues such as privacy and confidentiality while dealing with Big Data.

In manufacturing and processing, Big Data offers insights that stakeholders can use to efficiently use raw materials to output quality products. Manufacturers can perform analytics on big data to generate ideas that can be used to increase market share, enhance safety, minimize wastage, and solve other challenges faster.

In the retail sector, companies rely heavily on customer loyalty to maintain market share in a highly competitive market. In this case, managing big data can help retailers to understand the best methods to utilize in marketing their products to existing and potential consumers, and also to sustain relationships.

Challenges Handling Big Data

With the introduction of Big Data, the challenge of consolidating and creating value on data assets becomes magnified. Today, organizations are expected to handle increased data velocity, variety, and volume. It is now a business necessity to deal with traditional enterprise data and Big Data. Traditional relational databases are suitable for storing, processing, and managing low-latency data. Big Data has increased volume, variety, and velocity, making it difficult for legacy database systems to efficiently handle it.

Failing to act on this challenge implies that enterprises cannot tap the opportunities presented by data generated from diverse sources, such as machine sensors, weblogs, social media, and so on. On the contrary, organizations that will explore Big Data capabilities amidst its challenges will remain competitive. It is necessary for businesses to integrate diverse systems with Big Data platforms in a meaningful manner, as heterogeneity of data environments continue to increase.

Virtualization

Virtualization involves turning physical computing resources, such as databases and servers into multiple systems. The concept consists of making the function of an IT resource simulated in software, making it identical to the corresponding physical object. Virtualization technique uses abstraction to create a software application to appear and operate like hardware to provide a series of benefits ranging from flexibility, scalability, performance, and reliability.

Typically, virtualization is made possible using virtual machines (VMs) implemented in microprocessors with necessary hardware support and OS-level implementations to enhance computational productivity. VMs offers additional convenience, security, and integrity with little resource overhead.

Benefits of Virtualization

Achieving the economics of wide-scale functional virtualization using available technologies is easy to improve reliability by employing virtualization offered by cloud service providers on fully redundant and standby basis. Traditionally, organizations would deploy several services to operate at a fraction of their capacity to meet increased processing and storage demands. These requirements resulted in increased operating costs and inefficiencies. With the introduction of virtualization, the software can be used to simulate functionalities of hardware. In effect, businesses can outstandingly eliminate the possibility of system failures. At the same time, the technology significantly reduces capital expense components of IT budgets. In future, more resources will be spent on operating, than acquisition expenses. Company funds will be channeled to service providers instead of purchasing expensive equipment and hiring local personnel.

Overall, virtualization enables IT functions across business divisions and industries to be performed more efficiently, flexibly, inexpensively, and productively. The technology meaningfully eliminates expensive traditional implementations.

Apart from reducing capital and operating costs for organizations, virtualization minimizes and eliminates downtime. It also increases IT productivity, responsiveness, and agility. The technology provides faster provisioning of resources and applications. In case of incidents, virtualization allows fast disaster recovery that maintains business continuity.

Types of Virtualization

There are various types of virtualization, such as a server, network, and desktop virtualization.

In server virtualization, more than one operating system runs on a single physical server to increase IT efficiency, reduce costs, achieve timely workload deployment, improve availability and enhance performance.

Network virtualization involves reproducing a physical network to allow applications to run on a virtual system. This type of virtualization provides operational benefits and hardware independence.

In desktop virtualization, desktops and applications are virtualized and delivered to different divisions and branches in a company. Desktop virtualization supports outsourced, offshore, and mobile workers who can access simulate desktop on tablets and iPads.

Characteristics of Virtualization

Some of the features of virtualization that support the efficiency and performance of the technology include:

Partitioning: In virtualization, several applications, database systems, and operating systems are supported by a single physical system since the technology allows partitioning of limited IT resources.

Isolation: Virtual machines can be isolated from the physical systems hosting them. In effect, if a single virtual instance breaks down, the other machine, as well as the host hardware components, will not be affected.

Encapsulation: A virtual machine can be presented as a single file while abstracting other features. This makes it possible for users to identify the VM based on a role it plays.

Data Virtualization – A Solution for Big Data Challenges

Virtualization can be viewed as a strategy that helps derive information value when needed. The technology can be used to add a level of efficiency that makes big data applications a reality. To enjoy the benefits of big data, organizations need to abstract data from different reinforcements. In other words, virtualization can be deployed to provide partitioning, encapsulation, and isolation that abstracts the complexities of Big Data stores to make it easy to integrate data from multiple stores with other data from systems used in an enterprise.

Virtualization enables ease of access to Big Data. The two technologies can be combined and configured using the software. As a result, the approach makes it possible to present an extensive collection of disassociated and structured and unstructured data ranging from application and weblogs, operating system configuration, network flows, security events, to storage metrics.

Virtualization improves storage and analysis capabilities on Big Data. As mentioned earlier, the current traditional relational databases are incapable of addressing growing needs inherent to Big Data. Today, there is an increase in special purpose applications for processing varied and unstructured big data. The tools can be used to extract value from Big Data efficiently while minimizing unnecessary data replication. Virtualization tools also make it possible for enterprises to access numerous data sources by integrating them with legacy relational data centers, data warehouses, and other files that can be used in business intelligence. Ultimately, companies can deploy virtualization to achieve a reliable way to handle complexity, volume, and heterogeneity of information collected from diverse sources. The integrated solutions will also meet other business needs for near-real-time information processing and agility.

In conclusion, it is evident that the value of Big Data comes from processing information gathered from diverse sources in an enterprise. Virtualizing big data offers numerous benefits that cannot be realized while using physical infrastructure and traditional database systems. It provides simplification of Big Data infrastructure that reduces operational costs and time to results. Shortly, Big Data use cares will shift from theoretical possibilities to multiple use patterns that feature powerful analytics and affordable archival of vast datasets. Virtualization will be crucial in exploiting Big Data presented as abstracted data services.

 

Data Warehousing vs. Data Virtualization

Information Management

Information Management

Today, a business heavily depends on data to gain insights into their processes and operations and to develop new ways to increase market share and profits. In most cases, data required to generate the insights are sourced and located in diverse places, which requires reliable access mechanism. Currently, data warehousing and data virtualization are two principal techniques used to store and access the sources of critical data in a company. Each approach offers various capabilities and can be deployed for particular use cases as described in this article.

Data Warehousing

A data warehouse is designed and developed to secure host historical data from different sources. In effect, this technique protects data sources from performance degradation caused by the impact of sophisticated analytics and enormous demands for reports. Today, various tools and platforms have been developed for data warehouse automation in companies. They can be deployed to quicken development, automate testing, maintenance, and other steps involved in data warehousing. In a data warehouse, data is stored as a series of snapshots, where a record represents data at a particular time. In effect, companies can analyze data warehouse snapshots to compare data between different periods. The results are converted into insights required to make crucial business decisions.

Moreover, a data warehouse is optimized for other functions, such as data retrieval. The technology duplicates data to allow database de-normalization that enhances query performance. The solution is further deployed to create an enterprise data warehouse (EDW) used to service the entire organization.

Data Warehouse Information Architecture

Data Warehouse Information Architecture

Features of a Data Warehouse

A data warehouse is subject-oriented, and it is designed to help entities analyze data. For instance, a company can start a data warehouse focused on sales to learn more about sales data. Analytics on this warehouse can help establish insights such as the best customer for the period. The data warehouse is subject oriented since it can be defined based on a subject matter.

A data warehouse is integrated. Data from various sources is first out into a consistent format. The process requires the firm to resolve some challenges, such as naming conflicts and inconsistencies on units of measure.

A data warehouse in nonvolatile. In effect, data entered into the warehouse should not change after it is stored. This feature increases accuracy and integrity in data warehousing.

A data warehouse is time variant since it focuses on data changes over time. Data warehousing discovers trends in business by using large amounts of historical data. In effect, a typical operation in a data warehouse scans millions of rows to return an output.

A data warehouse is designed and developed to handle ad hoc queries. In most cases, organizations may not predict the amount of workload of a data warehouse. Therefore, it is recommendable to optimize the data warehouse to perform optimally over any possible query operation.

A data warehouse is regularly updated by the ETL process using bulk data modification techniques. Therefore, end users cannot directly update the data warehouse.

Advantages of Data Warehousing

The primary motivation for developing a data warehouse is to provide timely information required for decision making in an organization. A business intelligence data warehouse serves as an initial checkpoint for crucial business data. When a company stores its data in a data warehouse, tracking it becomes natural. The technology allows users to perform quick searches to be able to retrieve and analyze static data.

Another driver for companies investing in data warehouses involves integrating data from disparate sources. This capability adds value to operational applications like customer relationship management systems. A well-integrated warehouse allows the solution to translate information to a more usable and straightforward format, making it easy for users to understand the business data.

The technology also allows organizations to perform a series of analysis on data.

A data warehouse reduces the cost to access historical data in an organization.

Data warehousing provides standardization of data across an organization. Moreover, it helps identify and eliminate errors. Before loading data, the solution shows inconsistencies to users and corrects them.

A data warehouse also improves the turnaround time for analysis and report generation.

The technology makes it easy for users to access and share data. A user can conduct a quick search on a data warehouse to find and analyze static data without wasting time.

Data warehousing removes informational processing load from transaction-oriented databases.

Disadvantages of Data Warehousing

While data warehousing technology is undoubtedly beneficial to many organizations, not all data warehouses are relevant to a business. In some cases, a data warehouse can be expensive to scale and maintain.

Preparing a data warehouse is time-consuming since it requires users to input raw data, which has to be achieved manually.

A data warehouse is not a perfect choice for handing unstructured and complex raw data. Moreover, it faces difficulties incompatibility. Depending on the data sources, companies may require a business intelligence team to ensure compatibility is achieved for data coming from sources running distinct operating systems and programs.

The technology requires a maintenance cost to continue working correctly. The solution needs to be updated with latest features that might be costly. Regularly maintaining a data warehouse will need a business to spend more on top of the initial investment.

A data warehouse use can be limited due to information privacy and confidentiality issues. In most cases, businesses collect and store sensitive data belonging to their clients. Viewing it is only allowed to individual employees, which limits the benefits offered by a data warehouse.

Data Warehousing Use Case

There are a series of ways organizations use data warehouses. Businesses can optimize the technology for performance by identifying the type of data warehouse they have.

  1. A data warehouses can be used by an organization that is struggling to report efficiently on business operations and activities. The solution makes it possible to access the required data
  2. A data warehouse is necessary for an organization where data is copied separately by different divisions for analysis in spreadsheets that are not consistent with one another.
  3. Data warehousing is crucial in organizations where uncertainties about data accuracy are causing executives to question the veracity of reports.
  4. A data warehouse is crucial for business intelligence acceleration. The technology delivers rapid data insights to analysts at different scales, concurrency, and without requiring manual tuning or optimization of a database.
Data Virtualization Information Architecture

Data Virtualization Information Architecture

Data Virtualization

Data virtualization technology does not require transfer or storage of data. Instead, users employ a combination of application programming interfaces (APIs) and metadata (data about data) to interface with data in different sources. Users use joined queries to gain access to the original data sources. In other words, data virtualization offers a simplified and integrated view to business data in real-time as requested by business users, applications, and analytics. In effect, the technology makes it possible to integrate data from distinct sources, formats, and locations, without replication. It creates a unified virtual data layer that delivers data services to support users and various business applications.

Data virtualization performs many of the same data integration functions, that is, extract, transform, and load, data replication, and federation. It leverages modern technology to deliver real-time data integration with agility, low cost, and high speed. In effect, data virtualization eliminates traditional data integration and reduces the need for replicated data warehouses and data marts in most cases.

Capabilities and Benefits of Data Virtualization

There are various benefits of implementing data virtualization in an organization.

Firstly, data virtualization allows access and leverage of all information that helps a firm achieve a competitive advantage. The solution offers a unified virtual layer that abstracts the underlying source complexity and presents disparate data sources as a single source.

Data virtualization is cheaper since it does not require actual hardware devices to be installed. In other words, organizations no longer need to purchase and dedicate a lot of IT resources and additional monetary investment to create on-site resources, similar to the one used in a data warehouse.

Data virtualization allows speedy deployment of resources. In this solution, resource provisioning is fast and straightforward. Organizations are not required to set up physical machines or to create local networks or install other IT components. Users have a single point of access to a virtual environment that can be distributed to the entire company.

Data virtualization is an energy-efficient system since the solution does not require additional local hardware and software. Therefore, an organization will not be required to install cooling systems.

Disadvantages of Data Virtualization

Data virtualization creates a security risk. In the modern world, having information is a cheap way to make money. In effect, company data is frequently targeted by hackers. Implementing data virtualization from disparate sources may give an opportunity to malicious users to steal critical information and use it for monetary gain.

Data virtualization requires a series of channels or links that must work in cohesion to perform the intended task. In this cases, all data sources should be available for virtualization to work effectively.

Data Virtualization Use Cases

  • Companies that rely on business intelligence require data virtualization for rapid prototyping to meet immediate business needs. Data virtualization can create a real-time reporting solution that unifies access to multiple internal databases.
  • Provisioning data services for single-view applications, such as in customer service and call center applications require data virtualization.

 

What Is Machine Learning?

Machine Learning

Machine Learning

Machine learning is Artificial Intelligence (AI) which enables a system to learn from data rather than through explicit programming.  Machine learning uses algorithms that iteratively learn from data to improve, describe data, and predict outcomes.  As the algorithms ingest training data to produce a more precise machine learning model. Once trained, the machine learning model, when provided data will generate predictions based on the data that taught the model.  Machine learning is a crucial ingredient for creating modern analytics models.

My Most Used Windows 10 Keyboard Shortcuts

Shortcut Keystrokes

Shortcut Keystrokes

While there are a great number of useful windows 10 shortcuts, I have the list below the combination, which I use daily.  Many of the shortcuts can be used across multiple applications (e.g. Notepad++, MS Word, SQL Server, Aginity, etc.) and save a considerable amount of mouse work.  Overall, these shortcut keys are more efficient and faster than using the mouse to perform the same task on a repetitive basis.

You may want to investigate the numerous other Windows 10 shortcuts keys, which best apply to your daily activities, but these are the ones, which I have found most useful and which I have committed to memory.

Table of My Most Used Windows Shortcuts

Key
Strokes

Behavior

Alt
+ Tab

Switch
between open apps

Ctrl
+ A

Select
all items in a document or window

Ctrl
+ Alt + Tab

Use
the arrow keys to switch between all open apps

Ctrl
+ C

Copy
the selected item

Ctrl
+ D

Delete
the selected item and move it to the Recycle Bin

Ctrl
+ F

Select
the search box

Ctrl
+ V

Paste
the selected item

Ctrl
+ X

Cut
the selected item

Esc

Stop
or leave the current task

F5

Refresh
the active window

F11

Maximize
Window

Related References

 Microsoft > Windows Support > Keyboard shortcuts in Windows

 

 

 

 

 

 

What is Source Control?

Acronyms, Abbreviations, Terms, And Definitions

Acronyms, Abbreviations, Terms, And Definitions

Source Control is an Information technology environment management system for storing, tracking and managing changes to software. This is commonly done through a process of creating branches (copies for safely creating new features) off of the stable master version of the software, then merging stable feature branches back into the master version. This is also known as version control or revision control.

Database – What is DDL?

SQL (Structured Query Language), Database, What is DDL?

SQL (Structured Query Language)

What is DDL (Data Definition Language)?

DDL (Data Definition Language), are the statements used to manage tables, schemas, domains, indexes, views, and privileges.  The the major actions performed by DDL commands are: create, alter, drop, grant, and revoke.

 

Related References

Data Modeling – Fact Table Effective Practices

Database Table

Database Table

Here are a few guidelines for modeling and designing fact tables.

Fact Table Effective Practices

  • The table naming convention should identify it as a fact table. For example:
    • Suffix Pattern:
      • <<TableName>>_Fact
      • <<TableName>>_F
    • Prefix Pattern:
      • FACT_<TableName>>
      • F_<TableName>>
    • Must contain a temporal dimension surrogate key (e.g. date dimension)
    • Measures should be nullable – this has an impact on aggregate functions (SUM, COUNT, MIN, MAX, and AVG, etc.)
    • Dimension Surrogate keys (srky) should have a foreign key (FK) constraint
    • Do not place the dimension processing in the fact jobs

Related References

Data Modeling – Dimension Table Effective Practices

Database Table

Database Table

I’ve had these notes laying around for a while, so, I thought I consolidate them here.   So, here are few guidelines to ensure the quality of your dimension table structures.

Dimension Table Effective Practices

  • The table naming convention should identify it as a dimension table. For example:
    • Suffix Pattern:
      • <<TableName>>_Dim
      • <<TableName>>_D
    • Prefix Pattern:
      • Dim_<TableName>>
      • D_<TableName>>
  • Have Primary Key (PK) assigned on table surrogate Key
  • Audit fields – Type 1 dimensions should:
    • Have a Created Date timestamp – When the record was initially created
    • have a Last Update Timestamp – When was the record last updated
  • Job Flow: Do not place the dimension processing in the fact jobs.
  • Every Dimension should have a Zero (0), Unknown, row
  • Fields should be ‘NOT NULL’ replacing nulls with a zero (0) numeric and integer type fields or space ( ‘ ‘ ) for Character type files.
  • Keep dimension processing outside of the fact jobs

Related References

 

 

Data Modeling – Database Table Field Ordering Effective Practices

Database Table Field Ordering Effective Practices

Database Table

Field ordering can help the performance on inserts and updates and, also, keeps developer and users from having to search entire table structure to be sure they have all the keys, etc.

Table Field Ordering

  1. Distribution Field Or Fields, if no distribution field is set the first field will be used by default.
  2. Primary Key Columns (including Parent and Child key fields)
  3. Foreign Key Columns (Not Null)
  4. Not Null Columns
  5. Nullable Columns
  6. Created Date Timestamp
  7. Modified (or Last Updated) Date Timestamp
  8. Large text Fields
  9. Large binary Columns or Binary Field references

Related References

Data Modeling – What is Data Modeling?

Data Models, Data Modeling, What is data Modeling, logical Model, Conceptual Model, Physical Model

Data Models

Data modeling is the documenting of data relationships, characteristics, and standards based on its intended use of the data.   Data modeling techniques and tools capture and translate complex system designs into easily understood representations of the data creating a blueprint and foundation for information technology development and reengineering.

A data model can be thought of as a diagram  that illustrates the relationships between data. Although capturing all the possible relationships in a data model can be very time-intensive, a well-documented models allow stakeholders to identify errors and make changes before any programming code has been written.

Data modelers often use multiple models to view the same data and ensure that all processes, entities, relationships and data flows have been identified.

There are several different approaches to data modeling, including:

Concept Data Model (CDM)

  • The Concept Data Model (CDM) identifies the high level information entities, their relationships, and organized in the Entity Relationship Diagram (ERD).

Logical Data Model (LDM)

  • The Logical Data Model (LDM)  defines detail business information (in business terms) within each of the Concept Data Model and is a refinement of the information entities of the Concept Data Model.   Logical data model are non-RDBMS specific  business definition of tables, fields, and attributes contained within each information entity from which the Physical Data Model (PDM) and Entity Relationship Diagram (ERD) is produced.

Physical Data Model (PDM)

  • The Physical Data Model (PDM)  provides the actual technical details of the model and database object (e.g. table names, field names, etc.) to facilitate creation of accurate detail technical designs and actual database creation.  Physical Data Models are RDBMS specific definition of the logical model used build database, create deployable DDL statements, and to produce the Entity Relationship Diagram (ERD).

Related References

 

Information Management Unit Testing

Information Management Unit Testing, UT, Unit Test

Information Management Unit Testing

 

Information management projects generally have the following development work:

  • Data movement software;
  • Data conversion software;
  • Data cleansing routines;
  • Database development DDL; and
  • Business intelligence and reporting analytical solutions.

Module testing validates that each module’s logic satisfies requirements specified in the requirements specification.

Effective  Practices

  1. Should focus on testing individual modules to ensure that they perform to specification, handle all exceptions as expected, and produce the appropriate alerts to satisfy error handling.
  2. Should be performed in the development environment.
  3. Should be conducted by the software developer who develops the code.
  4. Should validate the module’s logic, adherence to functional requirements and adherence to technical standards.
  5. Should ensure that all module source code has been executed and each conditional logic branch followed.
  6. Test data and test results should be recorded and form part of the release package when the code moves to production.
  7. Should include a code review, which should:
  • Focus on reviewing code and test results to provide additional verification that the code conforms to data movement best practices and security requirement; and
  • Verify that test results confirm that all conditional logic paths were followed and that all error messages were tested properly.

Testing Procedures

  1. Review design specification with designer.
  2. Prepare test plan before coding.
  3. Create test data and document expected test results.
  4. Ensure that test data validates the module’s logic, adherence to functional requirements and adherence to technical standards.
  5. Ensure that test data tests all module source code and each conditional logic branch.
  6. Conduct unit test in personal schema.
  7. Document test results.
  8. Place test data and test results in project documentation repository.
  9. Check code into code repository.
  10. Participate in code readiness review with Lead Developer.
  11. Schedule code review with appropriate team members.
  12. Assign code review roles as follows:
  • Author, the developer who created the code;
  • Reader, a developer who will read the code during the code review—The reader may also be the author; and
  • Scribe, a developer who will take notes.

Code Review Procedures

  1. Validate that code readiness review has been completed.
  2. Read code.
  3. Verify that code and test results conform to data movement best practices.
  4. Verify that all conditional logic paths were followed and that all error messages were tested properly.
  5. Verify that coding security vulnerability issues have been addressed.
  6. Verify that test data and test results have been placed in project documentation repository.
  7. Verify that code has been checked into code repository.
  8. Document action items.

Testing strategies

  1. Unit test data should be created by the developer and should be low volume.
  2. All testing should occur in a developer’s personal schema.

Summary

Unit testing is generally conducted by the developer who develops the code and validates that each module’s logic satisfies requirements specified in the requirements specification.

Where do data models fit in the Software Development Life Cycle (SDLC) Process?

Data Model SDLC Relationship Diagram

Data Model SDLC Relationship Diagram

In the classic Software Development Life Cycle (SDLC) process, Data Models are typically initiated, by model type, at key process steps and are maintained as data model detail is added and refinement occurs.

The Concept Data Model (CDM) is, usually, created in the Planning phase.   However,  creation the Concept Data Model  can slide forwarded or backward,  somewhat , within the System Concept Development, Planning, and Requirements Analysis phases, depending upon  whether the application being modeled is a custom development effort or a modification of a Commercial-Off-The-Shelf (COTS) application.  The CDM is maintained, as necessary, through the remainder of the SDLC process.

The Logical Data Model (LDM) is created in the Requirement Analysis phase and is a refinement of the information entities of the Concept Data Model. The LDM is maintained, as necessary, through the remainder of the SDLC process.

The Physical Data Model (PDM) is created in the Design phase to facilitate creation of accurate detail technical designs and actual database creation. The PDM is maintained, as necessary, through the remainder of the SDLC process.

Related References: