Category Archives: Centerprise Data Integrator

Server Resiliency Improved in Centerprise 7.5

We are releasing Centerprise Data Integrator 7.5 very soon, and as with every new release, our focus is to improve the product experience for our customers. We have added new features and have made improvements to the existing ones in the upcoming build.

One of the key changes we have made in the 7.5 version is greatly improving the server resilience. This will ensure trouble-free server operations, even when connection issues occur between Astera server and the MSSQL server hosting the repository databases. Enhanced server resilience will not only improve the overall performance of Centerprise, but will also prove to be significantly beneficial in the scenarios where 24/7 operation with high uptime is required, which is the case with most of our enterprise customers.

Key Benefits of Improved Server Resilience

With improved server resilience, users will get the following benefits:

Better and quick recovery of the server

With improved server resilience, Astera server will no longer enter a permanent error state after a database outage event. Instead, it will recover as soon as the connection is restored, allowing the customer to continue with the normal operations without any downtime. The server can now survive most repository database outages without any user action required.

Auto-recovery mode – no manual restarts required

Manually restarting the server will be no longer necessary in case of lost connectivity. The server will recover automatically. If the server starts up when a connection to the repository database is not available, the server will enter a paused state waiting for the connection. When the connectivity is restored, the server will enter its normal operational state without the need for manually restarting it.

Improved performance

Improved server resilience means improved performance. If the server is running the flows and an outage event occurs, the flows will not be terminated, and instead paused waiting on connection. This means that the server will be able to complete most flows successfully even in the presence of multiple random network and/or database connection issues.

Activity/Error tracking

Logging in general, and logging of database connection issues in particular, have been greatly improved in the new release. Moreover, the server writes database connection issues in the Windows event log and includes a link to the Error file for easier troubleshooting. An entry is also added to trace the server log when the connection is restored.

 

TDWI Anaheim Conference 2018: Post Event Highlights

TDWI Anaheim Astera DWAccelerator

TDWI Anaheim Conference 2018 has been a truly transformative journey for Team Astera. We attended several talking sessions, had one-on-one meetings with thought leaders and industry experts, and received a positive response for our products, especially our end-to-end data warehouse automation solution, DWAccelerator.

Event Overview

TDWI Anaheim Astera DWAccelerator

TDWI is the focal point of education, research, and insights when it comes to data management, analytics, and the latest trends and technologies within the big data realm. TDWI Anaheim Conference 2018 was held at the happiest place on Earth, Disneyland® Hotel, from August 5 to August 10, inviting data professionals from a whole host of renowned companies from across North America.

The 5-day long conference featured over 65 half- and full-day sessions by TDWI instructors and industry experts. It was geared towards providing hands-on training and practical experiences to data professionals and aspiring individuals. The scope of the conference revolved around four major areas, including data infrastructure and technologies, modern data management, machine learning and advanced analytics, and data strategy and leadership.

Highlights of the Event

TDWI Anaheim Conference 2018 enabled us to explore new avenues and learn about the latest happenings in the data management industry. Here is our version of the conference:

Our CEO Shared Data Warehousing Automation Insights

TDWI Anaheim Astera DWAccelerator

Ibrahim Surani, CEO Astera Software, conducted a talking session on ‘Model-Driven Data Warehouse Development.’ He started with the importance of data and highlighted major challenges businesses generally face when executing a data warehousing or integration project. He shed light on the indispensability of data warehouse automation in enhancing the quality, speed, and accuracy of data warehouse performance. He further talked about the ingredients for achieving data warehouse automation, which included source data modeler, dimension modeler, robust ETL engine, seamless connectivity, and high-performance ETL engine.

The crux of the talk was the meta-data driven model, which he explained in the form of a 4-step process, each illustrating the key aspects of automating the data warehouse development process. Ibrahim emphasized on the benefits of model-driven data warehouse automation, allowing businesses to cut maintenance costs, reduce time to market, and minimize handoffs between users and software tools, without compromising on the flexibility and power of the solution.

Discussions with Industry Experts on DWAccelerator

TDWI gathers industry experts and thought leaders from renowned research firms and Fortune 500 companies, some of which are instructors for the platform as well. We did several exclusive product demos and received positive feedback for DWAccelerator’s capabilities.

One of the TDWI instructors and Managing Research Director at EMA, John L Myers, showed great interest in DWAccelerator’s automation. The product is capable of drastically reducing the time-consuming process of developing enterprise data warehouse architecture and designing ETL processes. He was impressed with DWA’s several features, such as automatic joins, load policy configuration, and flow generation.

Our CEO and COO have been invited to participate in the panel for data warehouse automation in the upcoming TDWI conference in Orlando. Be sure to be on the lookout for the updates regarding our booth and free conference passes for TDWI Orlando Conference.

The Final Words

TDWI has been tracking trends and technologies shaping data and educating companies and professionals how to utilize data to its maximum potential for over 20 years. TDWI Anaheim Conference 2018 has been an insightful experience for our team and proved to be a great platform to connect with renowned industry names and clients looking for automated data warehousing solution, like DWAccelerator.

 

An Automated Approach to Modeling Your Slowly Changing Dimensions

Business data is inherently susceptible to change with the passage of time and impacts the business in different ways. In data warehouses, the effect of time on our dimensions and facts requires careful study for the repository to meet the business intelligence objective of delivering up-to-date information to decision makers.

Question is, how best to handle these changes?

Developing a dimensional model that captures the different states of your data with respect to time is a key objective of an Enterprise Data Warehouse. For measures in our fact tables, we can use date dimensions and link them using foreign keys. For dimensions, the complexity of handling changes increases greatly. Each step of the Slowly Changing Dimension (SCD) flow must be hand-coded using multiple, complex SQL statements. The implementation is lengthy and complex, and affects the business’ ability to maintain its data quickly and reliably – which is always a critical consideration.

Slowly Changing Dimensions in Centerprise

Compared to the traditional hand-coded approach to the slowly changing dimension flow, Astera offers an automated implementation using a completely drag-and-drop interface. Source data is mapped to an SCD object in Centerprise, which pushes system-generated SQL statements directly to the target data warehouse (Read: Pushdown Optimization Mode in Centerprise) based on the field layouts defined by the user. Each column in the user’s table can be designated as Surrogate Key, Business Key, SCD1, SCD2, etc. (see below) within the component’s properties in Centerprise. The platform handles the update strategy, performance considerations, routing, and complex joins automatically on the backend, as long as the SCD Field Types in below screen are defined correctly.

Field Layout - Slowly Changing Dimensions component

SCD Object Properties in Centerprise

Automating Type 1 & 2 Slowly Changing Dimension Implementation

Centerprise supports both Type 1 and Type 2 SCD to update records with and without maintaining history.

SCD Type 1

This type deals with updates in the dimensional table, for cases when preserving history is not a consideration and you need to replace the old values in your table with recent ones.

To use SCD Type 1 in Centerprise, you can mark your column as ‘SCD1 – Update’ in the Layout Fields menu of the SCD object in Centerprise, as seen in above screenshot for the ‘Contact Title’ column.

SCD Type 2

This type deals with changes in your dimension that need to be tracked. A new record is inserted with each change, and the existing record is marked as expired, by date, version, or status.

To use SCD Type 2 in Centerprise, mark your chosen column as ‘SCD2 – Update and Insert’, as seen in above screenshot for ‘ContactName’ column.

Push-Down Optimization

Once the layout is defined and flow executed, the Astera SCD transformation generates the SQL code necessary to compare, join, route, and insert data in your target dimension and pushes the transformation logic down to the database for processing.

Using this approach, the maintenance of large dimensions is significantly faster because all the processing is done by the database rather than the Centerprise server performing the operations and going back and forth between the database to read, compare, and write the data.

To learn more about the automated Slowly Changing Dimensions component in Centerprise and how to use it to manage your dimensions, download the white paper: How to Manage Slowly Changing Dimensions Using Centerprise.

Pushdown Optimization Mode in Centerprise Data Integrator

How does Pushdown Optimization mode work in Centerprise?

Moving data, containing millions of records, between source, ETL server, and target database can be a time-consuming process. When source and target database reside on the same server, unnecessary data movements and delays can be prevented by applying transformations to data in pushdown optimization mode.

Pushdown optimization mode pushes down the transformation logic to the source or target database. Centerprise integration server translates the applied transformation logic into automatically generated SQL queries. This eliminates the need for extracting data from the source, migrating it to staging tables on an ETL server for applying transformations, and then loading the transformed data on the target database. As a result, performance is significantly improved and data is readily made available to the end-users.

ELT, ETL, Pushdown optimization mode

Types of Pushdown Mode

There are two types of pushdown optimization modes:

  1. Full pushdown optimization mode
  2. Partial pushdown optimization mode

In full pushdown optimization mode, the Centerprise integration server executes the job completely in the pushdown mode. And in partial pushdown mode, the transformation logic is either pushed down to the target database or the source database, depending on the transformation logic and database provider.

Database Providers supported in Pushdown Mode by Centerprise

Centerprise supports following database providers:

  1. MySQL
  2. SQL
  3. Oracle
  4. Postgres
  5. MSSQL

Verify Pushdown Mode

Certain transformation logic cannot be executed in a pushdown mode. ‘Verify Pushdown Mode’ feature in Centerprise identifies the transformation logic that can be pushed down to the source or destination database.

To learn more about Pushdown Optimization mode in Centerprise and its use cases, download the white paper Centerprise Automated Pushdown Optimization.

Optimizing Business Capabilities with a Data Integration Software

Businesses are increasingly adopting a data-driven culture. The significant surge in the volume of the exchanged data indicates that the trend is creating a paradigm shift – a shift from manufacturing to an information economy. To put this in perspective, Google processes petabyte of information by the hour and The Economist recently declared data as the most valuable resource, even more than the oil.

Data integration with Centerprise

“The world’s most valuable resource is no longer oil, but data.”

-The Economist

But the true utility of any resource comes from its consumption or the value it delivers to the consumers. The same principle applies to data. To gain maximum utility out of data, businesses must be able to (quickly and reliably) integrate incoming data from disparate sources and make that information available to the relevant stakeholders, both internally and externally. Your business needs a data integration tool to perform this task efficiently.

A data integration tool can help you in following ways to optimize your current business capabilities:

By extracting data from structured and unstructured sources

Incoming data can be structured, semi-structured, poly-structured or unstructured. For instance, text-based PDF files, PDF forms and scanned PDF images are used as a medium for exchanging information by many organizations. But the data contained in PDF files is unstructured and is required to be extracted for crucial business decisions. A data integration tool can automate the data extraction process and integrate the extracted data with the internal systems for further processing and analysis.

By integrating data from hierarchical files

Integrating data from flat files is comparatively easier but business users face challenges when they try to extract, parse and integrate information from hierarchical data files such as XML, JSON, EDI and COBOL. To perform hierarchical data integration, business users rely on IT, which increases the burden on them. A data integration tool can effectively bridge this gap between business executives and IT.

Learn how Centerprise Data Integrator enables business users to work with hierarchical data, without the need for custom coding and programming, by downloading the whitepaper Hierarchical Data Integration for Business Users.

By making data readily available to business users

A data integration tool with a user-friendly interface and a comprehensive library of built-in functions can help limit the reliance on IT. It readily makes the data available to business users who can then work with the available information and get business insights without delay. Additionally, data integration tools can automate the ETL process, which eliminates the need for manual integration and significantly reduces the chances of errors.

The performance of a business is optimized when the executives are more focused on making critical business decisions rather than collecting and integrating the data.

By checking for data-quality

A data integration tool cleanses, validates and ensures the trustworthiness of the incoming data. Poor quality data can adversely affect business insights that can prove to be expensive for the business.

Overall, a data integration tool that simplifies the ETL process for the users is an investment that organizations should make to stay relevant in the current data driven business environment. It can prove to be beneficial for the business in more than one ways. By bridging the gap between IT and business executives, it helps in efficient division of workload. It empowers business users to drive insights from the data by giving them prompt access to it. And when executives delegate the task of data integration and extraction to software, they can focus on more critical aspects of the business. The result is faster and more accurate business decisions, minimized costs and increased revenue.

Astera’s Centerprise Data Integrator is a complete data integration solution that provides the mentioned benefits to its users, and more. The user-friendly interface and visual drag-and-drop environment eliminates the need for manual scripting and enables business users to work with data without relying on IT. Contact Astera’s sales and support to get more information.

Mainframe Modernization

Common Challenges of COBOL Data Extraction and How Centerprise Addresses Them

Although technologies like Ruby, Hadoop, and Cloud Computing continue to dominate headlines, there are still a large number of businesses that rely on legacy technologies. Many businesses, particularly those operating in the banking and insurance sector, use solutions that are COBOL-based.

According to Reuters, over 220 billion lines of COBOL code are in use today. As a result, a tremendous amount of data remains tied up in legacy systems. For any legacy modernization and BI initiative to be successful, it is important that this data must be integrated, transformed, and offloaded onto an analytics platform.

While extracting data from COBOL-based legacy applications is essential for improved decision-making, it remains a challenge for most businesses due to two primary reasons:

  • Shortage of COBOL Skills

There is a growing gap between the number of skilled COBOL programmers and organizations relying on the programming language. The average age of COBOL programmers is 55 years, and 70 percent of universities are favoring fancy languages like Java, C++, Linux, and UNIX over COBOL.

  • Need for Custom Programming

Analyzing data by directly querying the mainframe is a complex process. It requires custom development and therefore can be time-consuming and costly, with billing based on MIPS.

To addresses these two challenges, businesses need a solution that can fuel their data integration efforts, while ensuring data quality and reducing the need for hand-coding the processes.

How Centerprise Facilitates COBOL Data Extraction

Centerprise is a complete data integration solution that allows users to import data from a variety of sources, including legacy systems, transform it and write it to a destination of their choice. With its user-friendly, drag-and-drop interface and unparalleled data mapping capabilities, Centerprise makes the process of extracting data from COBOL-based systems simple, quick, and cost-effective.

cobol data extraction, legacy modernization

Centerprise offers complete support for COBOL data extraction with the functionality to:

  • Read a COBOL File — Centerprise features a high-speed COBOL file reader that can efficiently process large COBOL files.
  • Parse a Copybook — The built-in copybook parser reads a COBOL copybook and automatically builds the layout. When a copybook is not available, users can import a COBOL data file as a fixed length file and manually define field markers, data types, and numeric formats.
  • Identify USAGE, REDEFINES, and OCCURS — Centerprise offers support for different clauses used in a COBOL data file, including REDEFINES, OCCURS, and USAGE, such as COMP, COMP-3, COMP-5.

Once a COBOL data file has been imported, users can leverage the code-free, drag-and-drop environment of Centerprise to transform and write data to a destination of their choice.

Download our whitepaper to learn how Centerprise can help you combine legacy COBOL data with modern data streams and get a unified view of your information assets.