Centerprise 6: Processing Hierarchical Data Using Scoped Transformations

Hierarchical data formats play a major role in B2B data exchange. Most B2B data exchange takes place using hierarchical formats such as XML or EDI. Parsing, transforming, and building hierarchical documents is a complex process. Centerprise 6 Data Integrator brings major new functionality for working with hierarchical data formats.

Overview

Centerprise 6 features a large number of built-in transformations that enable users to develop the sophisticated flows needed for complex data integration. These transformations can be broadly classified into single transformations and set transformations.

Single transformations usually operate on a single record at a time and are used for looking up or computing values. Examples of single transformations include lookups, expressions, functions, and others.

Set transformations operate on a record set and may alter sequence and number of records in that set. Examples of set transformations include sort, filter, join, merge, normalize, denormalize, union, etc. Some transformations can be classified both as single and set, depending on whether they return a single record or multiple records. These transformations include lookups, subflows, and text parsers.

Generally, set transformations operate on the entire data set. For instance, a sort transformation sorts the entire data set before passing along records to the next object. Similarly, aggregate transformations use the entire data set to construct aggregated records.

This model, while perfectly suitable for flat structures, does not provide adequate semantics for dealing with hierarchical data.

Scoped Transformations

Centerprise 6 now introduces the concept of Scoped Transformations, set  transformations that can be limited to a specific node in the source tree, enabling the building and manipulation of complex hierarchical data structures.

A set transformation can be designated as a Scoped Transformation by creating a scope map between a source object node and the top node in the transformation. This is accomplished by pressing the Alt key while dragging a source node to the target. In the following example, the filter is a Scoped Transformation whose scope is Order node. This action attaches the result of a Filter to the Order node. For onward mapping, this action can be deemed as implicitly creating a Filter node collection inside the Order node.

Building a Simple Tree

We start with an example of using scoped transformations to build a simple sales order tree structure. While Centerprise offers many different ways of building tree structures, a Scoped Transformation will be used to accomplish the task in this instance.

This dataflow builds an order tree using Microsoft’s sample Northwind database. Even though you can build a tree in Centerprise using the Database Table Source and Tree Join Transformations, in this example we are using Scoped Transformations in order to illustrate their usefulness. The steps are described below:

Order – Database Table Source

The Order source retrieves data from the Order table in the database.

OrderDetail –Database Lookup

OrderDetail is a database lookup with the ‘Return All’ option selected. This means that the lookup will return all rows that match lookup keys. In this case, it will return all instances of OrderDetail for the specified Order object. This step creates a tree with the Order table containing a collection of OrderDetail data.

OrderTotals – Aggregate

OrderTotals is an aggregate transformation with no Group By fields defined. This step creates order level totals. The result is implicitly attached to the Order node as a single instance object.

OrderTree – Passthru

OrderTree is merely a passthru transformation that consolidates all the nodes in a unified tree object.

This simple scenario illustrates the power and versatility of the Centerprise 6 scoped transformation feature. Most Centerprise set transformations can be used as Scoped Transformations. These include filter, sort, distinct, lookups, aggregate, join, tree join, union, merge, normalize, denormalize, and others.

Centerprise 6: Detached Transformations Enable Higher Productivity and Performance

Centerprise 6 will be out in the next few weeks and our blog is going to focus in the coming months on getting users up to speed with the many new features and enhancements so that you can get the most out of your investment.

Centerprise Data Integrator features a large number of built-in transformations that enable users to develop sophisticated flows to meet complex data integration demands. New in Centerprise 6 is detached transformations, a key new functionality that enables you to both conditionally use transformations and reuse them in multiple expressions. You can also create composite transformations in your subflows that can be reused in a variety of scenarios, adding modularity to your flows.

What is a detached transformation?

Centerprise transformations can be broadly classified into single transformations and set transformations. Single transformations usually operate on a single record at a time and are used for looking up or computing values. Examples of single transformations include lookups, expressions, functions, and others. Set transformations, on the other hand, operate on sets of records and may alter the sequence and number of records passing through them. Examples of set transformations include sort, filter, join, merge, normalize, denormalize, and union, among others.

Until now, all Centerprise transformations have been attached. That is, they receive input from other objects via inbound maps and send output via outbound maps. Centerprise 6 now introduces the concept of detached transformations—single transformations that are not mapped to any other objects.

Centerprise 6 now introduces the concept of detached transformations—single transformations that are not mapped to any other objects.

Instead, these transformations behave similarly to built-in or custom functions and can be used in expressions. When a single transformation is marked as detached, Centerprise disallows mapping to and from that transformation.

In the Expression Builder dialog, detached transformations are listed in the Function List box. These functions are under the Detached Action category and are prefixed by the “$” sign. This feature enables you to invoke lookups, expressions, subflows, and others from within expressions.

In the Expression Builder dialog, detached transformations are listed in the Function List box under the Detached Action category and are prefixed by the “$” sign.

The following actions can be designated as detached transformations:

1.       Database Lookup

2.       File Lookup

3.       SQL Statement Lookup

4.       Code Lookup

5.       Sequence Generator

6.       Expression

7.       Subflow (Those containing only the single transformations)

Detached transformations can have only one output element and a maximum of five input elements. For complete flexibility, they can be used in any expression inside a dataflow or workflow, including expression transformation, filter, route, workflow decision, data-driven write strategy, and data quality rules.

With the introduction of detached transformations in Centerprise 6, the Astera team provides a new way to help you save time and money by increasing your productivity, and, in some cases, the performance of your data integration scenarios.

Coming Soon—Centerprise 6!

Centerprise 6, Astera’s latest major release of our innovative data integration software, will be introduced in September. This version represents a major, across-the-board upgrade that combines over 150 new features and enhancements that support our trademark ease of use and extensibility within a powerful, fully-functional environment.

This release builds on Centerprise’s impressive complex data mapping capabilities. Hierarchical data mapping has been significantly strengthened to ensure that Centerprise is the best platform for overcoming the challenges of complex hierarchical structures such as extensible markup language (XML), electronic data interchange (EDI), web services, and more.

The server functionality has received a major upgrade that includes scalability enhancements, high availability features, workflow restart, and server and job management improvements.

Over the next few months we’re going to be highlighting in this blog some of the most significant new features and capabilities and how they can help your organization more effectively utilize its information assets.

Check back often for new information!