Data transformation in a data integration job using next Centerprise version “Hermes”

Most data integration jobs  revolve  around  gathering  data from disparate sources, cleaning it, making it conform to the standards by means of data mapping and data transformations, and then sending it to a destination such as a Data warehouse, a Data mart, or an Operational Data Store. One of the key challenges here  is the transformation or the ‘T’ of the ETL. Hermes  offers an exhaustive set of inbuilt data field mapping and data record  transformation constructs. In this blog I am going to talk about the field mapping. I’ll talk about record level transformations on a later day.

Data Field Mapping

In a regular migration job, one needs to map the source and the target. Based on the layout of the destination, the user picks what field of source is going to be mapped to which field of the destination. But this straight movement is not enough in most of the cases. Hermes offers various ways for data field mapping, apart from the field-to-field direct map.

To illustrate other forms of mapping, I have taken an example of a simple data migration job. This scenario involves transfer of a company’s customers’ information from the first quarter of the year to a database table. A snapshot of the visual data-flow design looks like this -

I am going to describe the functionality of each of the data field mapping types, followed by how I have used them in this sample.

Expression Map

Backed by Astera’s rules engine, expression map provides the capability to write Excel-like expressions and do complex calculations based on source fields and then assign them to a destination field.

In our example, source data comes with individual pieces of information about the customer address, but our destination data table has only one column for address and it takes the full address. I have used the expression map for this purpose and combined the five source fields – Street_address, city, state, zip, and country – into one full address and then assigned it to the destination.

List Lookup:

A list lookup stores the lookup list with the meta and uses it for looking up a destination value for incoming source values.

The source data in the example contains the full country names, but the destination table accepts only a two letter code for the country. I have used a list lookup for this purpose and created a lookup with entries like, UK for United Kingdom, US for United States, and so on and so forth. This lookup is attached to the source field of country and the result of the lookup goes to the country field of the destination.

Hermes offers another way to store this lookup information, called a database table lookup. A database lookup essentially works the same way as a list lookup, the only difference being that lookup list is stored in a database table.

SQL Statement Map

Using the SQL Statement map, you can  run a SQL query or a stored procedure with some of its parameters taking value from source fields and the output will be assigned to the destination.

For the example, one requirement was to get the total sales amount for the quarter for each of the customer records, where sales data is stored in a separate table. I have created an SQL query for this purpose that reads –

select TotalSales from ActiveCustomers where contactname = ‘@CustomerId’

Now, I connect the CustomerId from the source to this SQL Map, which works as the parameter for this SQL query, get the total sales for the customer and assigns to the destination field TotalSales.

Function Map

A function map offers a list of Financial, Name Parsing, Regular Expression, Date time, String, Logical, and Conversion functions, where these functions take input from the source fields and the output of the function can be assigned to a destination field.

I have used a function GetLastName from the functions’ list to get the last name of the customer. As you can see in the picture, this map is getting the ContactName from the source and assigning the result, last name — that is obtained by parsing the contact’s full name –  to the LastName field of the destination.

Constant Value Map

A constant value map can be used to assign a constant value to a destination field.

For the example transfer of customer records, we needed to add the information about which quarter this data belongs to, and I have used a constant value map. I am assigning a constant value “Q1 2010″ as data period to each of the customers.

Using this sample, we looked at different field mapping options available in Hermes to transform the source data at the field level before putting it into a destination. Hermes offers several record level transformations, such as merge, sort, union, join, distinct, etc.  I’ll talk about the record level transformations in my next blog entry.

Tags: , , , , ,

Comments are closed.