Dataflows – The Cornerstone of Data Integration Part 2

Part 2 – Creating a Dataflow

To create a new dataflow, go to File -> New -> Dataflow on the main menu. Or expand the New Dataflow dropdown on the main toolbar and select Dataflow from the dropdown menu.

Adding Objects

A dataflow normally has one or more sources, and may have zero, one, or more destinations. Sources, destinations, most types of maps, transformations, and logs are represented as objects on the dataflow. Depending on the type of the object, an object can be added to the dataflow in one of the following ways:

For flat file sources or destinations:

1. Flow Toolbox. You can add a source object or a destination object by selecting it from the appropriate category in the Flow Toolbox. For example, to add a source comma-delimited file object, expand the Sources group in the Flow toolbox, and drag and drop the Delimited File Source tool onto the dataflow.

To add a destination object, press and hold the Shift key while performing the drag and drop. Note that an object added this way initially does not have any properties defined. To define its properties, double click on the object title, or right-click and select Properties from the context menu. In the Properties screen that opens, select the File Path of the file that will be associated with the object. Field layout and other properties can then be populated based on the file’s content. An example of a source delimited file Properties screen is shown below.

2. Drag and Drop. Excel, Delimited, and fixed-length files can be dragged from an Explorer window and dropped onto an open dataflow tab in Centerprise. By default, the file dropped on a dataflow is added as a source object. To add the file as a destination, press and hold the Shift key while dropping the file.

The advantage of using drag and drop compared to other methods is that many of the object’s properties are pre-populated based on the file’s content. For example, the field layout is automatically filled out so there is no need to manually create it.

3. Copy and Paste. If the source or destination is already defined in a dataflow/workflow, the existing object can be copied and pasted from the same or a different dataflow/workflow. The object being copied retains the properties of the original object and is assigned a unique new name to distinguish it from the original object. The designation of a Source vs. Destination object cannot be changed using this method.

For XML sources or destinations:

1. Flow Toolbox. To add an XML source or destination to the dataflow, use the XML File Source or XML File Destination tool in its appropriate group in the Flow Toolbox. Note that the XML file object initially will not have any properties defined. To define its properties, double click on the object’s title, or right click and select Properties from the context menu. In the Properties screen that opens, select the file path of the XML file that will be associated with the object and provide the path to the XSD schema that controls the layout of the XML file. An example of a source XML file Properties screen is shown below.

As with flat files, an existing XML object can be copied and pasted from the same or a different dataflow or workflow. The object being copied retains the properties of the original object and is assigned a unique new name to distinguish it from the original object.

For databases:

1. Drag and drop. A database table or view can be dragged from the Data Source Browser and dropped onto an open dataflow tab. To open the Data Source Browser, go to View -> Data Source Browser. Connect to the appropriate server, then expand the Database tree and expand the Tables (or Views) tree to select the table (or view). Drag and drop the selected table or view to the dataflow. By default, the database table is added as a Database Table Source object. To add a database table (or view) as a destination, press and hold ‘Shift’ key while dragging and dropping a table (or view) from the Data Source Browser. To add a data model source, press and hold the Control key while dragging and dropping a table (or view) from the Data Source Browser. To add a database lookup object, press and hold the Alt key while dragging and dropping a table or view from the Data Source Browser. As with files, you can copy and paste an existing database table object from the same or different dataflow or workflow. The object being copied retains the properties of the original object, and is assigned a unique new name to distinguish it from the original object.

2. Flow Toolbox. To add a database table source or destination to the dataflow, use the Database Table Source or Database Table Destination tool in its appropriate group in the Flow toolbox. The database table object initially will not have any properties defined. To define its properties, double click on the object’s title, or right click and select Properties from the context menu. An example of a source database table Properties screen is shown below.

As with files, an existing database table object can be copied and pasted from the same or a different dataflow or workflow. The object being copied retains the properties of the original object and is assigned a unique new name to distinguish it from the original object.

For any other types of objects such as maps, transformations, or logging objects:

1. Flow Toolbox. An object can added by selecting it from the appropriate category in the Flow Toolbox.

An object added this way initially does not have any properties defined. To define its properties, double click on the object title, or right click and select Properties from the context menu.

2. Copy and Paste. If the object is already defined in the same or a different dataflow/workflow, the existing object can be copied and pasted into the dataflow/workflow. The object being copied retains the properties of the original object, and is assigned a unique new name to distinguish it from the original object.

Unlimited Undo/Redo

Dataflow designer supports unlimited undo-redo capability. Users can undo or redo the last action, or undo/redo several actions at once. To undo the last action, open View menu and select Undo, or click the icon on the Dataflow Toolbar, or use the CTRL+Z shortcut. To redo the last action, open View menu and select Redo, or click the icon on the Dataflow Toolbar, or use the CTRL+Y shortcut. To undo several actions at once, select the first action to be undone from the dropdown menu and the subsequent actions will also be undone.

To redo several actions at once, select the first action wished to redo from the dropdown menu and the subsequent actions will also be redone.

Copying Objects

Using the copy and paste feature, an object on the dataflow can be replicated by copying it into a new object with a different name to distinguish it from the original object. This object can be pasted into the same dataflow or a different dataflow. Several objects can be copied at once by clicking the desired objects while pressing the CTRL key or selecting with the mouse by drawing a rectangle around the objects while holding down the LEFT mouse button. Right click on the selected object or objects and select Copy from the context menu. Then right click on white space in the same or a different dataflow and select Paste from the context menu. The CTRL+C shortcut can also be used to copy the selected object into the clipboard, CTRL+V to paste it from the clipboard, and CTRL+X to cut it into the clipboard.

To move an object or a set of objects, use one of the same cut and paste sequences described above. When objects are moved they keep their original names.

Next week we’ll discuss how to manage a dataflow layout. To see the outline for the entire series, click here>>.