Saturday, March 21, 2020

Creating a Custom Integration Knowledge Module to Backup Table before inserting new rows - ODI11g

Today I will be modifying an existing IKM SQL Control Append and implement + test it in ODI11g.


*This case study was demonstrated via ODI Substitution API in oracle's website, for further references direct to this link:*



To implement this, I will duplicate the original IKM SQL Control Append Knowledge Module and add additional steps.

2 Additional steps will be added to the Knowledge Module:


  • Drop Backup Table (if it already exists one);
  • Create Backup Table of the target's table before inserting new rows, tables will be suffixed with 'Target table's name'_BKP.

I will name this IKM as 'IKM SQL CONTROL APPEND/ wCreateBKP':

No need to specify target technology.


In Details tab, each step of the knowledge module is specified, and the Order is ranked in a top down approach, so the upper most step starts the first, and so on.

Typically the additional step for Dropping an existing BKP_ table with the target's table name should be the first step in this knowledge module, additionally after this step we want to Create a BKP_ prefixed table with all the existing records in the target table.

So the dependency here is that both of the steps should execute before Truncate target table or Delete target table step. Hence I will add both of these steps as the first to execute in the KM.


The typical SQL query for dropping a table in Oracle is:

  • Drop table 'Table_name'_BKP;

And the typical SQL query for creating a table postfixed with '_BKP' as a copy of another table including rows is:

  • Create table 'Table_name'||'_BKP' as Select * from 'Table_name';


ODI KM's use substitution API's. As cited in oracle's website (link above): KMs are written as templates by using the Oracle Data Integrator substitution API. The API methods are java methods that return a string value. They all belong to a single object instance named "odiRef". The same method may return different values depending on the type of KM that invokes it. That's why they are classified by type of KM.


Query:

  • Drop table 'Table_name'_BKP;
  • Drop table <%=odiRef.getTable("L","TARG_NAME","A")%>_BKP
And,

  • Create table 'Table_name'||'_BKP' as Select * from 'Table_name';
  • Create table <%=odiRef.getTable("L","TARG_NAME","A")%>_BKP as select * from <%=odiRef.getTable("L","TARG_NAME","A")%>

Now that we have the substitution API's ready, create two additional steps in Details tab of KM's using the green plus '+' on the top right corner, and rearrange the order using the up and down arrows on the top right corner:

New steps in KM

Note that when Dropping the backup table (if existing) the checkbox for ignore errors is marked, meaning that if the _BKP backup table doesn't already exist, do not throw an error and proceed with Creating it.

Open the step for Drop Backup Table and make sure to check that box and paste the substitution API's code under Command on Target section. No need to specify Technology of Context whatsoever.

Drop Backup Table (if existing)

Navigate to Create Backup Table step, this time I won't check Ignore Errors checkbox because I wan't to throw an error whenever a table cannot be created, make sure to paste the code under Command on Target Section.


Create Backup Table


Now that both of these steps are created, I want to create two Options under this Knowledge module, so if I want to disable this feature of the KM, I can do that by selecting it as False.

Navigate to Knowledge Modules in Designer Tab, Expand Knowledge Modules, expand IKM SQL CONTROL APPEND/ wCreateBKP and right click on it to add a new option:

New Option



When Creating the options, check for Position, the Options Position should refer to the Order position of the step inside Details tab, in my case Dropping the Backup table should have position 10 and Creating it should have position 20.

Edit Option

Edit Option

Now that we have the Knowledge module in place, it is time to test it.

To demonstrate it's execution and step I will use a simple interface which uses a file to populate an Oracle table.

Under flow tab of this interface, I will choose the custom IKM we've built, and I will make sure that DROP_BACKUP_TBL option and CREATE_BACKUP_TBL options are selected as true (which it should be by default) and Truncate as True.

Flow

It is time to execute the interface and check the steps.

Execution steps
As you can observe, Drop Backup Table has a warning because a previous table with _BKP name did not exist, and then Create Backup Table executed correctly.

If I run it again, it won't show any warning on Drop Backup Table because now a _BKP table exists:

Execution steps





Wednesday, March 4, 2020

How to perform Group By in ODI11g interfaces

Aggregation is a must in Data Warehousing and Data Integration, as of ODI 12c there exists a designated tool for Group by operations, but in ODI 11g you have to perform a workaround.

Aggregate functions in SQL include:

MIN
MAX
SUM
COUNT
AVG

To perform a query with the above functions, a group by statement has to be used in order to aggregate data.


I will be using a random source file created randomly with ID and different numerical attributes:


Source file set


And a target table created manually in Oracle with identical attributes such as source file:

Target table


Now I will create an interface (GROUP_BY_EXP) which uses the data store's created from above file and table:

Initial Interface


Assuming that I want to find the SUM of Height grouped by ID only, typically this will be achieved with this query:


Mapping only ID and Height in Mappings expressions this is pretty easy:
Mapping Expressions


Executing this interface will generate this code, which indeed is exactly what we expect it to generate:
Code generated in first run

What if we map all of the columns on the target table? And we need the same task Sum(Height) and Group by ID? 







Let's firstly map all of the columns:
Mapping Expressions
Let's execute it:

Code generated in second run

Now this brings an issue because in our case we need only ID used in Group By expression.

In order to avoid this, each of unneeded columns in group by statement: NAME, LENGTH, WIDTH should have aggregate function used in expression, if that does not pose a problem in whatever you are using, then encapsulate each column with MIN/MAX:


Mapping Expressions
Let's execute it:

Code generated in third run
As you can see NAME, HEIGHT and WIDTH are removed from Group By statement whenever there's an aggregate function used on them.

Note: This is a workaround, data should be analyzed after the execution due to the fact that MIN/MAX functions might generate data which can cause trouble on your business logic.