ODI as a tool offers data quality assurance through various scopes.
Today I will demonstrate how to add ODI level constraints and isolate errors which violate a constraint, in error tables that ODI creates automatically (known as E$ tables).
In principle, ODI has 2 kind of data controls known as:
- Flow Control - which validates data before inserting into a target table (validation is done within I$ tables). What happens is that it leaves only the records which do not violate a constraint into the target table.
- Static Control - Validation occurs after inserted in target table, though insertion is done in E$ table for validating records but they won't be deleted from target table.
Enabling these controls in ODI, one must use the IKM's Flow Control/Static Control options of an IKM in the Flow tab of an interface.
But first we need to define what constraints do columns have in order to apply Flow Control.
Todays scenario:
- Populate a table in which the primary key of the table cannot be null (ID).
- Column height cannot be more than 200
- Column date has to be in format YYYYMMDD (i.e 20191101)
Let's dig into the problem.
Create an interface which populates a table consisting in my case columns mentioned above.
I will be reading a .txt source file which is tab delimited. Data populated in the target table looks like this:
Area hand circled with red, marks the violations which will be declared in the datastore of the target table now. We can observe that the ID is null, Height is bigger that 200 and DATE is not in format of YYYYMMDD.
First let us write the SQL statements which validate the conditions:
As per ID: WHERE ID IS NOT NULL will do the work
As per Height: WHERE to_number(HEIGHT) > 200
As per Date: REGEXP_LIKE(DATE, '^\d{4}(0[1-9]|(1[0-2]))(0[1-9]|[1-2][0-9]|3[0-1])$')
Navigate to Quick-Edit tab and select CN and UK (Check Not Null) for ID:
Note: UK has to be selected for the primary key because flow control requires one.
Now we need to create constraints in target table's datastore. Navigate to: Designer - Models - Model where your datastore resides - Right Click Constraints - New Condition:
In my case, I created 2 Conditions
1. CHECK_HEIGHT
2. CHECK_DATE
Given the names, I also choose Type: Oracle Data Integrator Condition
In the Where area: paste the SQL expressions.
Message: the error message which you want to prompt
Note: In Control tab on top left corner Flow and Static Control can be allowed for these constraints. I have checked both, but in my case I will be using only Flow Control.
Save the constraints and now open the interface, go to Flow tab, select IKM_SQL_Control_Append and mark the option FLOW_CONTROL to True.
Under Control Tab next to Flow you can select the CKM which does the controlling.
Besides selecting the CKM it allows you to disable or enable specific constraints (True or False).

Also the maximum number of errors allowed can be specified, meaning that if that number is exceeded, interface will not throw warnings instead it will throw an error!
Run the interface and check the Operator:
Besides selecting the CKM it allows you to disable or enable specific constraints (True or False).
Also the maximum number of errors allowed can be specified, meaning that if that number is exceeded, interface will not throw warnings instead it will throw an error!
Run the interface and check the Operator:
In DBMS you can see that a new table has been created in the schema, the E$ table consists of the constraint violating records:
Only one record did not violate the constraint and that record is inserted into the target table:


No comments:
Post a Comment