5)
JKM Journalization Knowledge module:
This JKM we are using CDC Change data capture
feature. If we are using CDC that time we can go for this JKM.
6)SKM
(Service Knwoledge Module):
This SKM we are using for public data
webservices.
like SOA or other public webserivces products.
INTERFACE:
Interface is object for mapping source and
target datastores(tables) and we will use interface for loading data from
source to target databases.
In interface below Knowledge modules we will
use.
1)LKM
2)IKM
3)CKM
Note:
In ODI 11g we will call Interface and ODI 12c we will call mapping.
Creating New Interface:
Goto=>Designer=>Projects=>NEw
Project=>Folder=>Interface=>Right Click=> New Interface.
Before creating interface first we need to
identify KM's based source and target databases.
source : Oracle
Target : Oracle
LKM:
LKM SQL to SQL
IKM: IKM SQL Incremental update or IKM SQL
Control Append
CKM:
CKM SQL
we need to import all above knowledge modules
in your project.
Right Click on Knowledge Modules=> Import
Knowledge Modules.
All KM's XML files will be available in below
ODI installed location.
C:\Oracle\product\11.1.1\Oracle_ODI_1\oracledi\xml-reference
Create Interface:
Name: User defined EMP_INTF
Optimization Context:we need to select
development server context.
Select Target Logical Schema if you are using
staging in target. if you to maintain staging other than target you have to
select that logical schema.
Here we are maintaing staging in target bcz of
that we are selecting target logical schema.
Goto=>Mapping Tab.
Here we can see two panels. left side will be
for source tables and right side will be for target tables.
Drag n drop target table from target model to
interface right side target datastore area.
Drag and drop source table from source model
to interface left sdie source datastore area.
KM's Selection:
Goto=>FLow tab=>Click on Source
Diagram=>Select LKM SQL to SQL in properties window.
Click on Target Diagram =>Select IKM SQL
Control Append in property window.
Goto=> Controls Tab=>Select CKM SQL
Enable or Diable Flow Control:
Goto=>Flow Tab=> click on Target
Diagram=> Goto Property Inspector
select FLOW_CONTROL: TRUE/FALSE.
if you enable FLOW_CONTROL: TRUE target table
should contain PK or SK
Interface
practice with example:
Interface
from File to SQL (Oracle):
Step1:
Identifying KM's.
1)LKM File to SQL
2)IKM SQL Incremental update or control append
3)CKM SQL or Oracle
step2:
Creating Data server and physical schema for
File Technology.
Goto=>Topology=>Physical
Architecture=>Technologies=>File=>Right click =>New Data Server
NAME: SOURCE_DS
username/password : not required for file
Goto=>JDBC Tab=> Select ODI FILE jdbc
driver
JDBC URL: jdbc:snps:dbfile
Creating Physical Schema for above file Data
Server.
Right Click on SOURCE_DS=>New Physical
Schema.
Schema NAme should be your source file folder
path including folder name :
C:\SOURCE_FILES
Step3:
Creating Logical Schema.
Goto=>Logical
Architecture=>Technologies=>File=>Right Click=>New Logical Schema.
NAme:
SOURCE_FILE_LS
Step4:
Creating Model for File Technology.
Goto=>Designer=>Models=>New
Model=>
Name: SOURCE_FILE_MDL
Technology: File
Logical Schema: SOURCE_FILE_LS
Save.
Note: For File technologies we are not doing
direct reverse engineering. we have to add manually each file (datastore).
Adding File
in Model.
Right Click on Model=>New DataStore
NAme: EMPF
Browse =>Resource NAme=> Select source
file from source folder.
Goto=>Files Tab
Select File Format: Delimited
Header No of rows: 1 ( header means columns).
Field Seperator: Others (,)
Goto=> Columns Tab=> click on Reverse
Engineering.
Save File.
Step5:
Creating Interface for loading data from file
to sql.
Projects=>Interfaces=>Right
Click=>New Interface.
NAme: FILE_TO_SQL_INTF.
Select Target Logical Schema.
Goto=>Mapping Tab=>Drag n drop Target
table into target mapping area.
Source file into source mapping area.
If source file data having any spaces we have
to trim function in mapping.
use TRIM() function in all mappings.
TRIM() its will remove both side spaces
LTRIM() it will remove left side spaces
RTRIM() it will remove right side spaces.
Goto=> Flow Tab=>Click on Source
Diagram=> Select LKM File to SQL
Click on Target Diagram select IKM SQL Control
append or Incremental Update.
Save your interface and run.
after running we can see data in target table.
EXCEL TO DATABASE (SQL):
Step1: Creating source xls data file
open excel file and change work book name as
EMP
and select all data with columns and goto
Formulas=>Define name=>
Select workbook=>Define name as EMP.
Save
Step2: Creating System DSN for excel
goto=>Control Panel=>Administrative
tools=>ODBC Data sources=>Click
Goto=>System DSN Tab=>Click Add
select Microsoft Excel driver(xls.,xlsx,....)
DSN (Data source name): XL_DSN
Select workbook (select emp excel file) and
click on ok.
step3: Creating Data server for Excel
Goto=>Topology=>Physical
Architecture=>Microsoft Excel=>Right click =>NEW Dataserver.
Name:
XL_DS
Goto=> JDBC
Tab=>Select ODBC JDBC bridge driver
In JDBC URL Profile created System DSN: XL_DSN
Creating Physical Schema:
Right click Created
Dataserver=>XL_DS=>New Physical Schema
Select Default schema and save.
Creating Logical Schema:
Goto=>Logical
Architecture=>Technologies=>Mircosoft Excel=>Right click=>
New Logical Schema.
Name: XL_LS
Assign Physical schemas to all contexts.
GLOBAL=>XL_DS.Defaultschema
TEST=>XL_DS.DefaultSchema
PROD=>XL_DS.DefualtSchema
step4: Creating Model For Excel:
Goto=>Designer =>Models=>New Model
NAme:
XL_MDL
Technology: Mircosoft Excel
Logical Schema: XL_LS
Save
Goto=>Reverse Engineering Tab=>Select
Objects to reverse engineer as
SYSTEM TABLE
Note:
For excel and xml data files we need to select object type as SYSTEM
TABLE
for reverse engineering.
Click on Reverse engineering button. it will
import excel sheet into models with name as EMP that we are provided in Excel
Defiine name in formulas tab.
Knowledge modules for EXCEL TO DATABASE:
1) LKM SQL to SQL
2) IKM SQL or Oracle Control append
3)CKM SQL or Oracle
Note:
For excel and xml we are not using File type KM's we have to use
SQL KM's only.
Note:
SQL To FILE
while loading data from sql(database) to file
stagning should not be in target
stagning always should be in other than
target. stagning we can maintain in source or other databases not in target.
KM's identification for SQL to FILE:
1) LKM SQL to SQL ( source sql and stagning
sql)
2) IKM SQL to FILE Append ( Stagning sql and
target file)
kM's Identifiecation for FILE TO SQL:
1) LKM FILE to SQL ( Source file and stagning
sql)
2) IKM SQL control append ( Target is sql)
KM's identification for File to File:
1) LKM File to SQL ( Source file and stagning
sql)
2) IKM SQL to FILE Append ( Stagning sql and
target file)
CDC (Change Data Capture):
CDC we are using to capture only changed data
from source tables. For implementing CDC
we need to use JKM(Journalization Knowledge
Module).
This CDC will work for only Database Tables
not for files, excel and xml files.
Note: If we want to implement CDC on source
table ,source table should satify below requirement.
1)Source table should contain Primary key
2)Source table should allow to created DML
Triggers on that table.
What is DML PLSQL Triggers: Whenver event occuers (Delete,insert or
update) on
source table this trigger will fire and it
will insert modified rows into our stagning tables.
Enable CDC
at Table level in ODI:
step1: There are two types Journalizatioin
1)Simple
2)Consistent set
JKM simple we will use individual tables
without data dependency.
JKM Consistent set we will use for set of
tables with having data dependency. like
parent and child tables.
there are two JKM's will be available if
simple we can go for JKM ORacle Simple
if it is Consistent we can go for JKM Oracle
Consistent.
Importing JKM Oracle Simple.
step2: Enable CDC at Table level in source Model
Goto=>Designer=>Source Model=>Right
click on EMP Table=>Change Data Capture=>
select Add To CDC
step3: After
adding CDC then we need to start journal.
before starting Journal we need to enable JKM
at Model level.
Open Source Model=>goto Journalizing
Tab=>Select simple=>Select JKM Oracle Simple.
Right click on EMP=>Change Data
Capture=>Start Journal.
In this Journalization process it will create
below objects in Source schem for EMP table.
1) T$EMP DML trigger on Source emp table
2)J$EMP table for maintain changed data
3)JV$DEMP
view it will create on both EMP and J$EMP tables.
step4: Enabling CDC at your Interface.
Goto=> Interface=>Source mapping
datastore area=>Click on source emp table=>
goto Property inspector=>Select
Journalization Data only.
After enable this journalization flag at
interface it will select data from
JV$DEMP view.
this JV$DEMP having is selecting data from
both EMP and J$EMP tables.
this view will return only modified data.
After finishing your interface it will delete
data from J$table.
Note:
Whenever changes happing on source table this DML trigger will fire and
if it is update or delete or inserting on source that data trigger will insert
into J$EMP table
JNR_FLAG='I' or 'U' or 'D' and
JNR_DATE=modified date and PK Column vlaue EMPNO.
while running our interface it will select
data from JV$DVIEW, this view having only
modified data. after loading data into target
table it will delete from J$EMP table.
SCD (Slowly Changing Dimension):
OLTP: Online transaction processing
if any database having day to day transactions
and update or inserting data frequently in database that database we will call
OLTP Database. here only normlised tables will be there.
OLAP: Online analytics or analysis processing:
If any database having or maintaing history
data and they are using for only
data analysis (reporting) not for manipulation
(transaction insert or update ) that type
of databases we are calling OLAP Database or
Warehouse.
In OLAP
database tables, we will call different names.
1)Dimension tables
2)Fact tables
1)Dimention tables: if any table maintaining master data (primary
key information) or parent data that type of we will call dimension tables.
2) Fact Tables: if any table maintaining
summary data (aggregated data) or child table.
that type of tables we will call facts tables.
Note:
In OLTP we will call PK(parent) and FK9child) Tables and in OLAP we will
call Dimention (parent having PK) and Fact tables (child having FK).
In datawarehousing there are different types
of dimensions available. i,e., one type
we are going to discuss Slowly Changing
Dimention.
SLOWLY CHANING DIMENTION:
If any dimention table data getting changed
that type of dimentions we will
slowly chaning dimention.
there are three type of SCD.
1) SCD TYPE1:
2)SCD TYPE2
3) SCD TYPE3:
Table:
EMP
ENAME
D.O.B D.O.J Experice Address SAL
RAM
13-Jun-1984 05-Jyl-2011 3.5 5 lac p.a
he got a hike in 2012. how we will maintain
this data in three types of dimentions.
1) SCD TYPE1: This type 1 we will use if we dont required
history for particular dimention table that time we will go for SCD Type 1. SCD
Type 1 will alwasy overwrite
data if we are receiving changed data.
Table: EMP ( he got hike 5 to 6 lack in 2012
ENAME D.O.B D.O.J Experice Address
SAL modifieddate
RAM 13-Jun-1984 05-Jyl-2011 3.5 6 lac p.a 05-jul-12
Note: TYPE1 will alwasy overwrite data in
existig data. it wont maintain history.
if you dont want history we can go for type1.
2)SCD TYPE2: This type 2 we are using to maintain history
with no of times. type 2
will maintain history with no of times
whenever you are gettig changed information and it will insert as new record.
example if one employee got hike in 10 times it will insert
10 rows in employee table.
If we want to maintain scd type 2 that table
should contain below additional columns.
1)Timestamp startdate
2)timestamp enddate
3)Active reocrd flag
Table: EMP ( he got hike 5 to 6 lack in 2012
ENAME D.O.B D.O.J Experice Address SAL
t_startdate t_enddate
active_flag
RAM 13-Jun-1984
05-Jyl-2011
3.5 5 lac
p.a 05-jul-11
01-01-2100 Y
2012 he got hike. so scd type we will insert
new record and we will update active_flag='N' for and endingtimestamp we will
update
as enddate.
ENAME D.O.B D.O.J Experice Address SAL
t_startdate t_enddate active_flag
RAM 13-Jun-1984 05-Jyl-2011
3.5 5 lac
p.a 05-jul-11
04-jul-2012
N
RAM 13-Jun-1984 05-Jyl-2011
4.5 6 lac
p.a 05-jul-12
01-01-2100 Y
againg he got hike in 2013
ENAME D.O.B
D.O.J Experice Address SAL
t_startdate t_enddate active_flag
RAM 13-Jun-1984 05-Jyl-2011
3.5 5 lac
p.a 05-jul-11 04-jul-2012 N
RAM 13-Jun-1984 05-Jyl-2011
4.5 6 lac
p.a 05-jul-12 04-jul-2013 N
RAM 13-Jun-1984 05-Jyl-2011
5.5 7 lac
p.a 05-jul-13 01-01-2100 Y
like this whenever data getting changed it
will insert new row and it will update old records status as N (inactive) and
endingtimestamp as enddate.
3)SCD Type3:
this type3 we are using for if we want to
maintain history only one time. this type 3 can maintain history only one time
and
not a new row (record) it will maintain in
same row(record) with additional column. for example if we want maintain
2 columns data for one time history we to
create adidional two columns in that table.
Table: EMP
ENAME D.O.B D.O.J Experice Address PREV_SAL
CUR_SAL modified_date
RAM 13-Jun-1984 05-Jyl-2011
3.5 0 5 lac p.a 05-jul-11
he got a hike in 2012.
Table: EMP
ENAME D.O.B D.O.J Experice Address PREV_SAL CUR_SAL modified_date
RAM 13-Jun-1984 05-Jyl-2011 4.5 5 lac p.a
6 lac p.a 05-jul-12
he got a hike in 2013.
Table: EMP
ENAME D.O.B D.O.J Experice Address PREV_SAL CUR_SAL modified_date
RAM 13-Jun-1984
05-Jyl-2011
3.5 6 lac p.a 7 lac p.a 05-jul-13
In ODI we have separate IKM for SCD: IKM Oracle Slowly Changing Dimention.
SCD
Properties:
1) Surrogate key (PK): we need one PK key on
SCD table
2) Overwrite on Change: whcih columns dont
need history we will use this for that column.
3) Add row on change: which column need a history data what columns
we will use this . TYPE2.
4) Starting timestamp: SCD TYPE 2 only we will use
5) Ending Timestamp: SCD Type 2 only we will use
6) Current record Flag: SCD Type 2 only we will use.
Steps for implementing SCD TYPE (1 & 2) in
ODI.
Step1:
for training i am creating new table for
supporting scd type 2 with below
additional columns.
SCD2:
------
CREATE TABLE SCD2
(
EMPNO NUMBER(4, 0) ,
ENAME VARCHAR2(10 BYTE) ,
JOB VARCHAR2(9 BYTE) ,
MGR NUMBER(4, 0) ,
HIREDATE DATE ,
SAL NUMBER(7, 2) ,
COMM NUMBER(7, 2) ,
DEPTNO NUMBER(2, 0) ,
START_DATE
DATE ,
END_DATE DATE ,
ACTIVE_FLAG VARCHAR2(1 BYTE)
) ;
After creating this table we have to do
reverse engineering.
Step2: Changing OLAP Table Type as : Slowly Changing Dimention
and enabling SCD behaviour at each column in
Model=>SCD Table=>Column=>Description tab.
enable all scd behaviour for all columns it
should not leave empty.
EMPNO:
Natural Key
ENAME:
Overwrite on change
JOB:
Overwrite on change
MGR : overwrite on change
hiredate:
overwrite on change
sal: add row on change
comm:
overwrite on change
deptno :
overwrite on change
start_date: starting timestamp
end-date:
ending timestamp
active_falg:
current record flag
this all properties we need to enable at
Model=>Target Model=>SCD2 table=>attributes(columns)=>Open
column=>
goto Description tab=>Select Slowly
changing dimention behaviour.
step3: Imporing IKM Slowloy changing dimention.
step4: creating
interface.
No comments:
Post a Comment