Wednesday, December 24, 2014

Oracle Data Integrator Interview Quetions and Answers part 2



5) JKM Journalization Knowledge module:
This JKM we are using CDC Change data capture feature. If we are using CDC that time we can go for this JKM.

6)SKM (Service Knwoledge Module):
This SKM we are using for public data webservices.
like SOA or other public webserivces products.

INTERFACE:
Interface is object for mapping source and target datastores(tables) and we will use interface for loading data from source to target databases.
In interface below Knowledge modules we will use.
1)LKM
2)IKM
3)CKM
Note:  In ODI 11g we will call Interface and ODI 12c we will call mapping.

Creating New Interface:
Goto=>Designer=>Projects=>NEw Project=>Folder=>Interface=>Right Click=>  New Interface.

Before creating interface first we need to identify KM's based source and target databases.
source : Oracle
Target : Oracle
LKM:  LKM SQL to SQL
IKM: IKM SQL Incremental update or IKM SQL Control Append
CKM:  CKM SQL
we need to import all above knowledge modules in your project.

Right Click on Knowledge Modules=> Import Knowledge Modules.

All KM's XML files will be available in below ODI installed location.
C:\Oracle\product\11.1.1\Oracle_ODI_1\oracledi\xml-reference

Create Interface:
Name: User defined EMP_INTF
Optimization Context:we need to select development server context.

Select Target Logical Schema if you are using staging in target. if you to maintain staging other than target you have to select that logical schema.
Here we are maintaing staging in target bcz of that we are selecting target logical schema.

Goto=>Mapping Tab.
Here we can see two panels. left side will be for source tables and right side will be for target tables.

Drag n drop target table from target model to interface right side target datastore area.
Drag and drop source table from source model to interface left sdie source datastore area.
KM's Selection:
Goto=>FLow tab=>Click on Source Diagram=>Select LKM SQL to SQL in properties window.
Click on Target Diagram =>Select IKM SQL Control Append in property window.
Goto=> Controls Tab=>Select CKM SQL

Enable or Diable Flow Control:
Goto=>Flow Tab=> click on Target Diagram=> Goto Property Inspector
select FLOW_CONTROL: TRUE/FALSE.

if you enable FLOW_CONTROL: TRUE target table should contain PK or SK

Interface practice with example:
Interface from File to SQL (Oracle):
Step1:
Identifying KM's.
1)LKM File to SQL
2)IKM SQL Incremental update or control append
3)CKM SQL or Oracle

step2:
Creating Data server and physical schema for File Technology.

Goto=>Topology=>Physical Architecture=>Technologies=>File=>Right click =>New Data Server

NAME: SOURCE_DS
username/password : not required for file
Goto=>JDBC Tab=> Select ODI FILE jdbc driver
JDBC URL: jdbc:snps:dbfile

Creating Physical Schema for above file Data Server.

Right Click on SOURCE_DS=>New Physical Schema.

Schema NAme should be your source file folder path including folder name :  C:\SOURCE_FILES

Step3:
Creating Logical Schema.
Goto=>Logical Architecture=>Technologies=>File=>Right Click=>New Logical Schema.

NAme:  SOURCE_FILE_LS


Step4:
Creating Model for File Technology.

Goto=>Designer=>Models=>New Model=>

Name: SOURCE_FILE_MDL
Technology: File
Logical Schema: SOURCE_FILE_LS

Save.

Note: For File technologies we are not doing direct reverse engineering. we have to add manually each file (datastore).

Adding File  in Model.

Right Click on Model=>New DataStore
NAme: EMPF
Browse =>Resource NAme=> Select source file from source folder.

Goto=>Files Tab
Select File Format: Delimited
Header No of rows:  1 ( header means columns).

Field Seperator:  Others (,)

Goto=> Columns Tab=> click on Reverse Engineering.
Save File.

Step5:
Creating Interface for loading data from file to sql.

Projects=>Interfaces=>Right Click=>New Interface.

NAme: FILE_TO_SQL_INTF.
Select Target Logical Schema.

Goto=>Mapping Tab=>Drag n drop Target table into target mapping area.
Source file into source mapping area.

If source file data having any spaces we have to trim function in mapping.

use TRIM() function in all mappings.
TRIM() its will remove both side spaces
LTRIM() it will remove left side spaces
RTRIM() it will remove right side spaces.

Goto=> Flow Tab=>Click on Source Diagram=> Select LKM File to SQL
Click on Target Diagram select IKM SQL Control append or Incremental Update.
Save your interface and run.
after running we can see data in target table.




EXCEL TO DATABASE (SQL):
Step1: Creating source xls data file
open excel file and change work book name as EMP
and select all data with columns and goto Formulas=>Define name=>
Select workbook=>Define name as EMP.
Save

Step2: Creating System DSN for excel
goto=>Control Panel=>Administrative tools=>ODBC Data sources=>Click
Goto=>System DSN Tab=>Click Add
select Microsoft Excel driver(xls.,xlsx,....)
DSN (Data source name):  XL_DSN
Select workbook (select emp excel file) and click on ok.

step3:  Creating Data server for Excel
Goto=>Topology=>Physical Architecture=>Microsoft Excel=>Right click =>NEW Dataserver.
Name:  XL_DS
Goto=> JDBC  Tab=>Select ODBC JDBC bridge driver
In JDBC URL Profile created System DSN:  XL_DSN

Creating Physical Schema:
Right click Created Dataserver=>XL_DS=>New Physical Schema
Select Default schema and save.

Creating Logical Schema:
Goto=>Logical Architecture=>Technologies=>Mircosoft Excel=>Right click=>
New Logical Schema.
Name: XL_LS

Assign Physical schemas to all contexts.
GLOBAL=>XL_DS.Defaultschema
TEST=>XL_DS.DefaultSchema
PROD=>XL_DS.DefualtSchema

step4: Creating Model For Excel:
Goto=>Designer =>Models=>New Model
NAme:  XL_MDL
Technology: Mircosoft Excel
Logical Schema:  XL_LS
Save
Goto=>Reverse Engineering Tab=>Select Objects to reverse engineer as
SYSTEM TABLE

Note:  For excel and xml data files we need to select object type as SYSTEM TABLE
for reverse engineering.

Click on Reverse engineering button. it will import excel sheet into models with name as EMP that we are provided in Excel Defiine name in formulas tab.


Knowledge modules for EXCEL TO DATABASE:
1) LKM SQL to SQL
2) IKM SQL or Oracle Control append
3)CKM SQL or Oracle

Note:  For excel and xml we are not using File type KM's we have to use
SQL KM's only.


Note:  SQL To FILE
while loading data from sql(database) to file stagning should not be in target
stagning always should be in other than target. stagning we can maintain in source or other databases not in target.

KM's identification for SQL to FILE:
1) LKM SQL to SQL ( source sql and stagning sql)
2) IKM SQL to FILE Append ( Stagning sql and target file)

kM's Identifiecation for FILE TO SQL:
1) LKM FILE to SQL ( Source file and stagning sql)
2) IKM SQL control append ( Target is sql)

KM's identification for File to File:
1) LKM File to SQL ( Source file and stagning sql)
2) IKM SQL to FILE Append ( Stagning sql and target file)


CDC (Change Data Capture):
CDC we are using to capture only changed data from source tables. For implementing CDC
we need to use JKM(Journalization Knowledge Module).

This CDC will work for only Database Tables not for files, excel and xml files.


Note: If we want to implement CDC on source table ,source table should satify below requirement.
1)Source table should contain Primary key
2)Source table should allow to created DML Triggers on that table.

What is DML PLSQL Triggers:  Whenver event occuers (Delete,insert or update) on
source table this trigger will fire and it will insert modified rows into our stagning tables.

Enable CDC at Table level in ODI:
step1:  There are two types Journalizatioin
1)Simple
2)Consistent set

JKM simple we will use individual tables without data dependency.

JKM Consistent set we will use for set of tables with having data dependency. like
parent and child tables.
there are two JKM's will be available if simple we can go for JKM ORacle Simple
if it is Consistent we can go for JKM Oracle Consistent.

Importing JKM Oracle Simple.



step2:  Enable CDC at Table level in source Model
Goto=>Designer=>Source Model=>Right click on EMP Table=>Change Data Capture=>
select Add To CDC

step3: After adding CDC then we need to start journal.
before starting Journal we need to enable JKM at Model level.
Open Source Model=>goto Journalizing Tab=>Select simple=>Select JKM Oracle Simple.
Right click on EMP=>Change Data Capture=>Start Journal.

In this Journalization process it will create below objects in Source schem for EMP table.

1) T$EMP DML trigger on Source emp table
2)J$EMP table for maintain changed data
3)JV$DEMP  view it will create on both EMP and J$EMP tables.

step4:  Enabling CDC at your Interface.

Goto=> Interface=>Source mapping datastore area=>Click on source emp table=>
goto Property inspector=>Select Journalization Data only.

After enable this journalization flag at interface it will select data from  JV$DEMP view.

this JV$DEMP having is selecting data from both EMP and J$EMP tables.
this view will return only modified data.
After finishing your interface it will delete data from J$table.

Note:  Whenever changes happing on source table this DML trigger will fire and if it is update or delete or inserting on source that data trigger will insert into J$EMP table
JNR_FLAG='I' or 'U' or 'D' and JNR_DATE=modified date and PK Column vlaue EMPNO.
while running our interface it will select data from JV$DVIEW, this view having only
modified data. after loading data into target table it will delete from J$EMP table.




SCD (Slowly Changing Dimension):

OLTP: Online transaction processing
if any database having day to day transactions and update or inserting data frequently in database that database we will call OLTP Database. here only normlised tables will be there.

OLAP: Online analytics or analysis processing:
If any database having or maintaing history data and they are using for only
data analysis (reporting) not for manipulation (transaction insert or update ) that type
of databases we are calling OLAP Database or Warehouse.
 In OLAP database tables, we will call different names.
1)Dimension tables
2)Fact tables

1)Dimention tables:  if any table maintaining master data (primary key information) or parent data that type of we will call dimension tables.
2) Fact Tables: if any table maintaining summary data (aggregated data) or child table.
that type of tables we will call facts tables.

Note:  In OLTP we will call PK(parent) and FK9child) Tables and in OLAP we will call Dimention (parent having PK) and Fact tables (child having FK).

In datawarehousing there are different types of dimensions available. i,e., one type
we are going to discuss Slowly Changing Dimention.

SLOWLY CHANING DIMENTION:
If any dimention table data getting changed that type of dimentions we will
slowly chaning dimention.
there are three type of SCD.
1) SCD TYPE1:
2)SCD TYPE2
3) SCD TYPE3:

Table:  EMP
ENAME  D.O.B              D.O.J               Experice Address       SAL
RAM     13-Jun-1984  05-Jyl-2011    3.5                         5 lac p.a
he got a hike in 2012. how we will maintain this data in three types of dimentions.

1) SCD TYPE1:  This type 1 we will use if we dont required history for particular dimention table that time we will go for SCD Type 1. SCD Type 1 will alwasy overwrite
data if we are receiving changed data.
Table:  EMP  ( he got hike 5 to 6 lack in 2012
ENAME  D.O.B              D.O.J               Experice  Address       SAL        modifieddate
RAM     13-Jun-1984     05-Jyl-2011    3.5                         6 lac p.a  05-jul-12

Note: TYPE1 will alwasy overwrite data in existig data.  it wont maintain history.
if you dont want history we can go  for type1.

2)SCD TYPE2:  This type 2 we are using to maintain history with no of times. type 2
will maintain history with no of times whenever you are gettig changed information and it will insert as new record. example if one employee got hike in 10 times it will insert
10 rows in employee table.
If we want to maintain scd type 2 that table should contain below additional columns.
1)Timestamp startdate
2)timestamp enddate
3)Active reocrd flag

Table:  EMP  ( he got hike 5 to 6 lack in 2012
ENAME   D.O.B              D.O.J          Experice Address       SAL      t_startdate   t_enddate     active_flag
RAM        13-Jun-1984   05-Jyl-2011    3.5                         5 lac p.a   05-jul-11         01-01-2100    Y
2012 he got hike. so scd type we will insert new record and we will update active_flag='N' for and endingtimestamp we will update
as enddate.
ENAME   D.O.B               D.O.J          Experice Address   SAL        t_startdate   t_enddate         active_flag
RAM     13-Jun-1984  05-Jyl-2011    3.5                         5 lac p.a   05-jul-11     04-jul-2012          N
RAM     13-Jun-1984  05-Jyl-2011    4.5                         6 lac p.a   05-jul-12      01-01-2100         Y
againg he got hike in 2013
ENAME    D.O.B              D.O.J          Experice Address    SAL        t_startdate   t_enddate         active_flag
RAM     13-Jun-1984  05-Jyl-2011    3.5                         5 lac p.a  05-jul-11       04-jul-2012         N
RAM     13-Jun-1984  05-Jyl-2011    4.5                         6 lac p.a  05-jul-12       04-jul-2013         N
RAM     13-Jun-1984  05-Jyl-2011    5.5                         7 lac p.a  05-jul-13       01-01-2100         Y
like this whenever data getting changed it will insert new row and it will update old records status as N (inactive) and endingtimestamp as enddate.

3)SCD Type3:
this type3 we are using for if we want to maintain history only one time. this type 3 can maintain history only one time and
not a new row (record) it will maintain in same row(record) with additional column. for example if we want maintain
2 columns data for one time history we to create adidional two columns in that table.
Table:  EMP 
ENAME    D.O.B              D.O.J             Experice Address    PREV_SAL     CUR_SAL        modified_date
RAM     13-Jun-1984      05-Jyl-2011       3.5                            0                 5 lac p.a        05-jul-11 
he got a hike in 2012.
Table:  EMP 
ENAME    D.O.B                D.O.J               Experice Address PREV_SAL       CUR_SAL        modified_date
RAM         13-Jun-1984    05-Jyl-2011         4.5                        5 lac p.a          6 lac p.a         05-jul-12 
he got a hike in 2013.
Table:  EMP 
ENAME    D.O.B              D.O.J               Experice Address    PREV_SAL       CUR_SAL        modified_date
RAM        13-Jun-1984     05-Jyl-2011    3.5                             6 lac p.a           7 lac p.a           05-jul-13 

In ODI we have separate IKM for SCD:  IKM Oracle Slowly Changing Dimention.

SCD Properties:
1) Surrogate key (PK): we need one PK key on SCD table
2) Overwrite on Change: whcih columns dont need history we will use this for that column.
3) Add row on change:  which column need a history data what columns we will use this . TYPE2.
4) Starting timestamp:  SCD TYPE 2 only we will use
5) Ending Timestamp:  SCD Type 2 only we will use
6) Current record Flag:  SCD Type 2 only we will use.

Steps for implementing SCD TYPE (1 & 2) in ODI.

Step1:
for training i am creating new table for supporting scd type 2  with below additional columns.
SCD2:
------
CREATE TABLE SCD2
  (
    EMPNO       NUMBER(4, 0) ,
    ENAME       VARCHAR2(10 BYTE) ,
    JOB         VARCHAR2(9 BYTE) ,
    MGR         NUMBER(4, 0) ,
    HIREDATE    DATE ,
    SAL         NUMBER(7, 2) ,
    COMM        NUMBER(7, 2) ,
    DEPTNO      NUMBER(2, 0) ,
    START_DATE  DATE ,
    END_DATE    DATE ,
    ACTIVE_FLAG VARCHAR2(1 BYTE)
  ) ;

After creating this table we have to do reverse engineering.


Step2:   Changing OLAP Table Type as :   Slowly Changing Dimention
and enabling SCD behaviour at each column in Model=>SCD Table=>Column=>Description tab.

enable all scd behaviour for all columns it should not leave empty.
EMPNO:  Natural Key
ENAME:  Overwrite on change
JOB:  Overwrite on change
MGR : overwrite on change
hiredate:  overwrite on change
sal: add row on change
comm:  overwrite on change
deptno :  overwrite on change
start_date: starting timestamp
end-date:  ending timestamp
active_falg:  current record flag

this all properties we need to enable at Model=>Target Model=>SCD2 table=>attributes(columns)=>Open column=>
goto Description tab=>Select Slowly changing dimention behaviour.

step3:  Imporing IKM Slowloy changing dimention.
step4: creating interface.



No comments:

Post a Comment