Tuesday, March 20, 2012

Oracle Data Integrator (ODI) - Frequently Asked Questions (FAQ)

1) What is Oracle Data Integrator (ODI)?
Oracle acquired Sunopsis in 2006 and with it "Sunopsis Data Integrator".

Oracle Data Integrator (ODI) is an E-LT (Extract, Load and Transform) tool used for high-speed data movement between disparate systems.

The latest version, Oracle Data Integrator Enterprise Edition (ODI-EE) brings together "Oracle Data Integrator" and "Oracle Warehouse Builder" as separate components of a single product with a single licence.


2) What is E-LT?
E-LT is an innovative approach to extracting, loading and Transforming data. Typically ETL application vendors have relied on costly heavyweight , mid-tier server to perform the transformations required when moving large volumes of data around the enterprise.

ODI delivers unique next-generation, Extract Load  and Transform (E-LT) technology that improves performance and reduces data integration costs, even across heterogeneous systems by pushing the processing required down to the typically large and powerful database servers already in place within the enterprise.


3) What components make up Oracle Data Integrator?
"Oracle Data Integrator" comprises of:

        - Oracle Data Integrator + Topology Manager + Designer + Operator + Agent
        - Oracle Data Quality for Data Integrator
        - Oracle Data Profiling


4) What is Oracle Data Integration Suite?


Oracle data integration suite is a set of data management applications for building, deploying, and managing enterprise data integration solutions:
  • Oracle Data Integrator Enterprise Edition
  • Oracle Data Relationship Management
  • Oracle Service Bus (limited use)
  • Oracle BPEL (limited use)
  • Oracle WebLogic Server (limited use)
Additional product options are:

  • Oracle Goldengate
  • Oracle Data Quality for Oracle Data Integrator (Trillium-based DQ)
  • Oracle Data Profiling (Trillium based Data Profiling)
  • ODSI (the former Aqualogic Data Services Platform)

5) What systems can ODI extract and load data into?

ODI brings true heterogeneous connectivity out-of-the-box, it can connect natively to Oracle, Sybase, MS SQL Server, MySQL, LDAP, DB2, PostgreSQL, Netezza.

It can also connect to any data source supporting JDBC, its possible even to use the Oracle BI Server as a data source using the jdbc driver that ships with BI Publisher


6) What are Knowledge Modules?

Knowledge Modules form the basis of 'plug-ins' that allow ODI to generate the relevant execution code , across technologies , to perform tasks in one of six areas, the six types of knowledge module consist of:
  • Reverse-engineering knowledge modules are used for reading the table and other object metadata from source databases
  • Journalizing knowledge modules record the new and changed data within either a single table or view or a consistent set of tables or views
  • Loading knowledge modules are used for efficient extraction of data from source databases for loading into a staging area (database-specific bulk unload utilities can be used where available)
  • Check knowledge modules are used for detecting errors in source data
  • Integration knowledge modules are used for efficiently transforming data from staging area to the target tables, generating the optimized native SQL for the given database
  • Service knowledge modules provide the ability to expose data as Web services
ODI ships with many knowledge modules out of the box, these are also extendable, they can modified within the ODI Designer module.


7) How do 'Contexts' work in ODI?

ODI offers a unique design approach through use of Contexts and Logical schemas. Imagine a development team,  within the ODI Topology manager a senior developer can define the system architecture, connections, databases, data servers (tables etc) and so forth.

These objects are linked through contexts to 'logical' architecture objects that are then used by other developers to simply create interfaces using these logical objects, at run-time, on specification of a context within which to execute the interfaces, ODI will use the correct physical connections, databases + tables (source + target) linked the logical objects being used in those interfaces as defined within the environment Topology.


8) Does my ODI infrastructure require an Oracle database?
No, the ODI modular repositories (Master + and one of multiple Work repositories) can be installed on any database engine that supports ANSI ISO 89 syntax such as Oracle, Microsoft SQL Server, Sybase AS Enterprise, IBM DB2 UDB, IBM DB2/40.


9) Where can I get more information on ODI?
The OTN Data integration home page : http://www.oracle.com/us/products/middleware/data-integration/index.html


10) Does ODI support web services?

Yes, ODI is 'SOA' enabled and its web services can be used in 3 ways:
  • The Oracle Data Integrator Public Web Service, that lets you execute a scenario (a published package) from a web service call
  • Data Services, which provide a web service over an ODI data store (i.e. a table, view or other data source registered in ODI)
  • The ODIInvokeWebService tool that you can add to a package to request a response from a web service 

11) Where does ODI sit with my existing OWB implementation(s)?
As mentioned previously, the ODI-EE licence includes both ODI and OWB as separate products, both tools will converge in time into "Oracle’s Unified Data Integration Product".

Oracle have released a statement of direction for both products, published January 2010:

http://www.oracle.com/technology/products/oracle-data-integrator/sod.pdf

OWB 11G R2 is the first step from Oracle to bring these two applications together, its now possible to use ODI Knowledge modules within your OWB 11G R2 environment as 'Code Templates', an Oracle white paper published February 2010 describes this in more detail:

http://www.oracle.com/technology/products/warehouse/pdf/owb-11gr2-code-template-mappings.pdf


12) Is ODI Used by Oracle in their products?

Yes there are many Oracle products that utilise ODI, but here are just a few:
  • Oracle Application Integration Architecture (AIA)
  • Oracle Agile products
  • Oracle Hyperion Financial Management
  • Oracle Hyperion Planning
  • Oracle Fusion Governance, Risk & Compliance
  • Oracle Business Activity Monitoring
Oracle BI Applications also uses ODI as its core ETL tool in place of Informatica , but only for one release of OBIA and when using a certain source system.

Future plans are to have ODI fully available through the OBIA offering.

ODI Interview questions and answers

  Oracle Data Integrator (ODI) is  - product from Sunopsis acquired by Oracle in 2006 and now part of Oracle Fusion Middleware Family.
1. ODI is built on E-LT(Extract , Load and Transform) Architecture.
2. Oracle Data Integrator 10g (10.1.3.5.0) suite includes three products
a) Oracle Data Integrator
b) Oracle Data Quality and
c)  Oracle Data Profiling
3. ODI uses Database as ETL (Extract, Transform & Load) engine thus eliminates requirement of proprietary ELT engine
4. Oracle Data Integrator Enterprise Edition (ODIEE) is combination of ODI (Oracle Data Integrator) and OWB (Oracle Warehouse Builder)


 Q.Explain what is ODI?why is it different from the other ETL tools.

ODI stands for Oracle Data Integrator. It is different from another ETL tool in a way that it uses E-LT approach as opposed to ETL approach. This approach eliminates the need of the exclusive Transformation Server between the Source and Target Data server. The power of the target data server can be used to transform the data. i.e. The target data server acts as staging area in addition to its role of target databasel. While loading the data in the target database (from staging area) the transformation logic is implemented. Also, the use of appropriate CKM (Check Knowldege Module) can be made while doing this to implement data quality requirement.


Q.How will you bring in the different source data into ODI?

you will have to create dataservers in the topology manager for the different sources that you want.

Q.How will you bulk load data?


In Odi there are IKM that are designed for bulk loading of data.

Q.How will you bring in files from remote locations?


We will invoke the Service knowledge module in ODI,this will help us to accesses data thought a web service.

Q.How will you handle dataquality in ODI?

There are two ways of handling dataquality in Odi....the first method deals with handling the incorrect data using the CKM...the second method uses Oracle data quality tool(this is for advanced quality options).



Oracle Data Integrator (ODI) consists of following products
1. Repository - is relational Databases to store objects used/configured or developed by ODI. There are two type of Repository Master Repository (one and only one) and Work Repository (one or more)
a) Master Repository - There is only one Master repository and used to store security information, topology information (servers..) and versions of the Objects. All Modules (designer, operator, topology & security) have access to master repository.
b) Work Repository- Work related objects (project objects) are stored in Work Repository like Models, Projects and run-time information. There could be multiple work repository per installation and all linked to single master repository. Work Repository is accessed by Designer/Operator Module and run time agent.

2. Graphical Modules -
a) Designer (designer.sh|bat)- All project development takes place in this module and this is the place where database and application metadata are imported and defined.
b) Operator (operator.sh|bat) - usually to monitor production ODI instance and shows execution logs, rows processed and execution statistics
c) Topology Manager (topology.sh|bat)- To register servers, schema and agents in master repository.
d) Security Manager (security.sh|bat) - To manage user profiles and their access privileges.

3. Runtime Component / Scheduler Agent - scheduler agent coordinates execution of scenarios. Scheduler Agent retrieves code from execution repository and then requests database server, scripting engine or operating system server to execute that code.

4. Metadata Navigator (MN)- is Web (JSP/Servlet) application (available as oracledimn.war) that enables access to repository through Web Interface (Web Browser). MetaData Navigator (MN) requires Application Server and you deploy MN application oracledimn.war on pre-installed application server (Tom Cat, OAS, WebLogic). This is optional component.





ODI Architecture

ODI Architecture

  1. Understand ODI Architecture.
  2. Understand Components that make up ODI.
  3. Understand what are ODI repositories?
Architecture Overview:
What is Oracle Data Integrator?
  • Data integration product.
  • ODI is a development platform. (Business Rule Driven , E-LT approach)
  • Simple and faster.
  • Based on Metadata – Centralized Repository.
Oracle Data Integrator is an integration platform. Simply put, it is used to move and transform information across the information system. Oracle Data Integrator is also a development platform for integration processes. It is unique in two respects:
  • It uses an approach driven by business rules. In this approach, you focus your effort on the business side of integration, and not on the technical aspects.
  • It uses the E-LT approach. Oracle Data Integrator does not execute the integration processes itself at run time, but orchestrates a process which leverages existing systems.
Oracle Data Integrator is based on metadata. That is, descriptive information about the information system and its contents. This metadata is stored in a centralized metadata repository. These elements combined mean that, Oracle Data Integrator AIP enables “Simply Faster Integration.

ODI Architecture


The central component of the architecture is the repository. This stores configuration information about the IT infrastructure, the metadata for all applications, projects, scenarios, and execution logs. Repositories can be installed on an OLTP relational database. The repository also contains information about the Oracle Data Integrator infrastructure, defined by the administrators.
Administrators, developers, and operators use different Oracle Data Integrator Graphical User Interfaces to access the repositories.
Security and Topology are used for administering the infrastructure, Designer is used for reverse engineering metadata and developing projects, and Operator is used for scheduling and operating run-time operations.
At design time, developers work in a repository to define metadata and business rules. The resulting processing jobs are executed by the Agent, which orchestrates the execution by leveraging existing systems. It connects to available servers and requests them to execute the code. It then stores all return codes and messages into the repository.
It also stores statistics such as the number of records processed, the elapsed time, and so on.
Several different repositories can coexist in a single IT infrastructure. In the graphic in the previous page, two repositories are represented: one for the development environment, and another one for the production environment. The developers release their projects in the form of scenarios that are sent to production.
In production, these scenarios are scheduled and executed on a Scheduler Agent which also stores all its information in the repository. Operators have access to this information and are able to monitor the integration processes in real time.
Business users, as well as developers, administrators and operators, can get Web-based read access to the repository. The Metadata Navigator application server links the Oracle Data Integrator Repository to any Web browser, such as Firefox or Internet Explorer
ODI Components


The four Oracle Data Integrator GUIs—Designer, Operator, Topology Manager, and Security Manager, are based on Java. They can be installed on any platform that supports Java Virtual Machine 1.4, including Windows, Linux, HP-UX, Solaris, pSeries, and so on.
Designer is the GUI for defining metadata, and rules for transformation and data quality. It uses these to generate scenarios for production, and is where all project development takes place. It is the core module for developers and metadata administrators. Operator is used to manage and monitor Oracle Data Integrator in production. It is designed for production operators and shows the execution logs with errors counts, the number of rows processed, execution statistics, and so on. At design time, developers use Operator for debugging purposes.
Topology Manager manages the physical and logical architecture of the infrastructure. Servers, schemas, and agents are registered here in the Oracle Data Integrator Master Repository. This module is usually used by the administrators of the infrastructure.
Security Manager manages users and their privileges in Oracle Data Integrator. It can be used to give profiles and users access rights to Oracle Data Integrator objects and features. This module is usually used by security administrators. All Oracle Data Integrator modules store their information in the centralized Oracle Data Integrator repository.
ODI Run Time Components

At run time, the Scheduler Agent orchestrates the execution of the developed scenarios. It can be installed on any platform provided that it supports a Java Virtual Machine 1.4 (Windows, Linux, HP-UX, Solaris, pSeries, iSeries, zSeries, and so on).
Execution may be launched from one of the graphical modules, or by using the built-in scheduler. Thanks to Oracle Data Integrator’ E-LT architecture, the Scheduler Agent rarely performs any transformation itself. Normally, it simply retrieves code from the execution repository, and requests database servers, operating systems or scripting engines to execute it. When the execution is completed, the scheduler agent updates logs in the repository, reporting error messages and execution statistics.
The execution log can be viewed from the Operator graphical module. It is important to understand that although it can act as a transformation engine, the agent is rarely used this way in practice. Agents are installed at tactical locations in the information system to orchestrate the integration processes and leverage existing systems. Agents are lightweight components in this distributed integration architecture
Metadata Navigator
Metadata Navigator is a J2EE application that provides Web access to Oracle Data Integrator repositories. It allows the users to navigate projects, models, logs, and so on. By default, it is installed on Jakarta Tomcat Application Server.
Business users, developers, operators and administrators use their Web browser to access Metadata Navigator. Via its comprehensive Web interface, they can see flow maps, trace the source of all data and even drill down to the field level to understand the transformations that affected the data.
It is also possible to trigger and monitor processing jobs from a Web browser through Metadata Navigator

Components – A global view


By putting these pieces together, you now have a global view of the components that make up Oracle Data Integrator: the graphical components, the repository, the Scheduler Agent, and finally Metadata Navigator.
ODI Repository


The Oracle Data Integrator Repository is composed of a master repository and several work repositories. These repositories are databases stored in relational database management systems. All objects configured, developed, or used by the Oracle Data Integrator modules are stored in one of these two types of repository. The repositories are accessed in client/server mode by the various components of the Oracle Data Integrator architecture.
There is usually only one master repository, which contains the following information:
· Security information including users, profiles, and access privileges for the Oracle Data Integrator platform.
· Topology information including technologies, definitions of servers and schemas, contexts and languages.
Old versions of objects. The information contained in the master repository is maintained with Topology Manager and Security Manager. All modules access the master repository, as they all need the topology and security information stored there.
The work repository is where projects are worked on. Several work repositories may coexist in the same Oracle Data Integrator installation. This is useful, for example, to maintain separate environments or to reflect a particular versioning life cycle.
A work repository stores information for:
· Data models, which include the descriptions of schemas, data store structures and metadata, fields and columns, data quality constraints, cross references, data lineage, and so on
· Projects, which include business rules, packages, procedures, folders, knowledge modules, variables and so on
Execution, which means scenarios, scheduling information and logs
The contents of a work repository are managed with Designer and Operator. It is also accessed by the agent at run time.
When a work repository is only used to store execution information (typically for production purposes), it is called an execution repository. Execution repositories are accessed at run time with Operator and also by agents. An important rule to remember is that all work repositories are always attached to exactly one master repository
Example of Repository Setup


This diagram gives an overview of a typical repository architecture where development, testing and production are carried out in separate work repositories. When the development team finishes working on certain projects, it releases them into the master repository. The testing team imports these released versions for testing in a separate work repository, thus allowing the development team to continue working on the next versions. When the test team successfully validates the developed items, the production team then exports executable versions (called scenarios) into the final production work repository. This repository structure corresponds to a simple development-test-production cycle

ODI Architecture and approach

Oracle Data Integrator provides a new declarative design approach to defining data transformation and integration processes, resulting in faster and simpler development and maintenance. Based on a unique “E-LT” architecture

Oracle Data Integrator Course Content ODI Training in bangalore

Prerequisites

·  40 GB HD
·  2  GB RAM
·   ORACLE 11G R2
·   Web logic Server 11G
·   ODI 11G
·   RCU 11G

Datawarehousing Concepts:
  • Physical and Logical Data modeling
  • Creating Physical tables and schemas
  • Creating Oracle Queries
  • Data base installations
  • Dataware house concepts
  • Star Schema , Snowflake Schema and Galaxy Schema
  • SCD Types  ( slowly changing dimentions)
  • Dimentional modeling
  • Data modeling
  • ODI Installations
  • Weblogic Installations and configurations
  • Oracle Data Integrator 11G New Features Overview

Course Objectives:
  • Overview of the architecture of the complete Oracle Data Integrator system
  • Define the Information System architecture
  • Get started with projects in Oracle Data Integrator and release your work for deployment
  • An overview of models, reverse-engineering model metadata from your database, and how and why to flesh out models with missing metadata
  • Manage metadata in Oracle Data Integrator: models and data quality
  • Create simple interfaces and add multiple sources to an interface
  • Launch a session to run an interface
  • Enforce data quality constraints in an interface
  • Manage packages
  • Add more advanced integration components
  • Customize data flow and get the best performance out of your system
  • An overview of sessions, use Operator to monitor, stop and start sessions, and basic troubleshooting.




Introduction :  
·  Identifying the Course Content
·  What is Oracle Data Integrator?
·  Why Oracle Data Integrator?
·  Overview of ODI 11g Architecture
·  Overview of ODI 11g Components
·  About Graphical Modules
·  Types of ODI Agents
·  Overview of Oracle Data Integrator Repositories

 Installations & Configurations : 
·  Oracle 11G Database.
·  ODI 11G
·  Weblogic Server 11G
·  Configurations

Administering ODI Repositories and Agents : 
·  Administrating the ODI Repositories
·  Creating Repository Storage Spaces
·  Creating and Connecting to the Master Repository
·  Creating and Connecting to the Work Repository
·  Managing ODI Agents
·  Creating a Physical Agent
·  Launching a Listener, Scheduler and Web Agent
·  Example of Load Balancing
ODI Topology Concepts :  
·  Overview of ODI Topology
·  About Data Servers and Physical Schemas
·  Defining the Physical Architecture
·  Defining the Logical Architecture
·  Mapping Logical and Physical Resources
·  Defining Agents
·  Defining a Topology
·  Planning the Topology
Describing the Physical and Logical Architecture : 
·  Overview of Topology Navigator
·  Creating Physical Architecture
·  Creating a Data Server
·  Testing a Data Server Connection
·  Creating a Physical Schema
·  Creating Logical Architecture
·  Overview of Logical Architecture and Context Views
·  Linking the Logical and Physical Architecture
Setting Up a New ODI Project : 
·  Overview of ODI Projects
·  Creating a New Project
·  Using Folders
·  Organizing Projects and Folders
·  Understanding Knowledge Modules
·  Exchanging ODI Objects
·  Exporting and Importing Objects
·  Using Markers
Oracle Data Integrator Model Concepts :  
·  What is a Model?
·  Understanding Metadata in ODI
·  Understanding Reverse Engineering
·  Creating Models
·  Organizing Models
·  Creating Data stores
·  Using Constraints in ODI
·  Creating Keys and References
Organizing ODI Models and Creating Data stores : 
·  What is an Interface?
·  Business Rules for Interfaces
·  What is a Mapping?
·  What is a Join?
·  What is a Filter?
·  What is a Constraint?
·  What is a Staging Area?
·  Creating a Basic Interface
ODI Interface Concepts 
·  What is an Interface?
·  Business Rules for Interfaces
·  What is a Mapping, Filter, Join?
·  Overview of Integration Process
·  What is a Staging Area?
·  About Execution Location
·  Using Knowledge Modules (KM) with ODI Interface
·  Creating a Basic Interface
Designing Interfaces : 
·  Designing an Interface
·  Multiple Source Data stores
·  Creating Joins
·  Filtering data
·  Disabling Transformations
·  Overview of the Flow
·  Specifying the Staging Area
·  Selecting Knowledge Modules
Interfaces: Monitoring and Debugging : 
·  Monitoring Interfaces
·  Using Operator
·  Viewing Sessions and Tasks
·  How to Monitor Execution of an Interface
·  How to Troubleshoot a Session
·  Keys to Reviewing the Generated Code
·  Working with Errors
·  Tips for Preventing Errors
Designing Interfaces: Advanced Topics : 
·  Using Business Rules in Interfaces
·  Overview of Business Rule Elements
·  Using variables
·  Using User Functions
·  Using Substitution Methods
·  Modifying a KM
·  Developing Your Own KM
·  Using RKM for Customized Reverse Engineering
Using ODI procedures : 
·  What is a Procedure?
·  Examples of Procedures
·  Creating Procedures
·  Adding Commands
·  Adding Options
·  Running a Procedure
·  Using Operator to View Results
Using ODI Packages : 
·  What is a package?
·  Creating a package
·  Executing a package
·  Creating Advanced Packages
·  Error handling
·  Controlling an Execution Path
·  Creating a Loop
·  Using the Advanced tab
Managing ODI Scenarios and Versions :  
·  What is a Scenario?
·  Managing Scenarios
·  Preparing Scenarios for Deployment
·  Automating Scenario Management
·  Scheduling the ODI Scenario
·  Overview of ODI version management
·  Using Version Browser and Version Comparison Tool
·  Handling concurrent changes
Enforcing Data Quality and Auditing Data with ODI : 
·  Why Data Quality?
·  When to Enforce Data Quality?
·  Data Quality in Source Applications
·  Data Quality Control in the Integration Process
·  Data Quality in the Target Applications
·  Enforcing Data Quality
·  Exploring Your Data
·  Auditing Data Quality
Working with Changed Data Capture :  
·  Overview of ODI version management
·  Techniques of Changed Data Capture
·  Changed Data Capture in ODI
·  CDC Strategies and Infrastructure
·  CDC Consistency
·  Using CDC
·  Viewing Data/Changed data
·  Using Journalizing
Administering ODI Resources: Advanced Topics : 
·  Using Open Tools
·  Installing Open Tools
·  Using Open Tools in a Package
·  Using Open Tools in a Procedure or in a KM
·  Developing Your Own Open Tools
·  Setting Up ODI Security
·  Defining Security Policies
·  Defining Password Policies