Tuesday, June 26, 2012

Error Table Structure in oracle data integrator

Error Table Structure

The E$ error table has the list of columns described in the following table:
Columns Description
[Columns of the checked table] The error table contains all the columns of the checked datastore.
ERR_TYPE Type of error:
  • 'F' when the datastore is checked during flow control
  • 'S' when the datastore is checked using static control
ERR_MESS Error message related to the violated constraint
CHECK_DATE Date and time when the datastore was checked
ORIGIN Origin of the check operation. This column is set either to the datastore name or to an interface name and ID depending on how the check was performed.
CONS_NAME Name of the violated constraint.
CONS_TYPE Type of the constraint:
  • 'PK': Primary Key
  • 'AK': Alternate Key
  • 'FK': Foreign Key
  • 'CK': Check condition
  • 'NN': Mandatory column

How to store Multiple values in ODI Variable

Create one procedure

1) First write one select statement in SOURCE TAB

ex :  select deptno as DNO from dept

2) go to TARGET TAB calll below query

ex:  delete from emp where deptno = :DNO

LOG Levels in Oracle Data Integrator

     List of LOG Levels in Oracle Data Integrator



    0=>  No Log  
   
    1=>  Displays the start and end of each session

    2=>Displays level 1 and the start and end of each step

    3=>Displays level 2 and each task executed

    4=>Displays the SQL queries executed, as well as level 3

    5=>A complete trace, inclueds level 4
   
    6=> Log level 6 has the same behavior as log level 5, but with the addition of variable tracking while
           execution or restarting  jobs.

Saturday, June 23, 2012

Different Types of Dimensions and Facts in Data Warehouse

Dimension -
A dimension table typically has two types of columns, primary keys to fact tables and textual\descreptive data.
Fact -
A fact table typically has two types of columns, foreign keys to dimension tables and measures those that contain numeric facts. A fact table can contain fact’s data on detail or aggregated level.

Types of Dimensions -
Slowly Changing Dimensions:
 Attributes of a dimension that would undergo changes over time. It depends on the business requirement whether particular attribute history of changes should be preserved in the data warehouse. This is called a Slowly Changing Attribute and a dimension containing such an attribute is called a Slowly Changing Dimension.

Rapidly Changing Dimensions:
A dimension attribute that changes frequently is a Rapidly Changing Attribute. If you don’t need to track the changes, the Rapidly Changing Attribute is no problem, but if you do need to track the changes, using a standard Slowly Changing Dimension technique can result in a huge inflation of the size of the dimension. One solution is to move the attribute to its own dimension, with a separate foreign key in the fact table. This new dimension is called a Rapidly Changing Dimension.

Junk Dimensions:
A junk dimension is a single table with a combination of different and unrelated attributes to avoid having a large number of foreign keys in the fact table. Junk dimensions are often created to manage the foreign keys created by Rapidly Changing Dimensions.

Inferred Dimensions:
While loading fact records, a dimension record may not yet be ready. One solution is to generate an surrogate key with Null for all the other attributes. This should technically be called an inferred member, but is often called an inferred dimension.

Conformed Dimensions:
A Dimension that is used in multiple locations is called a conformed dimension. A conformed dimension may be used with multiple fact tables in a single database, or across multiple data marts or data warehouses.

Degenerate Dimensions:
 A degenerate dimension is when the dimension attribute is stored as part of fact table, and not in a separate dimension table. These are essentially dimension keys for which there are no other attributes. In a data warehouse, these are often used as the result of a drill through query to analyze the source of an aggregated number in a report. You can use these values to trace back to transactions in the OLTP system.

Role Playing Dimensions:
A role-playing dimension is one where the same dimension key — along with its associated attributes — can be joined to more than one foreign key in the fact table. For example, a fact table may include foreign keys for both Ship Date and Delivery Date. But the same date dimension attributes apply to each foreign key, so you can join the same dimension table to both foreign keys. Here the date dimension is taking multiple roles to map ship date as well as delivery date, and hence the name of Role Playing dimension.

Shrunken Dimensions:
A shrunken dimension is a subset of another dimension. For example, the Orders fact table may include a foreign key for Product, but the Target fact table may include a foreign key only for ProductCategory, which is in the Product table, but much less granular. Creating a smaller dimension table, with ProductCategory as its primary key, is one way of dealing with this situation of heterogeneous grain. If the Product dimension is snowflaked, there is probably already a separate table for ProductCategory, which can serve as the Shrunken Dimension.

Static Dimensions:
Static dimensions are not extracted from the original data source, but are created within the context of the data warehouse. A static dimension can be loaded manually — for example with Status codes — or it can be generated by a procedure, such as a Date or Time dimension.

Types of Facts -

Additive:
Additive facts are facts that can be summed up through all of the dimensions in the fact table. A sales fact is a good example for additive fact.
Semi-Additive:
Semi-additive facts are facts that can be summed up for some of the dimensions in the fact table, but not the others.
Eg: Daily balances fact can be summed up through the customers dimension but not through the time dimension.
Non-Additive:
Non-additive facts are facts that cannot be summed up for any of the dimensions present in the fact table.
Eg: Facts which have percentages, ratios calculated.

Factless Fact Table:
 In the real world, it is possible to have a fact table that contains no measures or facts. These tables are called “Factless Fact tables”.
Eg: A fact table which has only product key and date key is a factless fact. There are no measures in this table. But still you can get the number products sold over a period of time.
Based on the above classifications, fact tables are categorized into two:
Cumulative:
This type of fact table describes what has happened over a period of time. For example, this fact table may describe the total sales by product by store by day. The facts for this type of fact tables are mostly additive facts. The first example presented here is a cumulative fact table.
Snapshot:
This type of fact table describes the state of things in a particular instance of time, and usually includes more semi-additive and non-additive facts. The second example presented here is a snapshot fact table.

Friday, June 22, 2012

Using the Operating System Environment Variables in jython


Using the Operating System Environment Variables

It can be usefull to retrieve the Operating System environment variables. The following examples show how to retrieve this list:
import os
ftrg = open('c:/temp/listenv.txt', 'w')
try:
  envDict = os.environ
  osCurrentDirectory = os.getcwd()
  print >> ftrg, 'Current Directory: %s'  % osCurrentDirectory
  print >> ftrg, '=============================='
  print >> ftrg, 'List of environment variables:'
  print >> ftrg, '=============================='
  for aKey in envDict.keys():
    print >> ftrg, '%s\t= %s' % (aKey, envDict[aKey])
  print >> ftrg, '=============================='
  print >> ftrg, 'Oracle Data Integrator specific environment variables:'
  print >> ftrg, '=============================='
  for aKey in envDict.keys():
    if aKey.startswith('SNP_'):
      print >> ftrg, '%s\t= %s' % (aKey, envDict[aKey])
finally:
  ftrg.close()

To retrieve the value of the USERNAME environment variable, just write:
import os
currentUser = os.environ['USERNAME']

Using JDBC in jython


Using JDBC

It can be convenient to use JDBC (Java DataBase Connectivity) to connect to a database from Jython. All Java classes in the CLASSPATH can be directly used in Jython. The following example shows how to use the JDBC API to connect to a database, to run a SQL query and write the result into a file.
The reference documentation for Java is available at http://java.sun.com
import java.sql as sql
import java.lang as lang
def main():
  driver, url, user, passwd = (
    'oracle.jdbc.driver.OracleDriver',
    'jdbc:oracle:thin:@myserver:1521:mysid',
    'myuser',
    'mypasswd')
  ##### Register Driver
  lang.Class.forName(driver)
  
  ##### Create a Connection Object
  myCon = sql.DriverManager.getConnection(url, user, passwd)
  f = open('c:/temp/jdbc_res.txt', 'w')
  try:
    ##### Create a Statement
    myStmt = myCon.createStatement()
    ##### Run a Select Query and get a Result Set
    myRs = myStmt.executeQuery("select TABLE_NAME, OWNER from ALL_TABLES where TABLE_NAME like 'SNP%'")
  
    ##### Loop over the Result Set and print the result in a file
    while (myRs.next()):
      print >> f , "%s\t%s" %(myRs.getString("TABLE_NAME"), myRs.getString("OWNER") )
  finally:
    myCon.close()
    f.close()

### Entry Point of the program      
if __name__ == '__main__':
  main()

It is possible to combine Jython with odiRef API in the Oracle Data Integrator Procedures, for even more flexibility. Instead of hard-coding the parameters to connect to a database in the program, the getInfo method can be used:
import java.sql as sql
import java.lang as lang
def main():
  driver, url, user, passwd = (
    '<%=odiRef.getInfo("DEST_JAVA_DRIVER")%>',
    '<%=odiRef.getInfo("DEST_JAVA_URL")%>',
    '<%=odiRef.getInfo("DEST_USER_NAME")%>',
    '<%=odiRef.getInfo("DEST_PASS")%>')
  ##### Register Driver
  lang.Class.forName(driver)
[...]

Using IP sockets in jython


Using IP sockets

IP sockets are used to initiate an IP communication between two processes on the network. Jython greatly simplifies the creation of IP servers (waiting for IP packets) or IP clients (sending IP packets).
The following example shows the implementation of a very basic IP server. It waits for data coming from client software, and writes each received packet into the file c:/temp/socketserver.log. If a server receives the packet STOPSERVER, the server stops:

Server

import socket
import time
HOST = ''
PORT = 9191 # Arbitrary port (not recommended)
LOG_FILE = 'c:/temp/sockserver.log'
mySock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
mySock.bind((HOST, PORT))
logfile = open(LOG_FILE, 'w')
try:
  print >> logfile, '*** Server started : %s' % time.strftime('%Y-%m-%d %H:%M:%S')
  while 1:
    data, addr = mySock.recvfrom(1024)
    print >> logfile, '%s (%s): %s' % (time.strftime('%Y-%m-%d %H:%M:%S'), addr, data)
    if data == 'STOPSERVER':
      print >> logfile, '*** Server shutdown at %s by %s' % (time.strftime('%Y-%m-%d %H:%M:%S'), addr)
      break
finally:
  logfile.close()

Client

The following example can be used ot test the above server. It sends two packets before asking the server to stop.
import socket
import sys
PORT = 9191 # Same port as the server
HOST = 'SERVER_IP_ADDRESS'
mySock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
mySock.sendto('Hello World !', (HOST, PORT))
mySock.sendto('Do U hear me?', (HOST, PORT))
mySock.sendto('STOPSERVER', (HOST, PORT))


Using FTP in jython


Using FTP

In some environments, it can be useful to use FTP (File Transfer Protocol) to transfer files between heterogeneous systems. Oracle Data Integrator provides an additional Jython module to further integrate FTP.
The following examples show how to use this module:
Pull the *.txt files from /home/odi of the server ftp.myserver.com into the local directory c:\temp

import snpsftp
ftp = snpsftp.SnpsFTP('ftp.myserver.com', 'mylogin', 'mypasswd')
try:
  ftp.setmode('ASCII')
  ftp.mget('/home/odi', '*.txt', 'c:/temp')
finally:
  ftp.close()

Push the files *.zip from C:\odi\lib onto ftp.myserver.com in the remote directory /home/odi/lib

import snpsftp
ftp = snpsftp.SnpsFTP('ftp.myserver.com', 'mylogin', 'mypasswd')
try:
  ftp.setmode('BINARY')
  ftp.mput('C:/odi/lib', '*.zip', '/home/odi/lib')
finally:
  ftp.close()