informatica 3.ppt by huangyinggok

VIEWS: 18 PAGES: 48

									                                  Informatica

•   Understanding Commit Points Overview
•   A commit interval is the interval at which the PowerCenter Server commits data to
    targets during a session. The commit point can be a factor of the commit interval, the
    commit interval type, and the size of the buffer blocks. The commit interval is the
    number of rows you want to use as a basis for the commit point. The commit interval
    type is the type of rows that you want to use as a basis for the commit point. You can
    choose between the following commit types:
•   Target-based commit. The PowerCenter Server commits data based on the number of
    target rows and the key constraints on the target table. The commit point also depends on
    the buffer block size, the commit interval, and the PowerCenter Server configuration for
    writer timeout.
•   Source-based commit. The PowerCenter Server commits data based on the number of
    source rows. The commit point is the commit interval you configure in the session
    properties.
•   User-defined commit. The PowerCenter Server commits data based on transactions
    defined in the mapping properties. You can also configure some commit and rollback
    options in the session properties
                      Informatica

• Source-based and user-defined commit sessions have
  partitioning restrictions. If you configure a session with
  multiple partitions to use source-based or user-defined
  commit, you can only choose pass-through partitioning at
  certain partition points in a pipeline. .
• Source-based commit may slow session performance if the
  session uses a one-to-one mapping. A one-to-one mapping
  is a mapping that moves data from a Source Qualifier,
  XML Source Qualifier, or Application Source Qualifier
  transformation directly to a target. For more information
  about performance
• you can set commit from workflow manager -- session --
  properties
                            Informatica

•   Understanding Transaction Control
•   PowerCenter allows you to define transactions that the PowerCenter
    Server uses when it processes transformations, and when it commits
    and rolls back data at a target
•   The PowerCenter Server can process a transformation for each row at
    a time, for all rows in a transaction, or for all source rows together
•   Transformation Scope
•   You can configure how the PowerCenter Server applies the
    transformation logic to incoming data with the Transformation Scope
    transformation property. When the PowerCenter Server processes a
    transformation, it either drops transaction boundaries or preserves
    transaction boundaries, depending on the transformation scope and the
    mapping configuration
                            Informatica

•   Understanding Transaction Control
•   PowerCenter allows you to define transactions that the PowerCenter
    Server uses when it processes transformations, and when it commits
    and rolls back data at a target
•   The PowerCenter Server can process a transformation for each row at
    a time, for all rows in a transaction, or for all source rows together
•   Transformation Scope
•   You can configure how the PowerCenter Server applies the
    transformation logic to incoming data with the Transformation Scope
    transformation property. When the PowerCenter Server processes a
    transformation, it either drops transaction boundaries or preserves
    transaction boundaries, depending on the transformation scope and the
    mapping configuration
                                  Informatica

•   You can choose one of the following values for the transformation scope:
•   Row. Applies the transformation logic to one row of data at a time. Choose Row when a
    row of data does not depend on any other row. When you choose Row for a
    transformation connected to multiple upstream transaction control points, the
    PowerCenter Server drops transaction boundaries and outputs all rows from the
    transformation as an open transaction. When you choose Row for a transformation
    connected to a single upstream transaction control point, the PowerCenter Server
    preserves transaction boundaries.
•   Transaction. Applies the transformation logic to all rows in a transaction. Choose
    Transaction when a row of data depends on all rows in the same transaction, but does
    not depend on rows in other transactions. When you choose Transaction, the
    PowerCenter Server preserves incoming transaction boundaries. It resets any cache, such
    as an aggregator or lookup cache, when it receives a new transaction.
•   When you choose Transaction for a multiple input group transformation, you must
    connect all input groups to the same upstream transaction control point.
•   All Input. Applies the transformation logic on all incoming data. When you choose All
    Input, the PowerCenter Server drops incoming transaction boundaries and outputs all
    rows from the transformation as an open transaction. Choose All Input when a row of
    data depends on all rows in the source
                                   Informatica

•   Transaction Scope is define in the designer for transformations :
•   Transaction Control is available for aggregator , sorter, Joiner , Rank , XML . It s
    optional for all these and the deault for these is all input.
•   For all other transformations default is Row.
•   Other properties to set in Session are :
•   Row is generally by default.
•   When the PowerCenter Server performs a database transaction, such as a commit, it
    performs the transaction to all targets in a target connection group
•   Dropping and Recreating Indexes
•   After you insert significant amounts of data into a target, you normally need to drop and
    recreate indexes on that table to optimize query speed. You can drop and recreate
    indexes by:
•   Using pre- and post-session SQL. The preferred method for dropping and re-creating
    indexes is to define a SQL statement in the Pre SQL property that drops indexes before
    loading data to the target. You can use the Post SQL property to recreate the indexes
    after loading data to the target. Define the Pre SQL and Post SQL properties for
    relational targets in the Transformations view on the Mapping tab in the session
    properties.
                                       Informatica

•   Constraint-Based Loading
•   In the Workflow Manager, you can specify constraint-based loading for a session. When you select
    this option, the PowerCenter Server orders the target load on a row-by-row basis. For every row
    generated by an active source, the PowerCenter Server loads the corresponding transformed row first
    to the primary key table, then to any foreign key tables. Constraint-based loading depends on the
    following requirements:
•   Active source. Related target tables must have the same active source.
•   Key relationships. Target tables must have key relationships.
•   Target connection groups. Targets must be in one target connection group.
•   Treat rows as insert. Use this option when you insert into the target. You cannot use updates with
    constraint-based loading.
•   When target tables receive rows from different active sources, the PowerCenter Server reverts to
    normal loading for those tables, but loads all other targets in the session using constraint-based
    loading when possible
•   When target tables have no key relationships, the PowerCenter Server does not perform constraint-
    based loading. Similarly, when target tables have circular key relationships, the PowerCenter Server
    reverts to a normal load. For example, you have one target containing a primary key and a foreign
    key related to the primary key in a second target. The second target also contains a foreign key that
    references the primary key in the first target. The PowerCenter Server cannot enforce constraint-
    based loading for these tables. It reverts to a normal load
                                      Informatica

•   Bulk Loading
•   You can enable bulk loading when you load to DB2, Sybase, Oracle, or Microsoft SQL Server.
•   If you enable bulk loading for other database types, the PowerCenter Server reverts to a normal load.
    Bulk loading improves the performance of a session that inserts a large amount of data to the target
    database. Configure bulk loading on the Mapping tab.
•   When bulk loading, the PowerCenter Server invokes the database bulk utility and bypasses the
    database log, which speeds performance. Without writing to the database log, however, the target
    database cannot perform rollback. As a result, you may not be able to perform recovery. Therefore,
    you must weigh the importance of improved session performance against the ability to recover an
    incomplete session
•   When loading to DB2, Microsoft SQL Server, and Oracle targets, you must specify a normal load for
    data driven sessions. When you specify bulk mode and data driven, the PowerCenter Server reverts to
    normal load
•   When bulk loading to Sybase and DB2 targets, the PowerCenter Server ignores the commit interval
    you define in the session properties and commits data when the writer block is full.
•   When bulk loading to Microsoft SQL Server and Oracle targets, the PowerCenter Server commits
    data at each commit interval. Also, Microsoft SQL Server and Oracle start a new bulk load
    transaction after each commit.
•   To bulk load into indexed tables, choose non-parallel mode. To do this, you must disable the Enable
    Parallel Mode option.
                                Informatica

•   Note that when you disable parallel mode, you cannot load multiple target
    instances, partitions, or sessions into the same table.
•   To bulk load in parallel mode, you must drop indexes and constraints in the
    target tables before running a bulk load session. After the session completes,
    you can rebuild them. If you use bulk loading with the session on a regular
    basis, you can use pre- and post-session SQL to drop and rebuild indexes and
    key constraints.
•   You can enter any SQL command that is valid in the database associated with
    the connection object. The PowerCenter Server does not allow nested
    comments, even though the database might. When you enter SQL in the SQL
    Editor, you manually type in the SQL statements. Use a semi-colon (;) to
    separate multiple statements. The PowerCenter Server ignores semi-colons
    within single quotes, double quotes, or within /* ...*/. If you need to use a semi
    -colon outside of quotes or comments, you can escape it with a back slash (\).
    You cannot use session or mapping variables in the environment SQL.
•   Some of the options which you can specify while creating a connection in
    workflow manager from Connections -- relational.
                                   Informatica

•   Configuring Environment SQL
•   For relational databases, you may need to execute some SQL commands in the database
    environment when you connect to the database. For example, you might want to set
    isolation levels on the source and target systems to avoid deadlocks.
•   You configure environment SQL in the database connection. You can use environment
    SQL for source, target, lookup, and stored procedure connections. If the SQL syntax is
    not valid, the PowerCenter Server does not connect to the database, and the session fails.
•   The PowerCenter Server executes the SQL each time it connects to the database. For
    example, if you configure environment SQL in a target connection, and you configure
    three partitions for the pipeline, the PowerCenter Server executes the SQL three times,
    once for each connection to the target database
•   Rollback Segment -- only for Oracle
•   Enable Parallel Mode --Oracle --Enables parallel processing when loading data into a
    table in bulk mode.
•   To replace a relational database connection:
•   Close all folders in the repository.
•   Choose Connections-Replace.
•   The Replace Connections dialog box appears
                                        Informatica
•   Configuring the Mapping
•   When you design a mapping, consider requirements for session recovery. Configure the mapping so
    that the PowerCenter Server can extract, transform, and load data with the same results each time it
    runs the session.
•   Use the following guidelines when you configure the mapping:
•   Sort the data from the source. This guarantees that the PowerCenter Server always receives source
    rows in the same order. You can do this by configuring the Sorted Ports option in the Source
    Qualifier or Application Source Qualifier transformation or by adding a Sorter transformation
    configured for distinct output rows to the mapping after the source qualifier.
•   Verify all targets receive data from transformations that produce repeatable data. Some
    transformations produce repeatable data. You can enable a session for recovery in the Workflow
    Manager when all targets in the mapping receive data from transformations that produce repeatable
    data. For more information on repeatable data, see Working with Repeatable Data.
•   You can enable a session for recovery in the Workflow Manager when all targets in the mapping
    receive data from transformations that produce repeatable data. All transformations have a property
    that determines when the transformation produces repeatable data. For most transformations, this
    property is hidden. However, you can write the Custom transformation procedure to output
    repeatable data, and then configure the Custom transformation Output Is Repeatable property to
    match the procedure behavior.
•   Transformations can produce repeatable data under the following circumstances:
•   Never. The order of the output data is inconsistent between session runs. This is the default for active
    Custom transformations.
•   Based on input order. The output order is consistent between session runs when the input data order
    for all input groups is consistent between session runs. This is the default for passive Custom
    transformations.
•   Always. The order of the output data is consistent between session runs even if the order of the input
    data is inconsistent between session runs.
                                   Informatica

•   some of the transformations always return repeateable data while for some it is based on
    the input order.
•   Union transformation can never be repeatable.
•   The recovery can be enabled from the workflow manager -- edit session -- config
    object
•   to restart the workflow .
•   goto the workflow monitor -- task in menu -- recover/resume
•   You can recover a workflow from a failed sequential or concurrent session. You might
    want to fail a workflow as a result of session failure if successive tasks in the workflow
    depend on the success of the previous sessions.
•   To configure a session to fail the workflow if the session fails, enable the Fail Parent If
    This Task Fails option on the General tab of the session properties. For more
    information
•   The PowerCenter Server writes recovery data to relational target databases when you run
    a session enabled for recovery. If the session fails, the PowerCenter Server uses the
    recovery data to determine the point at which it continues to commit data during the
    recovery session.
•   The PowerCenter Server writes recovery data to relational target databases when you run
    a session enabled for recovery. If the session fails, the PowerCenter Server uses the
    recovery data to determine the point at which it continues to commit data during the
    recovery session.
                                    Informatica

•   Verifying Recovery Tables
•   The PowerCenter Server creates recovery information in cache files for all sessions
    enabled for recovery. It also creates recovery tables on the target database for relational
    targets during the initial session run.
•   If the session is enabled for recovery, the PowerCenter Server creates recovery
    information in cache files during the normal session run. The PowerCenter Server stores
    the cache files in the directory specified for $PMCacheDir. The PowerCenter Server
    generates file names in the format PMGMD_METADATA_*.dat. Do not alter these
    files or remove them from the PowerCenter Server cache directory. The PowerCenter
    Server cannot run the recovery session if you delete the recovery cache files.
•   If the session writes to a relational database and is enabled for recovery, the
    PowerCenter Server also verifies the recovery tables on the target database for all
    relational targets at the beginning of a normal session run. If the tables do not exist, the
    PowerCenter Server creates them. If the database user name the PowerCenter Server
    uses to connect to the target database does not have permission to create the recovery
    tables, you must manually create them
•   If the session completes successfully, the PowerCenter Server deletes all recovery cache
    files and removes recovery table entries that are related to the session. The PowerCenter
    Server initializes the information in the recovery tables at the beginning of the next
    session run.
                                  Informatica

•   If a session enabled for recovery fails, you can run the session in recovery mode. The
    PowerCenter Server moves a recovery session through the states of a normal session:
    scheduled, waiting, running, succeeded, and failed. When the PowerCenter Server starts
    the recovery session, it runs all pre-session tasks.
•   For relational normal load targets, the PowerCenter Server performs incremental load
    recovery. It uses the recovery information created during the normal session run to
    determine the point at which the session stopped committing data to the target. It then
    continues writing data to the target. On successful recovery, the PowerCenter Server
    removes the recovery information from the tables.
•   For example, if the PowerCenter Server commits 10,000 rows before the session fails,
    when you run the session in recovery mode, the PowerCenter Server bypasses the rows
    up to 10,000 and starts loading with row 10,001.
•   If the session writes to a relational target in bulk mode, the PowerCenter Server
    performs the entire writer run. If the Truncate Target Table option is enabled in the
    session properties, the PowerCenter Server truncates the target before loading data
•   In some cases, you cannot perform recovery for a session. There may also be
    circumstances that cause a recovery session to fail or produce inconsistent data. If you
    cannot recover a session, you can run the session again.
                                       Informatica

•   You cannot run sessions in recovery mode under the following circumstances:
•   You change the number of partitions. If you change the number of partitions after the session fails,
    the recovery session fails.
•   Recovery table is empty or missing from the target database. The PowerCenter Server fails the
    recovery session under the following circumstances:
•   You deleted the table after the PowerCenter Server created it.
•   The session enabled for recovery succeeded, and the PowerCenter Server removed the recovery
    information from the table.
•   Recovery cache file is missing. The PowerCenter Server fails the recovery session if the recovery
    cache file is missing from the PowerCenter Server cache directory
•   You might get inconsistent data if you perform recovery under the following circumstances:
•   You change the partitioning configuration. If you change any partitioning options after the session
    fails, you may get inconsistent data.
•   Source data is not sorted. To perform a successful recovery, the PowerCenter Server must process
    source rows during recovery in the same order it processes them during the initial session. Use the
    Sorted Ports option in the Source Qualifier transformation or add a Sorter transformation directly
    after the Source Qualifier transformation.
•   The sources or targets change after the initial session failure. If you drop or create indexes, or edit
    data in the source or target tables before recovering a session, the PowerCenter Server may return
    missing or repeat rows.
                                      Informatica

•   The session writes to a relational target in bulk mode, but the session is not configured to
    truncate the target table. The PowerCenter Server may load duplicate rows to the during the
    recovery session.
•   The mapping uses a Normalizer transformation. The Normalizer transformation generates source
    data in the form of primary keys. Recovering a session might generate different values than if the
    session completed successfully. However, the PowerCenter Server will continue to produce unique
    key values.
•   The mapping uses a Sequence Generator transformation. The Sequence Generator transformation
    generates source data in the form of sequence values. Recovering a session might generate different
    values than if the session completed successfully.
•   If you want to ensure the same sequence data is generated during the recovery session, you can reset
    the value specified as the Current Value in the Sequence Generator transformation properties to the
    same value used when you ran the failed session. If you do not reset the Current Value, the
    PowerCenter Server will continue to generate unique Sequence values.
•   The session performs incremental aggregation and the PowerCenter Server stops unexpectedly.
    If the PowerCenter Server stops unexpectedly while running an incremental aggregation session, the
    recovery session cannot use the incremental aggregation cache files. Rename the backup cache files
    for the session from PMAGG*.idx.bak and PMAGG*.dat.bak to PMAGG*.idx and PMAGG*.dat
    before you perform recovery.
•   The PowerCenter Server data movement mode changes after the initial session failure. If you
    change the data movement mode before recovering the session, the PowerCenter Server might return
    incorrect data
                      Informatica

• The PowerCenter Server code page or source and
  target code pages change after the initial session
  failure. If you change the source, target, or PowerCenter
  Server code pages, the PowerCenter Server might return
  incorrect data. You can perform recovery if the new code
  pages are two-way compatible with the original code
  pages.
• The PowerCenter Server runs in Unicode mode and
  you change the session sort order. When the
  PowerCenter Server runs in Unicode mode, it sorts
  character data based on the sort order selected for the
  session. Do not perform recovery if you change the session
  sort order after the session fails.
                               Informatica

•   CACHES
•   The PowerCenter Server creates index and data caches in memory for
    Aggregator, Rank, Joiner, and Lookup transformations in a mapping
•   When you create a session with multiple partitions, the PowerCenter Server
    can partition caches for the Aggregator, Joiner, Lookup, and Rank
    transformations. It creates a separate cache for each partition, and each
    partition works with only the rows needed by that partition. As a result, the
    PowerCenter Server requires only a portion of total cache memory for each
    partition. When you run a session, the PowerCenter Server accesses the cache
    in parallel for each partition. If you do not use cache partitioning, the
    PowerCenter Server accesses the cache serially for each partition.
•   After you configure the session for partitioning, you can configure memory
    requirements and cache directories for each transformation in the
    Transformations view on the Mapping tab of the session properties. To
    configure the memory requirements, calculate the total requirements for a
    transformation, and divide by the number of partitions
                                       Informatica

•   PARAMETERS :
•   You can use a parameter file to define the values for parameters and variables used in a workflow,
    worklet, or session. You can create a parameter file using a text editor such as WordPad or Notepad.
    You list the parameters or variables and their values in the parameter file. Parameter files can contain
    the following types of parameters and variables:
•   Workflow variables
•   Worklet variables
•   Session parameters
•   Mapping parameters and variables
•   You can place parameter files on the PowerCenter Server machine or on a local machine. Use a local
    parameter file if you do not have access to parameter files on the PowerCenter Server machine. When
    you use a local parameter file, pmcmd passes variables and values in the file to the PowerCenter
    Server. Local parameter files are used with the startworkflow pmcmd command. For more
    information, see pmcmd Reference.
•   You must define session parameters in a parameter file. Since session parameters do not have default
    values, when the PowerCenter Server cannot locate the value of a session parameter in the parameter
    file, it fails to initialize the session.
•   You can include parameter or variable information for more than one workflow, worklet, or session
    in a single parameter file by creating separate sections for each object within the parameter file.
                                       Informatica
•   You can also create multiple parameter files for a single workflow, worklet, or session and change
    the file that these tasks use as needed. To specify the parameter file the PowerCenter Server uses with
    a workflow, worklet, or session, you can do either of the following:
•   Enter the parameter file name and directory in the workflow, worklet, or session properties.
•   Start the workflow, worklet, or session using pmcmd and enter the parameter filename and directory
    in the command line
•   You can define the following heading formats:
•   Workflow variables:
•   [folder name.WF:workflow name]
•   Worklet variables:
•   [folder name.WF:workflow name.WT:worklet name]
•   Worklet variables in nested worklets:
•   [folder name.WF:workflow name.WT:worklet name.WT:worklet name...]
•   Session parameters, plus mapping parameters and variables:
•   [folder name.WF:workflow name.ST:session name]
•   or
•   [folder name.session name]
•   main parameters
•   String Mapping Parameter $$State MA
•   Datetime Mapping Variable $$Time 10/1/2000 00:00:00
•   Source File (Session Parameter) $InputFile1 Sales.txt
•   Database Connection (Session Parameter) $DBConnection_Target Sales (database connection)
•   Session Log File (Session Parameter) $PMSessionLogFile d:/session logs/firstrun.txt
                                      Informatica
•   sample file :
•   The following text is an excerpt from a parameter file:
•   [HET_TGTS.WF:wf_TCOMMIT_INST_ALIAS]
•   $$platform=unix
•   [HET_TGTS.WF:wf_TGTS_ASC_ORDR.ST:s_TGTS_ASC_ORDR]
•   $$platform=unix
•   $DBConnection_ora=qasrvrk2_hp817
•   [ORDERS.WF:wf_PARAM_FILE.WT:WL_PARAM_Lvl_1]
•   $$DT_WL_lvl_1=02/01/2000 00:00:00
•   $$Double_WL_lvl_1=2.2
•   [ORDERS.WF:wf_PARAM_FILE.WT:WL_PARAM_Lvl_1.WT:NWL_PARAM_Lvl_2]
•   $$DT_WL_lvl_2=03/01/2000 00:00:00
•   $$Int_WL_lvl_2=3
•   $$String_WL_lvl_2=ccccc
•   You can specify the parameter filename and directory in the workflow or session properties.
•   To enter a parameter file in the workflow properties:
•   Select Workflows-Edit.
•   Click the Properties tab.
•   Enter the parameter directory and name in the Parameter Filename field
•   To enter a parameter file in the session properties:
•   Click the Properties tab and open the General Options settings.
•   Enter the parameter directory and name in the Parameter Filename field
•   In the parameter file, folder and session names are case-sensitive. Make sure to enter folder and
    session names exactly as they appear in the Workflow Manager. Also, use the appropriate prefix for
    all user-defined session parameters
                                         Informatica
•   I am trying to use a source file parameter to specify a source file and location, but the
    PowerCenter Server cannot find the source file.
•   Make sure to clear the source file directory in the session properties. The PowerCenter Server
    concatenates the source file directory with the source file name to locate the source file.
•   Also, make sure to enter a directory local to the PowerCenter Server and to use the appropriate
    delimiter for the operating system.
•   I am trying to run a workflow with a parameter file and one of the sessions keeps failing.
•   The session might contain a parameter that is not listed in the parameter file. The PowerCenter Server
    uses the parameter file to start all sessions in the workflow. Check the session properties, then verify
    that all session parameters are defined correctly in the parameter file.
•   You can configure a session to use DB2, Oracle, Sybase IQ, and Teradata external loaders to load
    session target files into the respective databases. External Loaders can increase session performance
    since these databases can load information directly from files faster than they can run the SQL
    commands to insert the same data into the database.
•   To use an external loader for a session, you must perform the following tasks:
•   Create an external loader connection in the Workflow Manager and configure the external loader
    attributes. For details on creating external loader connections
•   Configure the session to write to flat file instead of to a relational database. For more information,
•   Choose an external loader connection for each target file in the session properties
•   When you run a session that uses an external loader, the PowerCenter Server creates a control file
    and a target flat file. The control file contains information about the target flat file such as data format
    and loading instructions for the external loader. The control file has an extension of .ctl. You can
    view the control file and the target flat file in the target file directory (default: $PMTargetFileDir).
                                        Informatica
•   Before you run external loaders, consider the following issues:
•   Disable constraints. Normally, you disable constraints built into the tables receiving the data before
    performing the load. Consult your database documentation for instructions on how to disable
    constraints.
•   Performance issues. To preserve high performance, you can increase commit intervals and turn off
    database logging. However, to perform database recovery on failed sessions, you must have database
    logging turned on.
•   Code page requirements. DB2, Oracle, Sybase IQ, and Teradata database servers must run in the
    same code page as the target flat file code page. The external loaders start in the target flat file code
    page. The PowerCenter Server creates the control and target flat files using the target flat file code
    page. If you are using a code page other than 7-bit ASCII for the target flat file, run the PowerCenter
    Server in Unicode data movement mode.
•   The PowerCenter Server can use multiple external loaders within one session. For example, if the
    mapping contains two targets, you can create a session that uses different connection types: one uses
    an Oracle external loader connection and the other uses a Sybase IQ external loader connection.
•   The behavior of the external loader depends on how you choose to load the data. You can load data in
    the following ways:
•   Loading to named pipes. When you load data to named pipes, the external loader starts to load data
    to the target database as soon as the data appears in the named pipe.
•   Staging data using flat files. When you stage data in flat files, the external loader starts to load data
    to the target databases only after the PowerCenter Server completes writing to the target flat files.
•   The Oracle SQL loader can perform insert, update, and delete operations on targets. The target flat
    file for an Oracle external loader can be fixed-width or delimited.
                                     Informatica
•   pmcmd is a program that you can use to communicate with the PowerCenter Server. You can perform
    some of the tasks that you can also perform in the Workflow Manager such as starting and stopping
    workflows and tasks.
•   You can use pmcmd in the following modes:
•   Command line mode. The command line syntax allows you to write scripts for scheduling
    workflows. Each command you write in the command line mode must include connection
    information to the PowerCenter Server.
•   Interactive mode. You establish and maintain an active connection to the PowerCenter Server. This
    allows you to issue a series of commands.
•   PMCMD
•   Configuring PM_HOME
•   Use the PM_HOME variable to start pmcmd from a directory other than the install directory. On
    UNIX, point the PM_HOME and PATH environment variables to the PowerCenter Server
    installation directory. On Windows, include the PowerCenter Server install directory in the
    environment path.
•   Each command must include the connection information to the PowerCenter Server and the
    PowerCenter repository. For example, to start a workflow named wFlow4 in the command line mode,
    use the following syntax:
•   pmcmd startworkflow -s serveraddress:portno -u YourUsername -p YourPassword wFlow4
•   When you work in the command line mode, pmcmd indicates the success or failure of a command
    with a return code. Return code (0) indicates that the command succeeded. Any other return code
    indicates that the command failed.
•   Use pmcmd in the interactive mode to start and stop workflows and tasks without writing a script.
    Once you establish a dedicated connection to the PowerCenter Server, you can issue commands
    without specifying the connection information. For example, to start the workflow wFlow4 in the
    interactive mode, type the following at the pmcmd prompt:
                                       Informatica
•   pmcmd> startworkflow wFlow4
•   Some of the pmcmd commands
•   aborttask ,abortworkflow , connect , disconnect , getrunningsessionsdetails , getserverdetails
•   getserverproperties , getsessionstatistics , gettaskdetails , getworkflowdetails
•   pingserver , quit , Interactive , resumeworkflow
•   resumeworklet ,scheduleworkflow , setfolder , setnowait
•   setwait , showsettings , shutdownserver ,startask
•   startworkflow , stoptask ,stopworkflow ,unscheduleworkflow
•   unsetfolder ,version ,waittask ,waitworkflow
•   To start pmcmd from any directory on Windows:
•   In the system properties, add the installation directory to the path variable. For example, on Windows
    2000, configure the path variable in System settings. Click the Environment tab to select the path
    variable and add the installation directory to the variable value.
                                        Informatica
•   Repository Security
•   Each repository user needs a user name and password to access the repository. PowerCenter uses a
    security module to verify user log in. The security module supports the following types of user
    authentication:
•   PowerCenter default authentication
•   Lightweight Directory Access Protocol (LDAP) authentication
•   PowerCenter default authentication enables you to create users and maintain passwords in the
    repository. The security module verifies users against these user names and passwords.
•   LDAP defines a network protocol for accessing a directory service. If you use LDAP to authenticate
    users, the repository security module passes a user login to the external directory for authentication.
    To accomplish this, the repository maintains an association between repository user names and
    external login names. The repository administrator maintains the user name-login associations, but
    does not maintain user passwords in the repository.
•   When you create a repository, the repository creates two default users:
•   Administrator. The default password for Administrator is Administrator.
•   Database user. The database user name and password used when you created the repository.
•   you can grant various security permissions from repository manager for a user like run the designer ,
    workflow, workflow monitor
•   Handling User Connections
•   Sometimes, the Repository Server does not disconnect the user from the repository. The repository
    has a residual connection if the connection remains in the User Connections dialog box when the
    repository client application or machine is actually shut down. This can happen in the following
    situations:
•   Network problems occur.
•   A PowerCenter Client or PowerCenter Server shuts down improperly.
•   A residual repository connection also retains all repository locks associated with the connection. Use
    the Repository Manager to terminate residual user connections and release residual locks in the
    repository.
                                      Informatica
•   The PowerCenter Server opens additional connections to the repository for each Load Manager and
    DTM process. If you terminate a repository connection to a PowerCenter Server, the Repository
    Server also terminates all other connections to the same PowerCenter Server
•   Warning: Terminating an active connection may cause repository inconsistencies. Terminate
    residual connections only.
•   To terminate a residual connection:
•   Launch the Repository Manager and connect to the repository.
•   Choose Edit-Show User Connections.
•   In the Repository Connections dialog box, select the connection you want to terminate.
•   Verify the user is no longer connected to the repository.
•   Click End Connection.
•   The Repository Server closes the user connection.
•   Click Close to return to the Repository Manager
•   The repository uses locks to prevent users from duplicating or overriding work.
•   The Repository Server creates the following types of locks on repository objects when you view, edit,
    or execute them in a workflow:
•   In-use lock. Placed on objects you want to view.
•   Write-intent lock. Placed on objects you want to modify.
•   Execute lock. Locks objects you want to run or execute, such as workflows and sessions
                                       Informatica
•   Releasing Locks
•   If a system or network problem causes the repository to retain residual locks, you may need to unlock
    an object before using it.
•   Warning: Before unlocking any object, note which user owns the lock, verify that the object is not
    being used and verify that the user is not connected to the repository.
•   You can unlock objects and folders by identifying the user connection associated with the lock and
    terminating the user connection.
•   You do not need special privileges to terminate connections associated with your repository user
    name. For information on the privileges and permissions to terminate other user connections and
    unlock objects see Repository Manager Tasks.
•   To unlock a folder or repository object:
•   Open the Repository Manager and connect to the repository.
•   View object locks and identify the connection identification number associated with the locks you
    want to release.
•   After you view the object locks, choose Edit-Show User Connections.
•   In the User Connections dialog box, match the connection identification number from the Object
    Locks dialog box with the connection identification number in the User Connections dialog box.
•   Verify the user is not connected to the repository.
•   Select the user connection and click End Connection.
•   The Repository Server closes the user connection and releases all locks associated with the
    connection.
•   Click Close to return to the Repository Manager.
•   Informatica Metadata Exchange (MX) provides a set of relational views that allow easy SQL access
    to the Informatica metadata repository. The Repository Manager generates these views when you
    create or upgrade a repository.
•   Warning: The PowerCenter repository tables have an open architecture. Although you can view the
    repository tables, Informatica strongly advises against altering the tables or data within the tables.
    Informatica is not responsible for corrupted data that is caused by customer alteration of the
    repository tables or data within those tables. Therefore, do not directly access the actual repository
    tables. Instead, use MX to access the repository
                                       Informatica
•   PARTITION
•   A partition is a pipeline stage that executes in a single reader, transformation, or writer thread. By
    default, the PowerCenter Server defines a single partition in the source pipeline. If you purchase the
    Partitioning option, you can increase the number of partitions. This increases the number of
    processing threads, which can improve session performance
•   You can generally define up to 64 partitions at any partition point. However, there are situations in
    which you can define only one partition in the pipeline
•   Increasing the number of partitions or partition points increases the number of threads. Therefore,
    increasing the number of partitions or partition points also increases the load on the server machine.
    If the server machine contains ample CPU bandwidth, processing rows of data in a session
    concurrently can increase session performance. However, if you create a large number of partitions or
    partition points in a session that processes large amounts of data, you can overload the system
•   Partition Types
•   When you configure the partitioning information for a pipeline, you must specify a partition type at
    each partition point in the pipeline. The partition type determines how the PowerCenter Server
    redistributes data across partition points.
•   The Workflow Manager allows you to specify the following partition types:
•   Round-robin. The PowerCenter Server distributes data evenly among all partitions. Use round-robin
    partitioning where you want each partition to process approximately the same number of rows. For
    more information,.
•   Hash. The PowerCenter Server applies a hash function to a partition key to group data among
    partitions. If you select hash auto-keys, the PowerCenter Server uses all grouped or sorted ports as
    the partition key. If you select hash user keys, you specify a number of ports to form the partition
    key. Use hash partitioning where you want to ensure that the PowerCenter Server processes groups of
    rows with the same partition key in the same partition. For more information, see Hash Keys
                                       Informatica
•   Key range. You specify one or more ports to form a compound partition key. The PowerCenter
    Server passes data to each partition depending on the ranges you specify for each port. Use key range
    partitioning where the sources or targets in the pipeline are partitioned by key range. For more
    information, see Key Range Partition Type.
•   Pass-through. The PowerCenter Server passes all rows at one partition point to the next partition
    point without redistributing them. Choose pass-through partitioning where you want to create an
    additional pipeline stage to improve performance, but do not want to change the distribution of data
    across partitions. For more information, see Pass-Through Partition Type.
•   Database partitioning. The PowerCenter Server queries the IBM DB2 system for table partition
    information and loads partitioned data to the corresponding nodes in the target database.
•   When you create or edit a session, you can change the partitioning information for each pipeline in a
    mapping. If the mapping contains multiple pipelines, you can specify multiple partitions in some
    pipelines and single partitions in others. You update partitioning information using the Partitions
    view on the Mapping tab in the session properties.
•   You can configure the following information in the Partitions view on the Mapping tab:
•   Add and delete partition points.
•   Enter a description for each partition.
•   Specify the partition type at each partition point.
•   Add a partition key and key ranges for certain partition types.
•   Rules for Adding and Deleting Partition Points
•   You can add and delete partition points at other transformations in the pipeline according to the
    following rules:
•   You cannot create partition points at source instances.
•   You cannot create partition points at Sequence Generator transformations or unconnected
    transformations.
•   You can add partition points at any other transformation provided that no partition point receives
    input from more than one pipeline stage.
•   NOTE : WE dont see partition tab as we dont have partition option
                                       Informatica
•   Rules for Adding and Deleting Partition Points
•   You can add and delete partition points at other transformations in the pipeline according to the
    following rules:
•   You cannot create partition points at source instances.
•   You cannot create partition points at Sequence Generator transformations or unconnected
    transformations.
•   You can add partition points at any other transformation provided that no partition point receives
    input from more than one pipeline stage.
                                       Informatica
•   SERVER GRID OR CLUSTER
•   You can register and run multiple PowerCenter Servers against a local or global repository. When
    you register multiple PowerCenter Servers to the same repository, you can distribute the workload
    across the servers to increase performance.
•   You have the following options to run workflows and sessions using multiple servers:
•   Use a server grid to run workflows. You can use a server grid to automate the distribution of
    sessions. A server grid is a server object that distributes sessions in a workflow to servers based on
    server availability. The grid maintains connections to multiple servers in the grid. For more
    information about using server grids, see Working with Server Grids.
•   Change the assigned server for a workflow. When you configure a workflow, you assign a server
    to run that workflow. Each time the scheduled workflow runs, it runs on the assigned server. You can
    change the assigned server for a workflow in the workflow properties.
•   Change the assigned server for a session. When you configure a session, by default it runs on the
    server assigned to the workflow. You can change the assigned server for a session in the session
    properties
•   By default, each PowerCenter Server you add to the server grid can be both a master server and a
    worker server. Each server accepts tasks from the grid. You can configure a server to be only a
    master server by clearing Accept task from Server Grid. A PowerCenter Server that is only a master
    server does not run sessions from other servers in the grid, but it can distribute sessions to other
    servers in the grid.
•   Server Grid Guidelines and Requirements
•   Informatica recommends that each PowerCenter Server in a server grid uses the same operating
    system. While you can specify different session log directories, workflow log directories, and temp
    directories for the PowerCenter Servers, each PowerCenter Server in a server grid must meet the
    following requirements:
•   Register each PowerCenter Server to the same repository.
•   Use the same database connectivity for each PowerCenter Server.
•   Use the same server variables for each server in a grid, except for the $PMTempDir,
    $PMSessionLogDir, and $PMWorkflowLogDir variables.
                                   Informatica
•   Use the same cache directory.
•   Configure the following PowerCenter Server parameters the same:
•   Fail session if maximum number of concurrent sessions is reached
•   PMServer 4.0 date handling compatibility
•   Aggregate treat null as zero
•   Aggregate treat rows as insert
•   Treat CHAR as CHAR on read
•   Data Movement Mode
•   Validate Data Code Pages
•   Output Session Log In UTF8
•   Export Session Log Lib Name
•   Treat Null in comparison operator as
•   Data Display Format
•   PowerCenter Servers must be the same product version.
•   DB2 EEE loader must be on the same machine as PowerCenter Server
                                       Informatica
•   ERROR LOG
•   When you configure a session, you can choose to log row errors in a central location. When a row
    error occurs, the PowerCenter Server logs error information that allows you to determine the cause
    and source of the error. The PowerCenter Server logs information such as source name, row ID,
    current row data, transformation, timestamp, error code, error message, repository name, folder
    name, session name, and mapping information
•   The PowerCenter Server generates the following tables to help you track row errors:
•   PMERR_DATA. Stores data and metadata about a transformation row error and its corresponding
    source row.
•   PMERR_MSG. Stores metadata about an error and the error message.
•   PMERR_SESS. Stores metadata about the session.
•   PMERR_TRANS. Stores metadata about the source and transformation ports, such as name and
    datatype, when a transformation error occurs.
•   You configure error logging for each session in a workflow. You can find error handling options in
    the Config Object tab of the sessions properties.
•   To configure error logging options:
•   Double-click the Session task to open the session properties.
•   Select the Config Object tab.
•   Choose error handling options.
•   Error Log Type : Specifies the type of error log to create. You can specify relational database, flat
    file, or no log. By default, the PowerCenter Server does not create an error log
•   Log Row Data : Specifies whether or not to log transformation row data. By default, the
    PowerCenter Server logs transformation row data. If you disable this property, N/A or -1 appears in
    transformation row data fields.
•   Log Source Row Data : If you choose not to log source row data, or if source row data is
    unavailable, the PowerCenter Server writes an indicator such as N/A or -1, depending on the column
    datatype.
•   If you do not need to capture source row data, consider disabling this option to increase PowerCenter
    Server performance.
                                        Informatica
•   PowerCenter allows you to control commit and rollback transactions based on a set of rows that pass
    through a Transaction Control transformation. A transaction is the set of rows bound by commit or
    rollback rows. You can define a transaction based on a varying number of input rows. You might
    want to define transactions based on a group of rows ordered on a common key, such as employee ID
    or order entry date.
•   TRANSACTION CONTROL
•   In PowerCenter, you define transaction control at two levels:
•   Within a mapping. Within a mapping, you use the Transaction Control transformation to define a
    transaction. You define transactions using an expression in a Transaction Control transformation.
    Based on the return value of the expression, you can choose to commit, roll back, or continue without
    any transaction changes.
•   Within a session. When you configure a session, you configure it for user-defined commit. You can
    choose to commit or roll back a transaction if the PowerCenter Server fails to transform or write any
    row to the target
•   Enter the transaction control expression in the Transaction Control Condition field. The transaction
    control expression uses the IIF function to test each row against the condition. Use the following
    syntax for the expression:
•   IIF (condition, value1, value2)
•   The expression contains values that represent actions the PowerCenter Server performs based on the
    return value of the condition. The PowerCenter Server evaluates the condition on a row-by-row basis.
    The return value determines whether the PowerCenter Server commits, rolls back, or makes no
    transaction changes to the row. When the PowerCenter Server issues a commit or rollback based on
    the return value of the expression, it begins a new transaction. Use the following built-in variables in
    the Expression Editor when you create a transaction control expression:
•   TC_CONTINUE_TRANSACTION. The PowerCenter Server does not perform any transaction
    change for this row. This is the default value of the expression.
•   TC_COMMIT_BEFORE. The PowerCenter Server commits the transaction, begins a new
    transaction, and writes the current row to the target. The current row is in the new transaction.
                                        Informatica
•   TC_ROLLBACK_BEFORE. The PowerCenter Server rolls back the current transaction, begins a
    new transaction, and writes the current row to the target. The current row is in the new transaction.
•   TC_ROLLBACK_AFTER. The PowerCenter Server writes the current row to the target, rolls back
    the transaction, and begins a new transaction. The current row is in the rolled back transaction.
•   If the transaction control expression evaluates to a value other than commit, rollback, or continue, the
    PowerCenter Server fails the sessionExample
•   Suppose you want to use transaction control to write order information based on the order entry date.
    You want to ensure that all orders entered on any given date are committed to the target in the same
    transaction. To accomplish this, you can create a mapping with the following transformations:
•   Sorter transformation. Sort the source data by order entry date.
•   Expression transformation. Use local variables to determine whether the date entered is a new date.
•   The following table describes the ports in the Expression transformation
•   Transaction Control transformation. Create the following transaction control expression to
    commit data when the PowerCenter Server encounters a new order entry date:
•   IIF(NEW_DATE = 1, TC_COMMIT_BEFORE, TC_CONTINUE_TRANSACTION)
•   A Transaction Control transformation may be effective for one target and ineffective for another
    target.
•   If each target is connected to an effective Transaction Control transformation, the mapping is valid. If
    one target in the mapping is not connected to an effective Transaction Control transformation, the
    mapping is invalid.
•   You must connect each target instance to a Transaction Control transformation. You can connect
    multiple targets to a single Transaction Control transformation. You can connect only one effective
    Transaction Control transformation to a target. You cannot place a Transaction Control
    transformation in a pipeline branch that starts with a Sequence Generator transformation. If you use a
    dynamic Lookup transformation and a Transaction Control transformation in the same mapping, a
    rolled-back transaction might result in unsynchronized target data
                                      Informatica
•   Mapplet is a group of reusable transformation.The main purpose of using Mapplet is to hide the logic
    from end user point of view...It works like a function in C language.We can use it N number of
    times.Its a reusable object.
•   Reusable transformation is a single transformation.
•   You can design using 2 methods
•     1. using transformation developer
•     2. create normal one and promote it to reusable
•   API for custom transformations
•   PowerCenter provides a set of API functions that you can use to develop your transformation logic.
    When the Designer generates the source code files, it includes the generated functions in the source
    code. Add API functions to your code to implement the transformation logic. The procedure uses the
    API functions to interface with the PowerCenter Server. You must code API functions in the
    procedure C file. Optionally, you can also code the module C file
                                       Informatica
•   EXTERNAL TRNSFORMATIONS
•   External Procedure transformations operate in conjunction with procedures you create outside of the
    Designer interface to extend PowerCenter functionality
•   If you are an experienced programmer, you may want to develop complex functions within a
    dynamic link library (DLL) or UNIX shared library, instead of creating the necessary Expression
    transformations in a mapping.
•   To obtain this kind of extensibility, you can use the Transformation Exchange (TX) dynamic
    invocation interface built into PowerCenter. Using TX, you can create an Informatica External
    Procedure transformation and bind it to an external procedure that you have developed. You can bind
    External Procedure transformations to two kinds of external procedures:
•   COM external procedures (available on Windows only)
•   Informatica external procedures (available on Windows, AIX, HP-UX, Linux, and Solaris)
•   To use TX, you must be an experienced C, C++, or Visual Basic programmer.
•   You can use multi-threaded code in external procedures.
•   There are two components to TX: external procedures and External Procedure transformations.
•   As its name implies, an external procedure exists separately from the PowerCenter Server. It consists
    of C, C++, or Visual Basic code written by a user to define a transformation. This code is compiled
    and linked into a DLL or shared library, which is loaded by the PowerCenter Server at runtime. An
    external procedure is “bound” to an External Procedure transformation
•   .
•   An External Procedure transformation is created in the Designer. It is an object that resides in the
    Informatica repository and serves several purposes:
                                        Informatica
•   It contains the metadata describing the following external procedure. It is through this metadata that
    the PowerCenter Server knows the “signature” (number and types of parameters, type of return value,
    if any) of the external procedure.
•   It allows an external procedure to be referenced in a mapping. By adding an instance of an External
    Procedure transformation to a mapping, you call the external procedure bound to that transformation.

•   PMLIC
•   Informatica provides the pmlic command line program to manage all license file types. You can
    create, update, or view the file contents. You can create or update license files using a license key or
    another license file.
•   Run pmlic from the PowerCenter Server or Repository Server installation directory. For example,
    when the PowerCenter Server runs on a Windows machine, run pmlic from the PowerCenter Server
    installation directory.
•   pmlic uses the following commands:
•   Update. Updates or creates a license file with the license keys you specify.
•   Display. Displays the license keys in a license file, including their properties, such as type and
    expiration date.
•   The command line mode invokes and exits pmrep or pmrepagent each time you issue a command.
    The command line mode is useful if you want to run pmrep or pmrepagent commands through batch
    files, scripts, or other programs. For more information on scripting pmrep or pmrepagent commands,
                                  Informatica
•   PMREP
•   To run pmrep commands in command line mode:
•   At the command prompt, switch to the directory where the pmrep executable is located.
•   Enter pmrep followed by the command name and its required options and arguments.
•   At the command prompt, type the following command to run pmrep:
•   pmrep command_name [-option1] argument_1 [-option2] argument_2...
•   The first time you use pmrep in either command line or interactive mode, you must call
    the Connect command. All commands require a connection to the repository.
•   Connect uses the following command syntax:
•   connect -r repository_name -n repository_username -x repository_password -X
    repository_password_environment_variable -h repserver_host_name -o
    repserver_port_number
                                         Informatica
•   The first step in performance tuning is to identify the performance bottleneck. Performance
    bottlenecks can occur in the source and target databases, the mapping, the session, and the system.
    Generally, you should look for performance bottlenecks in the following order:
•   Target
•   Source
•   Mapping
•   Session
•   System

•   f your session writes to a flat file target, you can optimize session performance by writing to a flat file
    target that is local to the PowerCenter Server. If your session writes to a relational target, consider
    performing the following tasks to increase performance:
•   Drop indexes and key constraints.
•   Increase checkpoint intervals.
•   Use bulk loading.
•   Use external loading.
•   Increase database network packet size.
•   Optimize Oracle target databases.
                                        Informatica
•   Increasing Checkpoint Intervals
•   The PowerCenter Server performance slows each time it waits for the database to perform a
    checkpoint. To increase performance, consider increasing the database checkpoint interval. When
    you increase the database checkpoint interval, you increase the likelihood that the database performs
    checkpoints as necessary, when the size of the database log file reaches its limit.
•   For details on specific database checkpoints, checkpoint intervals, and log files, consult your database
    documentation.
•   Bulk Loading
•   You can use bulk loading to improve the performance of a session that inserts a large amount of data
    to a DB2, Sybase, Oracle, or Microsoft SQL Server database. Configure bulk loading on the Mapping
    tab.
•   When bulk loading, the PowerCenter Server bypasses the database log, which speeds performance.
    Without writing to the database log, however, the target database cannot perform rollback. As a
    result, you may not be able to perform recovery. Therefore, you must weigh the importance of
    improved session performance against the ability to recover an incomplete session.
•   For more information on configuring bulk loading, see Bulk Loading.
•   External Loading
•   You can use the External Loader session option to integrate external loading with a session.
•   If you have a DB2 EE or DB2 EEE target database, you can use the DB2 EE or DB2 EEE external
    loaders to bulk load target files. The DB2 EE external loader uses the PowerCenter Server db2load
    utility to load data. The DB2 EEE external loader uses the DB2 Autoloader utility.
•   If you have a Teradata target database, you can use the Teradata external loader utility to bulk load
    target files. If your target database runs on Oracle, you can use the Oracle SQL*Loader utility to bulk
    load target files. When you load data to an Oracle database using a pipeline with multiple partitions,
    you can increase performance if you create the Oracle target table with the same number of partitions
    you use for the pipeline.If your target database runs on Sybase IQ, you can use the Sybase IQ
    external loader utility to bulk load target files. If your Sybase IQ database is local to the PowerCenter
                                       Informatica
•   Increasing Database Network Packet Size
•   You can increase the network packet size in the Informatica Workflow Manager to reduce target
    bottleneck. For Sybase and Microsoft SQL Server, increase the network packet size to 8K - 16K. For
    Oracle, increase the network packet size in tnsnames.ora and listener.ora. If you increase the network
    packet size in the PowerCenter Server configuration, you also need to configure the database server
    network memory to accept larger packet sizes.
•   See your database documentation about optimizing database network packet size.
•   Optimizing Oracle Target Databases
•   If your target database is Oracle, you can optimize the target database by checking the storage clause,
    space allocation, and rollback segments.
•   When you write to an Oracle database, check the storage clause for database objects. Make sure that
    tables are using large initial and next values. The database should also store table and index data in
    separate tablespaces, preferably on different disks.
•   When you write to Oracle target databases, the database uses rollback segments during loads. Make
    sure that the database stores rollback segments in appropriate tablespaces, preferably on different
    disks. The rollback segments should also have appropriate storage clauses.
•   You can optimize the Oracle target database by tuning the Oracle redo log. The Oracle database uses
    the redo log to log loading operations. Make sure that redo log size and buffer size are optimal. You
    can view redo log properties in the init.ora file.
•   If your Oracle instance is local to the PowerCenter Server, you can optimize performance by using
    IPC protocol to connect to the Oracle database. You can set up Oracle database connection in
    listener.ora and tnsnames.ora.
•   See your Oracle documentation for details on optimizing Oracle databases.
                                       Informatica
•   If a mapping contains Lookup transformations, you might want to enable lookup caching. In general,
    you want to cache lookup tables that need less than 300MB.

•   Once you optimize your source database, target database, and mapping, you can focus on optimizing
    the session. You can perform the following tasks to improve overall performance:
•   Increase the number of partitions.
•   Reduce errors tracing.
•   Remove staging areas. Tune session parameters.
•   DTM Buffer Size 12,000,000 bytes 6,000,000 bytes 128,000,000 bytes
•   Buffer block size 64,000 bytes         4,000 bytes       128,000 bytes
•   Index cache size 1,000,000 bytes1,000,000 bytes12,000,000 bytes
•   Data cache size 2 ,000,000 bytes2,000,000 bytes24,000,000 bytes
•   Commit interval 10,000 rowsN/AN/A
•   High Precision     Disabled N/AN/A
•   Tracing Level      NormalTerse N/A
•   Allocating Buffer Memory
•   When the PowerCenter Server initializes a session, it allocates blocks of memory to hold source and
    target data. The PowerCenter Server allocates at least two blocks for each source and target partition.
    Sessions that use a large number of sources and targets might require additional memory blocks. If
    the PowerCenter Server cannot allocate enough memory blocks to hold the data, it fails the session.
•   By default, a session has enough buffer blocks for 83 sources and targets. If you run a session that
    has more than 83 sources and targets, you can increase the number of available memory blocks by
    adjusting the following session parameters:
•   DTM Buffer Size. Increase the DTM buffer size found in the Performance settings of the Properties
    tab. The default setting is 12,000,000 bytes.
                                         Informatica
•   Default Buffer Block Size. Decrease the buffer block size found in the Advanced settings of the
    Config Object tab. The default setting is 64,000 bytes.
•   To configure these settings, first determine the number of memory blocks the PowerCenter Server
    requires to initialize the session. Then, based on default settings, you can calculate the buffer size
    and/or the buffer block size to create the required number of session blocks.
•   If you have XML sources or targets in your mapping, use the number of groups in the XML source or
    target in your calculation for the total number of sources and targets

•   If your session reads from a flat file source, you can improve session performance by setting the
    number of bytes the PowerCenter Server reads per line. By default, the PowerCenter Server reads
    1024 bytes per line. If each line in the source file is less than the default setting, you can decrease the
    Line Sequential Buffer Length setting in the session properties.
•   If your session reads from a relational source, review the following suggestions for improving
    performance:
•   Optimize the query.
•   Create tempdb as in-memory database. (for sql server and sybase)
•   Use conditional filters. (filter in source qualifier )
•   Increase database network packet size.
•   Connect to Oracle databases using IPC protocol.
                                       Informatica
•   Generally, you reduce the number of transformations in the mapping and delete unnecessary links
    between transformations to optimize the mapping. You should configure the mapping with the least
    number of transformations and expressions to do the most amount of work possible. You should
    minimize the amount of data moved by deleting unnecessary links between transformations.
•   For transformations that use data cache (such as Aggregator, Joiner, Rank, and Lookup
    transformations), limit connected input/output or output ports. Limiting the number of connected
    input/output or output ports reduces the amount of data the transformations store in the data cache.
•   You can also perform the following tasks to optimize the mapping:
•   Configure single-pass reading. :
•   Consider using single-pass reading if you have several sessions that use the same sources. If you join
    the separate mappings and use only one source qualifier for each source,
•   Dataype conversions.
•   Eliminate transformation errors.
•   Optimize transformations.
•   Optimize expressions.
•   Forcing the PowerCenter Server to make unnecessary datatype conversions slows performance. For
    example, if your mapping moves data from an Integer column to a Decimal column, then back to an
    Integer column, the unnecessary datatype conversion slows performance. Where possible, eliminate
    unnecessary datatype conversions from mappings.
•   Some datatype conversions can improve system performance. Use integer values in place of other
    datatypes when performing comparisons using Lookup and Filter transformations.
•   If a mapping contains a Lookup transformation, you can optimize the lookup. Some of the things you
    can do to increase performance include caching the lookup table, optimizing the lookup condition, or
    indexing the lookup table.
                                       Informatica
•   If you filter rows from the mapping, you can improve efficiency by filtering early in the data flow.
    Instead of using a Filter transformation halfway through the mapping to remove a sizable amount of
    data, use a source qualifier filter to remove those same rows at the source.
•   Aggregator transformations often slow performance because they must group data before processing
    it. Aggregator transformations need additional memory to hold intermediate group results. You can
    optimize Aggregator transformations by performing the following tasks:
•   Group by simple columns.
•   Use sorted input.
•   Use incremental aggregation.
•   . Joiner transformations need a data cache to hold the master table rows and an index cache to hold
    the join columns from the master table. You need to make sure that you have enough memory to hold
    the data and the index cache so the system does not page to disk. To minimize memory requirements,
    you can also use the smaller table as the master table or join on as few columns as possible.
•   The type of join you use can affect performance. Normal joins are faster than outer joins and result in
    fewer rows. When possible, use database joins for homogenous sources.
•   Often performance slows because your session relies on inefficient connections or an overloaded
    PowerCenter Server system. System delays can also be caused by routers, switches, network
    protocols, and usage by many users. After you determine from the system monitoring tools that you
    have a system bottleneck, you can make the following global changes to improve the performance of
    all your sessions:
•   Improve network speed. Slow network connections can slow session performance. Have your
    system administrator determine if your network runs at an optimal speed. Decrease the number of
    network hops between the PowerCenter Server and databases.
•   Use multiple PowerCenter Servers. Using multiple PowerCenter Servers on separate systems might
    double or triple session performance.
•   Use a server grid. Use a collection of PowerCenter Servers to distribute and process the workload of
    a workflow. For information on server grids, see Working with Server Grids.
                                     Informatica
•   Improve CPU performance. Run the PowerCenter Server and related machines on high
    performance CPUs, or configure your system to use additional CPUs.
•   Configure the PowerCenter Server for ASCII data movement mode. When all character data
    processed by the PowerCenter Server is 7-bit ASCII or EBCDIC, configure the PowerCenter Server
    for ASCII data movement mode.
•   Check hard disks on related machines. Slow disk access on source and target databases, source and
    target file systems, as well as the PowerCenter Server and repository machines can slow session
    performance. Have your system administrator evaluate the hard disks on your machines.
•   Reduce paging. When an operating system runs out of physical memory, it starts paging to disk to
    free physical memory. Configure the physical memory for the PowerCenter Server machine to
    minimize paging to disk.
•   Use processor binding. In a multi-processor UNIX environment, the PowerCenter Server may use a
    large amount of system resources. Use processor binding to control processor usage by the
    PowerCenter Server.

								
To top