Thursday, July 1, 2010

DataStage Job Compile - Receives "Failed to invoke GenRuntime using phantom process helper." error.

Problem(Abstract)

When attempting to compile a job, user receives: Failed to invoke GenRuntime using phantom process helper.

Cause

Possible causes for this error include:
Servers /tmp space was full

Jobs status incorrect.
Format problem with projects uvodbc,config file
Corrupted DS_STAGETYPES file
Internal locks.


Diagnosing the problem

If the steps under Resolving the problem do not resolve the problem, proceed with the following steps,


Before opening a PMR with support, turn on server side tracing, attempt to compile the problem job, turn off server side tracing, and gather the tracing information.

  1. Turn on server side by connecting to the server with the DataStage Administrator client.
  2. High light the project which has the problem job.
  3. Click on the Properties button.
  4. In the Properties window, click the Tracing tab
  5. Click on the Enabled check box
  6. Click the OK button
  7. With a new DataStage Designer connection, attempt to compile the job.
  8. With the DataStage Administrator client, go back into the projects properties
  9. Select the Tracing tab
  10. Uncheck the Enabled check box
  11. For each DSRTRACE entry do the following:
    a) High light the entry and click View
    b) High light the contents of the display and click on Copy
    c) Paste the copied information into Notepad
  12. Open a PMR with support and supply the Notepad information.

Resolving the problem


Servers /tmp space full:

Clean up space in /tmp


Jobs status is incorrect:

DataStage Director->Job->Clear Status File


Format problem with projects uvodbc.config file:

Confirm uvodbc.config has the following entry/format:
[ODBC DATA SOURCES]

DBMSTYPE = UNIVERSE
network = TCP/IP
service = uvserver
host = 127.0.0.1

Corrupted DS_STAGETYPES:

  1. Connect to the DataStage server,change directory to DSEngine, source dsenv ( . ./dsenv)
    $ bin/uvsh
    >LOGTO projectname (case sensitive)
  2. Set a file pointer RTEMP to the template DS_STAGETYPES file
    >SETFILE /Template/DS_STAGETYPES RTEMP
  3. Check that all of the entries in the template DS_STAGETYPES file are present in the project's DS_STAGETYPES file
    >SELECT RTEMP
    * this will return a count of records found in the template DS_STAGETYPES file
    >COUNT DS_STAGETYPES
    * this will return a count of records found in the project's DS_STAGETYPES file
    * These numbers should be the same
  4. If the numbers differ and some records are missing from the project's DS_STAGETYPES file
    >COPY FROM RTEMP TO DS_STAGETYPES ALL OVERWRITING
  5. exit Universe shell
    >Q

Internal locks:
  1. Connect to the DataStage server,change directory to DSEngine, source dsenv ( . ./dsenv)
  2. Change directory to the projects directory that has the job generating the error.
  3. Execute the following replacing with the actual job name.
    $ $DSHOME/bin/uvsh "DS.PLADMIN.CMD NOPROMPT CLEAR LOCKS "

Monday, April 26, 2010

TeraData Sync Tables From DataStage

Question>
I need to find out how and What TeraSync Tables are in Datastage using Teradata

The Teradata DBA says these are a Datastage thing.

What is the purpose of these Tables? Can these be shut off?
Are these tables used with the TD Connector as well ?

Answer>
The Teradata sync tables are indeed created by certain DataStage stages. Teradata Enterprise creates one named terasync that is shared by all Teradata Enterprise jobs that are loading into the same database. The name of the sync table created by the Teradata Connector is supplied by the user, and that table can either be shared by other Teradata Connector jobs (with each job using a unique Sync ID key into that table) or each Teradata Connector job can have its own sync table.

These sync tables are a necessity due to requirements imposed by Teradata's parallel bulk load and export interfaces. These interfaces require a certain amount of synchronization at the start and end of a load or export and at every checkpoint in between. The interface requires a sequence of method calls to be done in lock step. After each player process has called the first method in the sequence, they cannot proceed to call the next method until all player processes have finished calling the first method. So the sync table is used as a means of communication between the player processes. After calling a method, a player updates the sync table and repeatedly checks the sync table until all the other players have also updated the sync table.

In Teradata Enterprise, you cannot avoid using the terasync table. In the Teradata Connector, you can avoid using the sync table by setting the Parallel synchronization property to No, however that stage will be forced to run sequentially in that case.

DataStage - setting ulimit in Clustered or GRID (MPP) Environment

Problem(Abstract)
Ulimits are determined by the dsrpc daemon process for all process spawned by the Conductor on an SMP environment, however, when you introduce additional processing nodes (MPP) Ulimits need to be addressed on those nodes separately.

Symptom
Changing ulimit in the conduct node does not resolve the error - One symptom of this error: Fatal Error: Need to be able to open at least 16 files; please check your ulimit setting for number of file descriptors [sort/merger.C:1087]

Cause
On UNIX/Linux MPP configuration the ulimit are set for the user running the processes on the processing nodes, They are not inherited from the Conductor like the rest of the environment.

Environment
MPP on Unix/Linux

Resolving the problem
Each users that "owns" processes on the remote processing nodes (where the Conductor spawns processes on remote processing nodes as that user) needs to have their ulimit set correctly on EACH processing node. This can be done by either setting the default ulimit for everyone, or setting each user's ulimit specifically.

An example for AIX:

Modify the /etc/security/limits file's default settings as root user:


default:
fsize = -1
core = 2097151
cpu = -1
data = -1
rss = -1
stack = 65536
nofiles = 102400
nofiles_hard = 102400


The default section is used to set the defaults for all users, You can override each user in the same manner.


A note about ulimit nofiles (number of open files) on 64-bit AIX. You should not set nofiles to -1 (unlimited) or nofiles_hard to unlimited on AIX running a 64 bit kernel. This will cause an integer overflow for Information Server and you will get a variation of the error listed above.

A bootstrap address with no port specification defaults to port 2809

Problem(Abstract)
The IBM InfoSphere DataStage client applications cannot connect to the InfoSphere DataStage server when the TCP/IP ports are not available through firewalls for the application server.

Symptom
An error message similar to the following message is displayed:
  • Failed to authenticate the current user against the selected Domain
  • A communication failure occurred while attempting to obtain an initial context with the provider URL: "iiop://s6lpar50:2825". Make sure that any bootstrap address information in the URL is correct and that the target name server is running. A bootstrap address with no port specification defaults to port 2809. Possible causes other than an incorrect bootstrap address or unavailable name server include the network environment and workstation network configuration. javax.security.auth.login.LoginException: A communication failure occurred while attempting to obtain an initial context with the provider URL: "iiop://s6lpar50:2825".

Resolving the problem
Before you install IBM InfoSphere Information Server, assess and configure your network to ensure that the IBM InfoSphere DataStage client applications can connect to the InfoSphere DataStage server when the WebSphere Application Server is behind a firewall.
To configure the application server ports:
  • In the WebSphere Application Server administrative console, expand the application server list in the Servers > Application servers, and then select server1.
  • In the Communications section, select Ports.
  • Open the ports for the firewall configuration.
  • Use the following list to identify the ports that are required for your network configuration.

To access the application server behind a firewall, open the following ports:
  • WC_defaulthost
  • BOOTSTRAP_ADDRESS
  • ORB_LISTENER_ADDRESS
  • SAS_SSL_SERVERAUTH_LISTENER_ADDRESS
  • CSIV2_SSL_MUTUALAUTH_LISTENER_ADDRESS
  • CSIV2_SSL_SERVERAUTH_LISTENER_ADDRESS

To use HTTPS to access Web clients, open the following port:
  • WC_defaulthost_secure

To access the WebSphere Application Server administrative console, open the following port:
  • WC_adminhost

To use HTTPS to access the WebSphere Application Server administrative console, open the following ports:
  • WC_adminhost_secure

To publishing services using a JMS binding, open the following ports:
  • SIB_ENDPOINT_ADDRESS
  • SIB_ENDPOINT_SECURE_ADDRESS
  • SIB_MQ_ENDPOINT_ADDRESS
  • SIB_MQ_ENDPOINT_SECURE_ADDRESS

Unable to trigger DataStage jobs from SAP

Question>
We are unable to trigger any DataStage jobs from SAP. When I try to invoke a DataStage job from SAP infopackage in a BW Load - PULL operation, I see this message in the log:

> cat RFCServer.2010_02_26.log
2 I: BW Open Hub Extract: Data ready for BW initiated job. InfoSpoke Request ID 306696. Fri Feb 26 09:45:00 2010
4 I: BW Load: Notifying push job request REQU_4GZ0F9G5SIAEE283M04XSUM9G is ready at Fri Feb 26 11:08:55 2010

But the DataStage job is not triggered. The info package keeps waiting forever to receive data but the DataStage job that provides data never gets triggered.

Answer>
Running jobReset to cleanup the temporary files:
Open a command prompt window to the DS Server and:
1. source dsenv:
cd /Server/DSEngine
. ./dsenv
2. run the job reset utility
cd $DSSAPHOME/DSBWbin
./jobReset
where = MASTER | MASTERTEXT | TRANSACTION | MASTERHIERARCHY
for example:
cd /hpithome/IBM/InformationServer/Server/DSBWbin
./jobReset SAPBW8 PACK09 IS1 MASTER bwload
You should get a confirmation message: "jobReset - Successful"

Tips - IS Project Implementation problem

If there will be any issues with respect to the connectivity, then the logs created under the directory as listed below:

Check out the Designer logs in for more details on the error you're seeing.

Client machine logs
C:\Documents and Settings\_Windows_Username_\ds_logs\dstage_wrapper_trace_N.log (where N cycles from 1 to 20, to give you the log files for the last 20 DataStage sessions)

tsortxxxxxx file size in Scratch directory

Question>
Is there any way how to influence the size of tsortxxxxx files that are created in the Scratch directory? We are running a PoC at the customer size and we have more than 70 mil records that has to be sorted. It leads to more than 5000 tsortxxxx files created in the Scratch directory. When these files are read in the next join stage, the performance is extremely slow because of necessity of access to this big directory. Each of the file has a size of approx. 10MB. Is there any way how to increase the file size and what are advantages and disadvantages of this approach?

Answer>

The size of the files are controlled through the use of the "Limit Memory" option in the Sort Stage. The default size is 20MB. What you are actually controlling is the amount of memory allocated to that instance of the sort stage on the node. The memory is used to buffer incoming rows and once full it is written to the sort work disk storage (in your case the Scratch directory). Increasing the size of the of the memory buffer can improve sort performance. 100-200MB is generally a decent compromise in many situations.

Advantages:
-- Larger memory buffer allows more records to be read into memory before being written to disk temporarily. For smaller files, this can help keep sorts completely in memory
-- The larger the memory buffer, the fewer i/o operations to the sort work disk and therefore your sort performance is increased. Also, a smaller number of files leads to fewer directory entries for the o/s to deal with

Disadvantages:
-- Larger memory buffer means more memory usage on your processing node(s). With multiple jobs and/or multiple sorts within jobs, this can lead to low memory and paging if not carefully watched.
-- This option is only available within the Sort stage...it is not available when using "link sorts" (sorting on the input link of a stage). If your job uses link sorts, you will have to replace them with Sort stages in order to use this option.

Be careful when tuning with this option so that you don't inadvertantly overtax system memory. Adjust and test and work towards a compromise that gives you acceptable performance without overloading the system.