Monday, April 26, 2010

DataStage - setting ulimit in Clustered or GRID (MPP) Environment

Problem(Abstract)
Ulimits are determined by the dsrpc daemon process for all process spawned by the Conductor on an SMP environment, however, when you introduce additional processing nodes (MPP) Ulimits need to be addressed on those nodes separately.

Symptom
Changing ulimit in the conduct node does not resolve the error - One symptom of this error: Fatal Error: Need to be able to open at least 16 files; please check your ulimit setting for number of file descriptors [sort/merger.C:1087]

Cause
On UNIX/Linux MPP configuration the ulimit are set for the user running the processes on the processing nodes, They are not inherited from the Conductor like the rest of the environment.

Environment
MPP on Unix/Linux

Resolving the problem
Each users that "owns" processes on the remote processing nodes (where the Conductor spawns processes on remote processing nodes as that user) needs to have their ulimit set correctly on EACH processing node. This can be done by either setting the default ulimit for everyone, or setting each user's ulimit specifically.

An example for AIX:

Modify the /etc/security/limits file's default settings as root user:


default:
fsize = -1
core = 2097151
cpu = -1
data = -1
rss = -1
stack = 65536
nofiles = 102400
nofiles_hard = 102400


The default section is used to set the defaults for all users, You can override each user in the same manner.


A note about ulimit nofiles (number of open files) on 64-bit AIX. You should not set nofiles to -1 (unlimited) or nofiles_hard to unlimited on AIX running a 64 bit kernel. This will cause an integer overflow for Information Server and you will get a variation of the error listed above.

No comments:

Post a Comment