[torqueusers] Wrong cput value
murphy at genome.chop.edu
Wed Jul 23 08:03:46 MDT 2008
Brock Palen wrote:
> Where these jobs differnt code?
> Some code (hfss comes to mind)
> forks the real process and somehow torque looses track of it. So cput
> will almost be zero.
> Other options if your using parallel code the user is not using a tm
> enabled mpirun.
The jobs use identical code, which happens to be a Perl wrapper around a
command-line java program, invoked via system(). So you're suggesting
that Torque might under rare circumstances (because of some bug?) fail
to account for the CPU time of the child processes such as the
perl-forked shell and shell-forked java process .... Hmmm. So in
general if a job invokes anything (?) which might fork, the cput value
should be treated with suspicion. Too bad.
> On Jul 22, 2008, at 2:39 PM, Kevin Murphy wrote:
>> I recently ran tracejob to compare runtime versus data-size
>> statistics on 563 jobs, and three of them had impossibly low
>> resources_used.cput values.
More information about the torqueusers