[torqueusers] "cp command failed"

Garrick Staples garrick at usc.edu
Thu Dec 13 12:27:37 MST 2007


On Thu, Dec 13, 2007 at 05:49:00AM -0600, Zhiliang Hu alleged:
> At 10:56 PM 12/12/2007, you wrote:
> 
> >On Wed, Dec 12, 2007 at 10:57:59PM -0600, Zhiliang Hu alleged:
> >> Let me re-phrase this problem --
> >> 
> >> 1- I can "qsub" to run a "hello" program,
> >> 2- I can run "mpiblast" with a script,
> >> 3- but when combine the two I encounter a weird problem:
> >> 
> >> > qsub -l nodes=6:ppn=2 mpiblast.sh
> >> -- Where "mpiblast.sh" contains:
> >> ----------------
> >> #!/bin/bash
> >> /opt/openmpi.gcc/bin/mpirun /usr/local/bin/mpiblast -p blastn -i /home/zhu/tests/mpiblast/datain4 -d bta.genome.chr
> >> ----------------
> >> 
> >> Now it complains (in the torque output file xxxx.e96):
> >> 
> >> ----------------------
> >> cp command failed!
> >> command: cp /raid/pub/ncbi/blast/mpidb/bta.genome.chr.007.nhr /scratch/tmp/bta.genome.chr.007.nhr
> >> source = /raid/pub/ncbi/blast/mpidb/bta.genome.chr.007.nhr
> >> dest = /scratch/tmp/bta.genome.chr.007.nhr
> >> ret_value = 32512
> >> ----------------------
> >> 
> >> Any idea what could this be?
> >
> >It doesn't look a torque error.  Is that coming from mpiblast, or your script?
> >Do both of those directories exist on the compute node?
> 
> and 
> 
> At 11:06 PM 12/12/2007, Chris Samuel <csamuel at vpac.org> wrote:
> 
> >Looks like an application error message rather than a Torque error.
> >
> >It appears that blast is trying (and failing) to stage some files from 
> >your RAID system to local scratch - but it's not saying why..
> 
> 
> Yes indeed that's what appears.  Thanks for hints on mpiblast -- but 
> the mpiblast.sh script works fine on its own.  I manually checked 
> folders, permissions, ssh, etc on all suspected directives and 
> everything appears fine as before.  As a matter of fact there are 
> more than 10 similar files in the same location that got copied over,
> no problem, so it appears "weird".

It's too bad it doesn't print the actual error message, because a return value
of 32512 is meaningless.  You may want to bring this up with an mpiblast list.

 
> That's why I ask here -- (my first 3 lines in the post) -- any possible
> known conflict when a working script and working qsub put together?

Well, obviously the environment can be different.  Put a 'set' or 'setenv' at
the top of your batch script to compare with your shell.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20071213/f68e1351/attachment.bin


More information about the torqueusers mailing list