[Mauiusers] Error in generating checkpoint file on AIX loadleveler

rama krishna joys_623 at yahoo.com
Tue Mar 13 03:38:54 MDT 2007


Hai everybody 
    
  Iam using AIX Loadleveler3.1 for checkpointing my simple serial application.The problem is while generating ckeckpoint file.It generates ckpt file name with extension .err,here(stp.ckpt.err).when restarted_from_ckpt is set to yes in job command file and run the job ,the node simply remove the job from the queue and even could not get output file. 
    
                                I am posting my job command file and application here.Please reply if anybody knows what is the problem for not generating correct ckpt file ,how to debug the problem.Tnx in advance 
    
    
  My job command file 
    
  # For First.c 
# @ job_type = serial 
# @ executable = first 
# @ output = stp.out 
# @ error = stp.err 
# @ class = general 
# @ checkpoint = yes 
# @ restart_from_ckpt = yes 
# @ ckpt_dir = /home/rtsg/crypt/ramakrishna/trial/ex/ 
# @ ckpt_file = stp.ckpt 
# @ restart_on_same_nodes = yes 
# @ requirements = Machine == "tf04" 
# @ wall_clock_limit = 5:00:00,4:30:00 
# @ queue 
      
  My application 
    
  #include<stdio.h> 
#include "llapi.h" 
int main() 
{ 
 int i; 
 LL_ckpt_info ckpt_info; 
 cr_error_t cp_error1; 
  
 ckpt_info.version = LL_API_VERSION; 
 ckpt_info.step_id = NULL; 
 ckpt_info.ckptType=NULL; 
 ckpt_info.waitType=NULL; 
 ckpt_info.abort_sig=NULL; 
 ckpt_info.cp_error_data=&cp_error1; 
 ckpt_info.ckpt_rc=0; 
 ckpt_info.soft_limit=0; 
 ckpt_info.hard_limit=0; 
 for(i=1;i<4000;i++) 
 { 
  printf("%d\n",i); 
  if(i==2000) 
   ll_init_ckpt(&ckpt_info ); 
 } 
 return 0; 
} 



RAM....!
 
---------------------------------
Sucker-punch spam with award-winning protection.
 Try the free Yahoo! Mail Beta.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20070313/218e41f5/attachment.html


More information about the mauiusers mailing list