[torqueusers] NVIDIA GPUs version error

Steve Crusan scrusan at ur.rochester.edu
Mon Aug 22 14:06:00 MDT 2011


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi all,

	I'm getting errors in my syslog from our gpu nodes pbs_moms:

	Aug 22 15:55:09 blugpu07 pbs_mom: LOG_ERROR::a system error occured (15205) in generate_server_gpustatus_smi, Unknown Nvidia driver version

	Here is the snipped output of pbsnodes blugpu07:
	<SNIPPED>
	gpu_status = gpu[1]=gpu_id=0:15:0;,gpu[0]=gpu_id=0:14:0;,driver_ver=275.09.07,timestamp=Mon Aug 22 15:56:41 2011


	If I login to the node, and check the pbs_mom logfiles, I see the following:

	08/22/2011 15:57:24;0002; pbs_mom;n/a;mom_server_all_update_gpustat;composing gpu status update for server 
	08/22/2011 15:57:24;0001; pbs_mom;Svr;pbs_mom;LOG_DEBUG::gpus, gpus: GPU cmd issued: nvidia-smi -a -x 2>&1
	 08/22/2011 15:57:26;0001; pbs_mom;Svr;pbs_mom;LOG_ERROR::a system error occured (15205) in generate_server_gpustatus_smi, Unknown Nvidia driver versio n 
	08/22/2011 15:57:26;0001; pbs_mom;Svr;pbs_mom;LOG_ERROR::a system error occured (15205) in generate_server_gpustatus_smi, Unknown Nvidia driver versio n 
	08/22/2011 15:57:26;0002; pbs_mom;n/a;mom_server_update_gpustat;mom_server_update_gpustat: sending to server "timestamp=Mon Aug 22 15:57:26 2011" 
	08/22/2011 15:57:26;0002; pbs_mom;n/a;mom_server_update_gpustat;mom_server_update_gpustat: sending to server "driver_ver=275.09.07" 
	08/22/2011 15:57:26;0002; pbs_mom;n/a;mom_server_update_gpustat;mom_server_update_gpustat: sending to server "gpuid=0:14:0" 
	08/22/2011 15:57:26;0002; pbs_mom;n/a;mom_server_update_gpustat;mom_server_update_gpustat: sending to server "gpuid=0:15:0" 
	08/22/2011 15:57:26;0002; pbs_mom;n/a;mom_server_update_gpustat;status update successfully sent to bhsn-int 


	Is this driver version we have not supported by torque? 



	Environment:
	- TORQUE-2.5.6
	- NVIDIA Driver Version : 275.09.07
	- kernel:	2.6.18-238.12.1.el5 

	- TORQUE client was build via:
	This build was configured with: '''--prefix=/opt/torque/2.5.6' '--exec-prefix=/opt/torque/2.5.6/x86_64' '--with-server-home=/var/spool/pbs' '--enable-syslog' '--with-scp' '--disable-rpp' '--disable-spool' '--with-pam' '--with-cpusets' '--with-geometry-requests' '--disable-gui' '--enable-nvidia-gpus' '--enable-docs'



 ----------------------
 Steve Crusan
 System Administrator
 Center for Research Computing
 University of Rochester
 https://www.crc.rochester.edu/


-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.17 (Darwin)
Comment: GPGTools - http://gpgtools.org

iQEcBAEBAgAGBQJOUrawAAoJENS19LGOpgqKwkoIAIQrY8rZn+J+vaSgnTElGxvu
KcMYlqkiBBZtix7YBCVMsHTv5PcOPT/4l1qHX4/7/P9ZW6Xc542LNKLJrd46FcLa
cmbkixUaGRJ5SDCVSyA6YzZZIBDHBjP3AMrIouDwjyOEhR3A9agI5yYPdFTRdcNQ
NoagT372lZnhVfPUYrVLM8oVIbS+KsZZGiYA4HShsbPUB/qqU/YqNroLlg7o8lVX
gHBY7C231TpC/YAJx1xZ5qjSSl1/mtzK8PuzqZ5mWBFtoXFvlzXFe+C0uqcCHLh2
jjkGeRU09YCkHEuqJy+iQ/KDGgvAFSmyuDgWq3RPJX8c7xw+y7saDLjhH9vPdVg=
=zdfO
-----END PGP SIGNATURE-----


More information about the torqueusers mailing list