[torqueusers] NVIDIA GPUs version error
Steve Crusan
scrusan at ur.rochester.edu
Mon Aug 22 14:06:00 MDT 2011
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi all,
I'm getting errors in my syslog from our gpu nodes pbs_moms:
Aug 22 15:55:09 blugpu07 pbs_mom: LOG_ERROR::a system error occured (15205) in generate_server_gpustatus_smi, Unknown Nvidia driver version
Here is the snipped output of pbsnodes blugpu07:
<SNIPPED>
gpu_status = gpu[1]=gpu_id=0:15:0;,gpu[0]=gpu_id=0:14:0;,driver_ver=275.09.07,timestamp=Mon Aug 22 15:56:41 2011
If I login to the node, and check the pbs_mom logfiles, I see the following:
08/22/2011 15:57:24;0002; pbs_mom;n/a;mom_server_all_update_gpustat;composing gpu status update for server
08/22/2011 15:57:24;0001; pbs_mom;Svr;pbs_mom;LOG_DEBUG::gpus, gpus: GPU cmd issued: nvidia-smi -a -x 2>&1
08/22/2011 15:57:26;0001; pbs_mom;Svr;pbs_mom;LOG_ERROR::a system error occured (15205) in generate_server_gpustatus_smi, Unknown Nvidia driver versio n
08/22/2011 15:57:26;0001; pbs_mom;Svr;pbs_mom;LOG_ERROR::a system error occured (15205) in generate_server_gpustatus_smi, Unknown Nvidia driver versio n
08/22/2011 15:57:26;0002; pbs_mom;n/a;mom_server_update_gpustat;mom_server_update_gpustat: sending to server "timestamp=Mon Aug 22 15:57:26 2011"
08/22/2011 15:57:26;0002; pbs_mom;n/a;mom_server_update_gpustat;mom_server_update_gpustat: sending to server "driver_ver=275.09.07"
08/22/2011 15:57:26;0002; pbs_mom;n/a;mom_server_update_gpustat;mom_server_update_gpustat: sending to server "gpuid=0:14:0"
08/22/2011 15:57:26;0002; pbs_mom;n/a;mom_server_update_gpustat;mom_server_update_gpustat: sending to server "gpuid=0:15:0"
08/22/2011 15:57:26;0002; pbs_mom;n/a;mom_server_update_gpustat;status update successfully sent to bhsn-int
Is this driver version we have not supported by torque?
Environment:
- TORQUE-2.5.6
- NVIDIA Driver Version : 275.09.07
- kernel: 2.6.18-238.12.1.el5
- TORQUE client was build via:
This build was configured with: '''--prefix=/opt/torque/2.5.6' '--exec-prefix=/opt/torque/2.5.6/x86_64' '--with-server-home=/var/spool/pbs' '--enable-syslog' '--with-scp' '--disable-rpp' '--disable-spool' '--with-pam' '--with-cpusets' '--with-geometry-requests' '--disable-gui' '--enable-nvidia-gpus' '--enable-docs'
----------------------
Steve Crusan
System Administrator
Center for Research Computing
University of Rochester
https://www.crc.rochester.edu/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.17 (Darwin)
Comment: GPGTools - http://gpgtools.org
iQEcBAEBAgAGBQJOUrawAAoJENS19LGOpgqKwkoIAIQrY8rZn+J+vaSgnTElGxvu
KcMYlqkiBBZtix7YBCVMsHTv5PcOPT/4l1qHX4/7/P9ZW6Xc542LNKLJrd46FcLa
cmbkixUaGRJ5SDCVSyA6YzZZIBDHBjP3AMrIouDwjyOEhR3A9agI5yYPdFTRdcNQ
NoagT372lZnhVfPUYrVLM8oVIbS+KsZZGiYA4HShsbPUB/qqU/YqNroLlg7o8lVX
gHBY7C231TpC/YAJx1xZ5qjSSl1/mtzK8PuzqZ5mWBFtoXFvlzXFe+C0uqcCHLh2
jjkGeRU09YCkHEuqJy+iQ/KDGgvAFSmyuDgWq3RPJX8c7xw+y7saDLjhH9vPdVg=
=zdfO
-----END PGP SIGNATURE-----
More information about the torqueusers
mailing list