A High Performance Compute cluster running scientific Linux. Head nodes and compute nodes are members of a Windows 2012 R2 Active directory domain. Users log in to client nodes using their AD login and then SSH (Kerberos enabled passwordless login) in to head node to submit PBS job scripts to the cluster.
knit and id commands work on the head node and display relevant information for the domainuser. Which means that the head node can connect and resolve domain usernames from the Active Directory.
Issue:
When submitting PBS jobs they see the following error :
[domainuser@myorg.org.au@HPC torque]$ qsub FirstJob.pbs
qsub: submit error (Bad UID for job execution MSG=User domainuser does not exist in server password file
when submitting the job , Torque call the system function : getpwnam_r to grab the user information who is submitting the job. The error in here is misleading as it sounds like getpwnam_r is only looking for the user in the "server password file". But according to the man page , when configured , it also search in NIS and LDAP for the given user.
“ The getpwnam() function returns a pointer to a structure containing the broken-out fields of the record in the password database (e.g., the local password file /etc/passwd, NIS, and LDAP) that matches the username name.
The getpwuid() function returns a pointer to a structure containing the broken-out fields of the record in the password database that matches the user ID uid. ”
Reason:
What causes this error is that getpwnam_r is looking for the user domainuser@myorg.org.au instead of domainuser in authentication databases. In this case in PASSWD file as well as in the Active Directory.
Fix:
Go to /etc/sssd/sssd.conf file and change the option : use_fully_qualified_names to False. So that SSS will only look for the username in Active Directory.
No comments:
Post a Comment