Friday, July 22, 2016

Configuring Bacula to backup an Apple Mac to a remote backup server

Scenario:
A set of   apple MacOS desktop machine running El Capitan need to be backed up to a remote Bacula server. This post describes the steps I have taken to get the setup working.










Environment:
Backup Server: Scientific Linux 7
Bacula director: Bacula 7.0.5  Upgraded to 7.4.3
Client Machines: MacOSX El Capitan
Bacula Client : bacula-sd 7.4.3

Step 1.
Install Homebrew package manager for Mac.

Step 2.
Install bacula-fd - Bacula's File Daemon
You only need to have bacula file daemon installed on client machines.
brew install bacula-fd

Make bacula-fd daemon start at  startup.
sudo brew services start bacula-fd

Step 3.
Upgrade Bacula on the server.
RedHat repositories only have Bacula 7.0.5 binaries at the time of this post. Since the client installed by homebrew is  version 7.4.3, I believe its a good idea to upgrade the director and the storage daemon to an equal or  higher version than the client.

a. Backup your bacula configuration files on the server /etc/bacula directory.
b. Backup bacula postgres or mysql SQL database
c. Download latest Bacula source.
d. Run ./configure -with-postgresql . Other configuration options are listed here.
e. Run make and then make install
f.  Run database update script
g Copy existing configuration conf files over new ones.
h.Run  chkconfig bacula-dir on ,  kconfig bacula-sd on and chkconfig bacula-fd on to make them start at start-up.

Step 4.
Configuring bacula-fd on clients.
Machines running bacula-dir , bacula-sd and bacula-fd must be able to resolve each other through DNS or host records.

 Log file for bacula-fd in client machines is located in:
/usr/local/var/log/bacula 
This log file not be used until you specifically include it in the conf file.

Sample config file for bacula-fd is shown below. Note the storage {} clause  included in it.


Step 5.
Configure clients,  file lists , backup jobs and schedules on bacula Director.
You can manually edit conf files or use webmin to configure this part.

Some error messages you may come across:
1. Fatal error: Unable to authenticate with File daemon at "myserver.ip.address:9102". Possible causes:   Passwords or names not the same or  Maximum Concurrent Jobs exceeded on the FD or  
FD networking messed up (restart daemon)

This means that your bacula director cannot talk to bacula file daemon. Follow these steps to resolve it.
a. Disable or add exceptions to bacula-fd port (TCP 9102) on client and server firewalls.
b. Makes sure that clients and server can resolve each other.
c. Make sure that the passwords are right.
d. Restart the client mac. Any port binding issues will get resolved from that.

2. Warning: Cannot bind port 9102: ERR=Address already in use: Retrying ...
You may see this error on the bacula-fd  client log.  Restarting client machine will resolve it.

3. Warning: bsock.c:107 Could not connect to Storage daemon on localhost:9103. ERR=Connection refused
a. You need to include the storage {} clause in your bacula-fd.conf file to redirect it to the remote storage server. Otherwise it will search for a bacula storage daemon on the local host.
b. Open up TCP port 9103 on client and server firewalls.

Saturday, May 7, 2016

qsub-Bad UID for job execution MSG=User does not exist in server password file

Scenario:
A High Performance Compute cluster running scientific Linux. Head nodes and  compute nodes are members of a Windows 2012 R2 Active directory domain. Users log in to client nodes  using their AD login and then SSH (Kerberos enabled passwordless login) in to head node to submit PBS job scripts to the cluster.  

knit and id commands work on the head node and display relevant information for the domainuser. Which means  that the head node can connect and resolve domain usernames from the Active Directory.

Issue:
When submitting PBS jobs they see the following error :
[domainuser@myorg.org.au@HPC torque]$ qsub FirstJob.pbs 
qsub: submit error (Bad UID for job execution MSG=User domainuser does not exist in server password file


when submitting the job , Torque call the system function : getpwnam_r to grab the user information who is submitting the job. The error in here is misleading as it sounds like getpwnam_r  is only looking for the user in the "server password file". But according to the man page , when configured , it also search in NIS and LDAP  for the given user.

“ The getpwnam() function returns a pointer to a structure containing the broken-out fields of the record in the password database (e.g., the local password file /etc/passwd, NIS, and LDAP) that matches the username name. 
The getpwuid() function returns a pointer to a structure containing the broken-out fields of the record in the password database that matches the user ID uid

Reason:
What causes this error is that getpwnam_r is looking for the user  domainuser@myorg.org.au instead of domainuser in authentication databases. In this case in PASSWD file as well as in the Active Directory.

Fix:
Go to /etc/sssd/sssd.conf file and change the option :  use_fully_qualified_names to False. So that SSS will only look for the username in Active Directory.

Torque Make error: mom_mach.h: No such file or directory

Issue: 
While running MAKE to compile  torque resource manager you may come up with the error:
site_mom_chu.c:25:22: fatal error: mom_mach.h: No such file or directory
#include "mom_mach.h"


Environment: 
OS: Scientific Linux 7.2
TORQUE Resource Manager :  6.0.1

Fix: 
To fix this issue , simply give full file access to all the files in the torque folder.

chmod 777 -R *

Then attempt the MAKE process again.

Only other available web resource regarding this issue is in here:

Monday, April 11, 2016

Configuring XNAT to use Active Directory LDAP Authentication

Intro:
XNAT is an open source imaging informatics platform developed by the Neuroinformatics Research Group at Washington University. XNAT was originally developed in the Buckner Lab at Washington University, now at Harvard University. It facilitates common management, productivity, and quality assurance tasks for imaging and associated data. Thanks to its extensibility, XNAT can be used to support a wide range of imaging-based projects.

Tested with following versons:
OS : Scientific Linux 7.2
XNAT version: 1.6.5
Java Version:  1.7.0_79
AD : Windows Server 2012 R2

Let's assume that, 

  • Your organisation's active directory domain is : myorg.com.au
  • All your users are located in People OU in the root of the domain
  • The directory server DNS name is : dc01.myorg.com.au
  • The ldap service account to access and read domain information is located in : myorg.com.au/People/Service Accounts
  • The service account to access the directory is : srvldap and  Passoword is : password

Official documentation on how to configure XNAT for LDAP authentication is located here.
Services.Properties Configuration - XNAT 1.6.x Documentation - XNAT Documentation Wiki

The purpose of this post is to provide you with accurate configuration options to make XNAT work with Active Directory. 

This is how a working configuration should looks like in XNAT  /apache-tomcat-7.0.68/webapps/xnat/WEB-INF/conf/services.properties file. (Note that the path will be different in your implementation) 

############# services.properties  ############# 
# Comma-separated list of the providers that users will be able to use to authenticate.
provider.providers.enabled=db,ldap1

provider.db.name=LOCAL
provider.db.id=localdb
provider.db.type=db

# Add "ldap1" to the enabled provider list above and fill in the missing fields to enable LDAP authentication.
provider.ldap1.name=MYORG
provider.ldap1.id=ldap1
provider.ldap1.type=ldap
provider.ldap1.address=ldap://dc01.myorg.com.au:389/dc=myorg,dc=com,dc=au
provider.ldap1.userdn=myorg.com.au/People/Service Accounts/srvldap
provider.ldap1.password=password
provider.ldap1.search.base=ou=People
provider.ldap1.search.filter=(sAMAccountName={0})

############ END services.properties  ###########

Note that ,
1. On the provider.ldap1.address field,  I have used:dc=myorg,dc=com,dc=au instead of using recommended dc=au,dc=com,dc=myorg. This order is important. Other wise you will get the following error in your XNAT security.log file.

"Authentication request failed: org.springframework.security.authentication.BadCredentialsException: Bad credentials"

2. Canonical name for provider.ldap1.userdn field instead of DN

Some helpful tips:
1. To enable debugging in XNAT security log , change flags shown below in the log4j.properties file.
This file is located in  /apache-tomcat-7.0.68/webapps/xnat/WEB-INF/conf/ folder

Change flags from WARN to DEBUG 

# Security logs, both Spring Framework and XNAT
log4j.category.org.springframework.security=DEBUG, security
log4j.additivity.org.springframework.security=false
log4j.category.org.nrg.xnat.security=DEBUG, security
log4j.additivity.org.nrg.xnat.security=false

2. Use JXplorer to test the connectivity to the Active Directory. Using a Java based tool like JXplorer will help you to troubleshoot the issues better in this type of scenarios as XNAT is also based on Java.

A helpful reference: XNAT 1.6.3 LDAP Error - Google Groups