Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
condor:installation:network [2011/07/18 18:04] – reorganized sections garrettheath4condor:installation:network [2012/08/09 19:18] (current) – [Configure Authentication] fixed sentence fragment garrettheath4
Line 8: Line 8:
  
 In order for daemons to run correctly and for permissions to be properly set, a local ''condor'' user must be present on all members of the Condor pool.  The following must be set for the ''condor'' users:\\ In order for daemons to run correctly and for permissions to be properly set, a local ''condor'' user must be present on all members of the Condor pool.  The following must be set for the ''condor'' users:\\
-  condor UID = 1344 +  condor UID = 64 
-  condor GID = 1610+  condor GID = 64
  
 First, check to see if the ''condor'' user exists on the machine.  Do this by running: First, check to see if the ''condor'' user exists on the machine.  Do this by running:
 <code bash>cat /etc/passwd | grep ^condor:</code> <code bash>cat /etc/passwd | grep ^condor:</code>
 **If you get a match**, first reset its settings in case the user wasn't created correctly. **If you get a match**, first reset its settings in case the user wasn't created correctly.
-<code bash>sudo groupmod -g 1610 condor +<code bash>sudo groupmod -g 64 condor 
-sudo usermod -c "Owner of Condor Daemons" -d "/var/lib/condor" -m -u 1344 -g condor -s "/sbin/nologin" -L condor</code>+sudo usermod -c "Owner of Condor Daemons" -d "/var/lib/condor" -m -u 64 -g condor -s "/sbin/nologin" -L condor</code>
 :!: If you get a message that says that the directory ''/var/lib/condor'' already exists, run this command next: :!: If you get a message that says that the directory ''/var/lib/condor'' already exists, run this command next:
 <code bash>sudo chown -R condor:condor /var/lib/condor</code> <code bash>sudo chown -R condor:condor /var/lib/condor</code>
  
 **If you do not get a match**, you need to manually add the user.  To do this, run: **If you do not get a match**, you need to manually add the user.  To do this, run:
-<code bash>sudo groupadd -g 1610 condor +<code bash>sudo groupadd -g 64 condor 
-sudo useradd -c "Owner of Condor Daemons" -d "/var/lib/condor" -m -u 1344 -g condor -s "/sbin/nologin" condor+sudo useradd -c "Owner of Condor Daemons" -d "/var/lib/condor" -m -u 64 -g condor -s "/sbin/nologin" condor
 sudo usermod -L condor</code> sudo usermod -L condor</code>
  
Line 28: Line 28:
 ===== Install Binaries ===== ===== Install Binaries =====
  
-In order to install the binaries onto the ''tesla.cs.wlu.edu'' NAS, run this command in the terminal:+The NAS is used to store all of the program files that Condor needs to run.  **Installing these onto the NAS only needs to happen once**, but in order to recompile the binaries onto the ''tesla.cs.wlu.edu'' NAS anyway, run this command in the terminal:
 <code bash>cd /mnt/config/src/fedora64 <code bash>cd /mnt/config/src/fedora64
 sudo ./condor_configure --type=manager,submit,execute --central-manager=john.cs.wlu.edu --local-dir=/mnt/config/hosts/_default --install-dir=/mnt/config/release/x86_64_rhap_5 --owner=condor --install --verbose</code> sudo ./condor_configure --type=manager,submit,execute --central-manager=john.cs.wlu.edu --local-dir=/mnt/config/hosts/_default --install-dir=/mnt/config/release/x86_64_rhap_5 --owner=condor --install --verbose</code>
  
 ===== Set Machine Variables ===== ===== Set Machine Variables =====
 +
 +Whenever the ''condor_master'' program opens, the first thing it does is look for the global configuration file.  FIXME
  
 The problem with putting as much of Condor on the NAS is that this introduces a lot of NFS traffic onto the network, especially when Condor jobs are running.  Having the user executables stored centrally on the NAS will cause all of the computers to be almost constantly reading from the NAS when the executables are opened and run. The problem with putting as much of Condor on the NAS is that this introduces a lot of NFS traffic onto the network, especially when Condor jobs are running.  Having the user executables stored centrally on the NAS will cause all of the computers to be almost constantly reading from the NAS when the executables are opened and run.
Line 45: Line 47:
 sudo chown -R condor:condor /var/lib/condor/execute sudo chown -R condor:condor /var/lib/condor/execute
 sudo chmod -R 755 /var/lib/condor/execute</code> sudo chmod -R 755 /var/lib/condor/execute</code>
 +
 +=====Configure Authentication=====
 +As specified in our Condor system's global configuration file, access to Condor is restricted to certain machines and usernames.  Whenever Condor receives a request, it first checks to see if the requester is allowed to make such a request.  Unfortunately, the requesting machine can lie about who it is and therefore "spoof" Condor into thinking the request is coming from a valid source.  In order to help prevent this from happening, Condor uses basic authentication to protect it from computers disguised as valid members of its pool.  This authentication takes the form of an encrypted password.  When Condor starts, it will read the configuration files to figure out where the password is stored.  As listed in the global configuration file as the ''SEC_PASSWORD_FILE'' configuration variable, the password is stored as ''/var/lib/condor/pool_password'' with root-only access.  In order for machines to be added to the Condor pool, this file __must be manually copied__ from an existing member of the pool to the new member.  Once copied, this file must be owned by ''root'' and have read and write access to the owner but all other permissions disabled (mode ''0600'').
 +
 +=====Configure Firewall=====
 +Condor primarily uses **port 9618** for communication between the ''condor_master'' daemons on each of the members of the Condor pool.  Because of this, the firewall of each of the members needs to have port 9618 open to accept incoming communication.  Condor uses a lot of other dynamically-chosen ports for direct communication between other daemons that want to bypass the ''condor_master'' daemon (in order to not bog down the busy ''condor_master'' daemon, of course).  If the daemons are configured to publish their port number publicly (in the filesystem), the daemons should be allowed to directly communicate with each other.
 +
 +In order to do this, a large range of ports needs to be opened so that all of the Condor daemons can freely communicate with each other while still having dynamically-allocated ports.  Specifically, the configuration variables ''HIGHPORT'' and ''LOWPORT'' in the global configuration file defines what range of ports Condor is allowed to use.  By default and/or by convention, this range is 9600-9700.  To open this port range, run ''system-config-firewall'' as ''root'' and add the **9600-9700** user-defined tcp and udp port ranges to the "Other Ports" section.  Click "Apply" to finish the deed.  Now, Condor daemons can freely communicate while not being impeded by the firewall.
 +
condor/installation/network.1311012279.txt.gz · Last modified: 2011/07/18 18:04 by garrettheath4
CC Attribution-Noncommercial-Share Alike 4.0 International
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0