Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
condor:installation:configuration [2011/07/22 15:57] – changed title garrettheath4 | condor:installation:configuration [2011/08/19 18:52] (current) – garrettheath4 | ||
---|---|---|---|
Line 3: | Line 3: | ||
=====Global Configuration File===== | =====Global Configuration File===== | ||
- | <file autoconf condor_config_global>###################################################################### | + | [[http:// |
- | ## | + | |
- | ## condor_config | + | |
- | ## | + | |
- | ## This is the global configuration file for condor. | + | |
- | ## made here may potentially be overridden in the local configuration | + | |
- | ## file. KEEP THAT IN MIND! To double-check that a variable is | + | |
- | ## getting set from the configuration file that you expect, use | + | |
- | ## condor_config_val -v < | + | |
- | ## | + | |
- | ## The file is divided into four main parts: | + | |
- | ## Part 1: Settings you likely want to customize | + | |
- | ## Part 2: Settings you may want to customize | + | |
- | ## Part 3: Settings that control the policy of when condor will | + | |
- | ## start and stop jobs on your machines | + | |
- | ## Part 4: Settings you should probably leave alone (unless you | + | |
- | ## know what you're doing) | + | |
- | ## | + | |
- | ## Please read the INSTALL file (or the Install chapter in the | + | |
- | ## Condor Administrator' | + | |
- | ## various settings in here and possible ways to configure your | + | |
- | ## pool. | + | |
- | ## | + | |
- | ## Unless otherwise specified, settings that are commented out show | + | |
- | ## the defaults that are used if you don't define a value. | + | |
- | ## that are defined here MUST BE DEFINED since they have no default | + | |
- | ## value. | + | |
- | ## | + | |
- | ## Unless otherwise indicated, all settings which specify a time are | + | |
- | ## defined in seconds. | + | |
- | ## | + | |
- | ###################################################################### | + | |
- | + | ||
- | ###################################################################### | + | |
- | ###################################################################### | + | |
- | ## | + | |
- | ## ###### | + | |
- | ## # # | + | |
- | ## # # | + | |
- | ## ###### | + | |
- | ## # ###### | + | |
- | ## # # # # # | + | |
- | ## # # # # # # | + | |
- | ## | + | |
- | ## Part 1: Settings you likely want to customize: | + | |
- | ###################################################################### | + | |
- | ###################################################################### | + | |
- | + | ||
- | ## What machine is your central manager? | + | |
- | + | ||
- | ## | + | |
- | ## Pathnames: | + | |
- | ## | + | |
- | ## Where are all of the Condor-related files stored for the entire | + | |
- | ## Condor system? | + | |
- | CondorDir = | + | |
- | + | ||
- | ## Where have you installed the bin, sbin and lib condor | + | |
- | RELEASE_DIR = $(CondorDir)/ | + | |
- | + | ||
- | ## Where is the local condor directory for each host? | + | |
- | ## This is where the local config file(s), logs and | + | |
- | ## spool/ | + | |
- | LOCAL_DIR = $(CondorDir)/ | + | |
- | + | ||
- | ## Where is the machine-specific local config file for each host? | + | |
- | # | + | |
- | # If this computer is the Condor Central Manager, load the central | + | |
- | # manager/ | + | |
- | # | + | |
- | + | ||
- | + | ||
- | ## Where are optional machine-specific local config files located? | + | |
- | ## Config files are included in lexicographic order. | + | |
- | LOCAL_CONFIG_DIR = $(LOCAL_DIR)/ | + | |
- | + | ||
- | ## Blacklist for file processing in the LOCAL_CONFIG_DIR | + | |
- | ## LOCAL_CONFIG_DIR_EXCLUDE_REGEXP = ^((\..*)|(.*~)|(# | + | |
- | + | ||
- | ## If the local config file is not present, is it an error? | + | |
- | ## WARNING: This is a potential security issue. | + | |
- | ## If not specificed, the default is True | + | |
- | REQUIRE_LOCAL_CONFIG_FILE = TRUE | + | |
- | + | ||
- | ## | + | |
- | ## Mail parameters: | + | |
- | ## | + | |
- | ## When something goes wrong with condor at your site, who should get | + | |
- | ## the email? | + | |
- | CONDOR_ADMIN = kollerg14@mail.wlu.edu | + | |
- | + | ||
- | ## Full path to a mail delivery program that understands that " | + | |
- | ## means you want to specify a subject: | + | |
- | MAIL = /bin/mail | + | |
- | + | ||
- | ## | + | |
- | ## Network domain parameters: | + | |
- | ## | + | |
- | ## Internet domain of machines sharing a common UID space. | + | |
- | ## machines don't share a common UID space, set it to | + | |
- | ## UID_DOMAIN = $(FULL_HOSTNAME) | + | |
- | ## to specify that each machine has its own UID space. | + | |
- | UID_DOMAIN = | + | |
- | + | ||
- | ## Internet domain of machines sharing a common file system. | + | |
- | ## If your machines don't use a network file system, set it to | + | |
- | ## FILESYSTEM_DOMAIN = $(FULL_HOSTNAME) | + | |
- | ## to specify that each machine has its own file system. | + | |
- | FILESYSTEM_DOMAIN = cs.wlu.edu | + | |
- | + | ||
- | ## What machine is your central manager? | + | |
- | CONDOR_HOST = john.cs.wlu.edu | + | |
- | # " | + | |
- | # manager call itself? | + | |
- | CondorHost_RealName = $(CONDOR_HOST) | + | |
- | + | ||
- | ## This macro is used to specify a short description of your pool. | + | |
- | ## It should be about 20 characters long. For example, the name of | + | |
- | ## the UW-Madison Computer Science Condor Pool is ``UW-Madison CS'' | + | |
- | COLLECTOR_NAME = Orion | + | |
- | + | ||
- | ###################################################################### | + | |
- | ###################################################################### | + | |
- | ## | + | |
- | ## ###### | + | |
- | ## # # | + | |
- | ## # # | + | |
- | ## ###### | + | |
- | ## # ###### | + | |
- | ## # # # # # | + | |
- | ## # # # # # # | + | |
- | ## | + | |
- | ## Part 2: Settings you may want to customize: | + | |
- | ## (it is generally safe to leave these untouched) | + | |
- | ###################################################################### | + | |
- | ###################################################################### | + | |
- | + | ||
- | ## | + | |
- | ## The user/group ID < | + | |
- | ## (this can also be specified in the environment) | + | |
- | ## Note: the CONDOR_IDS setting is ignored on Win32 platforms | + | |
- | # NOTE: CONDOR_IDS is defined in the machine-specific configuration files | + | |
- | # | + | |
- | + | ||
- | ## | + | |
- | ## Flocking: Submitting jobs to more than one pool | + | |
- | ## | + | |
- | ## Flocking allows you to run your jobs in other pools, or lets | + | |
- | ## others run jobs in your pool. | + | |
- | ## | + | |
- | ## To let others flock to you, define FLOCK_FROM. | + | |
- | ## | + | |
- | ## To flock to others, define FLOCK_TO. | + | |
- | + | ||
- | ## Join the W&L CS Pool (Orion) | + | |
- | # Designate which machines are members of this pool. | + | |
- | PoolMembers = john.cs.wlu.edu, | + | |
- | CondorUsers = condor@john.cs.wlu.edu/john.cs.wlu.edu, | + | |
- | condor@carl.cs.wlu.edu/ | + | |
- | condor@fred.cs.wlu.edu/ | + | |
- | AdminUsers = koller@$(CONDOR_HOST)/ | + | |
- | RootUsers = root@john.cs.wlu.edu/ | + | |
- | root@carl.cs.wlu.edu/ | + | |
- | root@fred.cs.wlu.edu/ | + | |
- | + | ||
- | ## FLOCK_FROM defines the machines where you would like to grant | + | |
- | ## people access to your pool via flocking. (i.e. you are granting | + | |
- | ## access to these machines to join your pool). | + | |
- | FLOCK_FROM = $(PoolMembers) | + | |
- | ## An example of this is: | + | |
- | #FLOCK_FROM = somehost.friendly.domain, | + | |
- | + | ||
- | ## FLOCK_TO defines the central managers of the pools that you want | + | |
- | ## to flock to. (i.e. you are specifying the machines that you | + | |
- | ## want your jobs to be negotiated at -- thereby specifying the | + | |
- | ## pools they will run in.) | + | |
- | FLOCK_TO = $(CONDOR_HOST) | + | |
- | ## An example of this is: | + | |
- | #FLOCK_TO = central_manager.friendly.domain, | + | |
- | + | ||
- | ## FLOCK_COLLECTOR_HOSTS should almost always be the same as | + | |
- | ## FLOCK_NEGOTIATOR_HOSTS (as shown below). | + | |
- | ## different is if the collector and negotiator in the pool that you are | + | |
- | ## flocking too are running on different machines (not recommended). | + | |
- | ## The collectors must be specified in the same corresponding order as | + | |
- | ## the FLOCK_NEGOTIATOR_HOSTS list. | + | |
- | FLOCK_NEGOTIATOR_HOSTS = $(FLOCK_TO) | + | |
- | FLOCK_COLLECTOR_HOSTS = $(FLOCK_TO) | + | |
- | ## An example of having the negotiator and the collector on different | + | |
- | ## machines is: | + | |
- | # | + | |
- | # | + | |
- | + | ||
- | ## | + | |
- | ## Host/IP access levels | + | |
- | ## | + | |
- | ## Please see the administrator' | + | |
- | ## settings, what they' | + | |
- | + | ||
- | ## What machines have administrative rights for your pool? This | + | |
- | ## defaults to your central manager. | + | |
- | ## machine(s) where whoever is the condor administrator(s) works | + | |
- | ## (assuming you trust all the users who log into that/ | + | |
- | ## machine(s), since this is machine-wide access you're granting). | + | |
- | ALLOW_ADMINISTRATOR = $(AdminUsers) | + | |
- | + | ||
- | ## If there are no machines that should have administrative access | + | |
- | ## to your pool (for example, there' | + | |
- | ## users have accounts), you can uncomment this setting. | + | |
- | ## Unfortunately, | + | |
- | ## be more difficult. | + | |
- | # | + | |
- | + | ||
- | ## What machines should have " | + | |
- | ## they can issue commands that a machine owner should be able to | + | |
- | ## issue to their own machine (like condor_vacate). | + | |
- | ## machines with administrator access, and the local machine. | + | |
- | ## is probably what you want. | + | |
- | ALLOW_OWNER = $(FULL_HOSTNAME), | + | |
- | + | ||
- | ## Read access. | + | |
- | ## can view the status of your pool, but cannot join your pool | + | |
- | ## or run jobs. | + | |
- | ## NOTE: By default, without these entries customized, you | + | |
- | ## are granting read access to the whole world. | + | |
- | ## restrict that to hosts in your domain. | + | |
- | ## grant read access to " | + | |
- | ## will be able to view the status of your pool and more easily help | + | |
- | ## you install, configure or debug your Condor installation. | + | |
- | ## It is important to have this defined. | + | |
- | ALLOW_READ = $(AdminUsers), | + | |
- | #ALLOW_READ = *.your.domain, | + | |
- | #DENY_READ = *.bad.subnet, | + | |
- | + | ||
- | ## Write access. | + | |
- | ## jobs, etc. Note: Any machine which has WRITE access must | + | |
- | ## also be granted READ access. | + | |
- | ## not also automatically grant READ access; you must change | + | |
- | ## ALLOW_READ above as well. | + | |
- | ## | + | |
- | ## You must set this to something else before Condor will run. | + | |
- | ## This most simple option is: | + | |
- | ## ALLOW_WRITE = * | + | |
- | ## but note that this will allow anyone to submit jobs or add | + | |
- | ## machines to your pool and is a serious security risk. | + | |
- | + | ||
- | ALLOW_WRITE = $(AdminUsers), | + | |
- | # | + | |
- | #DENY_WRITE = bad-machine.your.domain | + | |
- | + | ||
- | ## Are you upgrading to a new version of Condor and confused about | + | |
- | ## why the above ALLOW_WRITE setting is causing Condor to refuse to | + | |
- | ## start up? If you are upgrading from a configuration that uses | + | |
- | ## HOSTALLOW/ | + | |
- | ## convert all uses of the former to the latter. | + | |
- | ## authorization settings is identical. | + | |
- | ## unauthenticated IP-based authorization as well as authenticated | + | |
- | ## user-based authorization. | + | |
- | ## HOSTALLOW/ | + | |
- | ## in the future. | + | |
- | + | ||
- | ## Negotiator access. | + | |
- | ## managers. | + | |
- | ALLOW_NEGOTIATOR = condor@$(CONDOR_HOST)/ | + | |
- | ## Now, with flocking we need to let the SCHEDD trust the other | + | |
- | ## negotiators we are flocking with as well. You should normally | + | |
- | ## not have to change this either. | + | |
- | ALLOW_NEGOTIATOR_SCHEDD = $(CONDOR_HOST), | + | |
- | + | ||
- | ## Config access. | + | |
- | ## tool to modify all daemon configurations. | + | |
- | ## access should only be granted with extreme caution. | + | |
- | ## | + | |
- | ALLOW_CONFIG = $(AdminUsers) | + | |
- | + | ||
- | ## Daemon Access added by Garrett Koller (not in default config file) | + | |
- | ## Daemon access. | + | |
- | ## with the daemons of " | + | |
- | ## machines will be acknowledged and appropriate responses will be sent. | + | |
- | # | + | |
- | ALLOW_DAEMON = $(PoolMembers) | + | |
- | + | ||
- | ## Client Access added by Garrett Koller (not in default config file) | + | |
- | ## Client access. | + | |
- | ## I allow or deny." | + | |
- | ALLOW_CLIENT = $(PoolMembers) | + | |
- | + | ||
- | ## Flocking Configs. | + | |
- | ## but we set them from the FLOCK_FROM/TO macros above. | + | |
- | ## to leave these unchanged. | + | |
- | ALLOW_WRITE_COLLECTOR = $(ALLOW_WRITE), | + | |
- | ALLOW_WRITE_STARTD | + | |
- | ALLOW_READ_COLLECTOR | + | |
- | ALLOW_READ_STARTD | + | |
- | + | ||
- | # Clear out any old-style HOSTALLOW settings: | + | |
- | HOSTALLOW_READ = | + | |
- | HOSTALLOW_WRITE = | + | |
- | HOSTALLOW_DAEMON = | + | |
- | HOSTALLOW_NEGOTIATOR = | + | |
- | HOSTALLOW_ADMINISTRATOR = | + | |
- | HOSTALLOW_OWNER = | + | |
- | + | ||
- | ## | + | |
- | ## Authentication | + | |
- | ## | + | |
- | ## Authentication added by Garrett Koller (not in default config file) | + | |
- | ## These parameters define how Condor will know whether or not a | + | |
- | ## machine that attempts to communicate with it is who it says it is. | + | |
- | ## Refer to Section 3.6.3 " | + | |
- | ## documentation for more information | + | |
- | + | ||
- | # A client processess (run by a normal user on a machine that may or | + | |
- | # may not have Condor installed, such as condor_submit) or another | + | |
- | # Condor daemon (either running locally or remotely) will offer these | + | |
- | # authentication methods when trying to communicate with the Condor | + | |
- | # system daemons. | + | |
- | SEC_CLIENT_AUTHENTICATION_METHODS = FS, PASSWORD | + | |
- | + | ||
- | # A daemon will accept these forms of authentication when | + | |
- | # communicating | + | |
- | SEC_DEFAULT_AUTHENTICATION_METHODS = FS, PASSWORD | + | |
- | + | ||
- | # Password authentication | + | |
- | SEC_PASSWORD_FILE = / | + | |
- | SEC_DAEMON_AUTHENTICATION = REQUIRED | + | |
- | SEC_DAEMON_INTEGRITY = REQUIRED | + | |
- | SEC_DAEMON_AUTHENTICATION_METHODS = PASSWORD | + | |
- | SEC_NEGOTIATOR_AUTHENTICATION = REQUIRED | + | |
- | SEC_NEGOTIATOR_INTEGRITY = REQUIRED | + | |
- | SEC_NEGOTIATOR_AUTHENTICATION_METHODS = PASSWORD | + | |
- | + | ||
- | + | ||
- | ## | + | |
- | ## Security parameters for setting configuration values remotely: | + | |
- | ## | + | |
- | ## These parameters define the list of attributes that can be set | + | |
- | ## remotely with condor_config_val for the security access levels | + | |
- | ## defined above (for example, WRITE, ADMINISTRATOR, | + | |
- | ## Please see the administrator' | + | |
- | ## settings, what they' | + | |
- | ## default values for any of these settings. | + | |
- | ## defined, no attributes can be set with condor_config_val. | + | |
- | + | ||
- | ## Do you want to allow condor_config_val -rset to work at all? | + | |
- | ## This feature is disabled by default, so to enable, you must | + | |
- | ## uncomment the following setting and change the value to " | + | |
- | ## Note: changing this requires a restart not just a reconfig. | + | |
- | ENABLE_RUNTIME_CONFIG = False | + | |
- | + | ||
- | ## Do you want to allow condor_config_val -set to work at all? | + | |
- | ## This feature is disabled by default, so to enable, you must | + | |
- | ## uncomment the following setting and change the value to " | + | |
- | ## Note: changing this requires a restart not just a reconfig. | + | |
- | ENABLE_PERSISTENT_CONFIG = False | + | |
- | + | ||
- | ## Directory where daemons should write persistent config files (used | + | |
- | ## to support condor_config_val -set). | + | |
- | ## be writable by root (or the user the Condor daemons are running as | + | |
- | ## if non-root). | + | |
- | ## Note: changing this requires a restart not just a reconfig. | + | |
- | # | + | |
- | + | ||
- | ## Attributes that can be set by hosts with " | + | |
- | ## defined with ALLOW_CONFIG and DENY_CONFIG above). | + | |
- | ## The commented-out value here was the default behavior of Condor | + | |
- | ## prior to version 6.3.3. | + | |
- | ## should leave this commented out. | + | |
- | # | + | |
- | + | ||
- | ## Attributes that can be set by hosts with " | + | |
- | ## permission (as defined above) | + | |
- | # | + | |
- | + | ||
- | ## Attributes that can be set by hosts with " | + | |
- | ## defined above) NOTE: any Condor job running on a given host will | + | |
- | ## have OWNER permission on that host by default. | + | |
- | ## kind of access, Condor jobs will be able to modify any attributes | + | |
- | ## you list below on the machine where they are running. | + | |
- | ## obvious security implications, | + | |
- | ## permission for custom attributes that you define for your own use | + | |
- | ## at your pool (custom attributes about your machines that are | + | |
- | ## published with the STARTD_ATTRS setting, for example). | + | |
- | # | + | |
- | + | ||
- | ## You can also define daemon-specific versions of each of these | + | |
- | ## settings. | + | |
- | ## changed in the condor_startd' | + | |
- | ## permission, you would use: | + | |
- | # | + | |
- | + | ||
- | + | ||
- | ## | + | |
- | ## Network filesystem parameters: | + | |
- | ## | + | |
- | ## Do you want to use NFS for file access instead of remote system | + | |
- | ## calls? | + | |
- | USE_NFS = True | + | |
- | + | ||
- | ## Do you want to use AFS for file access instead of remote system | + | |
- | ## calls? | + | |
- | #USE_AFS = False | + | |
- | + | ||
- | ## | + | |
- | ## Checkpoint server: | + | |
- | ## | + | |
- | ## Do you want to use a checkpoint server if one is available? | + | |
- | ## checkpoint server isn't available or USE_CKPT_SERVER is set to | + | |
- | ## False, checkpoints will be written to the local SPOOL directory on | + | |
- | ## the submission machine. | + | |
- | USE_CKPT_SERVER = False | + | |
- | + | ||
- | ## What's the hostname of this machine' | + | |
- | # | + | |
- | + | ||
- | ## Do you want the starter on the execute machine to choose the | + | |
- | ## checkpoint server? | + | |
- | ## the submit machine is used. Otherwise, the CKPT_SERVER_HOST set | + | |
- | ## on the execute machine is used. The default is true. | + | |
- | # | + | |
- | + | ||
- | ## | + | |
- | ## Miscellaneous: | + | |
- | ## | + | |
- | ## Try to save this much swap space by not starting new shadows. | + | |
- | ## Specified in megabytes. | + | |
- | # | + | |
- | + | ||
- | ## What's the maximum number of jobs you want a single submit machine | + | |
- | ## to spawn shadows for? The default is a function of $(DETECTED_MEMORY) | + | |
- | ## and a guess at the number of ephemeral ports available. | + | |
- | + | ||
- | ## Example 1: | + | |
- | # | + | |
- | + | ||
- | ## Example 2: | + | |
- | ## This is more complicated, | + | |
- | ## First define some expressions to use in our calculation. | + | |
- | ## Assume we can use up to 80% of memory and estimate shadow private data | + | |
- | ## size of 800k. | + | |
- | MAX_SHADOWS_MEM = ceiling($(DETECTED_MEMORY)*0.8*1024/ | + | |
- | ## Assume we can use ~21,000 ephemeral ports (avg ~2.1 per shadow). | + | |
- | ## Under Linux, the range is set in / | + | |
- | MAX_SHADOWS_PORTS = 10000 | + | |
- | ## Under windows, things are much less scalable, currently. | + | |
- | ## Note that this can probably be safely increased a bit under 64-bit windows. | + | |
- | MAX_SHADOWS_OPSYS = ifThenElse(regexp(" | + | |
- | ## Now build up the expression for MAX_JOBS_RUNNING. | + | |
- | ## due to lack of a min() function. | + | |
- | MAX_JOBS_RUNNING = $(MAX_SHADOWS_MEM) | + | |
- | MAX_JOBS_RUNNING = \ | + | |
- | ifThenElse( $(MAX_SHADOWS_PORTS) < $(MAX_JOBS_RUNNING), | + | |
- | $(MAX_SHADOWS_PORTS), | + | |
- | $(MAX_JOBS_RUNNING) ) | + | |
- | MAX_JOBS_RUNNING = \ | + | |
- | ifThenElse( $(MAX_SHADOWS_OPSYS) < $(MAX_JOBS_RUNNING), | + | |
- | $(MAX_SHADOWS_OPSYS), | + | |
- | $(MAX_JOBS_RUNNING) ) | + | |
- | + | ||
- | + | ||
- | ## Maximum number of simultaneous downloads of output files from | + | |
- | ## execute machines to the submit machine (limit applied per schedd). | + | |
- | ## The value 0 means unlimited. | + | |
- | # | + | |
- | + | ||
- | ## Maximum number of simultaneous uploads of input files from the | + | |
- | ## submit machine to execute machines (limit applied per schedd). | + | |
- | ## The value 0 means unlimited. | + | |
- | # | + | |
- | + | ||
- | ## Condor needs to create a few lock files to synchronize access to | + | |
- | ## various log files. | + | |
- | ## filesystems and file locking over the years, we HIGHLY recommend | + | |
- | ## that you put these lock files on a local partition on each | + | |
- | ## machine. | + | |
- | ## be sure to change this entry. | + | |
- | ## running as needs to have write access to this directory. | + | |
- | ## you're not running as root, this is whatever user you started up | + | |
- | ## the condor_master as. If you are running as root, and there' | + | |
- | ## condor account, it's probably condor. | + | |
- | ## you've set in the CONDOR_IDS environment variable. | + | |
- | ## manual for details on this. | + | |
- | LOCK = / | + | |
- | + | ||
- | ## If you don't use a fully qualified name in your /etc/hosts file | + | |
- | ## (or NIS, etc.) for either your official hostname or as an alias, | + | |
- | ## Condor wouldn' | + | |
- | ## places that it'd like to. You can set this parameter to the | + | |
- | ## domain you'd like appended to your hostname, if changing your host | + | |
- | ## information isn't a good option. | + | |
- | ## the global config file (not the LOCAL_CONFIG_FILE from above). | + | |
- | # | + | |
- | + | ||
- | ## If you don't have DNS set up, Condor will normally fail in many | + | |
- | ## places because it can't resolve hostnames to IP addresses and | + | |
- | ## vice-versa. If you enable this option, Condor will use | + | |
- | ## pseudo-hostnames constructed from a machine' | + | |
- | ## DEFAULT_DOMAIN_NAME. Both NO_DNS and DEFAULT_DOMAIN must be set in | + | |
- | ## your top-level config file for this mode of operation to work | + | |
- | ## properly. | + | |
- | NO_DNS = False | + | |
- | + | ||
- | ## Condor can be told whether or not you want the Condor daemons to | + | |
- | ## create a core file if something really bad happens. | + | |
- | ## sets the resource limit for the size of a core file. By default, | + | |
- | ## we don't do anything, and leave in place whatever limit was in | + | |
- | ## effect when you started the Condor daemons. | + | |
- | ## set and " | + | |
- | ## it's set to " | + | |
- | ## core files are even created). | + | |
- | ## developers debug any problems you might be having. | + | |
- | # | + | |
- | + | ||
- | ## When Condor daemons detect a fatal internal exception, they | + | |
- | ## normally log an error message and exit. If you have turned on | + | |
- | ## CREATE_CORE_FILES, | + | |
- | ## ABORT_ON_EXCEPTION so that core files are generated when an | + | |
- | ## exception occurs. | + | |
- | ## want. | + | |
- | # | + | |
- | + | ||
- | ## Condor Glidein downloads binaries from a remote server for the | + | |
- | ## machines into which you're gliding. This saves you from manually | + | |
- | ## downloading and installing binaries for every architecture you | + | |
- | ## might want to glidein to. The default server is one maintained at | + | |
- | ## The University of Wisconsin. If you don't want to use the UW | + | |
- | ## server, you can set up your own and change the following to | + | |
- | ## point to it, instead. | + | |
- | GLIDEIN_SERVER_URLS = \ | + | |
- | http:// | + | |
- | + | ||
- | ## List the sites you want to GlideIn to on the GLIDEIN_SITES. For example, | + | |
- | ## if you'd like to GlideIn to some Alliance GiB resources, | + | |
- | ## uncomment the line below. | + | |
- | ## Make sure that $(GLIDEIN_SITES) is included in ALLOW_READ and | + | |
- | ## ALLOW_WRITE, | + | |
- | ## This is _NOT_ done for you by default, because it is an even better | + | |
- | ## idea to use a strong security method (such as GSI) rather than | + | |
- | ## host-based security for authorizing glideins. | + | |
- | # | + | |
- | # | + | |
- | + | ||
- | ## If your site needs to use UID_DOMAIN settings (defined above) that | + | |
- | ## are not real Internet domains that match the hostnames, you can | + | |
- | ## tell Condor to trust whatever UID_DOMAIN a submit machine gives to | + | |
- | ## the execute machine and just make sure the two strings match. | + | |
- | ## default for this setting is False, since it is more secure this | + | |
- | ## way. | + | |
- | TRUST_UID_DOMAIN = True | + | |
- | + | ||
- | ## If you would like to be informed in near real-time via condor_q when | + | |
- | ## a vanilla/ | + | |
- | ## TRUE. However, this real-time update of the condor_schedd by the shadows | + | |
- | ## could cause performance issues if there are thousands of concurrently | + | |
- | ## running vanilla/ | + | |
- | ## are allowed to suspend and resume. | + | |
- | # | + | |
- | + | ||
- | ## A standard universe job can perform arbitrary shell calls via the | + | |
- | ## libc ' | + | |
- | ## which performs the actual system() invocation in the initialdir of the | + | |
- | ## running program and as the user who submitted the job. However, since the | + | |
- | ## user job can request ARBITRARY shell commands to be run by the shadow, this | + | |
- | ## is a generally unsafe practice. This should only be made available if it is | + | |
- | ## actually needed. If this attribute is not defined, then it is the same as | + | |
- | ## it being defined to False. Set it to True to allow the shadow to execute | + | |
- | ## arbitrary shell code from the user job. | + | |
- | SHADOW_ALLOW_UNSAFE_REMOTE_EXEC = False | + | |
- | + | ||
- | ## KEEP_OUTPUT_SANDBOX is an optional feature to tell Condor-G to not | + | |
- | ## remove the job spool when the job leaves the queue. | + | |
- | ## set to TRUE. Since you will be operating Condor-G in this manner, | + | |
- | ## you may want to put leave_in_queue = false in your job submit | + | |
- | ## description files, to tell Condor-G to simply remove the job from | + | |
- | ## the queue immediately when the job completes (since the output files | + | |
- | ## will stick around no matter what). | + | |
- | # | + | |
- | + | ||
- | ## This setting tells the negotiator to ignore user priorities. | + | |
- | ## avoids problems where jobs from different users won't run when using | + | |
- | ## condor_advertise instead of a full-blown startd (some of the user | + | |
- | ## priority system in Condor relies on information from the startd -- | + | |
- | ## we will remove this reliance when we support the user priority | + | |
- | ## system for grid sites in the negotiator; for now, this setting will | + | |
- | ## just disable it). | + | |
- | # | + | |
- | + | ||
- | ## This is a list of libraries containing ClassAd plug-in functions. | + | |
- | # | + | |
- | + | ||
- | ## This setting tells Condor whether to delegate or copy GSI X509 | + | |
- | ## credentials when sending them over the wire between daemons. | + | |
- | ## Delegation can take up to a second, which is very slow when | + | |
- | ## submitting a large number of jobs. Copying exposes the credential | + | |
- | ## to third parties if Condor isn't set to encrypt communications. | + | |
- | ## By default, Condor will delegate rather than copy. | + | |
- | # | + | |
- | + | ||
- | ## This setting controls whether Condor delegates a full or limited | + | |
- | ## X509 credential for jobs. Currently, this only affects grid-type | + | |
- | ## gt2 grid universe jobs. The default is False. | + | |
- | # | + | |
- | + | ||
- | ## This setting controls the default behaviour for the spooling of files | + | |
- | ## into, or out of, the Condor system by such tools as condor_submit | + | |
- | ## and condor_transfer_data. Here is the list of valid settings for this | + | |
- | ## parameter and what they mean: | + | |
- | ## | + | |
- | ## | + | |
- | ## Ask the condor_schedd to solely store/ | + | |
- | ## | + | |
- | ## | + | |
- | ## Ask the condor_schedd for a location of a condor_transferd, | + | |
- | ## store/ | + | |
- | ## | + | |
- | ## The allowed values are case insensitive. | + | |
- | ## The default of this parameter if not specified is: stm_use_schedd_only | + | |
- | SANDBOX_TRANSFER_METHOD = stm_use_schedd_only | + | |
- | + | ||
- | ## This setting specifies an IP address that depends on the setting of | + | |
- | ## BIND_ALL_INTERFACES. If BIND_ALL_INTERFACES | + | |
- | ## this variable controls what IP address will be advertised as the public | + | |
- | ## address of the daemon. If BIND_ALL_INTERFACES is False, then this variable | + | |
- | ## specifies which IP address to bind network sockets to. If | + | |
- | ## BIND_ALL_INTERFACES is False and NETWORK_INTERFACE is not defined, Condor | + | |
- | ## chooses a network interface automatically. It tries to choose a public | + | |
- | ## interface if one is available. If it cannot decide which of two interfaces | + | |
- | ## to choose from, it will pick the first one. | + | |
- | # | + | |
- | + | ||
- | ## | + | |
- | ## Settings that control the daemon' | + | |
- | ## | + | |
- | + | ||
- | ## | + | |
- | ## The flags given in ALL_DEBUG are shared between all daemons. | + | |
- | ## | + | |
- | + | ||
- | ALL_DEBUG | + | |
- | + | ||
- | MAX_COLLECTOR_LOG = 1000000 | + | |
- | COLLECTOR_DEBUG = | + | |
- | + | ||
- | MAX_KBDD_LOG = 1000000 | + | |
- | KBDD_DEBUG = | + | |
- | + | ||
- | MAX_NEGOTIATOR_LOG = 1000000 | + | |
- | NEGOTIATOR_DEBUG = D_MATCH | + | |
- | MAX_NEGOTIATOR_MATCH_LOG = 1000000 | + | |
- | + | ||
- | MAX_SCHEDD_LOG = 1000000 | + | |
- | SCHEDD_DEBUG = D_PID | + | |
- | + | ||
- | MAX_SHADOW_LOG = 1000000 | + | |
- | SHADOW_DEBUG = | + | |
- | + | ||
- | MAX_STARTD_LOG = 1000000 | + | |
- | STARTD_DEBUG = | + | |
- | + | ||
- | MAX_STARTER_LOG = 1000000 | + | |
- | + | ||
- | MAX_MASTER_LOG = 1000000 | + | |
- | MASTER_DEBUG = | + | |
- | ## When the master starts up, should it truncate it's log file? | + | |
- | TRUNC_MASTER_LOG_ON_OPEN | + | |
- | + | ||
- | MAX_JOB_ROUTER_LOG | + | |
- | JOB_ROUTER_DEBUG | + | |
- | + | ||
- | MAX_ROOSTER_LOG | + | |
- | ROOSTER_DEBUG | + | |
- | + | ||
- | MAX_SHARED_PORT_LOG | + | |
- | SHARED_PORT_DEBUG | + | |
- | + | ||
- | MAX_HDFS_LOG | + | |
- | HDFS_DEBUG | + | |
- | + | ||
- | # High Availability Logs | + | |
- | MAX_HAD_LOG = 1000000 | + | |
- | HAD_DEBUG = | + | |
- | MAX_REPLICATION_LOG = 1000000 | + | |
- | REPLICATION_DEBUG = | + | |
- | MAX_TRANSFERER_LOG = 1000000 | + | |
- | TRANSFERER_DEBUG = | + | |
- | + | ||
- | + | ||
- | ## The daemons touch their log file periodically, | + | |
- | ## nothing to write. When a daemon starts up, it prints the last time | + | |
- | ## the log file was modified. This lets you estimate when a previous | + | |
- | ## instance of a daemon stopped running. This paramete controls how often | + | |
- | ## the daemons touch the file (in seconds). | + | |
- | TOUCH_LOG_INTERVAL = 300 | + | |
- | + | ||
- | ###################################################################### | + | |
- | ###################################################################### | + | |
- | ## | + | |
- | ## ###### | + | |
- | ## # # | + | |
- | ## # # | + | |
- | ## ###### | + | |
- | ## # ###### | + | |
- | ## # # # # # | + | |
- | ## # # # # # # | + | |
- | ## | + | |
- | ## Part 3: Settings control the policy for running, stopping, and | + | |
- | ## periodically checkpointing condor jobs: | + | |
- | ###################################################################### | + | |
- | ###################################################################### | + | |
- | + | ||
- | ## This section contains macros are here to help write legible | + | |
- | ## expressions: | + | |
- | MINUTE = 60 | + | |
- | HOUR = (60 * $(MINUTE)) | + | |
- | StateTimer = (time() - EnteredCurrentState) | + | |
- | ActivityTimer = (time() - EnteredCurrentActivity) | + | |
- | ActivationTimer = ifThenElse(JobStart =!= UNDEFINED, (time() - JobStart), 0) | + | |
- | LastCkpt = (time() - LastPeriodicCheckpoint) | + | |
- | + | ||
- | ## The JobUniverse attribute is just an int. These macros can be | + | |
- | ## used to specify the universe in a human-readable way: | + | |
- | STANDARD = 1 | + | |
- | VANILLA = 5 | + | |
- | MPI = 8 | + | |
- | VM = 13 | + | |
- | IsMPI = (TARGET.JobUniverse == $(MPI)) | + | |
- | IsVanilla | + | |
- | IsStandard | + | |
- | IsVM = (TARGET.JobUniverse == $(VM)) | + | |
- | + | ||
- | NonCondorLoadAvg = (LoadAvg - CondorLoadAvg) | + | |
- | BackgroundLoad = 0.3 | + | |
- | HighLoad = 0.5 | + | |
- | StartIdleTime = 15 * $(MINUTE) | + | |
- | ContinueIdleTime = | + | |
- | MaxSuspendTime = 10 * $(MINUTE) | + | |
- | MaxVacateTime = 10 * $(MINUTE) | + | |
- | + | ||
- | KeyboardBusy = (KeyboardIdle < $(MINUTE)) | + | |
- | ConsoleBusy = (ConsoleIdle | + | |
- | CPUIdle = ($(NonCondorLoadAvg) <= $(BackgroundLoad)) | + | |
- | CPUBusy = ($(NonCondorLoadAvg) >= $(HighLoad)) | + | |
- | KeyboardNotBusy = ($(KeyboardBusy) == False) | + | |
- | + | ||
- | BigJob = (TARGET.ImageSize >= (50 * 1024)) | + | |
- | MediumJob = (TARGET.ImageSize >= (15 * 1024) && TARGET.ImageSize < (50 * 1024)) | + | |
- | SmallJob = (TARGET.ImageSize < (15 * 1024)) | + | |
- | + | ||
- | JustCPU = ($(CPUBusy) && ($(KeyboardBusy) == False)) | + | |
- | MachineBusy = ($(CPUBusy) | + | |
- | + | ||
- | ## If job submitter user is listed here, give the job a high priority. | + | |
- | GreedyUserRank = (Owner == " | + | |
- | + (Owner == " | + | |
- | + (Owner == " | + | |
- | + (Owner == " | + | |
- | + (Owner == " | + | |
- | + (Owner == " | + | |
- | + | ||
- | IsGreedyUser | + | |
- | + | ||
- | ## The RANK expression controls which jobs this machine prefers to | + | |
- | ## run over others. | + | |
- | ## RANK = TARGET.ImageSize | + | |
- | ## RANK = (Owner == " | + | |
- | ## + ((Owner == " | + | |
- | ## By default, RANK is always 0, meaning that all jobs have an equal | + | |
- | ## ranking. | + | |
- | RANK = $(GreedyUserRank) | + | |
- | + | ||
- | + | ||
- | ##################################################################### | + | |
- | ## This where you choose the configuration that you would like to | + | |
- | ## use. It has no defaults so it must be defined. | + | |
- | ## file off with the UWCS_* policy. | + | |
- | ###################################################################### | + | |
- | + | ||
- | ## Also here is what is referred to as the TESTINGMODE_*, | + | |
- | ## a quick hardwired way to test Condor with a simple no-preemption policy. | + | |
- | ## Replace UWCS_* with TESTINGMODE_* if you wish to do testing mode. | + | |
- | ## For example: | + | |
- | ## WANT_SUSPEND = $(UWCS_WANT_SUSPEND) | + | |
- | ## becomes | + | |
- | ## WANT_SUSPEND = $(TESTINGMODE_WANT_SUSPEND) | + | |
- | + | ||
- | # When should we only consider SUSPEND instead of PREEMPT? | + | |
- | WANT_SUSPEND = $(UWCS_WANT_SUSPEND) | + | |
- | + | ||
- | # When should we preempt gracefully instead of hard-killing? | + | |
- | WANT_VACATE = $(UWCS_WANT_VACATE) | + | |
- | + | ||
- | ## When is this machine willing to start a job? | + | |
- | START = ($(UWCS_START) || $(IsGreedyUser)) | + | |
- | + | ||
- | ## When should a local universe job be allowed to start? | + | |
- | # | + | |
- | + | ||
- | ## When should a scheduler universe job be allowed to start? | + | |
- | # | + | |
- | + | ||
- | ## When to suspend a job? | + | |
- | SUSPEND = ($(UWCS_SUSPEND) && ($(IsGreedyUser) == False)) | + | |
- | + | ||
- | ## When to resume a suspended job? | + | |
- | CONTINUE = ($(UWCS_CONTINUE) || $(IsGreedyUser)) | + | |
- | + | ||
- | ## When to nicely stop a job? | + | |
- | ## (as opposed to killing it instantaneously) | + | |
- | PREEMPT = ($(UWCS_PREEMPT) && ($(IsGreedyUser) == False)) | + | |
- | + | ||
- | ## When to instantaneously kill a preempting job | + | |
- | ## (e.g. if a job is in the pre-empting stage for too long) | + | |
- | KILL = ($(UWCS_KILL) && ($(IsGreedyUser) == False)) | + | |
- | + | ||
- | PERIODIC_CHECKPOINT = $(UWCS_PERIODIC_CHECKPOINT) | + | |
- | PREEMPTION_REQUIREMENTS = $(UWCS_PREEMPTION_REQUIREMENTS) | + | |
- | PREEMPTION_RANK = $(UWCS_PREEMPTION_RANK) | + | |
- | NEGOTIATOR_PRE_JOB_RANK = $(UWCS_NEGOTIATOR_PRE_JOB_RANK) | + | |
- | NEGOTIATOR_POST_JOB_RANK = $(UWCS_NEGOTIATOR_POST_JOB_RANK) | + | |
- | MaxJobRetirementTime | + | |
- | CLAIM_WORKLIFE | + | |
- | + | ||
- | ##################################################################### | + | |
- | ## This is the UWisc - CS Department Configuration. | + | |
- | ##################################################################### | + | |
- | + | ||
- | # When should we only consider SUSPEND instead of PREEMPT? | + | |
- | # Only when SUSPEND is True and one of the following is also true: | + | |
- | # - the job is small | + | |
- | # - the keyboard is idle | + | |
- | # - it is a vanilla universe job | + | |
- | UWCS_WANT_SUSPEND | + | |
- | ( $(SUSPEND) ) | + | |
- | + | ||
- | # When should we preempt gracefully instead of hard-killing? | + | |
- | UWCS_WANT_VACATE | + | |
- | + | ||
- | # Only start jobs if: | + | |
- | # 1) the keyboard has been idle long enough, AND | + | |
- | # 2) the load average is low enough OR the machine is currently | + | |
- | # running a Condor job | + | |
- | # (NOTE: Condor will only run 1 job at a time on a given resource. | + | |
- | # The reasons Condor might consider running a different job while | + | |
- | # already running one are machine Rank (defined above), and user | + | |
- | # priorities.) | + | |
- | UWCS_START = ( (KeyboardIdle > $(StartIdleTime)) \ | + | |
- | && ( $(CPUIdle) || \ | + | |
- | | + | |
- | + | ||
- | # Suspend jobs if: | + | |
- | # 1) the keyboard has been touched, OR | + | |
- | # 2a) The cpu has been busy for more than 2 minutes, AND | + | |
- | # 2b) the job has been running for more than 90 seconds | + | |
- | UWCS_SUSPEND = ( $(KeyboardBusy) || \ | + | |
- | ( (CpuBusyTime > 2 * $(MINUTE)) \ | + | |
- | && | + | |
- | + | ||
- | # Continue jobs if: | + | |
- | # 1) the cpu is idle, AND | + | |
- | # 2) we've been suspended more than 10 seconds, AND | + | |
- | # 3) the keyboard hasn't been touched in a while | + | |
- | UWCS_CONTINUE = ( $(CPUIdle) && ($(ActivityTimer) > 10) \ | + | |
- | && (KeyboardIdle > $(ContinueIdleTime)) ) | + | |
- | + | ||
- | # Preempt jobs if: | + | |
- | # 1) The job is suspended and has been suspended longer than we want | + | |
- | # 2) OR, we don't want to suspend this job, but the conditions to | + | |
- | # suspend jobs have been met (someone is using the machine) | + | |
- | UWCS_PREEMPT = ( ((Activity == " | + | |
- | ($(ActivityTimer) > $(MaxSuspendTime))) \ | + | |
- | || (SUSPEND && (WANT_SUSPEND == False)) ) | + | |
- | + | ||
- | # Maximum time (in seconds) to wait for a job to finish before kicking | + | |
- | # it off (due to PREEMPT, a higher priority claim, or the startd | + | |
- | # gracefully shutting down). | + | |
- | # was started, minus any suspension time. Once the retirement time runs | + | |
- | # out, the usual preemption process will take place. | + | |
- | # self-limit the retirement time to _less_ than what is given here. | + | |
- | # By default, nice user jobs and standard universe jobs set their | + | |
- | # MaxJobRetirementTime to 0, so they will not wait in retirement. | + | |
- | + | ||
- | UWCS_MaxJobRetirementTime = 0 | + | |
- | + | ||
- | ## If you completely disable preemption of claims to machines, you | + | |
- | ## should consider limiting the timespan over which new jobs will be | + | |
- | ## accepted on the same claim. | + | |
- | ## preemption for a comprehensive discussion. | + | |
- | ## configuration does not disable preemption of claims, we leave | + | |
- | ## CLAIM_WORKLIFE undefined (infinite). | + | |
- | # | + | |
- | + | ||
- | # Kill jobs if they have taken too long to vacate gracefully | + | |
- | UWCS_KILL = $(ActivityTimer) > $(MaxVacateTime) | + | |
- | + | ||
- | ## Only define vanilla versions of these if you want to make them | + | |
- | ## different from the above settings. | + | |
- | # | + | |
- | # | + | |
- | # | + | |
- | # && | + | |
- | # | + | |
- | # | + | |
- | # || (SUSPEND_VANILLA && (WANT_SUSPEND == False)) ) | + | |
- | # | + | |
- | + | ||
- | ## Checkpoint every 3 hours on average, with a +-30 minute random | + | |
- | ## factor to avoid having many jobs hit the checkpoint server at | + | |
- | ## the same time. | + | |
- | UWCS_PERIODIC_CHECKPOINT = $(LastCkpt) > (3 * $(HOUR) + \ | + | |
- | $RANDOM_INTEGER(-30, | + | |
- | + | ||
- | ## You might want to checkpoint a little less often. | + | |
- | ## example of this is below. | + | |
- | ## periodic checkpoint every 6 hours. | + | |
- | ## checkpoint every 12 hours. | + | |
- | # | + | |
- | # ( (TARGET.ImageSize < 60000) && \ | + | |
- | # ($(LastCkpt) > (6 * $(HOUR) + $RANDOM_INTEGER(-30, | + | |
- | # ( $(LastCkpt) > (12 * $(HOUR) + $RANDOM_INTEGER(-30, | + | |
- | + | ||
- | ## The rank expressions used by the negotiator are configured below. | + | |
- | ## This is the order in which ranks are applied by the negotiator: | + | |
- | ## 1. NEGOTIATOR_PRE_JOB_RANK | + | |
- | ## 2. rank in job ClassAd | + | |
- | ## 3. NEGOTIATOR_POST_JOB_RANK | + | |
- | ## 4. cause of preemption (0=user priority, | + | |
- | ## 5. PREEMPTION_RANK | + | |
- | + | ||
- | ## The NEGOTIATOR_PRE_JOB_RANK expression overrides all other ranks | + | |
- | ## that are used to pick a match from the set of possibilities. | + | |
- | ## The following expression matches jobs to unclaimed resources | + | |
- | ## whenever possible, regardless of the job-supplied rank. | + | |
- | UWCS_NEGOTIATOR_PRE_JOB_RANK = RemoteOwner =?= UNDEFINED | + | |
- | + | ||
- | ## The NEGOTIATOR_POST_JOB_RANK expression chooses between | + | |
- | ## resources that are equally preferred by the job. | + | |
- | ## The following example expression steers jobs toward | + | |
- | ## faster machines and tends to fill a cluster of multi-processors | + | |
- | ## breadth-first instead of depth-first. | + | |
- | ## machines over offline (hibernating) ones. In this example, | + | |
- | ## the expression is chosen to have no effect when preemption | + | |
- | ## would take place, allowing control to pass on to | + | |
- | ## PREEMPTION_RANK. | + | |
- | UWCS_NEGOTIATOR_POST_JOB_RANK = \ | + | |
- | | + | |
- | + | ||
- | ## The negotiator will not preempt a job running on a given machine | + | |
- | ## unless the PREEMPTION_REQUIREMENTS expression evaluates to true | + | |
- | ## and the owner of the idle job has a better priority than the owner | + | |
- | ## of the running job. This expression defaults to true. | + | |
- | UWCS_PREEMPTION_REQUIREMENTS = ( $(StateTimer) > (1 * $(HOUR)) && \ | + | |
- | RemoteUserPrio > TARGET.SubmitterUserPrio * 1.2 ) || (MY.NiceUser == True) | + | |
- | + | ||
- | ## The PREEMPTION_RANK expression is used in a case where preemption | + | |
- | ## is the only option and all other negotiation ranks are equal. | + | |
- | ## example, if the job has no preference, it is usually preferable to | + | |
- | ## preempt a job with a small ImageSize instead of a job with a large | + | |
- | ## ImageSize. | + | |
- | ## same. However, the negotiator will always prefer to match the job | + | |
- | ## with an idle machine over a preemptable machine, if all other | + | |
- | ## negotiation ranks are equal. | + | |
- | UWCS_PREEMPTION_RANK = (RemoteUserPrio * 1000000) - TARGET.ImageSize | + | |
- | + | ||
- | + | ||
- | ##################################################################### | + | |
- | ## This is a Configuration that will cause your Condor jobs to | + | |
- | ## always run. This is intended for testing only. | + | |
- | ###################################################################### | + | |
- | + | ||
- | ## This mode will cause your jobs to start on a machine an will let | + | |
- | ## them run to completion. | + | |
- | ## on in the machine (load average, keyboard activity, etc.) | + | |
- | + | ||
- | TESTINGMODE_WANT_SUSPEND = False | + | |
- | TESTINGMODE_WANT_VACATE = False | + | |
- | TESTINGMODE_START = True | + | |
- | TESTINGMODE_SUSPEND = False | + | |
- | TESTINGMODE_CONTINUE = True | + | |
- | TESTINGMODE_PREEMPT = False | + | |
- | TESTINGMODE_KILL = False | + | |
- | TESTINGMODE_PERIODIC_CHECKPOINT = False | + | |
- | TESTINGMODE_PREEMPTION_REQUIREMENTS = False | + | |
- | TESTINGMODE_PREEMPTION_RANK = 0 | + | |
- | + | ||
- | # Prevent machine claims from being reused indefinitely, | + | |
- | # preemption of claims is disabled in the TESTINGMODE configuration. | + | |
- | TESTINGMODE_CLAIM_WORKLIFE = 1200 | + | |
- | + | ||
- | + | ||
- | ###################################################################### | + | |
- | ###################################################################### | + | |
- | ## | + | |
- | ## ###### | + | |
- | ## # # | + | |
- | ## # # | + | |
- | ## ###### | + | |
- | ## # ###### | + | |
- | ## # # # # # | + | |
- | ## # # # # # # | + | |
- | ## | + | |
- | ## Part 4: Settings you should probably leave alone: | + | |
- | ## (unless you know what you're doing) | + | |
- | ###################################################################### | + | |
- | ###################################################################### | + | |
- | + | ||
- | ###################################################################### | + | |
- | ## Daemon-wide settings: | + | |
- | ###################################################################### | + | |
- | + | ||
- | ## Pathnames | + | |
- | LOG = $(LOCAL_DIR)/ | + | |
- | SPOOL = $(LOCAL_DIR)/ | + | |
- | #EXECUTE = $(LOCAL_DIR)/ | + | |
- | EXECUTE = / | + | |
- | BIN = $(RELEASE_DIR)/ | + | |
- | LIB = $(RELEASE_DIR)/ | + | |
- | INCLUDE = $(RELEASE_DIR)/ | + | |
- | SBIN = $(RELEASE_DIR)/ | + | |
- | LIBEXEC = $(RELEASE_DIR)/ | + | |
- | + | ||
- | ## If you leave HISTORY undefined (comment it out), no history file | + | |
- | ## will be created. | + | |
- | HISTORY = $(SPOOL)/ | + | |
- | + | ||
- | ## Log files | + | |
- | COLLECTOR_LOG = $(LOG)/ | + | |
- | KBDD_LOG = $(LOG)/ | + | |
- | MASTER_LOG = $(LOG)/ | + | |
- | NEGOTIATOR_LOG = $(LOG)/ | + | |
- | NEGOTIATOR_MATCH_LOG = $(LOG)/ | + | |
- | SCHEDD_LOG = $(LOG)/ | + | |
- | SHADOW_LOG = $(LOG)/ | + | |
- | STARTD_LOG = $(LOG)/ | + | |
- | STARTER_LOG = $(LOG)/ | + | |
- | JOB_ROUTER_LOG | + | |
- | ROOSTER_LOG | + | |
- | SHARED_PORT_LOG = $(LOG)/ | + | |
- | # High Availability Logs | + | |
- | HAD_LOG = $(LOG)/ | + | |
- | REPLICATION_LOG = $(LOG)/ | + | |
- | TRANSFERER_LOG = $(LOG)/ | + | |
- | HDFS_LOG = $(LOG)/ | + | |
- | + | ||
- | ## Lock files | + | |
- | SHADOW_LOCK = $(LOCK)/ | + | |
- | + | ||
- | ## This setting controls how often any lock files currently in use have their | + | |
- | ## timestamp updated. Updating the timestamp prevents administrative programs | + | |
- | ## like ' | + | |
- | ## an integer in seconds with a minimum of 60 seconds. The default if not | + | |
- | ## specified is 28800 seconds, or 8 hours. | + | |
- | ## This attribute only takes effect on restart of the daemons or at the next | + | |
- | ## update time. | + | |
- | | + | |
- | + | ||
- | ## This setting primarily allows you to change the port that the | + | |
- | ## collector is listening on. By default, the collector uses port | + | |
- | ## 9618, but you can set the port with a ": | + | |
- | ## COLLECTOR_HOST = $(CONDOR_HOST): | + | |
- | COLLECTOR_HOST | + | |
- | + | ||
- | ## The NEGOTIATOR_HOST parameter has been deprecated. | + | |
- | ## the negotiator is listening is now dynamically allocated and the IP | + | |
- | ## and port are now obtained from the collector, just like all the | + | |
- | ## other daemons. | + | |
- | ## are running version 6.7.3 or earlier, you can uncomment this | + | |
- | ## setting to go back to the old fixed-port (9614) for the negotiator. | + | |
- | # | + | |
- | + | ||
- | ## How long are you willing to let daemons try their graceful | + | |
- | ## shutdown methods before they do a hard shutdown? (30 minutes) | + | |
- | SHUTDOWN_GRACEFUL_TIMEOUT = 1800 | + | |
- | + | ||
- | ## How much disk space would you like reserved from Condor? | + | |
- | ## places where Condor is computing the free disk space on various | + | |
- | ## partitions, it subtracts the amount it really finds by this | + | |
- | ## many megabytes. | + | |
- | RESERVED_DISK = 50 | + | |
- | + | ||
- | ## If your machine is running AFS and the AFS cache lives on the same | + | |
- | ## partition as the other Condor directories, | + | |
- | ## reserve the space that your AFS cache is configured to use, set | + | |
- | ## this to true. | + | |
- | # | + | |
- | + | ||
- | ## By default, if a user does not specify " | + | |
- | ## description file, any email Condor sends about that job will go to | + | |
- | ## " | + | |
- | ## domain (so that you would set UID_DOMAIN to be the same across all | + | |
- | ## machines in your pool), *BUT* email to user@UID_DOMAIN is *NOT* | + | |
- | ## the right place for Condor to send email for your site, you can | + | |
- | ## define the default domain to use for email. | + | |
- | ## would be to set EMAIL_DOMAIN to the fully qualified hostname of | + | |
- | ## each machine in your pool, so users submitting jobs from a | + | |
- | ## specific machine would get email sent to user@machine.your.domain, | + | |
- | ## instead of user@your.domain. | + | |
- | ## setting commented out unless two things are true: 1) UID_DOMAIN is | + | |
- | ## set to your domain, not $(FULL_HOSTNAME), | + | |
- | ## user@UID_DOMAIN won't work. | + | |
- | EMAIL_DOMAIN = mail.wlu.edu | + | |
- | + | ||
- | ## Should Condor daemons create a UDP command socket (for incomming | + | |
- | ## UDP-based commands) in addition to the TCP command socket? | + | |
- | ## default, classified ad updates sent to the collector use UDP, in | + | |
- | ## addition to some keep alive messages and other non-essential | + | |
- | ## communication. | + | |
- | ## desirable to disable the UDP command port (for example, to reduce | + | |
- | ## the number of ports represented by a GCB broker, etc). If not | + | |
- | ## defined, the UDP command socket is enabled by default, and to | + | |
- | ## modify this, you must restart your Condor daemons. Also, this | + | |
- | ## setting must be defined machine-wide. | + | |
- | ## " | + | |
- | ## is " | + | |
- | # | + | |
- | + | ||
- | ## If your site needs to use TCP updates to the collector, instead of | + | |
- | ## UDP, you can enable this feature. | + | |
- | ## THIS FOR MOST SITES! | + | |
- | ## this feature are pools made up of machines connected via a | + | |
- | ## wide-area network where UDP packets are frequently or always | + | |
- | ## dropped. | + | |
- | ## COLLECTOR_SOCKET_CACHE_SIZE setting at your collector, and each | + | |
- | ## entry in the socket cache uses another file descriptor. | + | |
- | ## defined, this feature is disabled by default. | + | |
- | # | + | |
- | + | ||
- | ## HIGHPORT and LOWPORT let you set the range of ports that Condor | + | |
- | ## will use. This may be useful if you are behind a firewall. By | + | |
- | ## default, Condor uses port 9618 for the collector, 9614 for the | + | |
- | ## negotiator, and system-assigned (apparently random) ports for | + | |
- | ## everything else. HIGHPORT and LOWPORT only affect these | + | |
- | ## system-assigned ports, but will restrict them to the range you | + | |
- | ## specify here. If you want to change the well-known ports for the | + | |
- | ## collector or negotiator, see COLLECTOR_HOST or NEGOTIATOR_HOST. | + | |
- | ## Note that both LOWPORT and HIGHPORT must be at least 1024 if you | + | |
- | ## are not starting your daemons as root. You may also specify | + | |
- | ## different port ranges for incoming and outgoing connections by | + | |
- | ## using IN_HIGHPORT/ | + | |
- | #HIGHPORT = 9700 | + | |
- | #LOWPORT = 9600 | + | |
- | + | ||
- | ## If a daemon doens' | + | |
- | ## a core file? This bascially controls the type of the signal | + | |
- | ## sent to the child process, and mostly affects the Condor Master | + | |
- | # | + | |
- | + | ||
- | + | ||
- | ###################################################################### | + | |
- | ## Daemon-specific settings: | + | |
- | ###################################################################### | + | |
- | + | ||
- | ## | + | |
- | ## condor_master | + | |
- | ## | + | |
- | ## Daemons you want the master to keep running for you: | + | |
- | # NOTE: DAEMON_LIST is defined in the local configuration files | + | |
- | # | + | |
- | + | ||
- | ## Which daemons use the Condor DaemonCore library (i.e., not the | + | |
- | ## checkpoint server or custom user daemons)? | + | |
- | # | + | |
- | #MASTER, STARTD, SCHEDD, KBDD, COLLECTOR, NEGOTIATOR, EVENTD, \ | + | |
- | # | + | |
- | #DBMSD, QUILL, JOB_ROUTER, ROOSTER, LEASEMANAGER, | + | |
- | + | ||
- | + | ||
- | ## Where are the binaries for these daemons? | + | |
- | MASTER = $(SBIN)/ | + | |
- | STARTD = $(SBIN)/ | + | |
- | SCHEDD = $(SBIN)/ | + | |
- | KBDD = $(SBIN)/ | + | |
- | NEGOTIATOR = $(SBIN)/ | + | |
- | COLLECTOR = $(SBIN)/ | + | |
- | STARTER_LOCAL = $(SBIN)/ | + | |
- | JOB_ROUTER | + | |
- | ROOSTER | + | |
- | HDFS = $(SBIN)/ | + | |
- | SHARED_PORT = $(LIBEXEC)/ | + | |
- | TRANSFERER = $(LIBEXEC)/ | + | |
- | + | ||
- | ## When the master starts up, it can place it's address (IP and port) | + | |
- | ## into a file. This way, tools running on the local machine don' | + | |
- | ## need to query the central manager to find the master. | + | |
- | ## feature can be turned off by commenting out this setting. | + | |
- | MASTER_ADDRESS_FILE = $(LOG)/ | + | |
- | + | ||
- | ## Where should the master find the condor_preen binary? If you don' | + | |
- | ## want preen to run at all, set it to nothing. | + | |
- | PREEN = $(SBIN)/ | + | |
- | + | ||
- | ## How do you want preen to behave? | + | |
- | ## about files preen finds that it thinks it should remove. | + | |
- | ## means you want preen to actually remove these files. | + | |
- | ## want either of those things to happen, just remove the appropriate | + | |
- | ## one from this setting. | + | |
- | PREEN_ARGS = -m -r | + | |
- | + | ||
- | ## How often should the master start up condor_preen? | + | |
- | # | + | |
- | + | ||
- | ## If a daemon dies an unnatural death, do you want email about it? | + | |
- | PUBLISH_OBITUARIES = True | + | |
- | + | ||
- | ## If you're getting obituaries, how many lines of the end of that | + | |
- | ## daemon' | + | |
- | OBITUARY_LOG_LENGTH = 30 | + | |
- | + | ||
- | ## Should the master run? | + | |
- | START_MASTER = True | + | |
- | + | ||
- | ## Should the master start up the daemons you want it to? | + | |
- | START_DAEMONS = True | + | |
- | + | ||
- | ## How often do you want the master to send an update to the central | + | |
- | ## manager? | + | |
- | MASTER_UPDATE_INTERVAL = 300 | + | |
- | + | ||
- | ## How often do you want the master to check the timestamps of the | + | |
- | ## daemons it's running? | + | |
- | ## master restarts them. | + | |
- | MASTER_CHECK_NEW_EXEC_INTERVAL = 1800 | + | |
- | + | ||
- | ## Once you notice new binaries, how long should you wait before you | + | |
- | ## try to execute them? | + | |
- | MASTER_NEW_BINARY_DELAY = 120 | + | |
- | + | ||
- | ## What's the maximum amount of time you're willing to give the | + | |
- | ## daemons to quickly shutdown before you just kill them outright? | + | |
- | SHUTDOWN_FAST_TIMEOUT = 120 | + | |
- | + | ||
- | ###### | + | |
- | ## Exponential backoff settings: | + | |
- | ###### | + | |
- | ## When a daemon keeps crashing, we use " | + | |
- | ## wait longer and longer before restarting it. This is the base of | + | |
- | ## the exponent used to determine how long to wait before starting | + | |
- | ## the daemon again: | + | |
- | MASTER_BACKOFF_FACTOR = 2.0 | + | |
- | + | ||
- | ## What's the maximum amount of time you want the master to wait | + | |
- | ## between attempts to start a given daemon? | + | |
- | ## MASTER_BACKOFF_FACTOR, | + | |
- | MASTER_BACKOFF_CEILING = 3600 | + | |
- | + | ||
- | ## How long should a daemon run without crashing before we consider | + | |
- | ## it " | + | |
- | ## of restarts so the exponential backoff stuff goes back to normal. | + | |
- | MASTER_RECOVER_FACTOR = 300 | + | |
- | + | ||
- | + | ||
- | ## | + | |
- | ## condor_collector | + | |
- | ## | + | |
- | ## Address to which Condor will send a weekly e-mail with output of | + | |
- | ## condor_status. | + | |
- | # NOTE: CONDOR_DEVELOPERS is defined in the local configuration files | + | |
- | # | + | |
- | + | ||
- | ## Global Collector to periodically advertise basic information about | + | |
- | ## your pool. | + | |
- | # | + | |
- | + | ||
- | + | ||
- | ## | + | |
- | ## condor_negotiator | + | |
- | ## | + | |
- | ## Determine if the Negotiator will honor SlotWeight attributes, which | + | |
- | ## may be used to give a slot greater weight when calculating usage. | + | |
- | NEGOTIATOR_USE_SLOT_WEIGHTS = True | + | |
- | + | ||
- | + | ||
- | ## How often the Negotaitor starts a negotiation cycle, defined in | + | |
- | ## seconds. | + | |
- | NEGOTIATOR_INTERVAL = 60 | + | |
- | + | ||
- | ## Should the Negotiator publish an update to the Collector after | + | |
- | ## every negotiation cycle. It is useful to have this set to True | + | |
- | ## to get immediate updates on LastNegotiationCycle statistics. | + | |
- | NEGOTIATOR_UPDATE_AFTER_CYCLE = False | + | |
- | + | ||
- | + | ||
- | ## | + | |
- | ## condor_startd | + | |
- | ## | + | |
- | ## Where are the various condor_starter binaries installed? | + | |
- | STARTER_LIST = STARTER, STARTER_STANDARD | + | |
- | STARTER = $(SBIN)/ | + | |
- | STARTER_STANDARD = $(SBIN)/ | + | |
- | STARTER_LOCAL = $(SBIN)/ | + | |
- | + | ||
- | ## When the startd starts up, it can place it's address (IP and port) | + | |
- | ## into a file. This way, tools running on the local machine don' | + | |
- | ## need to query the central manager to find the startd. | + | |
- | ## feature can be turned off by commenting out this setting. | + | |
- | STARTD_ADDRESS_FILE = $(LOG)/ | + | |
- | + | ||
- | ## When a machine is claimed, how often should we poll the state of | + | |
- | ## the machine to see if we need to evict/ | + | |
- | POLLING_INTERVAL | + | |
- | + | ||
- | ## How often should the startd send updates to the central manager? | + | |
- | UPDATE_INTERVAL | + | |
- | + | ||
- | ## How long is the startd willing to stay in the " | + | |
- | MATCH_TIMEOUT = 600 | + | |
- | + | ||
- | ## How long is the startd willing to stay in the preempting/ | + | |
- | ## state before it just kills the starter directly? | + | |
- | KILLING_TIMEOUT = 60 | + | |
- | + | ||
- | ## When a machine unclaimed, when should it run benchmarks? | + | |
- | ## LastBenchmark is initialized to 0, so this expression says as soon | + | |
- | ## as we're unclaimed, run the benchmarks. | + | |
- | ## unclaimed and it's been at least 4 hours since we ran the last | + | |
- | ## benchmarks, run them again. | + | |
- | ## of the benchmark results to provide more accurate values. | + | |
- | ## Note, if you don't want any benchmarks run at all, either comment | + | |
- | ## RunBenchmarks out, or set it to " | + | |
- | BenchmarkTimer = (time() - LastBenchmark) | + | |
- | RunBenchmarks : (LastBenchmark == 0 ) || ($(BenchmarkTimer) >= (4 * $(HOUR))) | + | |
- | # | + | |
- | + | ||
- | ## When the startd does benchmarks, which set of benchmarks should we | + | |
- | ## run? The default is the same as pre-7.5.6: MIPS and KFLOPS. | + | |
- | benchmarks_joblist = mips kflops | + | |
- | + | ||
- | ## What's the max " | + | |
- | ## (1.01), the startd will run the benchmarks serially. | + | |
- | benchmarks_max_job_load = 1.01 | + | |
- | + | ||
- | # MIPS (Dhrystone 2.1) benchmark: load 1.0 | + | |
- | benchmarks_mips_executable = $(LIBEXEC)/ | + | |
- | benchmarks_mips_job_load = 1.0 | + | |
- | + | ||
- | # KFLOPS (clinpack) benchmark: load 1.0 | + | |
- | benchmarks_kflops_executable = $(LIBEXEC)/ | + | |
- | benchmarks_kflops_job_load = 1.0 | + | |
- | + | ||
- | + | ||
- | ## Normally, when the startd is computing the idle time of all the | + | |
- | ## users of the machine (both local and remote), it checks the utmp | + | |
- | ## file to find all the currently active ttys, and only checks access | + | |
- | ## time of the devices associated with active logins. | + | |
- | ## on some systems, utmp is unreliable, and the startd might miss | + | |
- | ## keyboard activity by doing this. So, if your utmp is unreliable, | + | |
- | ## set this setting to True and the startd will check the access time | + | |
- | ## on all tty and pty devices. | + | |
- | # | + | |
- | + | ||
- | ## This entry allows the startd to monitor console (keyboard and | + | |
- | ## mouse) activity by checking the access times on special files in | + | |
- | ## /dev. Activity on these files shows up as " | + | |
- | ## the startd' | + | |
- | ## names of devices you want considered the console, without the | + | |
- | ## "/ | + | |
- | CONSOLE_DEVICES = console | + | |
- | + | ||
- | + | ||
- | ## The STARTD_ATTRS (and legacy STARTD_EXPRS) entry allows you to | + | |
- | ## have the startd advertise arbitrary attributes from the config | + | |
- | ## file in its ClassAd. | + | |
- | ## from the config file you want in the startd ClassAd. | + | |
- | ## NOTE: because of the different syntax of the config file and | + | |
- | ## ClassAds, you might have to do a little extra work to get a given | + | |
- | ## entry into the ClassAd. | + | |
- | ## quotes (") around your strings. | + | |
- | ## directly, as can boolean expressions. | + | |
- | ## the startd to advertise its list of console devices, when it's | + | |
- | ## configured to run benchmarks, and how often it sends updates to | + | |
- | ## the central manager, you'd have to define the following helper | + | |
- | ## macro: | + | |
- | # | + | |
- | ## Note: this must come before you define STARTD_ATTRS because macros | + | |
- | ## must be defined before you use them in other macros or | + | |
- | ## expressions. | + | |
- | ## Then, you'd set the STARTD_ATTRS setting to this: | + | |
- | # | + | |
- | ## | + | |
- | ## STARTD_ATTRS can also be defined on a per-slot basis. | + | |
- | ## builds the list of attributes to advertise by combining the lists | + | |
- | ## in this order: STARTD_ATTRS, | + | |
- | ## example, the startd ad for slot1 will have the value for | + | |
- | ## favorite_color, | + | |
- | ## will have favorite_color, | + | |
- | ## | + | |
- | # | + | |
- | # | + | |
- | # | + | |
- | ## | + | |
- | ## Attributes in the STARTD_ATTRS list can also be on a per-slot basis. | + | |
- | ## For example, the following configuration: | + | |
- | ## | + | |
- | # | + | |
- | # | + | |
- | # | + | |
- | # | + | |
- | # | + | |
- | ## | + | |
- | ## will result in the following attributes in the slot classified | + | |
- | ## ads: | + | |
- | ## | + | |
- | ## slot1 - favorite_color = " | + | |
- | ## slot2 - favorite_color = " | + | |
- | ## slot3 - favorite_color = " | + | |
- | ## | + | |
- | ## Finally, the recommended default value for this setting, is to | + | |
- | ## publish the COLLECTOR_HOST setting as a string. | + | |
- | ## useful using the " | + | |
- | ## for jobs to know (for example, via their environment) what pool | + | |
- | ## they' | + | |
- | COLLECTOR_HOST_STRING = " | + | |
- | STARTD_ATTRS = COLLECTOR_HOST_STRING | + | |
- | + | ||
- | ## When the startd is claimed by a remote user, it can also advertise | + | |
- | ## arbitrary attributes from the ClassAd of the job its working on. | + | |
- | ## Just list the attribute names you want advertised. | + | |
- | ## Note: since this is already a ClassAd, you don't have to do | + | |
- | ## anything funny with strings, etc. This feature can be turned off | + | |
- | ## by commenting out this setting (there is no default). | + | |
- | STARTD_JOB_EXPRS = ImageSize, ExecutableSize, | + | |
- | + | ||
- | ## If you want to " | + | |
- | ## has, you can use this setting to override Condor' | + | |
- | ## computation. | + | |
- | ## the change to take effect (a simple condor_reconfig will not do). | + | |
- | ## Please read the section on " | + | |
- | ## Macros" | + | |
- | ## discussion of this setting. | + | |
- | ## must be an integer (" | + | |
- | ## represent the default). | + | |
- | #NUM_CPUS = N | + | |
- | + | ||
- | ## If you never want Condor to detect more the " | + | |
- | ## line out. You must restart the startd for this setting to take | + | |
- | ## effect. If set to 0 or a negative number, it is ignored. | + | |
- | ## By default, it is ignored. Otherwise, it must be a positive | + | |
- | ## integer (" | + | |
- | ## represent the default). | + | |
- | # | + | |
- | + | ||
- | ## Normally, Condor will automatically detect the amount of physical | + | |
- | ## memory available on your machine. | + | |
- | ## how much physical memory (in MB) your machine has, overriding the | + | |
- | ## value Condor computes automatically. | + | |
- | #MEMORY = 128 | + | |
- | + | ||
- | ## How much memory would you like reserved from Condor? | + | |
- | ## Condor considers all the physical memory of your machine as | + | |
- | ## available to be used by Condor jobs. If RESERVED_MEMORY is | + | |
- | ## defined, Condor subtracts it from the amount of memory it | + | |
- | ## advertises as available. | + | |
- | # | + | |
- | + | ||
- | ###### | + | |
- | ## SMP startd settings | + | |
- | ## | + | |
- | ## By default, Condor will evenly divide the resources in an SMP | + | |
- | ## machine (such as RAM, swap space and disk space) among all the | + | |
- | ## CPUs, and advertise each CPU as its own slot with an even share of | + | |
- | ## the system resources. | + | |
- | ## there are a few options available to you. Please read the section | + | |
- | ## on " | + | |
- | ## Administrator' | + | |
- | ## only briefly listed and described here. | + | |
- | ###### | + | |
- | + | ||
- | ## The maximum number of different slot types. | + | |
- | # | + | |
- | + | ||
- | ## Use this setting to define your own slot types. | + | |
- | ## allows you to divide system resources unevenly among your CPUs. | + | |
- | ## You must use a different setting for each different type you | + | |
- | ## define. | + | |
- | ## an integer from 1 to MAX_SLOT_TYPES (defined above), | + | |
- | ## and you use this number to refer to your type. There are many | + | |
- | ## different formats these settings can take, so be sure to refer to | + | |
- | ## the section on " | + | |
- | ## Condor Administrator' | + | |
- | ## read the section titled " | + | |
- | ## understand this setting. | + | |
- | ## must restart the condor_start for the change to take effect. | + | |
- | # | + | |
- | # | + | |
- | # For example: | + | |
- | # | + | |
- | # | + | |
- | + | ||
- | ## If you define your own slot types, you must specify how | + | |
- | ## many slots of each type you wish to advertise. | + | |
- | ## this with the setting below, replacing the "< | + | |
- | ## corresponding integer you used to define the type above. | + | |
- | ## change the number of a given type being advertised at run-time, | + | |
- | ## with a simple condor_reconfig. | + | |
- | # | + | |
- | # For example: | + | |
- | # | + | |
- | # | + | |
- | + | ||
- | ## The number of evenly-divided slots you want Condor to | + | |
- | ## report to your pool (if less than the total number of CPUs). | + | |
- | ## setting is only considered if the " | + | |
- | ## are not in use. By default, all CPUs are reported. | + | |
- | ## must be an integer (" | + | |
- | ## represent the default). | + | |
- | #NUM_SLOTS = N | + | |
- | + | ||
- | ## How many of the slots the startd is representing should | + | |
- | ## be " | + | |
- | ## console activity)? | + | |
- | ## machine with N CPUs). | + | |
- | ## setting, that's just used to represent the default). | + | |
- | # | + | |
- | + | ||
- | ## How many of the slots the startd is representing should | + | |
- | ## be " | + | |
- | ## as console activity). | + | |
- | SLOTS_CONNECTED_TO_KEYBOARD = 1 | + | |
- | + | ||
- | ## If there are slots that aren't connected to the | + | |
- | ## keyboard or the console (see the above two settings), the | + | |
- | ## corresponding idle time reported will be the time since the startd | + | |
- | ## was spawned, plus the value of this parameter. | + | |
- | ## minutes. | + | |
- | ## not to care about keyboard activity, we want it to be available to | + | |
- | ## Condor jobs as soon as the startd starts up, instead of having to | + | |
- | ## wait for 15 minutes or more (which is the default time a machine | + | |
- | ## must be idle before Condor will start a job). If you don't want | + | |
- | ## this boost, just set the value to 0. If you change your START | + | |
- | ## expression to require more than 15 minutes before a job starts, | + | |
- | ## but you still want jobs to start right away on some of your SMP | + | |
- | ## nodes, just increase this parameter. | + | |
- | # | + | |
- | + | ||
- | ###### | + | |
- | ## Settings for computing optional resource availability statistics: | + | |
- | ###### | + | |
- | ## If STARTD_COMPUTE_AVAIL_STATS = True, the startd will compute | + | |
- | ## statistics about resource availability to be included in the | + | |
- | ## classad(s) sent to the collector describing the resource(s) the | + | |
- | ## startd manages. | + | |
- | ## in the resource classad(s) if STARTD_COMPUTE_AVAIL_STATS = True: | + | |
- | ## AvailTime = What proportion of the time (between 0.0 and 1.0) | + | |
- | ## has this resource been in a state other than " | + | |
- | ## LastAvailInterval = What was the duration (in seconds) of the | + | |
- | ## last period between " | + | |
- | ## The following attributes will also be included if the resource is | + | |
- | ## not in the " | + | |
- | ## AvailSince = At what time did the resource last leave the | + | |
- | ## " | + | |
- | ## epoch (00:00:00 UTC, Jan 1, 1970). | + | |
- | ## AvailTimeEstimate = Based on past history, this is an estimate | + | |
- | ## of how long the current period between " | + | |
- | ## last. | + | |
- | # | + | |
- | + | ||
- | ## If STARTD_COMPUTE_AVAIL_STATS = True, STARTD_AVAIL_CONFIDENCE sets | + | |
- | ## the confidence level of the AvailTimeEstimate. | + | |
- | ## estimate is based on the 80th percentile of past values. | + | |
- | # | + | |
- | + | ||
- | ## STARTD_MAX_AVAIL_PERIOD_SAMPLES limits the number of samples of | + | |
- | ## past available intervals stored by the startd to limit memory and | + | |
- | ## disk consumption. | + | |
- | ## approximately 10 bytes of disk space. | + | |
- | # | + | |
- | + | ||
- | ## | + | |
- | ## | + | |
- | ## | + | |
- | CKPT_PROBE = $(LIBEXEC)/ | + | |
- | + | ||
- | ## | + | |
- | ## condor_schedd | + | |
- | ## | + | |
- | ## Where are the various shadow binaries installed? | + | |
- | SHADOW_LIST = SHADOW, SHADOW_STANDARD | + | |
- | SHADOW = $(SBIN)/ | + | |
- | SHADOW_STANDARD = $(SBIN)/ | + | |
- | + | ||
- | ## When the schedd starts up, it can place it's address (IP and port) | + | |
- | ## into a file. This way, tools running on the local machine don' | + | |
- | ## need to query the central manager to find the schedd. | + | |
- | ## feature can be turned off by commenting out this setting. | + | |
- | SCHEDD_ADDRESS_FILE = $(SPOOL)/ | + | |
- | + | ||
- | ## Additionally, | + | |
- | ## as well as sending it to the collector. This way, tools that need | + | |
- | ## information about a daemon do not have to contact the central manager | + | |
- | ## to get information about a daemon on the same machine. | + | |
- | ## This feature is necessary for Quill to work. | + | |
- | SCHEDD_DAEMON_AD_FILE = $(SPOOL)/ | + | |
- | + | ||
- | ## How often should the schedd send an update to the central manager? | + | |
- | SCHEDD_INTERVAL = 600 | + | |
- | + | ||
- | ## How long should the schedd wait between spawning each shadow? | + | |
- | JOB_START_DELAY = 2 | + | |
- | + | ||
- | ## How many concurrent sub-processes should the schedd spawn to handle | + | |
- | ## queries? | + | |
- | SCHEDD_QUERY_WORKERS | + | |
- | + | ||
- | ## How often should the schedd send a keep alive message to any | + | |
- | ## startds it has claimed? | + | |
- | ALIVE_INTERVAL = 300 | + | |
- | + | ||
- | ## This setting controls the maximum number of times that a | + | |
- | ## condor_shadow processes can have a fatal error (exception) before | + | |
- | ## the condor_schedd will simply relinquish the match associated with | + | |
- | ## the dying shadow. | + | |
- | MAX_SHADOW_EXCEPTIONS = 5 | + | |
- | + | ||
- | ## Estimated virtual memory size of each condor_shadow process. | + | |
- | ## Specified in kilobytes. | + | |
- | # SHADOW_SIZE_ESTIMATE = 800 | + | |
- | + | ||
- | ## The condor_schedd can renice the condor_shadow processes on your | + | |
- | ## submit machines. | + | |
- | ## The higher the number, the lower priority the shadows have. | + | |
- | SHADOW_RENICE_INCREMENT = 1 | + | |
- | + | ||
- | ## The condor_schedd can renice scheduler universe processes | + | |
- | ## (e.g. DAGMan) on your submit machines. | + | |
- | ## scheduler universe processes? (1-19). | + | |
- | ## lower priority the processes have. | + | |
- | # SCHED_UNIV_RENICE_INCREMENT = 0 | + | |
- | + | ||
- | ## By default, when the schedd fails to start an idle job, it will | + | |
- | ## not try to start any other idle jobs in the same cluster during | + | |
- | ## that negotiation cycle. | + | |
- | ## efficient for large job clusters. | + | |
- | ## jobs in the cluster can be started even though an earlier job | + | |
- | ## can' | + | |
- | ## different disk space, memory, or operating system requirements. | + | |
- | ## Or, machines may be willing to run only some jobs in the cluster, | + | |
- | ## because their requirements reference the jobs' virtual memory size | + | |
- | ## or other attribute. | + | |
- | ## will force the schedd to try to start all idle jobs in each | + | |
- | ## negotiation cycle. | + | |
- | ## but it will ensure that all jobs that can be started will be | + | |
- | ## started. | + | |
- | NEGOTIATE_ALL_JOBS_IN_CLUSTER = True | + | |
- | + | ||
- | ## This setting controls how often, in seconds, the schedd considers | + | |
- | ## periodic job actions given by the user in the submit file. | + | |
- | ## (Currently, these are periodic_hold, | + | |
- | # | + | |
- | + | ||
- | ###### | + | |
- | ## Queue management settings: | + | |
- | ###### | + | |
- | ## How often should the schedd truncate it's job queue transaction | + | |
- | ## log? (Specified in seconds, once a day is the default.) | + | |
- | # | + | |
- | + | ||
- | ## How often should the schedd commit "wall clock" run time for jobs | + | |
- | ## to the queue, so run time statistics remain accurate when the | + | |
- | ## schedd crashes? | + | |
- | ## default. | + | |
- | # | + | |
- | + | ||
- | ## What users do you want to grant super user access to this job | + | |
- | ## queue? | + | |
- | ## By default, this only includes root. | + | |
- | # | + | |
- | # NOTE: QUEUE_SUPER_USERS is defined in the local configuration files | + | |
- | # | + | |
- | + | ||
- | + | ||
- | ## | + | |
- | ## condor_shadow | + | |
- | ## | + | |
- | ## If the shadow is unable to read a checkpoint file from the | + | |
- | ## checkpoint server, it keeps trying only if the job has accumulated | + | |
- | ## more than MAX_DISCARDED_RUN_TIME seconds of CPU usage. | + | |
- | ## the job is started from scratch. | + | |
- | ## setting is only used if USE_CKPT_SERVER (from above) is True. | + | |
- | MAX_DISCARDED_RUN_TIME = 3600 | + | |
- | + | ||
- | ## Should periodic checkpoints be compressed? | + | |
- | COMPRESS_PERIODIC_CKPT = False | + | |
- | + | ||
- | ## Should vacate checkpoints be compressed? | + | |
- | COMPRESS_VACATE_CKPT = False | + | |
- | + | ||
- | ## Should we commit the application' | + | |
- | ## space during a periodic checkpoint? | + | |
- | # | + | |
- | + | ||
- | ## Should we write vacate checkpoints slowly? | + | |
- | ## parameter specifies the speed at which vacate checkpoints should | + | |
- | ## be written, in kilobytes per second. | + | |
- | # | + | |
- | + | ||
- | ## How often should the shadow update the job queue with job | + | |
- | ## attributes that periodically change? | + | |
- | SHADOW_QUEUE_UPDATE_INTERVAL = 15 * 60 | + | |
- | + | ||
- | ## Should the shadow wait to update certain job attributes for the | + | |
- | ## next periodic update, or should it immediately these update | + | |
- | ## attributes as they change? | + | |
- | ## aggressive updates to a busy condor_schedd, | + | |
- | SHADOW_LAZY_QUEUE_UPDATE = TRUE | + | |
- | + | ||
- | + | ||
- | ## | + | |
- | ## condor_starter | + | |
- | ## | + | |
- | ## The condor_starter can renice the processes of Condor | + | |
- | ## jobs on your execute machines. | + | |
- | ## following entry and set it to how " | + | |
- | ## jobs. (1-19) | + | |
- | ## process gets on your machines. | + | |
- | ## Note on Win32 platforms, this number needs to be greater than | + | |
- | ## zero (i.e. the job must be reniced) or the mechanism that | + | |
- | ## monitors CPU load on Win32 systems will give erratic results. | + | |
- | JOB_RENICE_INCREMENT = 4 | + | |
- | + | ||
- | ## Should the starter do local logging to its own log file, or send | + | |
- | ## debug information back to the condor_shadow where it will end up | + | |
- | ## in the ShadowLog? | + | |
- | STARTER_LOCAL_LOGGING = FALSE | + | |
- | + | ||
- | ## If the UID_DOMAIN settings match on both the execute and submit | + | |
- | ## machines, but the UID of the user who submitted the job isn't in | + | |
- | ## the passwd file of the execute machine, the starter will normally | + | |
- | ## exit with an error. | + | |
- | ## job with the specified UID, even if it's not in the passwd file? | + | |
- | # | + | |
- | + | ||
- | ## honor the run_as_owner option from the condor submit file. | + | |
- | ## | + | |
- | STARTER_ALLOW_RUNAS_OWNER = TRUE | + | |
- | + | ||
- | ## Tell the Starter/ | + | |
- | ## condor_rmdir.exe is a windows-only command that does a better job | + | |
- | ## than the built-in rmdir command when it is run with elevated privileges | + | |
- | ## Such as when when Condor is running as a service. | + | |
- | ## /s is delete subdirectories | + | |
- | ## /c is continue on error | + | |
- | WINDOWS_RMDIR = $(SBIN)\condor_rmdir.exe | + | |
- | # | + | |
- | + | ||
- | ## | + | |
- | ## condor_procd | + | |
- | ## | + | |
- | ## | + | |
- | # the path to the procd binary | + | |
- | # | + | |
- | PROCD = $(SBIN)/ | + | |
- | + | ||
- | # the path to the procd " | + | |
- | # - on UNIX this will be a named pipe; we'll put it in the | + | |
- | # | + | |
- | # will be created in this directory for when the procd responds | + | |
- | # to its clients) | + | |
- | # - on Windows, this will be a named pipe as well (but named pipes on | + | |
- | # | + | |
- | # | + | |
- | # | + | |
- | # | + | |
- | PROCD_ADDRESS = $(LOCK)/ | + | |
- | + | ||
- | # The procd currently uses a very simplistic logging system. Since this | + | |
- | # log will not be rotated like other Condor logs, it is only recommended | + | |
- | # to set PROCD_LOG when attempting to debug a problem. In other Condor | + | |
- | # daemons, turning on D_PROCFAMILY will result in that daemon logging | + | |
- | # all of its interactions with the ProcD. | + | |
- | # | + | |
- | #PROCD_LOG = $(LOG)/ | + | |
- | + | ||
- | # This is the maximum period that the procd will use for taking | + | |
- | # snapshots (the actual period may be lower if a condor daemon registers | + | |
- | # a family for which it wants more frequent snapshots) | + | |
- | # | + | |
- | PROCD_MAX_SNAPSHOT_INTERVAL = 60 | + | |
- | + | ||
- | # On Windows, we send a process a "soft kill" via a WM_CLOSE message. | + | |
- | # This binary is used by the ProcD (and other Condor daemons if PRIVSEP | + | |
- | # is not enabled) to help when sending soft kills. | + | |
- | WINDOWS_SOFTKILL = $(SBIN)/ | + | |
- | + | ||
- | ## | + | |
- | ## condor_submit | + | |
- | ## | + | |
- | ## If you want condor_submit to automatically append an expression to | + | |
- | ## the Requirements expression or Rank expression of jobs at your | + | |
- | ## site, uncomment these entries. | + | |
- | # | + | |
- | # | + | |
- | + | ||
- | ## If you want expressions only appended for either standard or | + | |
- | ## vanilla universe jobs, you can uncomment these entries. | + | |
- | ## them are defined, they are used for the given universe, instead of | + | |
- | ## the generic entries above. | + | |
- | # | + | |
- | # | + | |
- | # | + | |
- | # | + | |
- | + | ||
- | ## This can be used to define a default value for the rank expression | + | |
- | ## if one is not specified in the submit file. | + | |
- | DEFAULT_RANK | + | |
- | + | ||
- | ## If you want universe-specific defaults, you can use the following | + | |
- | ## entries: | + | |
- | # | + | |
- | # | + | |
- | + | ||
- | ## If you want condor_submit to automatically append expressions to | + | |
- | ## the job ClassAds it creates, you can uncomment and define the | + | |
- | ## SUBMIT_EXPRS setting. | + | |
- | ## described above with respect to ClassAd vs. config file syntax, | + | |
- | ## strings, etc. One common use would be to have the full hostname | + | |
- | ## of the machine where a job was submitted placed in the job | + | |
- | ## ClassAd. | + | |
- | #MACHINE = " | + | |
- | # | + | |
- | + | ||
- | ## Condor keeps a buffer of recently-used data for each file an | + | |
- | ## application opens. | + | |
- | ## of bytes to be buffered for each open file at the executing | + | |
- | ## machine. | + | |
- | # | + | |
- | + | ||
- | ## Condor will attempt to consolidate small read and write operations | + | |
- | ## into large blocks. | + | |
- | ## Condor will use. | + | |
- | # | + | |
- | + | ||
- | ## | + | |
- | ## condor_preen | + | |
- | ## | + | |
- | ## Who should condor_preen send email to? | + | |
- | PREEN_ADMIN = $(CONDOR_ADMIN) | + | |
- | + | ||
- | ## What files should condor_preen leave in the spool directory? | + | |
- | VALID_SPOOL_FILES = job_queue.log, | + | |
- | Accountant.log, | + | |
- | local_univ_execute, | + | |
- | | + | |
- | .schedd_address, | + | |
- | + | ||
- | ## What files should condor_preen remove from the log directory? | + | |
- | INVALID_LOG_FILES = core | + | |
- | + | ||
- | ## | + | |
- | ## Java parameters: | + | |
- | ## | + | |
- | ## If you would like this machine to be able to run Java jobs, | + | |
- | ## then set JAVA to the path of your JVM binary. | + | |
- | ## interested in Java, there is no harm in leaving this entry | + | |
- | ## empty or incorrect. | + | |
- | + | ||
- | JAVA = / | + | |
- | JAVA_MAXHEAP_ARGUMENT = -Xmx1024m | + | |
- | + | ||
- | ## JAVA_CLASSPATH_DEFAULT gives the default set of paths in which | + | |
- | ## Java classes are to be found. | + | |
- | ## If your JVM needs to be informed of additional directories, | + | |
- | ## them here. However, do not remove the existing entries, as Condor | + | |
- | ## needs them. | + | |
- | + | ||
- | JAVA_CLASSPATH_DEFAULT = $(LIB) $(LIB)/ | + | |
- | + | ||
- | ## JAVA_CLASSPATH_ARGUMENT describes the command-line parameter | + | |
- | ## used to introduce a new classpath: | + | |
- | + | ||
- | JAVA_CLASSPATH_ARGUMENT = -classpath | + | |
- | + | ||
- | ## JAVA_CLASSPATH_SEPARATOR describes the character used to mark | + | |
- | ## one path element from another: | + | |
- | + | ||
- | JAVA_CLASSPATH_SEPARATOR = : | + | |
- | + | ||
- | ## JAVA_BENCHMARK_TIME describes the number of seconds for which | + | |
- | ## to run Java benchmarks. | + | |
- | ## benchmark, but consumes more otherwise useful CPU time. | + | |
- | ## If this time is zero or undefined, no Java benchmarks will be run. | + | |
- | + | ||
- | JAVA_BENCHMARK_TIME = 2 | + | |
- | + | ||
- | ## If your JVM requires any special arguments not mentioned in | + | |
- | ## the options above, then give them here. | + | |
- | + | ||
- | JAVA_EXTRA_ARGUMENTS = | + | |
- | + | ||
- | ## | + | |
- | ## | + | |
- | ## Condor-G settings | + | |
- | ## | + | |
- | ## Where is the GridManager binary installed? | + | |
- | + | ||
- | GRIDMANAGER = $(SBIN)/ | + | |
- | GT2_GAHP = $(SBIN)/ | + | |
- | GRID_MONITOR = $(SBIN)/ | + | |
- | + | ||
- | ## | + | |
- | ## Settings that control the daemon' | + | |
- | ## | + | |
- | ## | + | |
- | ## Note that the Gridmanager runs as the User, not a Condor daemon, so | + | |
- | ## all users must have write permssion to the directory that the | + | |
- | ## Gridmanager will use for it's logfile. Our suggestion is to create a | + | |
- | ## directory called GridLogs in $(LOG) with UNIX permissions 1777 | + | |
- | ## (just like /tmp ) | + | |
- | ## Another option is to use /tmp as the location of the GridManager log. | + | |
- | ## | + | |
- | + | ||
- | MAX_GRIDMANAGER_LOG = 1000000 | + | |
- | GRIDMANAGER_DEBUG = | + | |
- | + | ||
- | GRIDMANAGER_LOG = $(LOG)/ | + | |
- | GRIDMANAGER_LOCK = $(LOCK)/ | + | |
- | + | ||
- | ## | + | |
- | ## Various other settings that the Condor-G can use. | + | |
- | ## | + | |
- | + | ||
- | ## For grid-type gt2 jobs (pre-WS GRAM), limit the number of jobmanager | + | |
- | ## processes the gridmanager will let run on the headnode. Letting too | + | |
- | ## many jobmanagers run causes severe load on the headnode. | + | |
- | GRIDMANAGER_MAX_JOBMANAGERS_PER_RESOURCE = 10 | + | |
- | + | ||
- | ## If we're talking to a Globus 2.0 resource, Condor-G will use the new | + | |
- | ## version of the GRAM protocol. The first option is how often to check the | + | |
- | ## proxy on the submit site of things. If the GridManager discovers a new | + | |
- | ## proxy, it will restart itself and use the new proxy for all future | + | |
- | ## jobs launched. In seconds, | + | |
- | # | + | |
- | + | ||
- | ## The GridManager will shut things down 3 minutes before loosing Contact | + | |
- | ## because of an expired proxy. | + | |
- | ## In seconds, and defaults to 3 minutes | + | |
- | # | + | |
- | + | ||
- | ## Condor requires that each submitted job be designated to run under a | + | |
- | ## particular " | + | |
- | ## | + | |
- | ## If no universe is specificed in the submit file, Condor must pick one | + | |
- | ## for the job to use. By default, it chooses the " | + | |
- | ## The default can be overridden in the config file with the DEFAULT_UNIVERSE | + | |
- | ## setting, which is a string to insert into a job submit description if the | + | |
- | ## job does not try and define it's own universe | + | |
- | ## | + | |
- | # | + | |
- | + | ||
- | # | + | |
- | # The Cred_min_time_left is the first-pass at making sure that Condor-G | + | |
- | # does not submit your job without it having enough time left for the | + | |
- | # job to finish. For example, if you have a job that runs for 20 minutes, and | + | |
- | # you might spend 40 minutes in the queue, it's a bad idea to submit with less | + | |
- | # than an hour left before your proxy expires. | + | |
- | # 2 hours seemed like a reasonable default. | + | |
- | # | + | |
- | CRED_MIN_TIME_LEFT = 120 | + | |
- | + | ||
- | + | ||
- | ## | + | |
- | ## The GridMonitor allows you to submit many more jobs to a GT2 GRAM server | + | |
- | ## than is normally possible. | + | |
- | # | + | |
- | + | ||
- | ## | + | |
- | ## When an error occurs with the GridMonitor, | + | |
- | ## gridmanager wait before trying to submit a new GridMonitor job? | + | |
- | ## The default is 1 hour (3600 seconds). | + | |
- | # | + | |
- | + | ||
- | ## | + | |
- | ## The location of the wrapper for invoking | + | |
- | ## Condor GAHP server | + | |
- | ## | + | |
- | CONDOR_GAHP = $(SBIN)/ | + | |
- | CONDOR_GAHP_WORKER = $(SBIN)/ | + | |
- | + | ||
- | ## | + | |
- | ## The Condor GAHP server has it's own log. Like the Gridmanager, | + | |
- | ## GAHP server is run as the User, not a Condor daemon, so all users must | + | |
- | ## have write permssion to the directory used for the logfile. Our | + | |
- | ## suggestion is to create a directory called GridLogs in $(LOG) with | + | |
- | ## UNIX permissions 1777 (just like /tmp ) | + | |
- | ## Another option is to use /tmp as the location of the CGAHP log. | + | |
- | ## | + | |
- | MAX_C_GAHP_LOG = 1000000 | + | |
- | + | ||
- | #C_GAHP_LOG = $(LOG)/ | + | |
- | C_GAHP_LOG = / | + | |
- | C_GAHP_LOCK = / | + | |
- | C_GAHP_WORKER_THREAD_LOG = / | + | |
- | C_GAHP_WORKER_THREAD_LOCK = / | + | |
- | + | ||
- | ## | + | |
- | ## The location of the wrapper for invoking | + | |
- | ## GT4 GAHP server | + | |
- | ## | + | |
- | GT4_GAHP = $(SBIN)/ | + | |
- | + | ||
- | ## | + | |
- | ## The location of GT4 files. This should normally be lib/gt4 | + | |
- | ## | + | |
- | GT4_LOCATION = $(LIB)/ | + | |
- | + | ||
- | ## | + | |
- | ## The location of the wrapper for invoking | + | |
- | ## GT4 GAHP server | + | |
- | ## | + | |
- | GT42_GAHP = $(SBIN)/ | + | |
- | + | ||
- | ## | + | |
- | ## The location of GT4 files. This should normally be lib/gt4 | + | |
- | ## | + | |
- | GT42_LOCATION = $(LIB)/ | + | |
- | + | ||
- | ## | + | |
- | ## gt4 gram requires a gridftp server to perform file transfers. | + | |
- | ## If GRIDFTP_URL_BASE is set, then Condor assumes there is a gridftp | + | |
- | ## server set up at that URL suitable for its use. Otherwise, Condor | + | |
- | ## will start its own gridftp servers as needed, using the binary | + | |
- | ## pointed at by GRIDFTP_SERVER. GRIDFTP_SERVER_WRAPPER points to a | + | |
- | ## wrapper script needed to properly set the path to the gridmap file. | + | |
- | ## | + | |
- | # | + | |
- | GRIDFTP_SERVER = $(LIBEXEC)/ | + | |
- | GRIDFTP_SERVER_WRAPPER = $(LIBEXEC)/ | + | |
- | + | ||
- | ## | + | |
- | ## Location of the PBS/LSF gahp and its associated binaries | + | |
- | ## | + | |
- | GLITE_LOCATION = $(LIBEXEC)/ | + | |
- | PBS_GAHP = $(GLITE_LOCATION)/ | + | |
- | LSF_GAHP = $(GLITE_LOCATION)/ | + | |
- | + | ||
- | ## | + | |
- | ## The location of the wrapper for invoking the Unicore GAHP server | + | |
- | ## | + | |
- | UNICORE_GAHP = $(SBIN)/ | + | |
- | + | ||
- | ## | + | |
- | ## The location of the wrapper for invoking the NorduGrid GAHP server | + | |
- | ## | + | |
- | NORDUGRID_GAHP = $(SBIN)/ | + | |
- | + | ||
- | ## The location of the CREAM GAHP server | + | |
- | CREAM_GAHP = $(SBIN)/ | + | |
- | + | ||
- | ## Condor-G and CredD can use MyProxy to refresh GSI proxies which are | + | |
- | ## about to expire. | + | |
- | # | + | |
- | + | ||
- | ## The location of the Deltacloud GAHP server | + | |
- | DELTACLOUD_GAHP = $(SBIN)/ | + | |
- | + | ||
- | ## | + | |
- | ## EC2: Universe = Grid, Grid_Resource = Amazon | + | |
- | ## | + | |
- | + | ||
- | ## The location of the amazon_gahp program, required | + | |
- | AMAZON_GAHP = $(SBIN)/ | + | |
- | + | ||
- | ## Location of log files, useful for debugging, must be in | + | |
- | ## a directory writable by any user, such as /tmp | + | |
- | # | + | |
- | AMAZON_GAHP_LOG = / | + | |
- | + | ||
- | ## The number of seconds between status update requests to EC2. You can | + | |
- | ## make this short (5 seconds) if you want Condor to respond quickly to | + | |
- | ## instances as they terminate, or you can make it long (300 seconds = 5 | + | |
- | ## minutes) if you know your instances will run for awhile and don't mind | + | |
- | ## delay between when they stop and when Condor responds to them | + | |
- | ## stopping. | + | |
- | GRIDMANAGER_JOB_PROBE_INTERVAL = 300 | + | |
- | + | ||
- | ## As of this writing Amazon EC2 has a hard limit of 20 concurrently | + | |
- | ## running instances, so a limit of 20 is imposed so the GridManager | + | |
- | ## does not waste its time sending requests that will be rejected. | + | |
- | GRIDMANAGER_MAX_SUBMITTED_JOBS_PER_RESOURCE_AMAZON = 20 | + | |
- | + | ||
- | ## | + | |
- | ## | + | |
- | ## condor_credd credential managment daemon | + | |
- | ## | + | |
- | ## Where is the CredD binary installed? | + | |
- | CREDD = $(SBIN)/ | + | |
- | + | ||
- | ## When the credd starts up, it can place it's address (IP and port) | + | |
- | ## into a file. This way, tools running on the local machine don' | + | |
- | ## need an additional "-n host: | + | |
- | ## feature can be turned off by commenting out this setting. | + | |
- | CREDD_ADDRESS_FILE = $(LOG)/ | + | |
- | + | ||
- | ## Specify a remote credd server here, | + | |
- | # | + | |
- | + | ||
- | ## CredD startup arguments | + | |
- | ## Start the CredD on a well-known port. Uncomment to to simplify | + | |
- | ## connecting to a remote CredD. | + | |
- | ## in a future release. | + | |
- | CREDD_PORT = 9620 | + | |
- | CREDD_ARGS = -p $(CREDD_PORT) -f | + | |
- | + | ||
- | ## CredD daemon debugging log | + | |
- | CREDD_LOG = $(LOG)/ | + | |
- | CREDD_DEBUG = D_FULLDEBUG | + | |
- | MAX_CREDD_LOG = 4000000 | + | |
- | + | ||
- | ## The credential owner submits the credential. | + | |
- | ## other user who are also permitted to see all credentials. | + | |
- | ## to root on Unix systems, and Administrator on Windows systems. | + | |
- | # | + | |
- | + | ||
- | ## Credential storage location. | + | |
- | ## prior to starting condor_credd. | + | |
- | ## restrict access permissions to _only_ the directory owner. | + | |
- | CRED_STORE_DIR = $(LOCAL_DIR)/ | + | |
- | + | ||
- | ## Index file path of saved credentials. | + | |
- | ## This file will be automatically created if it does not exist. | + | |
- | CRED_INDEX_FILE = $(CRED_STORE_DIR)/ | + | |
- | + | ||
- | ## condor_credd | + | |
- | ## remaining lifespan is less than this value. | + | |
- | # | + | |
- | + | ||
- | ## condor-credd periodically checks remaining lifespan of stored | + | |
- | ## credentials, | + | |
- | # | + | |
- | + | ||
- | ## | + | |
- | ## | + | |
- | ## Stork data placment server | + | |
- | ## | + | |
- | ## Where is the Stork binary installed? | + | |
- | STORK = $(SBIN)/ | + | |
- | + | ||
- | ## When Stork starts up, it can place it's address (IP and port) | + | |
- | ## into a file. This way, tools running on the local machine don' | + | |
- | ## need an additional "-n host: | + | |
- | ## feature can be turned off by commenting out this setting. | + | |
- | STORK_ADDRESS_FILE = $(LOG)/ | + | |
- | + | ||
- | ## Specify a remote Stork server here, | + | |
- | # | + | |
- | + | ||
- | ## STORK_LOG_BASE specifies the basename for heritage Stork log files. | + | |
- | ## Stork uses this macro to create the following output log files: | + | |
- | ## $(STORK_LOG_BASE): | + | |
- | ## journal file. | + | |
- | ## $(STORK_LOG_BASE).history: | + | |
- | ## $(STORK_LOG_BASE).user_log: | + | |
- | STORK_LOG_BASE = $(LOG)/ | + | |
- | + | ||
- | ## Modern Condor DaemonCore logging feature. | + | |
- | STORK_LOG = $(LOG)/ | + | |
- | STORK_DEBUG = D_FULLDEBUG | + | |
- | MAX_STORK_LOG = 4000000 | + | |
- | + | ||
- | ## Stork startup arguments | + | |
- | ## Start Stork on a well-known port. Uncomment to to simplify | + | |
- | ## connecting to a remote Stork. | + | |
- | ## in a future release. | + | |
- | # | + | |
- | STORK_PORT = 9621 | + | |
- | STORK_ARGS = -p $(STORK_PORT) -f -Serverlog $(STORK_LOG_BASE) | + | |
- | + | ||
- | ## Stork environment. | + | |
- | ## shared object libraries. | + | |
- | ## LD_LIBRARY_PATH environments. | + | |
- | ## further specific environments. | + | |
- | ## environment when invoked from condor_master or the shell. | + | |
- | ## default environment is not adequate for all Stork modules, specify | + | |
- | ## a replacement environment here. This environment will be set by | + | |
- | ## condor_master before starting Stork, but does not apply if Stork is | + | |
- | ## started directly from the command line. | + | |
- | # | + | |
- | + | ||
- | ## Limits the number of concurrent data placements handled by Stork. | + | |
- | # | + | |
- | + | ||
- | ## Limits the number of retries for a failed data placement. | + | |
- | # | + | |
- | + | ||
- | ## Limits the run time for a data placement job, after which the | + | |
- | ## placement is considered failed. | + | |
- | # | + | |
- | + | ||
- | ## Temporary credential storage directory used by Stork. | + | |
- | # | + | |
- | + | ||
- | ## Directory containing Stork modules. | + | |
- | # | + | |
- | + | ||
- | ## | + | |
- | ## | + | |
- | ## Quill Job Queue Mirroring Server | + | |
- | ## | + | |
- | ## Where is the Quill binary installed and what arguments should be passed? | + | |
- | QUILL = $(SBIN)/ | + | |
- | #QUILL_ARGS = | + | |
- | + | ||
- | # Where is the log file for the quill daemon? | + | |
- | QUILL_LOG = $(LOG)/ | + | |
- | + | ||
- | # The identification and location of the quill daemon for local clients. | + | |
- | QUILL_ADDRESS_FILE = $(LOG)/ | + | |
- | + | ||
- | # If this is set to true, then the rest of the QUILL arguments must be defined | + | |
- | # for quill to function. If it is False or left undefined, then quill will not | + | |
- | # be consulted by either the scheduler or the tools, but in the case of a | + | |
- | # remote quill query where the local client has quill turned off, but the | + | |
- | # remote client has quill turned on, things will still function normally. | + | |
- | QUILL_ENABLED = FALSE | + | |
- | + | ||
- | # | + | |
- | # If Quill is enabled, by default it will only mirror the current job | + | |
- | # queue into the database. For historical jobs, and classads from other | + | |
- | # sources, the SQL Log must be enabled. | + | |
- | # | + | |
- | + | ||
- | # | + | |
- | # The SQL Log can be enabled on a per-daemon basis. For example, to collect | + | |
- | # historical job information, | + | |
- | # uncomment these two lines | + | |
- | # | + | |
- | # | + | |
- | + | ||
- | # This will be the name of a quill daemon using this config file. This name | + | |
- | # should not conflict with any other quill name--or schedd name. | + | |
- | #QUILL_NAME = quill@postgresql-server.machine.com | + | |
- | + | ||
- | # The Postgreql server requires usernames that can manipulate tables. This will | + | |
- | # be the username associated with this instance of the quill daemon mirroring | + | |
- | # a schedd' | + | |
- | # associated with it otherwise multiple quill daemons will corrupt the data | + | |
- | # held under an indentical user name. | + | |
- | # | + | |
- | + | ||
- | # The required password for the DB user which quill will use to read | + | |
- | # information from the database about the queue. | + | |
- | # | + | |
- | + | ||
- | # What kind of database server is this? | + | |
- | # For now, only PGSQL is supported | + | |
- | # | + | |
- | + | ||
- | # The machine and port of the postgres server. | + | |
- | # Although this says IP Addr, it can be a DNS name. | + | |
- | # It must match whatever format you used for the .pgpass file, however | + | |
- | # | + | |
- | + | ||
- | # The login to use to attach to the database for updating information. | + | |
- | # There should be an entry in file $SPOOL/ | + | |
- | # for this login id. | + | |
- | # | + | |
- | + | ||
- | # Polling period, in seconds, for when quill reads transactions out of the | + | |
- | # schedd' | + | |
- | # | + | |
- | + | ||
- | # Allows or disallows a remote query to the quill daemon and database | + | |
- | # which is reading this log file. Defaults to true. | + | |
- | # | + | |
- | + | ||
- | # Add debugging flags to here if you need to debug quill for some reason. | + | |
- | # | + | |
- | + | ||
- | # Number of seconds the master should wait for the Quill daemon to respond | + | |
- | # before killing it. This number might need to be increased for very | + | |
- | # large logfiles. | + | |
- | # The default is 3600 (one hour), but kicking it up to a few hours won't hurt | + | |
- | # | + | |
- | + | ||
- | # Should Quill hold open a database connection to the DBMSD? | + | |
- | # Each open connection consumes resources at the server, so large pools | + | |
- | # (100 or more machines) should set this variable to FALSE. Note the | + | |
- | # default is TRUE. | + | |
- | # | + | |
- | + | ||
- | ## | + | |
- | ## | + | |
- | ## Database Management Daemon settings | + | |
- | ## | + | |
- | ## Where is the DBMSd binary installed and what arguments should be passed? | + | |
- | DBMSD = $(SBIN)/ | + | |
- | DBMSD_ARGS = -f | + | |
- | + | ||
- | # Where is the log file for the quill daemon? | + | |
- | DBMSD_LOG = $(LOG)/ | + | |
- | + | ||
- | # Interval between consecutive purging calls (in seconds) | + | |
- | # | + | |
- | + | ||
- | # Interval between consecutive database reindexing operations | + | |
- | # This is only used when dbtype = PGSQL | + | |
- | # | + | |
- | + | ||
- | # Number of days before purging resource classad history | + | |
- | # This includes things like machine ads, daemon ads, submitters | + | |
- | # | + | |
- | + | ||
- | # Number of days before purging job run information | + | |
- | # This includes job events, file transfers, matchmaker matches, etc | + | |
- | # This does NOT include the final job ad. condor_history does not need | + | |
- | # any of this information to work. | + | |
- | # | + | |
- | + | ||
- | # Number of days before purging job classad history | + | |
- | # This is the information needed to run condor_history | + | |
- | # | + | |
- | + | ||
- | # DB size threshold for warning the condor administrator. This is checked | + | |
- | # after every purge. The size is given in gigabytes. | + | |
- | # | + | |
- | + | ||
- | # Number of seconds the master should wait for the DBMSD to respond before | + | |
- | # killing it. This number might need to be increased for very large databases | + | |
- | # The default is 3600 (one hour). | + | |
- | # | + | |
- | + | ||
- | ## | + | |
- | ## | + | |
- | ## VM Universe Parameters | + | |
- | ## | + | |
- | ## Where is the Condor VM-GAHP installed? (Required) | + | |
- | VM_GAHP_SERVER = $(SBIN)/ | + | |
- | + | ||
- | ## If the VM-GAHP is to have its own log, define | + | |
- | ## the location of log file. | + | |
- | ## | + | |
- | ## Optionally, if you do NOT define VM_GAHP_LOG, | + | |
- | ## be stored in the starter' | + | |
- | ## However, on Windows machine you must always define VM_GAHP_LOG. | + | |
- | # | + | |
- | VM_GAHP_LOG = $(LOG)/ | + | |
- | MAX_VM_GAHP_LOG = 1000000 | + | |
- | # | + | |
- | + | ||
- | ## What kind of virtual machine program will be used for | + | |
- | ## the VM universe? | + | |
- | ## The two options are vmware and xen. (Required) | + | |
- | #VM_TYPE = vmware | + | |
- | + | ||
- | ## How much memory can be used for the VM universe? (Required) | + | |
- | ## This value is the maximum amount of memory that can be used by the | + | |
- | ## virtual machine program. | + | |
- | #VM_MEMORY = 128 | + | |
- | + | ||
- | ## Want to support networking for VM universe? | + | |
- | ## Default value is FALSE | + | |
- | # | + | |
- | + | ||
- | ## What kind of networking types are supported? | + | |
- | ## | + | |
- | ## If you set VM_NETWORKING to TRUE, you must define this parameter. | + | |
- | ## VM_NETWORKING_TYPE = nat | + | |
- | ## VM_NETWORKING_TYPE = bridge | + | |
- | ## VM_NETWORKING_TYPE = nat, bridge | + | |
- | ## | + | |
- | ## If multiple networking types are defined, you may define | + | |
- | ## VM_NETWORKING_DEFAULT_TYPE for default networking type. | + | |
- | ## Otherwise, nat is used for default networking type. | + | |
- | ## VM_NETWORKING_DEFAULT_TYPE = nat | + | |
- | # | + | |
- | # | + | |
- | + | ||
- | ## In default, the number of possible virtual machines is same as | + | |
- | ## NUM_CPUS. | + | |
- | ## Since too many virtual machines can cause the system to be too slow | + | |
- | ## and lead to unexpected problems, limit the number of running | + | |
- | ## virtual machines on this machine with | + | |
- | # | + | |
- | + | ||
- | ## When a VM universe job is started, a status command is sent | + | |
- | ## to the VM-GAHP to see if the job is finished. | + | |
- | ## If the interval between checks is too short, it will consume | + | |
- | ## too much of the CPU. If the VM-GAHP fails to get status 5 times in a row, | + | |
- | ## an error will be reported to startd, and then startd will check | + | |
- | ## the availability of VM universe. | + | |
- | ## Default value is 60 seconds and minimum value is 30 seconds | + | |
- | # | + | |
- | + | ||
- | ## How long will we wait for a request sent to the VM-GAHP to be completed? | + | |
- | ## If a request is not completed within the timeout, an error will be reported | + | |
- | ## to the startd, and then the startd will check | + | |
- | ## the availability of vm universe. | + | |
- | # | + | |
- | + | ||
- | ## When VMware or Xen causes an error, the startd will disable the | + | |
- | ## VM universe. | + | |
- | ## we will test one more | + | |
- | ## whether vm universe is still unavailable after some time. | + | |
- | ## In default, startd will recheck vm universe after 10 minutes. | + | |
- | ## If the test also fails, vm universe will be disabled. | + | |
- | # | + | |
- | + | ||
- | ## Usually, when we suspend a VM, the memory being used by the VM | + | |
- | ## will be saved into a file and then freed. | + | |
- | ## However, when we use soft suspend, neither saving nor memory freeing | + | |
- | ## will occur. | + | |
- | ## For VMware, we send SIGSTOP to a process for VM in order to | + | |
- | ## stop the VM temporarily and send SIGCONT to resume the VM. | + | |
- | ## For Xen, we pause CPU. Pausing CPU doesn' | + | |
- | ## into a file. It only stops the execution of a VM temporarily. | + | |
- | # | + | |
- | + | ||
- | ## If Condor runs as root and a job comes from a different UID domain, | + | |
- | ## Condor generally uses " | + | |
- | ## If " | + | |
- | ## as the user defined in " | + | |
- | ## | + | |
- | ## Notice: In VMware VM universe, " | + | |
- | ## So we need to define " | + | |
- | ## For VMware, the user defined in " | + | |
- | ## home directory. | + | |
- | ## If neither " | + | |
- | ## VMware VM universe job will run as " | + | |
- | ## As a result, the preference of local users for a VMware VM universe job | + | |
- | ## which comes from the different UID domain is | + | |
- | ## " | + | |
- | # | + | |
- | + | ||
- | ## If Condor runs as root and " | + | |
- | ## all VM universe jobs will run as a user defined in " | + | |
- | # | + | |
- | + | ||
- | ## | + | |
- | ## VM Universe Parameters Specific to VMware | + | |
- | ## | + | |
- | + | ||
- | ## Where is perl program? (Required) | + | |
- | VMWARE_PERL = perl | + | |
- | + | ||
- | ## Where is the Condor script program to control VMware? (Required) | + | |
- | VMWARE_SCRIPT = $(SBIN)/ | + | |
- | + | ||
- | ## Networking parameters for VMware | + | |
- | ## | + | |
- | ## What kind of VMware networking is used? | + | |
- | ## | + | |
- | ## If multiple networking types are defined, you may specify different | + | |
- | ## parameters for each networking type. | + | |
- | ## | + | |
- | ## Examples | + | |
- | ## (e.g.) VMWARE_NAT_NETWORKING_TYPE = nat | + | |
- | ## (e.g.) VMWARE_BRIDGE_NETWORKING_TYPE = bridged | + | |
- | ## | + | |
- | ## If there is no parameter for specific networking type, VMWARE_NETWORKING_TYPE is used. | + | |
- | ## | + | |
- | # | + | |
- | # | + | |
- | VMWARE_NETWORKING_TYPE = nat | + | |
- | + | ||
- | ## The contents of this file will be inserted into the .vmx file of | + | |
- | ## the VMware virtual machine before Condor starts it. | + | |
- | # | + | |
- | + | ||
- | ## | + | |
- | ## VM Universe Parameters common to libvirt controlled vm's (xen & kvm) | + | |
- | ## | + | |
- | + | ||
- | ## Networking parameters for Xen & KVM | + | |
- | ## | + | |
- | ## This is the path to the XML helper command; the libvirt_simple_script.awk | + | |
- | ## script just reproduces what Condor already does for the kvm/xen VM | + | |
- | ## universe | + | |
- | LIBVIRT_XML_SCRIPT = $(LIBEXEC)/ | + | |
- | + | ||
- | ## This is the optional debugging output file for the xml helper | + | |
- | ## script. | + | |
- | ## write them to the file specified by this argument, which will be | + | |
- | ## passed as the second command line argument when the script is | + | |
- | ## executed | + | |
- | + | ||
- | # | + | |
- | + | ||
- | ## | + | |
- | ## VM Universe Parameters Specific to Xen | + | |
- | ## | + | |
- | + | ||
- | ## Where is bootloader for Xen domainU? (Required) | + | |
- | ## | + | |
- | ## The bootloader will be used in the case that a kernel image includes | + | |
- | ## a disk image | + | |
- | # | + | |
- | + | ||
- | ## The contents of this file will be added to the Xen virtual machine | + | |
- | ## description that Condor writes. | + | |
- | # | + | |
- | + | ||
- | ## | + | |
- | ## | + | |
- | ## condor_lease_manager lease manager daemon | + | |
- | ## | + | |
- | ## Where is the LeaseManager binary installed? | + | |
- | LeaseManager = $(SBIN)/ | + | |
- | + | ||
- | # Turn on the lease manager | + | |
- | # | + | |
- | + | ||
- | # The identification and location of the lease manager for local clients. | + | |
- | LeaseManger_ADDRESS_FILE = $(LOG)/ | + | |
- | + | ||
- | ## LeaseManager startup arguments | + | |
- | # | + | |
- | + | ||
- | ## LeaseManager daemon debugging log | + | |
- | LeaseManager_LOG = $(LOG)/ | + | |
- | LeaseManager_DEBUG = D_FULLDEBUG | + | |
- | MAX_LeaseManager_LOG = 1000000 | + | |
- | + | ||
- | # Basic parameters | + | |
- | LeaseManager.GETADS_INTERVAL = 60 | + | |
- | LeaseManager.UPDATE_INTERVAL = 300 | + | |
- | LeaseManager.PRUNE_INTERVAL = 60 | + | |
- | LeaseManager.DEBUG_ADS = False | + | |
- | + | ||
- | LeaseManager.CLASSAD_LOG = $(SPOOL)/ | + | |
- | # | + | |
- | # | + | |
- | # | + | |
- | + | ||
- | ## | + | |
- | ## | + | |
- | ## KBDD - keyboard activity detection daemon | + | |
- | ## | + | |
- | ## When the KBDD starts up, it can place it's address (IP and port) | + | |
- | ## into a file. This way, tools running on the local machine don' | + | |
- | ## need an additional "-n host: | + | |
- | ## feature can be turned off by commenting out this setting. | + | |
- | KBDD_ADDRESS_FILE = $(LOG)/ | + | |
- | + | ||
- | ## | + | |
- | ## | + | |
- | ## condor_ssh_to_job | + | |
- | ## | + | |
- | # NOTE: condor_ssh_to_job is not supported under Windows. | + | |
- | + | ||
- | # Tell the starter (execute side) whether to allow the job owner or | + | |
- | # queue super user on the schedd from which the job was submitted to | + | |
- | # use condor_ssh_to_job to access the job interactively (e.g. for | + | |
- | # debugging). | + | |
- | # | + | |
- | + | ||
- | # Tell the schedd (submit side) whether to allow the job owner or | + | |
- | # queue super user to use condor_ssh_to_job to access the job | + | |
- | # interactively (e.g. for debugging). | + | |
- | # defined. | + | |
- | # | + | |
- | + | ||
- | # Command condor_ssh_to_job should use to invoke the ssh client. | + | |
- | # %h --> remote host | + | |
- | # %i --> ssh key file | + | |
- | # %k --> known hosts file | + | |
- | # %u --> remote user | + | |
- | # %x --> proxy command | + | |
- | # %% --> % | + | |
- | # | + | |
- | + | ||
- | # Additional ssh clients may be configured. | + | |
- | # default as ssh, except for scp, which omits the %h: | + | |
- | # | + | |
- | + | ||
- | # Path to sshd | + | |
- | # | + | |
- | + | ||
- | # Arguments the starter should use to invoke sshd in inetd mode. | + | |
- | # %f --> sshd config file | + | |
- | # %% --> % | + | |
- | # | + | |
- | + | ||
- | # sshd configuration template used by condor_ssh_to_job_sshd_setup. | + | |
- | # | + | |
- | + | ||
- | # Path to ssh-keygen | + | |
- | # | + | |
- | + | ||
- | # Arguments to ssh-keygen | + | |
- | # %f --> key file to generate | + | |
- | # %% --> % | + | |
- | # | + | |
- | + | ||
- | ###################################################################### | + | |
- | ## | + | |
- | ## Condor HDFS | + | |
- | ## | + | |
- | ## This is the default local configuration file for configuring Condor | + | |
- | ## daemon responsible for running services related to hadoop | + | |
- | ## distributed storage system.You should copy this file to the | + | |
- | ## appropriate location and customize it for your needs. | + | |
- | ## | + | |
- | ## Unless otherwise specified, settings that are commented out show | + | |
- | ## the defaults that are used if you don't define a value. | + | |
- | ## that are defined here MUST BE DEFINED since they have no default | + | |
- | ## value. | + | |
- | ## | + | |
- | ###################################################################### | + | |
- | + | ||
- | ###################################################################### | + | |
- | ## FOLLOWING MUST BE CHANGED | + | |
- | ###################################################################### | + | |
- | + | ||
- | ## The location for hadoop installation directory. The default location | + | |
- | ## is under ' | + | |
- | ## should contain a lib folder that contains all the required Jars necessary | + | |
- | ## to run HDFS name and data nodes. | + | |
- | #HDFS_HOME = $(RELEASE_DIR)/ | + | |
- | + | ||
- | ## The host and port for hadoop' | + | |
- | ## name node (see HDFS_SERVICES) then the specified port will be used | + | |
- | ## to run name node. | + | |
- | HDFS_NAMENODE = hdfs:// | + | |
- | HDFS_NAMENODE_WEB = example.com: | + | |
- | + | ||
- | HDFS_BACKUPNODE = hdfs:// | + | |
- | HDFS_BACKUPNODE_WEB = example.com: | + | |
- | + | ||
- | ## You need to pick one machine as name node by setting this parameter | + | |
- | ## to HDFS_NAMENODE. The remaining machines in a storage cluster will | + | |
- | ## act as data nodes (HDFS_DATANODE). | + | |
- | HDFS_NODETYPE = HDFS_DATANODE | + | |
- | + | ||
- | ## If machine is selected to be NameNode then by a role should defined. | + | |
- | ## If it selected to be DataNode then this paramer is ignored. | + | |
- | ## Available options: | + | |
- | ## ACTIVE: Active NameNode role (default value) | + | |
- | ## BACKUP: Always synchronized with the active NameNode state, thus | + | |
- | ## | + | |
- | ## | + | |
- | ## CHECKPOINT: Periodically creates checkpoints of the namespace. | + | |
- | HDFS_NAMENODE_ROLE = ACTIVE | + | |
- | + | ||
- | ## The two set of directories that are required by HDFS are for name | + | |
- | ## node (HDFS_NAMENODE_DIR) and data node (HDFS_DATANODE_DIR). The | + | |
- | ## directory for name node is only required for a machine running | + | |
- | ## name node service and is used to store critical meta data for | + | |
- | ## files. The data node needs its directory to store file blocks and | + | |
- | ## their replicas. | + | |
- | HDFS_NAMENODE_DIR = / | + | |
- | HDFS_DATANODE_DIR = / | + | |
- | + | ||
- | ## Unlike name node address settings (HDFS_NAMENODE), | + | |
- | ## well known across the storage cluster, data node can run on any | + | |
- | ## arbitrary port of given host. | + | |
- | # | + | |
- | + | ||
- | #################################################################### | + | |
- | ## OPTIONAL | + | |
- | ##################################################################### | + | |
- | + | ||
- | ## Sets the log4j debug level. All the emitted debug output from HDFS | + | |
- | ## will go in ' | + | |
- | # | + | |
- | + | ||
- | ## The access to HDFS services both name node and data node can be | + | |
- | ## restricted by specifying IP/host based filters. By default settings | + | |
- | ## from ALLOW_READ/ | + | |
- | ## are used to specify allow and deny list. The below two parameters can | + | |
- | ## be used to override these settings. Read the Condor manual for | + | |
- | ## specification of these filters. | + | |
- | ## WARN: HDFS doesn' | + | |
- | # | + | |
- | # | + | |
- | + | ||
- | #Fully qualified name for Name node and Datanode class. | + | |
- | # | + | |
- | # | + | |
- | # | + | |
- | + | ||
- | ## In case an old name for hdfs configuration files is required. | + | |
- | # | + | |
=====Central Manager Shared Configuration File===== | =====Central Manager Shared Configuration File===== | ||
- | <file autoconf condor_config_manager.shared># | + | [[http://condor.cs.wlu.edu/condor/ |
- | # the machine' | + | |
- | DAEMON_LIST = COLLECTOR, MASTER, NEGOTIATOR, SCHEDD, STARTD, KBDD | + | |
- | + | ||
- | ## | + | |
- | ## condor_collector | + | |
- | ## | + | |
- | ## Address to which Condor will send a weekly e-mail with output of | + | |
- | ## condor_status. | + | |
- | CONDOR_DEVELOPERS = condor-admin@cs.wisc.edu</file> | + | |
=====Worker Shared Configuration File===== | =====Worker Shared Configuration File===== | ||
- | <file autoconf | + | [[http:// |
- | # the machine' | + | |
- | DAEMON_LIST = MASTER, STARTD, KBDD | + | |
- | ## | ||
- | ## condor_collector | ||
- | ## | ||
- | ## Address to which Condor will send a weekly e-mail with output of | ||
- | ## condor_status. | ||
- | # Don't send monthly statistics emails. | ||
- | # The central manager will do that. | ||
- | CONDOR_DEVELOPERS = NONE</ |