Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
condor:installation:configuration [2011/07/14 13:53] – created, configuration files added garrettheath4 | condor:installation:configuration [2011/08/19 18:52] (current) – garrettheath4 | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== | + | ======Condor |
+ | This page contains copies of the configuration files we use on our system. | ||
- | ===== Global Configuration File ===== | + | =====Global Configuration File===== |
- | <file autoconf condor_config>###################################################################### | + | [[http://condor.cs.wlu.edu/ |
- | ## | + | |
- | ## condor_config | + | |
- | ## | + | |
- | ## This is the global configuration file for condor. | + | |
- | ## made here may potentially be overridden in the local configuration | + | |
- | ## file. KEEP THAT IN MIND! To double-check that a variable is | + | |
- | ## getting set from the configuration file that you expect, use | + | |
- | ## condor_config_val -v < | + | |
- | ## | + | |
- | ## The file is divided into four main parts: | + | |
- | ## Part 1: Settings you likely want to customize | + | |
- | ## Part 2: Settings you may want to customize | + | |
- | ## Part 3: Settings that control the policy of when condor | + | |
- | ## start and stop jobs on your machines | + | |
- | ## Part 4: Settings you should probably leave alone (unless you | + | |
- | ## know what you're doing) | + | |
- | ## | + | |
- | ## Please read the INSTALL file (or the Install chapter in the | + | |
- | ## Condor Administrator' | + | |
- | ## various settings in here and possible ways to configure your | + | |
- | ## pool. | + | |
- | ## | + | |
- | ## Unless otherwise specified, settings that are commented out show | + | |
- | ## the defaults that are used if you don't define a value. | + | |
- | ## that are defined here MUST BE DEFINED since they have no default | + | |
- | ## value. | + | |
- | ## | + | |
- | ## Unless otherwise indicated, all settings which specify a time are | + | |
- | ## defined in seconds. | + | |
- | ## | + | |
- | ###################################################################### | + | |
- | ###################################################################### | + | =====Central Manager Shared Configuration File===== |
- | ###################################################################### | + | [[http:// |
- | ## | + | |
- | ## ###### | + | |
- | ## # # | + | |
- | ## # # | + | |
- | ## ###### | + | |
- | ## # ###### | + | |
- | ## # # # # # | + | |
- | ## # # # # # # | + | |
- | ## | + | |
- | ## Part 1: | + | |
- | ###################################################################### | + | |
- | ###################################################################### | + | |
- | ## What machine is your central manager? | + | =====Worker Shared Configuration File===== |
+ | [[http:// | ||
- | ## | ||
- | ## Pathnames: | ||
- | ## | ||
- | ## Where have you installed the bin, sbin and lib condor directories? | ||
- | RELEASE_DIR = / | ||
- | |||
- | ## Where is the local condor directory for each host? | ||
- | ## This is where the local config file(s), logs and | ||
- | ## spool/ | ||
- | LOCAL_DIR = $(TILDE) | ||
- | # | ||
- | |||
- | ## Where is the machine-specific local config file for each host? | ||
- | LOCAL_CONFIG_FILE = / | ||
- | |||
- | ## Where are optional machine-specific local config files located? | ||
- | ## Config files are included in lexicographic order. | ||
- | LOCAL_CONFIG_DIR = $(LOCAL_DIR)/ | ||
- | # | ||
- | |||
- | ## Blacklist for file processing in the LOCAL_CONFIG_DIR | ||
- | ## LOCAL_CONFIG_DIR_EXCLUDE_REGEXP = ^((\..*)|(.*~)|(# | ||
- | |||
- | ## If the local config file is not present, is it an error? | ||
- | ## WARNING: This is a potential security issue. | ||
- | ## If not specificed, the default is True | ||
- | # | ||
- | |||
- | ## | ||
- | ## Mail parameters: | ||
- | ## | ||
- | ## When something goes wrong with condor at your site, who should get | ||
- | ## the email? | ||
- | CONDOR_ADMIN = kollerg14@mail.wlu.edu | ||
- | |||
- | ## Full path to a mail delivery program that understands that " | ||
- | ## means you want to specify a subject: | ||
- | MAIL = /bin/mail | ||
- | |||
- | ## | ||
- | ## Network domain parameters: | ||
- | ## | ||
- | ## Internet domain of machines sharing a common UID space. | ||
- | ## machines don't share a common UID space, set it to | ||
- | ## UID_DOMAIN = $(FULL_HOSTNAME) | ||
- | ## to specify that each machine has its own UID space. | ||
- | UID_DOMAIN = cs.wlu.edu | ||
- | |||
- | ## Internet domain of machines sharing a common file system. | ||
- | ## If your machines don't use a network file system, set it to | ||
- | ## FILESYSTEM_DOMAIN = $(FULL_HOSTNAME) | ||
- | ## to specify that each machine has its own file system. | ||
- | FILESYSTEM_DOMAIN = cs.wlu.edu | ||
- | |||
- | ## This macro is used to specify a short description of your pool. | ||
- | ## It should be about 20 characters long. For example, the name of | ||
- | ## the UW-Madison Computer Science Condor Pool is ``UW-Madison CS'' | ||
- | COLLECTOR_NAME = Orion | ||
- | |||
- | ###################################################################### | ||
- | ###################################################################### | ||
- | ## | ||
- | ## ###### | ||
- | ## # # | ||
- | ## # # | ||
- | ## ###### | ||
- | ## # ###### | ||
- | ## # # # # # | ||
- | ## # # # # # # | ||
- | ## | ||
- | ## Part 2: Settings you may want to customize: | ||
- | ## (it is generally safe to leave these untouched) | ||
- | ###################################################################### | ||
- | ###################################################################### | ||
- | |||
- | ## | ||
- | ## The user/group ID < | ||
- | ## (this can also be specified in the environment) | ||
- | ## Note: the CONDOR_IDS setting is ignored on Win32 platforms | ||
- | # | ||
- | |||
- | ## | ||
- | ## Flocking: Submitting jobs to more than one pool | ||
- | ## | ||
- | ## Flocking allows you to run your jobs in other pools, or lets | ||
- | ## others run jobs in your pool. | ||
- | ## | ||
- | ## To let others flock to you, define FLOCK_FROM. | ||
- | ## | ||
- | ## To flock to others, define FLOCK_TO. | ||
- | |||
- | ## FLOCK_FROM defines the machines where you would like to grant | ||
- | ## people access to your pool via flocking. (i.e. you are granting | ||
- | ## access to these machines to join your pool). | ||
- | FLOCK_FROM = *.cs.wlu.edu | ||
- | ## An example of this is: | ||
- | #FLOCK_FROM = somehost.friendly.domain, | ||
- | |||
- | ## FLOCK_TO defines the central managers of the pools that you want | ||
- | ## to flock to. (i.e. you are specifying the machines that you | ||
- | ## want your jobs to be negotiated at -- thereby specifying the | ||
- | ## pools they will run in.) | ||
- | FLOCK_TO = | ||
- | ## An example of this is: | ||
- | #FLOCK_TO = central_manager.friendly.domain, | ||
- | |||
- | ## FLOCK_COLLECTOR_HOSTS should almost always be the same as | ||
- | ## FLOCK_NEGOTIATOR_HOSTS (as shown below). | ||
- | ## different is if the collector and negotiator in the pool that you are | ||
- | ## flocking too are running on different machines (not recommended). | ||
- | ## The collectors must be specified in the same corresponding order as | ||
- | ## the FLOCK_NEGOTIATOR_HOSTS list. | ||
- | FLOCK_NEGOTIATOR_HOSTS = $(FLOCK_TO) | ||
- | FLOCK_COLLECTOR_HOSTS = $(FLOCK_TO) | ||
- | ## An example of having the negotiator and the collector on different | ||
- | ## machines is: | ||
- | # | ||
- | # | ||
- | |||
- | ## | ||
- | ## Host/IP access levels | ||
- | ## | ||
- | ## Please see the administrator' | ||
- | ## settings, what they' | ||
- | |||
- | ## What machines have administrative rights for your pool? This | ||
- | ## defaults to your central manager. | ||
- | ## machine(s) where whoever is the condor administrator(s) works | ||
- | ## (assuming you trust all the users who log into that/those | ||
- | ## machine(s), since this is machine-wide access you're granting). | ||
- | ALLOW_ADMINISTRATOR = $(CONDOR_HOST) | ||
- | |||
- | ## If there are no machines that should have administrative access | ||
- | ## to your pool (for example, there' | ||
- | ## users have accounts), you can uncomment this setting. | ||
- | ## Unfortunately, | ||
- | ## be more difficult. | ||
- | # | ||
- | |||
- | ## What machines should have " | ||
- | ## they can issue commands that a machine owner should be able to | ||
- | ## issue to their own machine (like condor_vacate). | ||
- | ## machines with administrator access, and the local machine. | ||
- | ## is probably what you want. | ||
- | ALLOW_OWNER = $(FULL_HOSTNAME), | ||
- | |||
- | ## Read access. | ||
- | ## can view the status of your pool, but cannot join your pool | ||
- | ## or run jobs. | ||
- | ## NOTE: By default, without these entries customized, you | ||
- | ## are granting read access to the whole world. | ||
- | ## restrict that to hosts in your domain. | ||
- | ## grant read access to " | ||
- | ## will be able to view the status of your pool and more easily help | ||
- | ## you install, configure or debug your Condor installation. | ||
- | ## It is important to have this defined. | ||
- | ALLOW_READ = *.cs.wlu.edu | ||
- | #ALLOW_READ = *.your.domain, | ||
- | #DENY_READ = *.bad.subnet, | ||
- | |||
- | ## Write access. | ||
- | ## jobs, etc. Note: Any machine which has WRITE access must | ||
- | ## also be granted READ access. | ||
- | ## not also automatically grant READ access; you must change | ||
- | ## ALLOW_READ above as well. | ||
- | ## | ||
- | ## You must set this to something else before Condor will run. | ||
- | ## This most simple option is: | ||
- | ## ALLOW_WRITE = * | ||
- | ## but note that this will allow anyone to submit jobs or add | ||
- | ## machines to your pool and is a serious security risk. | ||
- | |||
- | ALLOW_WRITE = $(FULL_HOSTNAME), | ||
- | # | ||
- | #DENY_WRITE = bad-machine.your.domain | ||
- | |||
- | ## Are you upgrading to a new version of Condor and confused about | ||
- | ## why the above ALLOW_WRITE setting is causing Condor to refuse to | ||
- | ## start up? If you are upgrading from a configuration that uses | ||
- | ## HOSTALLOW/ | ||
- | ## convert all uses of the former to the latter. | ||
- | ## authorization settings is identical. | ||
- | ## unauthenticated IP-based authorization as well as authenticated | ||
- | ## user-based authorization. | ||
- | ## HOSTALLOW/ | ||
- | ## in the future. | ||
- | |||
- | ## Negotiator access. | ||
- | ## managers. | ||
- | ALLOW_NEGOTIATOR = $(CONDOR_HOST) | ||
- | ## Now, with flocking we need to let the SCHEDD trust the other | ||
- | ## negotiators we are flocking with as well. You should normally | ||
- | ## not have to change this either. | ||
- | ALLOW_NEGOTIATOR_SCHEDD = $(CONDOR_HOST), | ||
- | |||
- | ## Config access. | ||
- | ## tool to modify all daemon configurations. | ||
- | ## access should only be granted with extreme caution. | ||
- | ## config access is denied from all hosts. | ||
- | # | ||
- | |||
- | ## Flocking Configs. | ||
- | ## but we set them from the FLOCK_FROM/ | ||
- | ## to leave these unchanged. | ||
- | ALLOW_WRITE_COLLECTOR = $(ALLOW_WRITE), | ||
- | ALLOW_WRITE_STARTD | ||
- | ALLOW_READ_COLLECTOR | ||
- | ALLOW_READ_STARTD | ||
- | |||
- | |||
- | ## | ||
- | ## Security parameters for setting configuration values remotely: | ||
- | ## | ||
- | ## These parameters define the list of attributes that can be set | ||
- | ## remotely with condor_config_val for the security access levels | ||
- | ## defined above (for example, WRITE, ADMINISTRATOR, | ||
- | ## Please see the administrator' | ||
- | ## settings, what they' | ||
- | ## default values for any of these settings. | ||
- | ## defined, no attributes can be set with condor_config_val. | ||
- | |||
- | ## Do you want to allow condor_config_val -rset to work at all? | ||
- | ## This feature is disabled by default, so to enable, you must | ||
- | ## uncomment the following setting and change the value to " | ||
- | ## Note: changing this requires a restart not just a reconfig. | ||
- | # | ||
- | |||
- | ## Do you want to allow condor_config_val -set to work at all? | ||
- | ## This feature is disabled by default, so to enable, you must | ||
- | ## uncomment the following setting and change the value to " | ||
- | ## Note: changing this requires a restart not just a reconfig. | ||
- | # | ||
- | |||
- | ## Directory where daemons should write persistent config files (used | ||
- | ## to support condor_config_val -set). | ||
- | ## be writable by root (or the user the Condor daemons are running as | ||
- | ## if non-root). | ||
- | ## Note: changing this requires a restart not just a reconfig. | ||
- | # | ||
- | |||
- | ## Attributes that can be set by hosts with " | ||
- | ## defined with ALLOW_CONFIG and DENY_CONFIG above). | ||
- | ## The commented-out value here was the default behavior of Condor | ||
- | ## prior to version 6.3.3. | ||
- | ## should leave this commented out. | ||
- | # | ||
- | |||
- | ## Attributes that can be set by hosts with " | ||
- | ## permission (as defined above) | ||
- | # | ||
- | |||
- | ## Attributes that can be set by hosts with " | ||
- | ## defined above) NOTE: any Condor job running on a given host will | ||
- | ## have OWNER permission on that host by default. | ||
- | ## kind of access, Condor jobs will be able to modify any attributes | ||
- | ## you list below on the machine where they are running. | ||
- | ## obvious security implications, | ||
- | ## permission for custom attributes that you define for your own use | ||
- | ## at your pool (custom attributes about your machines that are | ||
- | ## published with the STARTD_ATTRS setting, for example). | ||
- | # | ||
- | |||
- | ## You can also define daemon-specific versions of each of these | ||
- | ## settings. | ||
- | ## changed in the condor_startd' | ||
- | ## permission, you would use: | ||
- | # | ||
- | |||
- | |||
- | ## | ||
- | ## Network filesystem parameters: | ||
- | ## | ||
- | ## Do you want to use NFS for file access instead of remote system | ||
- | ## calls? | ||
- | #USE_NFS = False | ||
- | |||
- | ## Do you want to use AFS for file access instead of remote system | ||
- | ## calls? | ||
- | #USE_AFS = False | ||
- | |||
- | ## | ||
- | ## Checkpoint server: | ||
- | ## | ||
- | ## Do you want to use a checkpoint server if one is available? | ||
- | ## checkpoint server isn't available or USE_CKPT_SERVER is set to | ||
- | ## False, checkpoints will be written to the local SPOOL directory on | ||
- | ## the submission machine. | ||
- | # | ||
- | |||
- | ## What's the hostname of this machine' | ||
- | # | ||
- | |||
- | ## Do you want the starter on the execute machine to choose the | ||
- | ## checkpoint server? | ||
- | ## the submit machine is used. Otherwise, the CKPT_SERVER_HOST set | ||
- | ## on the execute machine is used. The default is true. | ||
- | # | ||
- | |||
- | ## | ||
- | ## Miscellaneous: | ||
- | ## | ||
- | ## Try to save this much swap space by not starting new shadows. | ||
- | ## Specified in megabytes. | ||
- | # | ||
- | |||
- | ## What's the maximum number of jobs you want a single submit machine | ||
- | ## to spawn shadows for? The default is a function of $(DETECTED_MEMORY) | ||
- | ## and a guess at the number of ephemeral ports available. | ||
- | |||
- | ## Example 1: | ||
- | # | ||
- | |||
- | ## Example 2: | ||
- | ## This is more complicated, | ||
- | ## First define some expressions to use in our calculation. | ||
- | ## Assume we can use up to 80% of memory and estimate shadow private data | ||
- | ## size of 800k. | ||
- | # | ||
- | ## Assume we can use ~21,000 ephemeral ports (avg ~2.1 per shadow). | ||
- | ## Under Linux, the range is set in / | ||
- | # | ||
- | ## Under windows, things are much less scalable, currently. | ||
- | ## Note that this can probably be safely increased a bit under 64-bit windows. | ||
- | # | ||
- | ## Now build up the expression for MAX_JOBS_RUNNING. | ||
- | ## due to lack of a min() function. | ||
- | # | ||
- | # | ||
- | # ifThenElse( $(MAX_SHADOWS_PORTS) < $(MAX_JOBS_RUNNING), | ||
- | # $(MAX_SHADOWS_PORTS), | ||
- | # $(MAX_JOBS_RUNNING) ) | ||
- | # | ||
- | # ifThenElse( $(MAX_SHADOWS_OPSYS) < $(MAX_JOBS_RUNNING), | ||
- | # $(MAX_SHADOWS_OPSYS), | ||
- | # $(MAX_JOBS_RUNNING) ) | ||
- | |||
- | |||
- | ## Maximum number of simultaneous downloads of output files from | ||
- | ## execute machines to the submit machine (limit applied per schedd). | ||
- | ## The value 0 means unlimited. | ||
- | # | ||
- | |||
- | ## Maximum number of simultaneous uploads of input files from the | ||
- | ## submit machine to execute machines (limit applied per schedd). | ||
- | ## The value 0 means unlimited. | ||
- | # | ||
- | |||
- | ## Condor needs to create a few lock files to synchronize access to | ||
- | ## various log files. | ||
- | ## filesystems and file locking over the years, we HIGHLY recommend | ||
- | ## that you put these lock files on a local partition on each | ||
- | ## machine. | ||
- | ## be sure to change this entry. | ||
- | ## running as needs to have write access to this directory. | ||
- | ## you're not running as root, this is whatever user you started up | ||
- | ## the condor_master as. If you are running as root, and there' | ||
- | ## condor account, it's probably condor. | ||
- | ## you've set in the CONDOR_IDS environment variable. | ||
- | ## manual for details on this. | ||
- | LOCK = $(LOG) | ||
- | |||
- | ## If you don't use a fully qualified name in your /etc/hosts file | ||
- | ## (or NIS, etc.) for either your official hostname or as an alias, | ||
- | ## Condor wouldn' | ||
- | ## places that it'd like to. You can set this parameter to the | ||
- | ## domain you'd like appended to your hostname, if changing your host | ||
- | ## information isn't a good option. | ||
- | ## the global config file (not the LOCAL_CONFIG_FILE from above). | ||
- | # | ||
- | |||
- | ## If you don't have DNS set up, Condor will normally fail in many | ||
- | ## places because it can't resolve hostnames to IP addresses and | ||
- | ## vice-versa. If you enable this option, Condor will use | ||
- | ## pseudo-hostnames constructed from a machine' | ||
- | ## DEFAULT_DOMAIN_NAME. Both NO_DNS and DEFAULT_DOMAIN must be set in | ||
- | ## your top-level config file for this mode of operation to work | ||
- | ## properly. | ||
- | #NO_DNS = True | ||
- | |||
- | ## Condor can be told whether or not you want the Condor daemons to | ||
- | ## create a core file if something really bad happens. | ||
- | ## sets the resource limit for the size of a core file. By default, | ||
- | ## we don't do anything, and leave in place whatever limit was in | ||
- | ## effect when you started the Condor daemons. | ||
- | ## set and " | ||
- | ## it's set to " | ||
- | ## core files are even created). | ||
- | ## developers debug any problems you might be having. | ||
- | # | ||
- | |||
- | ## When Condor daemons detect a fatal internal exception, they | ||
- | ## normally log an error message and exit. If you have turned on | ||
- | ## CREATE_CORE_FILES, | ||
- | ## ABORT_ON_EXCEPTION so that core files are generated when an | ||
- | ## exception occurs. | ||
- | ## want. | ||
- | # | ||
- | |||
- | ## Condor Glidein downloads binaries from a remote server for the | ||
- | ## machines into which you're gliding. This saves you from manually | ||
- | ## downloading and installing binaries for every architecture you | ||
- | ## might want to glidein to. The default server is one maintained at | ||
- | ## The University of Wisconsin. If you don't want to use the UW | ||
- | ## server, you can set up your own and change the following to | ||
- | ## point to it, instead. | ||
- | GLIDEIN_SERVER_URLS = \ | ||
- | http:// | ||
- | |||
- | ## List the sites you want to GlideIn to on the GLIDEIN_SITES. For example, | ||
- | ## if you'd like to GlideIn to some Alliance GiB resources, | ||
- | ## uncomment the line below. | ||
- | ## Make sure that $(GLIDEIN_SITES) is included in ALLOW_READ and | ||
- | ## ALLOW_WRITE, | ||
- | ## This is _NOT_ done for you by default, because it is an even better | ||
- | ## idea to use a strong security method (such as GSI) rather than | ||
- | ## host-based security for authorizing glideins. | ||
- | # | ||
- | # | ||
- | |||
- | ## If your site needs to use UID_DOMAIN settings (defined above) that | ||
- | ## are not real Internet domains that match the hostnames, you can | ||
- | ## tell Condor to trust whatever UID_DOMAIN a submit machine gives to | ||
- | ## the execute machine and just make sure the two strings match. | ||
- | ## default for this setting is False, since it is more secure this | ||
- | ## way. | ||
- | # | ||
- | |||
- | ## If you would like to be informed in near real-time via condor_q when | ||
- | ## a vanilla/ | ||
- | ## TRUE. However, this real-time update of the condor_schedd by the shadows | ||
- | ## could cause performance issues if there are thousands of concurrently | ||
- | ## running vanilla/ | ||
- | ## are allowed to suspend and resume. | ||
- | # | ||
- | |||
- | ## A standard universe job can perform arbitrary shell calls via the | ||
- | ## libc ' | ||
- | ## which performs the actual system() invocation in the initialdir of the | ||
- | ## running program and as the user who submitted the job. However, since the | ||
- | ## user job can request ARBITRARY shell commands to be run by the shadow, this | ||
- | ## is a generally unsafe practice. This should only be made available if it is | ||
- | ## actually needed. If this attribute is not defined, then it is the same as | ||
- | ## it being defined to False. Set it to True to allow the shadow to execute | ||
- | ## arbitrary shell code from the user job. | ||
- | # | ||
- | |||
- | ## KEEP_OUTPUT_SANDBOX is an optional feature to tell Condor-G to not | ||
- | ## remove the job spool when the job leaves the queue. | ||
- | ## set to TRUE. Since you will be operating Condor-G in this manner, | ||
- | ## you may want to put leave_in_queue = false in your job submit | ||
- | ## description files, to tell Condor-G to simply remove the job from | ||
- | ## the queue immediately when the job completes (since the output files | ||
- | ## will stick around no matter what). | ||
- | # | ||
- | |||
- | ## This setting tells the negotiator to ignore user priorities. | ||
- | ## avoids problems where jobs from different users won't run when using | ||
- | ## condor_advertise instead of a full-blown startd (some of the user | ||
- | ## priority system in Condor relies on information from the startd -- | ||
- | ## we will remove this reliance when we support the user priority | ||
- | ## system for grid sites in the negotiator; for now, this setting will | ||
- | ## just disable it). | ||
- | # | ||
- | |||
- | ## This is a list of libraries containing ClassAd plug-in functions. | ||
- | # | ||
- | |||
- | ## This setting tells Condor whether to delegate or copy GSI X509 | ||
- | ## credentials when sending them over the wire between daemons. | ||
- | ## Delegation can take up to a second, which is very slow when | ||
- | ## submitting a large number of jobs. Copying exposes the credential | ||
- | ## to third parties if Condor isn't set to encrypt communications. | ||
- | ## By default, Condor will delegate rather than copy. | ||
- | # | ||
- | |||
- | ## This setting controls whether Condor delegates a full or limited | ||
- | ## X509 credential for jobs. Currently, this only affects grid-type | ||
- | ## gt2 grid universe jobs. The default is False. | ||
- | # | ||
- | |||
- | ## This setting controls the default behaviour for the spooling of files | ||
- | ## into, or out of, the Condor system by such tools as condor_submit | ||
- | ## and condor_transfer_data. Here is the list of valid settings for this | ||
- | ## parameter and what they mean: | ||
- | ## | ||
- | ## | ||
- | ## Ask the condor_schedd to solely store/ | ||
- | ## | ||
- | ## | ||
- | ## Ask the condor_schedd for a location of a condor_transferd, | ||
- | ## store/ | ||
- | ## | ||
- | ## The allowed values are case insensitive. | ||
- | ## The default of this parameter if not specified is: stm_use_schedd_only | ||
- | # | ||
- | |||
- | ## This setting specifies an IP address that depends on the setting of | ||
- | ## BIND_ALL_INTERFACES. If BIND_ALL_INTERFACES | ||
- | ## this variable controls what IP address will be advertised as the public | ||
- | ## address of the daemon. If BIND_ALL_INTERFACES is False, then this variable | ||
- | ## specifies which IP address to bind network sockets to. If | ||
- | ## BIND_ALL_INTERFACES is False and NETWORK_INTERFACE is not defined, Condor | ||
- | ## chooses a network interface automatically. It tries to choose a public | ||
- | ## interface if one is available. If it cannot decide which of two interfaces | ||
- | ## to choose from, it will pick the first one. | ||
- | # | ||
- | |||
- | ## | ||
- | ## Settings that control the daemon' | ||
- | ## | ||
- | |||
- | ## | ||
- | ## The flags given in ALL_DEBUG are shared between all daemons. | ||
- | ## | ||
- | |||
- | ALL_DEBUG | ||
- | |||
- | MAX_COLLECTOR_LOG = 1000000 | ||
- | COLLECTOR_DEBUG = | ||
- | |||
- | MAX_KBDD_LOG = 1000000 | ||
- | KBDD_DEBUG = | ||
- | |||
- | MAX_NEGOTIATOR_LOG = 1000000 | ||
- | NEGOTIATOR_DEBUG = D_MATCH | ||
- | MAX_NEGOTIATOR_MATCH_LOG = 1000000 | ||
- | |||
- | MAX_SCHEDD_LOG = 1000000 | ||
- | SCHEDD_DEBUG = D_PID | ||
- | |||
- | MAX_SHADOW_LOG = 1000000 | ||
- | SHADOW_DEBUG = | ||
- | |||
- | MAX_STARTD_LOG = 1000000 | ||
- | STARTD_DEBUG = | ||
- | |||
- | MAX_STARTER_LOG = 1000000 | ||
- | |||
- | MAX_MASTER_LOG = 1000000 | ||
- | MASTER_DEBUG = | ||
- | ## When the master starts up, should it truncate it's log file? | ||
- | # | ||
- | |||
- | MAX_JOB_ROUTER_LOG | ||
- | JOB_ROUTER_DEBUG | ||
- | |||
- | MAX_ROOSTER_LOG | ||
- | ROOSTER_DEBUG | ||
- | |||
- | MAX_SHARED_PORT_LOG | ||
- | SHARED_PORT_DEBUG | ||
- | |||
- | MAX_HDFS_LOG | ||
- | HDFS_DEBUG | ||
- | |||
- | # High Availability Logs | ||
- | MAX_HAD_LOG = 1000000 | ||
- | HAD_DEBUG = | ||
- | MAX_REPLICATION_LOG = 1000000 | ||
- | REPLICATION_DEBUG = | ||
- | MAX_TRANSFERER_LOG = 1000000 | ||
- | TRANSFERER_DEBUG = | ||
- | |||
- | |||
- | ## The daemons touch their log file periodically, | ||
- | ## nothing to write. When a daemon starts up, it prints the last time | ||
- | ## the log file was modified. This lets you estimate when a previous | ||
- | ## instance of a daemon stopped running. This paramete controls how often | ||
- | ## the daemons touch the file (in seconds). | ||
- | # | ||
- | |||
- | ###################################################################### | ||
- | ###################################################################### | ||
- | ## | ||
- | ## ###### | ||
- | ## # # | ||
- | ## # # | ||
- | ## ###### | ||
- | ## # ###### | ||
- | ## # # # # # | ||
- | ## # # # # # # | ||
- | ## | ||
- | ## Part 3: Settings control the policy for running, stopping, and | ||
- | ## periodically checkpointing condor jobs: | ||
- | ###################################################################### | ||
- | ###################################################################### | ||
- | |||
- | ## This section contains macros are here to help write legible | ||
- | ## expressions: | ||
- | MINUTE = 60 | ||
- | HOUR = (60 * $(MINUTE)) | ||
- | StateTimer = (time() - EnteredCurrentState) | ||
- | ActivityTimer = (time() - EnteredCurrentActivity) | ||
- | ActivationTimer = ifThenElse(JobStart =!= UNDEFINED, (time() - JobStart), 0) | ||
- | LastCkpt = (time() - LastPeriodicCheckpoint) | ||
- | |||
- | ## The JobUniverse attribute is just an int. These macros can be | ||
- | ## used to specify the universe in a human-readable way: | ||
- | STANDARD = 1 | ||
- | VANILLA = 5 | ||
- | MPI = 8 | ||
- | VM = 13 | ||
- | IsMPI = (TARGET.JobUniverse == $(MPI)) | ||
- | IsVanilla | ||
- | IsStandard | ||
- | IsVM = (TARGET.JobUniverse == $(VM)) | ||
- | |||
- | NonCondorLoadAvg = (LoadAvg - CondorLoadAvg) | ||
- | BackgroundLoad = 0.3 | ||
- | HighLoad = 0.5 | ||
- | StartIdleTime = 15 * $(MINUTE) | ||
- | ContinueIdleTime = | ||
- | MaxSuspendTime = 10 * $(MINUTE) | ||
- | MaxVacateTime = 10 * $(MINUTE) | ||
- | |||
- | KeyboardBusy = (KeyboardIdle < $(MINUTE)) | ||
- | ConsoleBusy = (ConsoleIdle | ||
- | CPUIdle = ($(NonCondorLoadAvg) <= $(BackgroundLoad)) | ||
- | CPUBusy = ($(NonCondorLoadAvg) >= $(HighLoad)) | ||
- | KeyboardNotBusy = ($(KeyboardBusy) == False) | ||
- | |||
- | BigJob = (TARGET.ImageSize >= (50 * 1024)) | ||
- | MediumJob = (TARGET.ImageSize >= (15 * 1024) && TARGET.ImageSize < (50 * 1024)) | ||
- | SmallJob = (TARGET.ImageSize < (15 * 1024)) | ||
- | |||
- | JustCPU = ($(CPUBusy) && ($(KeyboardBusy) == False)) | ||
- | MachineBusy = ($(CPUBusy) || $(KeyboardBusy)) | ||
- | |||
- | ## The RANK expression controls which jobs this machine prefers to | ||
- | ## run over others. | ||
- | ## RANK = TARGET.ImageSize | ||
- | ## RANK = (Owner == " | ||
- | ## + ((Owner == " | ||
- | ## By default, RANK is always 0, meaning that all jobs have an equal | ||
- | ## ranking. | ||
- | #RANK = 0 | ||
- | |||
- | |||
- | ##################################################################### | ||
- | ## This where you choose the configuration that you would like to | ||
- | ## use. It has no defaults so it must be defined. | ||
- | ## file off with the UWCS_* policy. | ||
- | ###################################################################### | ||
- | |||
- | ## Also here is what is referred to as the TESTINGMODE_*, | ||
- | ## a quick hardwired way to test Condor with a simple no-preemption policy. | ||
- | ## Replace UWCS_* with TESTINGMODE_* if you wish to do testing mode. | ||
- | ## For example: | ||
- | ## WANT_SUSPEND = $(UWCS_WANT_SUSPEND) | ||
- | ## becomes | ||
- | ## WANT_SUSPEND = $(TESTINGMODE_WANT_SUSPEND) | ||
- | |||
- | # When should we only consider SUSPEND instead of PREEMPT? | ||
- | WANT_SUSPEND = $(UWCS_WANT_SUSPEND) | ||
- | |||
- | # When should we preempt gracefully instead of hard-killing? | ||
- | WANT_VACATE = $(UWCS_WANT_VACATE) | ||
- | |||
- | ## When is this machine willing to start a job? | ||
- | START = $(UWCS_START) | ||
- | |||
- | ## When should a local universe job be allowed to start? | ||
- | # | ||
- | |||
- | ## When should a scheduler universe job be allowed to start? | ||
- | # | ||
- | |||
- | ## When to suspend a job? | ||
- | SUSPEND = $(UWCS_SUSPEND) | ||
- | |||
- | ## When to resume a suspended job? | ||
- | CONTINUE = $(UWCS_CONTINUE) | ||
- | |||
- | ## When to nicely stop a job? | ||
- | ## (as opposed to killing it instantaneously) | ||
- | PREEMPT = $(UWCS_PREEMPT) | ||
- | |||
- | ## When to instantaneously kill a preempting job | ||
- | ## (e.g. if a job is in the pre-empting stage for too long) | ||
- | KILL = $(UWCS_KILL) | ||
- | |||
- | PERIODIC_CHECKPOINT = $(UWCS_PERIODIC_CHECKPOINT) | ||
- | PREEMPTION_REQUIREMENTS = $(UWCS_PREEMPTION_REQUIREMENTS) | ||
- | PREEMPTION_RANK = $(UWCS_PREEMPTION_RANK) | ||
- | NEGOTIATOR_PRE_JOB_RANK = $(UWCS_NEGOTIATOR_PRE_JOB_RANK) | ||
- | NEGOTIATOR_POST_JOB_RANK = $(UWCS_NEGOTIATOR_POST_JOB_RANK) | ||
- | MaxJobRetirementTime | ||
- | CLAIM_WORKLIFE | ||
- | |||
- | ##################################################################### | ||
- | ## This is the UWisc - CS Department Configuration. | ||
- | ##################################################################### | ||
- | |||
- | # When should we only consider SUSPEND instead of PREEMPT? | ||
- | # Only when SUSPEND is True and one of the following is also true: | ||
- | # - the job is small | ||
- | # - the keyboard is idle | ||
- | # - it is a vanilla universe job | ||
- | UWCS_WANT_SUSPEND | ||
- | ( $(SUSPEND) ) | ||
- | |||
- | # When should we preempt gracefully instead of hard-killing? | ||
- | UWCS_WANT_VACATE | ||
- | |||
- | # Only start jobs if: | ||
- | # 1) the keyboard has been idle long enough, AND | ||
- | # 2) the load average is low enough OR the machine is currently | ||
- | # running a Condor job | ||
- | # (NOTE: Condor will only run 1 job at a time on a given resource. | ||
- | # The reasons Condor might consider running a different job while | ||
- | # already running one are machine Rank (defined above), and user | ||
- | # priorities.) | ||
- | UWCS_START = ( (KeyboardIdle > $(StartIdleTime)) \ | ||
- | && ( $(CPUIdle) || \ | ||
- | | ||
- | |||
- | # Suspend jobs if: | ||
- | # 1) the keyboard has been touched, OR | ||
- | # 2a) The cpu has been busy for more than 2 minutes, AND | ||
- | # 2b) the job has been running for more than 90 seconds | ||
- | UWCS_SUSPEND = ( $(KeyboardBusy) || \ | ||
- | ( (CpuBusyTime > 2 * $(MINUTE)) \ | ||
- | && | ||
- | |||
- | # Continue jobs if: | ||
- | # 1) the cpu is idle, AND | ||
- | # 2) we've been suspended more than 10 seconds, AND | ||
- | # 3) the keyboard hasn't been touched in a while | ||
- | UWCS_CONTINUE = ( $(CPUIdle) && ($(ActivityTimer) > 10) \ | ||
- | && (KeyboardIdle > $(ContinueIdleTime)) ) | ||
- | |||
- | # Preempt jobs if: | ||
- | # 1) The job is suspended and has been suspended longer than we want | ||
- | # 2) OR, we don't want to suspend this job, but the conditions to | ||
- | # suspend jobs have been met (someone is using the machine) | ||
- | UWCS_PREEMPT = ( ((Activity == " | ||
- | ($(ActivityTimer) > $(MaxSuspendTime))) \ | ||
- | || (SUSPEND && (WANT_SUSPEND == False)) ) | ||
- | |||
- | # Maximum time (in seconds) to wait for a job to finish before kicking | ||
- | # it off (due to PREEMPT, a higher priority claim, or the startd | ||
- | # gracefully shutting down). | ||
- | # was started, minus any suspension time. Once the retirement time runs | ||
- | # out, the usual preemption process will take place. | ||
- | # self-limit the retirement time to _less_ than what is given here. | ||
- | # By default, nice user jobs and standard universe jobs set their | ||
- | # MaxJobRetirementTime to 0, so they will not wait in retirement. | ||
- | |||
- | UWCS_MaxJobRetirementTime = 0 | ||
- | |||
- | ## If you completely disable preemption of claims to machines, you | ||
- | ## should consider limiting the timespan over which new jobs will be | ||
- | ## accepted on the same claim. | ||
- | ## preemption for a comprehensive discussion. | ||
- | ## configuration does not disable preemption of claims, we leave | ||
- | ## CLAIM_WORKLIFE undefined (infinite). | ||
- | # | ||
- | |||
- | # Kill jobs if they have taken too long to vacate gracefully | ||
- | UWCS_KILL = $(ActivityTimer) > $(MaxVacateTime) | ||
- | |||
- | ## Only define vanilla versions of these if you want to make them | ||
- | ## different from the above settings. | ||
- | # | ||
- | # | ||
- | # | ||
- | # && | ||
- | # | ||
- | # | ||
- | # || (SUSPEND_VANILLA && (WANT_SUSPEND == False)) ) | ||
- | # | ||
- | |||
- | ## Checkpoint every 3 hours on average, with a +-30 minute random | ||
- | ## factor to avoid having many jobs hit the checkpoint server at | ||
- | ## the same time. | ||
- | UWCS_PERIODIC_CHECKPOINT = $(LastCkpt) > (3 * $(HOUR) + \ | ||
- | $RANDOM_INTEGER(-30, | ||
- | |||
- | ## You might want to checkpoint a little less often. | ||
- | ## example of this is below. | ||
- | ## periodic checkpoint every 6 hours. | ||
- | ## checkpoint every 12 hours. | ||
- | # | ||
- | # ( (TARGET.ImageSize < 60000) && \ | ||
- | # ($(LastCkpt) > (6 * $(HOUR) + $RANDOM_INTEGER(-30, | ||
- | # ( $(LastCkpt) > (12 * $(HOUR) + $RANDOM_INTEGER(-30, | ||
- | |||
- | ## The rank expressions used by the negotiator are configured below. | ||
- | ## This is the order in which ranks are applied by the negotiator: | ||
- | ## 1. NEGOTIATOR_PRE_JOB_RANK | ||
- | ## 2. rank in job ClassAd | ||
- | ## 3. NEGOTIATOR_POST_JOB_RANK | ||
- | ## 4. cause of preemption (0=user priority, | ||
- | ## 5. PREEMPTION_RANK | ||
- | |||
- | ## The NEGOTIATOR_PRE_JOB_RANK expression overrides all other ranks | ||
- | ## that are used to pick a match from the set of possibilities. | ||
- | ## The following expression matches jobs to unclaimed resources | ||
- | ## whenever possible, regardless of the job-supplied rank. | ||
- | UWCS_NEGOTIATOR_PRE_JOB_RANK = RemoteOwner =?= UNDEFINED | ||
- | |||
- | ## The NEGOTIATOR_POST_JOB_RANK expression chooses between | ||
- | ## resources that are equally preferred by the job. | ||
- | ## The following example expression steers jobs toward | ||
- | ## faster machines and tends to fill a cluster of multi-processors | ||
- | ## breadth-first instead of depth-first. | ||
- | ## machines over offline (hibernating) ones. In this example, | ||
- | ## the expression is chosen to have no effect when preemption | ||
- | ## would take place, allowing control to pass on to | ||
- | ## PREEMPTION_RANK. | ||
- | UWCS_NEGOTIATOR_POST_JOB_RANK = \ | ||
- | | ||
- | |||
- | ## The negotiator will not preempt a job running on a given machine | ||
- | ## unless the PREEMPTION_REQUIREMENTS expression evaluates to true | ||
- | ## and the owner of the idle job has a better priority than the owner | ||
- | ## of the running job. This expression defaults to true. | ||
- | UWCS_PREEMPTION_REQUIREMENTS = ( $(StateTimer) > (1 * $(HOUR)) && \ | ||
- | RemoteUserPrio > TARGET.SubmitterUserPrio * 1.2 ) || (MY.NiceUser == True) | ||
- | |||
- | ## The PREEMPTION_RANK expression is used in a case where preemption | ||
- | ## is the only option and all other negotiation ranks are equal. | ||
- | ## example, if the job has no preference, it is usually preferable to | ||
- | ## preempt a job with a small ImageSize instead of a job with a large | ||
- | ## ImageSize. | ||
- | ## same. However, the negotiator will always prefer to match the job | ||
- | ## with an idle machine over a preemptable machine, if all other | ||
- | ## negotiation ranks are equal. | ||
- | UWCS_PREEMPTION_RANK = (RemoteUserPrio * 1000000) - TARGET.ImageSize | ||
- | |||
- | |||
- | ##################################################################### | ||
- | ## This is a Configuration that will cause your Condor jobs to | ||
- | ## always run. This is intended for testing only. | ||
- | ###################################################################### | ||
- | |||
- | ## This mode will cause your jobs to start on a machine an will let | ||
- | ## them run to completion. | ||
- | ## on in the machine (load average, keyboard activity, etc.) | ||
- | |||
- | TESTINGMODE_WANT_SUSPEND = False | ||
- | TESTINGMODE_WANT_VACATE = False | ||
- | TESTINGMODE_START = True | ||
- | TESTINGMODE_SUSPEND = False | ||
- | TESTINGMODE_CONTINUE = True | ||
- | TESTINGMODE_PREEMPT = False | ||
- | TESTINGMODE_KILL = False | ||
- | TESTINGMODE_PERIODIC_CHECKPOINT = False | ||
- | TESTINGMODE_PREEMPTION_REQUIREMENTS = False | ||
- | TESTINGMODE_PREEMPTION_RANK = 0 | ||
- | |||
- | # Prevent machine claims from being reused indefinitely, | ||
- | # preemption of claims is disabled in the TESTINGMODE configuration. | ||
- | TESTINGMODE_CLAIM_WORKLIFE = 1200 | ||
- | |||
- | |||
- | ###################################################################### | ||
- | ###################################################################### | ||
- | ## | ||
- | ## ###### | ||
- | ## # # | ||
- | ## # # | ||
- | ## ###### | ||
- | ## # ###### | ||
- | ## # # # # # | ||
- | ## # # # # # # | ||
- | ## | ||
- | ## Part 4: Settings you should probably leave alone: | ||
- | ## (unless you know what you're doing) | ||
- | ###################################################################### | ||
- | ###################################################################### | ||
- | |||
- | ###################################################################### | ||
- | ## Daemon-wide settings: | ||
- | ###################################################################### | ||
- | |||
- | ## Pathnames | ||
- | LOG = $(LOCAL_DIR)/ | ||
- | SPOOL = $(LOCAL_DIR)/ | ||
- | EXECUTE = $(LOCAL_DIR)/ | ||
- | BIN = $(RELEASE_DIR)/ | ||
- | LIB = $(RELEASE_DIR)/ | ||
- | INCLUDE = $(RELEASE_DIR)/ | ||
- | SBIN = $(RELEASE_DIR)/ | ||
- | LIBEXEC = $(RELEASE_DIR)/ | ||
- | |||
- | ## If you leave HISTORY undefined (comment it out), no history file | ||
- | ## will be created. | ||
- | HISTORY = $(SPOOL)/ | ||
- | |||
- | ## Log files | ||
- | COLLECTOR_LOG = $(LOG)/ | ||
- | KBDD_LOG = $(LOG)/ | ||
- | MASTER_LOG = $(LOG)/ | ||
- | NEGOTIATOR_LOG = $(LOG)/ | ||
- | NEGOTIATOR_MATCH_LOG = $(LOG)/ | ||
- | SCHEDD_LOG = $(LOG)/ | ||
- | SHADOW_LOG = $(LOG)/ | ||
- | STARTD_LOG = $(LOG)/ | ||
- | STARTER_LOG = $(LOG)/ | ||
- | JOB_ROUTER_LOG | ||
- | ROOSTER_LOG | ||
- | SHARED_PORT_LOG = $(LOG)/ | ||
- | # High Availability Logs | ||
- | HAD_LOG = $(LOG)/ | ||
- | REPLICATION_LOG = $(LOG)/ | ||
- | TRANSFERER_LOG = $(LOG)/ | ||
- | HDFS_LOG = $(LOG)/ | ||
- | |||
- | ## Lock files | ||
- | SHADOW_LOCK = $(LOCK)/ | ||
- | |||
- | ## This setting controls how often any lock files currently in use have their | ||
- | ## timestamp updated. Updating the timestamp prevents administrative programs | ||
- | ## like ' | ||
- | ## an integer in seconds with a minimum of 60 seconds. The default if not | ||
- | ## specified is 28800 seconds, or 8 hours. | ||
- | ## This attribute only takes effect on restart of the daemons or at the next | ||
- | ## update time. | ||
- | # LOCK_FILE_UPDATE_INTERVAL = 28800 | ||
- | |||
- | ## This setting primarily allows you to change the port that the | ||
- | ## collector is listening on. By default, the collector uses port | ||
- | ## 9618, but you can set the port with a ": | ||
- | ## COLLECTOR_HOST = $(CONDOR_HOST): | ||
- | COLLECTOR_HOST | ||
- | |||
- | ## The NEGOTIATOR_HOST parameter has been deprecated. | ||
- | ## the negotiator is listening is now dynamically allocated and the IP | ||
- | ## and port are now obtained from the collector, just like all the | ||
- | ## other daemons. | ||
- | ## are running version 6.7.3 or earlier, you can uncomment this | ||
- | ## setting to go back to the old fixed-port (9614) for the negotiator. | ||
- | # | ||
- | |||
- | ## How long are you willing to let daemons try their graceful | ||
- | ## shutdown methods before they do a hard shutdown? (30 minutes) | ||
- | # | ||
- | |||
- | ## How much disk space would you like reserved from Condor? | ||
- | ## places where Condor is computing the free disk space on various | ||
- | ## partitions, it subtracts the amount it really finds by this | ||
- | ## many megabytes. | ||
- | RESERVED_DISK = 5 | ||
- | |||
- | ## If your machine is running AFS and the AFS cache lives on the same | ||
- | ## partition as the other Condor directories, | ||
- | ## reserve the space that your AFS cache is configured to use, set | ||
- | ## this to true. | ||
- | # | ||
- | |||
- | ## By default, if a user does not specify " | ||
- | ## description file, any email Condor sends about that job will go to | ||
- | ## " | ||
- | ## domain (so that you would set UID_DOMAIN to be the same across all | ||
- | ## machines in your pool), *BUT* email to user@UID_DOMAIN is *NOT* | ||
- | ## the right place for Condor to send email for your site, you can | ||
- | ## define the default domain to use for email. | ||
- | ## would be to set EMAIL_DOMAIN to the fully qualified hostname of | ||
- | ## each machine in your pool, so users submitting jobs from a | ||
- | ## specific machine would get email sent to user@machine.your.domain, | ||
- | ## instead of user@your.domain. | ||
- | ## setting commented out unless two things are true: 1) UID_DOMAIN is | ||
- | ## set to your domain, not $(FULL_HOSTNAME), | ||
- | ## user@UID_DOMAIN won't work. | ||
- | # | ||
- | |||
- | ## Should Condor daemons create a UDP command socket (for incomming | ||
- | ## UDP-based commands) in addition to the TCP command socket? | ||
- | ## default, classified ad updates sent to the collector use UDP, in | ||
- | ## addition to some keep alive messages and other non-essential | ||
- | ## communication. | ||
- | ## desirable to disable the UDP command port (for example, to reduce | ||
- | ## the number of ports represented by a GCB broker, etc). If not | ||
- | ## defined, the UDP command socket is enabled by default, and to | ||
- | ## modify this, you must restart your Condor daemons. Also, this | ||
- | ## setting must be defined machine-wide. | ||
- | ## " | ||
- | ## is " | ||
- | # | ||
- | |||
- | ## If your site needs to use TCP updates to the collector, instead of | ||
- | ## UDP, you can enable this feature. | ||
- | ## THIS FOR MOST SITES! | ||
- | ## this feature are pools made up of machines connected via a | ||
- | ## wide-area network where UDP packets are frequently or always | ||
- | ## dropped. | ||
- | ## COLLECTOR_SOCKET_CACHE_SIZE setting at your collector, and each | ||
- | ## entry in the socket cache uses another file descriptor. | ||
- | ## defined, this feature is disabled by default. | ||
- | # | ||
- | |||
- | ## HIGHPORT and LOWPORT let you set the range of ports that Condor | ||
- | ## will use. This may be useful if you are behind a firewall. By | ||
- | ## default, Condor uses port 9618 for the collector, 9614 for the | ||
- | ## negotiator, and system-assigned (apparently random) ports for | ||
- | ## everything else. HIGHPORT and LOWPORT only affect these | ||
- | ## system-assigned ports, but will restrict them to the range you | ||
- | ## specify here. If you want to change the well-known ports for the | ||
- | ## collector or negotiator, see COLLECTOR_HOST or NEGOTIATOR_HOST. | ||
- | ## Note that both LOWPORT and HIGHPORT must be at least 1024 if you | ||
- | ## are not starting your daemons as root. You may also specify | ||
- | ## different port ranges for incoming and outgoing connections by | ||
- | ## using IN_HIGHPORT/ | ||
- | #HIGHPORT = 9700 | ||
- | #LOWPORT = 9600 | ||
- | |||
- | ## If a daemon doens' | ||
- | ## a core file? This bascially controls the type of the signal | ||
- | ## sent to the child process, and mostly affects the Condor Master | ||
- | # | ||
- | |||
- | |||
- | ###################################################################### | ||
- | ## Daemon-specific settings: | ||
- | ###################################################################### | ||
- | |||
- | ## | ||
- | ## condor_master | ||
- | ## | ||
- | ## Daemons you want the master to keep running for you: | ||
- | DAEMON_LIST = MASTER, STARTD, SCHEDD | ||
- | |||
- | ## Which daemons use the Condor DaemonCore library (i.e., not the | ||
- | ## checkpoint server or custom user daemons)? | ||
- | # | ||
- | #MASTER, STARTD, SCHEDD, KBDD, COLLECTOR, NEGOTIATOR, EVENTD, \ | ||
- | # | ||
- | #DBMSD, QUILL, JOB_ROUTER, ROOSTER, LEASEMANAGER, | ||
- | |||
- | |||
- | ## Where are the binaries for these daemons? | ||
- | MASTER = $(SBIN)/ | ||
- | STARTD = $(SBIN)/ | ||
- | SCHEDD = $(SBIN)/ | ||
- | KBDD = $(SBIN)/ | ||
- | NEGOTIATOR = $(SBIN)/ | ||
- | COLLECTOR = $(SBIN)/ | ||
- | STARTER_LOCAL = $(SBIN)/ | ||
- | JOB_ROUTER | ||
- | ROOSTER | ||
- | HDFS = $(SBIN)/ | ||
- | SHARED_PORT = $(LIBEXEC)/ | ||
- | TRANSFERER = $(LIBEXEC)/ | ||
- | |||
- | ## When the master starts up, it can place it's address (IP and port) | ||
- | ## into a file. This way, tools running on the local machine don't | ||
- | ## need to query the central manager to find the master. | ||
- | ## feature can be turned off by commenting out this setting. | ||
- | MASTER_ADDRESS_FILE = $(LOG)/ | ||
- | |||
- | ## Where should the master find the condor_preen binary? If you don't | ||
- | ## want preen to run at all, set it to nothing. | ||
- | PREEN = $(SBIN)/ | ||
- | |||
- | ## How do you want preen to behave? | ||
- | ## about files preen finds that it thinks it should remove. | ||
- | ## means you want preen to actually remove these files. | ||
- | ## want either of those things to happen, just remove the appropriate | ||
- | ## one from this setting. | ||
- | PREEN_ARGS = -m -r | ||
- | |||
- | ## How often should the master start up condor_preen? | ||
- | # | ||
- | |||
- | ## If a daemon dies an unnatural death, do you want email about it? | ||
- | # | ||
- | |||
- | ## If you're getting obituaries, how many lines of the end of that | ||
- | ## daemon' | ||
- | # | ||
- | |||
- | ## Should the master run? | ||
- | # | ||
- | |||
- | ## Should the master start up the daemons you want it to? | ||
- | # | ||
- | |||
- | ## How often do you want the master to send an update to the central | ||
- | ## manager? | ||
- | # | ||
- | |||
- | ## How often do you want the master to check the timestamps of the | ||
- | ## daemons it's running? | ||
- | ## master restarts them. | ||
- | # | ||
- | |||
- | ## Once you notice new binaries, how long should you wait before you | ||
- | ## try to execute them? | ||
- | # | ||
- | |||
- | ## What's the maximum amount of time you're willing to give the | ||
- | ## daemons to quickly shutdown before you just kill them outright? | ||
- | # | ||
- | |||
- | ###### | ||
- | ## Exponential backoff settings: | ||
- | ###### | ||
- | ## When a daemon keeps crashing, we use " | ||
- | ## wait longer and longer before restarting it. This is the base of | ||
- | ## the exponent used to determine how long to wait before starting | ||
- | ## the daemon again: | ||
- | # | ||
- | |||
- | ## What's the maximum amount of time you want the master to wait | ||
- | ## between attempts to start a given daemon? | ||
- | ## MASTER_BACKOFF_FACTOR, | ||
- | # | ||
- | |||
- | ## How long should a daemon run without crashing before we consider | ||
- | ## it " | ||
- | ## of restarts so the exponential backoff stuff goes back to normal. | ||
- | # | ||
- | |||
- | |||
- | ## | ||
- | ## condor_collector | ||
- | ## | ||
- | ## Address to which Condor will send a weekly e-mail with output of | ||
- | ## condor_status. | ||
- | # | ||
- | |||
- | ## Global Collector to periodically advertise basic information about | ||
- | ## your pool. | ||
- | # | ||
- | |||
- | |||
- | ## | ||
- | ## condor_negotiator | ||
- | ## | ||
- | ## Determine if the Negotiator will honor SlotWeight attributes, which | ||
- | ## may be used to give a slot greater weight when calculating usage. | ||
- | # | ||
- | |||
- | |||
- | ## How often the Negotaitor starts a negotiation cycle, defined in | ||
- | ## seconds. | ||
- | # | ||
- | |||
- | ## Should the Negotiator publish an update to the Collector after | ||
- | ## every negotiation cycle. It is useful to have this set to True | ||
- | ## to get immediate updates on LastNegotiationCycle statistics. | ||
- | # | ||
- | |||
- | |||
- | ## | ||
- | ## condor_startd | ||
- | ## | ||
- | ## Where are the various condor_starter binaries installed? | ||
- | STARTER_LIST = STARTER, STARTER_STANDARD | ||
- | STARTER = $(SBIN)/ | ||
- | STARTER_STANDARD = $(SBIN)/ | ||
- | STARTER_LOCAL = $(SBIN)/ | ||
- | |||
- | ## When the startd starts up, it can place it's address (IP and port) | ||
- | ## into a file. This way, tools running on the local machine don't | ||
- | ## need to query the central manager to find the startd. | ||
- | ## feature can be turned off by commenting out this setting. | ||
- | STARTD_ADDRESS_FILE = $(LOG)/ | ||
- | |||
- | ## When a machine is claimed, how often should we poll the state of | ||
- | ## the machine to see if we need to evict/ | ||
- | # | ||
- | |||
- | ## How often should the startd send updates to the central manager? | ||
- | # | ||
- | |||
- | ## How long is the startd willing to stay in the " | ||
- | # | ||
- | |||
- | ## How long is the startd willing to stay in the preempting/ | ||
- | ## state before it just kills the starter directly? | ||
- | # | ||
- | |||
- | ## When a machine unclaimed, when should it run benchmarks? | ||
- | ## LastBenchmark is initialized to 0, so this expression says as soon | ||
- | ## as we're unclaimed, run the benchmarks. | ||
- | ## unclaimed and it's been at least 4 hours since we ran the last | ||
- | ## benchmarks, run them again. | ||
- | ## of the benchmark results to provide more accurate values. | ||
- | ## Note, if you don't want any benchmarks run at all, either comment | ||
- | ## RunBenchmarks out, or set it to " | ||
- | BenchmarkTimer = (time() - LastBenchmark) | ||
- | RunBenchmarks : (LastBenchmark == 0 ) || ($(BenchmarkTimer) >= (4 * $(HOUR))) | ||
- | # | ||
- | |||
- | ## When the startd does benchmarks, which set of benchmarks should we | ||
- | ## run? The default is the same as pre-7.5.6: MIPS and KFLOPS. | ||
- | benchmarks_joblist = mips kflops | ||
- | |||
- | ## What's the max " | ||
- | ## (1.01), the startd will run the benchmarks serially. | ||
- | benchmarks_max_job_load = 1.0 | ||
- | |||
- | # MIPS (Dhrystone 2.1) benchmark: load 1.0 | ||
- | benchmarks_mips_executable = $(LIBEXEC)/ | ||
- | benchmarks_mips_job_load = 1.0 | ||
- | |||
- | # KFLOPS (clinpack) benchmark: load 1.0 | ||
- | benchmarks_kflops_executable = $(LIBEXEC)/ | ||
- | benchmarks_kflops_job_load = 1.0 | ||
- | |||
- | |||
- | ## Normally, when the startd is computing the idle time of all the | ||
- | ## users of the machine (both local and remote), it checks the utmp | ||
- | ## file to find all the currently active ttys, and only checks access | ||
- | ## time of the devices associated with active logins. | ||
- | ## on some systems, utmp is unreliable, and the startd might miss | ||
- | ## keyboard activity by doing this. So, if your utmp is unreliable, | ||
- | ## set this setting to True and the startd will check the access time | ||
- | ## on all tty and pty devices. | ||
- | # | ||
- | |||
- | ## This entry allows the startd to monitor console (keyboard and | ||
- | ## mouse) activity by checking the access times on special files in | ||
- | ## /dev. Activity on these files shows up as " | ||
- | ## the startd' | ||
- | ## names of devices you want considered the console, without the | ||
- | ## "/ | ||
- | CONSOLE_DEVICES = mouse, console | ||
- | |||
- | |||
- | ## The STARTD_ATTRS (and legacy STARTD_EXPRS) entry allows you to | ||
- | ## have the startd advertise arbitrary attributes from the config | ||
- | ## file in its ClassAd. | ||
- | ## from the config file you want in the startd ClassAd. | ||
- | ## NOTE: because of the different syntax of the config file and | ||
- | ## ClassAds, you might have to do a little extra work to get a given | ||
- | ## entry into the ClassAd. | ||
- | ## quotes (") around your strings. | ||
- | ## directly, as can boolean expressions. | ||
- | ## the startd to advertise its list of console devices, when it's | ||
- | ## configured to run benchmarks, and how often it sends updates to | ||
- | ## the central manager, you'd have to define the following helper | ||
- | ## macro: | ||
- | # | ||
- | ## Note: this must come before you define STARTD_ATTRS because macros | ||
- | ## must be defined before you use them in other macros or | ||
- | ## expressions. | ||
- | ## Then, you'd set the STARTD_ATTRS setting to this: | ||
- | # | ||
- | ## | ||
- | ## STARTD_ATTRS can also be defined on a per-slot basis. | ||
- | ## builds the list of attributes to advertise by combining the lists | ||
- | ## in this order: STARTD_ATTRS, | ||
- | ## example, the startd ad for slot1 will have the value for | ||
- | ## favorite_color, | ||
- | ## will have favorite_color, | ||
- | ## | ||
- | # | ||
- | # | ||
- | # | ||
- | ## | ||
- | ## Attributes in the STARTD_ATTRS list can also be on a per-slot basis. | ||
- | ## For example, the following configuration: | ||
- | ## | ||
- | # | ||
- | # | ||
- | # | ||
- | # | ||
- | # | ||
- | ## | ||
- | ## will result in the following attributes in the slot classified | ||
- | ## ads: | ||
- | ## | ||
- | ## slot1 - favorite_color = " | ||
- | ## slot2 - favorite_color = " | ||
- | ## slot3 - favorite_color = " | ||
- | ## | ||
- | ## Finally, the recommended default value for this setting, is to | ||
- | ## publish the COLLECTOR_HOST setting as a string. | ||
- | ## useful using the " | ||
- | ## for jobs to know (for example, via their environment) what pool | ||
- | ## they' | ||
- | COLLECTOR_HOST_STRING = " | ||
- | STARTD_ATTRS = COLLECTOR_HOST_STRING | ||
- | |||
- | ## When the startd is claimed by a remote user, it can also advertise | ||
- | ## arbitrary attributes from the ClassAd of the job its working on. | ||
- | ## Just list the attribute names you want advertised. | ||
- | ## Note: since this is already a ClassAd, you don't have to do | ||
- | ## anything funny with strings, etc. This feature can be turned off | ||
- | ## by commenting out this setting (there is no default). | ||
- | STARTD_JOB_EXPRS = ImageSize, ExecutableSize, | ||
- | |||
- | ## If you want to " | ||
- | ## has, you can use this setting to override Condor' | ||
- | ## computation. | ||
- | ## the change to take effect (a simple condor_reconfig will not do). | ||
- | ## Please read the section on " | ||
- | ## Macros" | ||
- | ## discussion of this setting. | ||
- | ## must be an integer (" | ||
- | ## represent the default). | ||
- | #NUM_CPUS = N | ||
- | |||
- | ## If you never want Condor to detect more the " | ||
- | ## line out. You must restart the startd for this setting to take | ||
- | ## effect. If set to 0 or a negative number, it is ignored. | ||
- | ## By default, it is ignored. Otherwise, it must be a positive | ||
- | ## integer (" | ||
- | ## represent the default). | ||
- | # | ||
- | |||
- | ## Normally, Condor will automatically detect the amount of physical | ||
- | ## memory available on your machine. | ||
- | ## how much physical memory (in MB) your machine has, overriding the | ||
- | ## value Condor computes automatically. | ||
- | #MEMORY = 128 | ||
- | |||
- | ## How much memory would you like reserved from Condor? | ||
- | ## Condor considers all the physical memory of your machine as | ||
- | ## available to be used by Condor jobs. If RESERVED_MEMORY is | ||
- | ## defined, Condor subtracts it from the amount of memory it | ||
- | ## advertises as available. | ||
- | # | ||
- | |||
- | ###### | ||
- | ## SMP startd settings | ||
- | ## | ||
- | ## By default, Condor will evenly divide the resources in an SMP | ||
- | ## machine (such as RAM, swap space and disk space) among all the | ||
- | ## CPUs, and advertise each CPU as its own slot with an even share of | ||
- | ## the system resources. | ||
- | ## there are a few options available to you. Please read the section | ||
- | ## on " | ||
- | ## Administrator' | ||
- | ## only briefly listed and described here. | ||
- | ###### | ||
- | |||
- | ## The maximum number of different slot types. | ||
- | # | ||
- | |||
- | ## Use this setting to define your own slot types. | ||
- | ## allows you to divide system resources unevenly among your CPUs. | ||
- | ## You must use a different setting for each different type you | ||
- | ## define. | ||
- | ## an integer from 1 to MAX_SLOT_TYPES (defined above), | ||
- | ## and you use this number to refer to your type. There are many | ||
- | ## different formats these settings can take, so be sure to refer to | ||
- | ## the section on " | ||
- | ## Condor Administrator' | ||
- | ## read the section titled " | ||
- | ## understand this setting. | ||
- | ## must restart the condor_start for the change to take effect. | ||
- | # | ||
- | # | ||
- | # For example: | ||
- | # | ||
- | # | ||
- | |||
- | ## If you define your own slot types, you must specify how | ||
- | ## many slots of each type you wish to advertise. | ||
- | ## this with the setting below, replacing the "< | ||
- | ## corresponding integer you used to define the type above. | ||
- | ## change the number of a given type being advertised at run-time, | ||
- | ## with a simple condor_reconfig. | ||
- | # | ||
- | # For example: | ||
- | # | ||
- | # | ||
- | |||
- | ## The number of evenly-divided slots you want Condor to | ||
- | ## report to your pool (if less than the total number of CPUs). | ||
- | ## setting is only considered if the " | ||
- | ## are not in use. By default, all CPUs are reported. | ||
- | ## must be an integer (" | ||
- | ## represent the default). | ||
- | #NUM_SLOTS = N | ||
- | |||
- | ## How many of the slots the startd is representing should | ||
- | ## be " | ||
- | ## console activity)? | ||
- | ## machine with N CPUs). | ||
- | ## setting, that's just used to represent the default). | ||
- | # | ||
- | |||
- | ## How many of the slots the startd is representing should | ||
- | ## be " | ||
- | ## as console activity). | ||
- | # | ||
- | |||
- | ## If there are slots that aren't connected to the | ||
- | ## keyboard or the console (see the above two settings), the | ||
- | ## corresponding idle time reported will be the time since the startd | ||
- | ## was spawned, plus the value of this parameter. | ||
- | ## minutes. | ||
- | ## not to care about keyboard activity, we want it to be available to | ||
- | ## Condor jobs as soon as the startd starts up, instead of having to | ||
- | ## wait for 15 minutes or more (which is the default time a machine | ||
- | ## must be idle before Condor will start a job). If you don't want | ||
- | ## this boost, just set the value to 0. If you change your START | ||
- | ## expression to require more than 15 minutes before a job starts, | ||
- | ## but you still want jobs to start right away on some of your SMP | ||
- | ## nodes, just increase this parameter. | ||
- | # | ||
- | |||
- | ###### | ||
- | ## Settings for computing optional resource availability statistics: | ||
- | ###### | ||
- | ## If STARTD_COMPUTE_AVAIL_STATS = True, the startd will compute | ||
- | ## statistics about resource availability to be included in the | ||
- | ## classad(s) sent to the collector describing the resource(s) the | ||
- | ## startd manages. | ||
- | ## in the resource classad(s) if STARTD_COMPUTE_AVAIL_STATS = True: | ||
- | ## AvailTime = What proportion of the time (between 0.0 and 1.0) | ||
- | ## has this resource been in a state other than " | ||
- | ## LastAvailInterval = What was the duration (in seconds) of the | ||
- | ## last period between " | ||
- | ## The following attributes will also be included if the resource is | ||
- | ## not in the " | ||
- | ## AvailSince = At what time did the resource last leave the | ||
- | ## " | ||
- | ## epoch (00:00:00 UTC, Jan 1, 1970). | ||
- | ## AvailTimeEstimate = Based on past history, this is an estimate | ||
- | ## of how long the current period between " | ||
- | ## last. | ||
- | # | ||
- | |||
- | ## If STARTD_COMPUTE_AVAIL_STATS = True, STARTD_AVAIL_CONFIDENCE sets | ||
- | ## the confidence level of the AvailTimeEstimate. | ||
- | ## estimate is based on the 80th percentile of past values. | ||
- | # | ||
- | |||
- | ## STARTD_MAX_AVAIL_PERIOD_SAMPLES limits the number of samples of | ||
- | ## past available intervals stored by the startd to limit memory and | ||
- | ## disk consumption. | ||
- | ## approximately 10 bytes of disk space. | ||
- | # | ||
- | |||
- | ## | ||
- | ## | ||
- | ## | ||
- | CKPT_PROBE = $(LIBEXEC)/ | ||
- | |||
- | ## | ||
- | ## condor_schedd | ||
- | ## | ||
- | ## Where are the various shadow binaries installed? | ||
- | SHADOW_LIST = SHADOW, SHADOW_STANDARD | ||
- | SHADOW = $(SBIN)/ | ||
- | SHADOW_STANDARD = $(SBIN)/ | ||
- | |||
- | ## When the schedd starts up, it can place it's address (IP and port) | ||
- | ## into a file. This way, tools running on the local machine don't | ||
- | ## need to query the central manager to find the schedd. | ||
- | ## feature can be turned off by commenting out this setting. | ||
- | SCHEDD_ADDRESS_FILE = $(SPOOL)/ | ||
- | |||
- | ## Additionally, | ||
- | ## as well as sending it to the collector. This way, tools that need | ||
- | ## information about a daemon do not have to contact the central manager | ||
- | ## to get information about a daemon on the same machine. | ||
- | ## This feature is necessary for Quill to work. | ||
- | SCHEDD_DAEMON_AD_FILE = $(SPOOL)/ | ||
- | |||
- | ## How often should the schedd send an update to the central manager? | ||
- | # | ||
- | |||
- | ## How long should the schedd wait between spawning each shadow? | ||
- | # | ||
- | |||
- | ## How many concurrent sub-processes should the schedd spawn to handle | ||
- | ## queries? | ||
- | # | ||
- | |||
- | ## How often should the schedd send a keep alive message to any | ||
- | ## startds it has claimed? | ||
- | # | ||
- | |||
- | ## This setting controls the maximum number of times that a | ||
- | ## condor_shadow processes can have a fatal error (exception) before | ||
- | ## the condor_schedd will simply relinquish the match associated with | ||
- | ## the dying shadow. | ||
- | # | ||
- | |||
- | ## Estimated virtual memory size of each condor_shadow process. | ||
- | ## Specified in kilobytes. | ||
- | # SHADOW_SIZE_ESTIMATE = 800 | ||
- | |||
- | ## The condor_schedd can renice the condor_shadow processes on your | ||
- | ## submit machines. | ||
- | ## The higher the number, the lower priority the shadows have. | ||
- | # SHADOW_RENICE_INCREMENT = 0 | ||
- | |||
- | ## The condor_schedd can renice scheduler universe processes | ||
- | ## (e.g. DAGMan) on your submit machines. | ||
- | ## scheduler universe processes? (1-19). | ||
- | ## lower priority the processes have. | ||
- | # SCHED_UNIV_RENICE_INCREMENT = 0 | ||
- | |||
- | ## By default, when the schedd fails to start an idle job, it will | ||
- | ## not try to start any other idle jobs in the same cluster during | ||
- | ## that negotiation cycle. | ||
- | ## efficient for large job clusters. | ||
- | ## jobs in the cluster can be started even though an earlier job | ||
- | ## can' | ||
- | ## different disk space, memory, or operating system requirements. | ||
- | ## Or, machines may be willing to run only some jobs in the cluster, | ||
- | ## because their requirements reference the jobs' virtual memory size | ||
- | ## or other attribute. | ||
- | ## will force the schedd to try to start all idle jobs in each | ||
- | ## negotiation cycle. | ||
- | ## but it will ensure that all jobs that can be started will be | ||
- | ## started. | ||
- | # | ||
- | |||
- | ## This setting controls how often, in seconds, the schedd considers | ||
- | ## periodic job actions given by the user in the submit file. | ||
- | ## (Currently, these are periodic_hold, | ||
- | # | ||
- | |||
- | ###### | ||
- | ## Queue management settings: | ||
- | ###### | ||
- | ## How often should the schedd truncate it's job queue transaction | ||
- | ## log? (Specified in seconds, once a day is the default.) | ||
- | # | ||
- | |||
- | ## How often should the schedd commit "wall clock" run time for jobs | ||
- | ## to the queue, so run time statistics remain accurate when the | ||
- | ## schedd crashes? | ||
- | ## default. | ||
- | # | ||
- | |||
- | ## What users do you want to grant super user access to this job | ||
- | ## queue? | ||
- | ## By default, this only includes root. | ||
- | QUEUE_SUPER_USERS = root, condor | ||
- | |||
- | |||
- | ## | ||
- | ## condor_shadow | ||
- | ## | ||
- | ## If the shadow is unable to read a checkpoint file from the | ||
- | ## checkpoint server, it keeps trying only if the job has accumulated | ||
- | ## more than MAX_DISCARDED_RUN_TIME seconds of CPU usage. | ||
- | ## the job is started from scratch. | ||
- | ## setting is only used if USE_CKPT_SERVER (from above) is True. | ||
- | # | ||
- | |||
- | ## Should periodic checkpoints be compressed? | ||
- | # | ||
- | |||
- | ## Should vacate checkpoints be compressed? | ||
- | # | ||
- | |||
- | ## Should we commit the application' | ||
- | ## space during a periodic checkpoint? | ||
- | # | ||
- | |||
- | ## Should we write vacate checkpoints slowly? | ||
- | ## parameter specifies the speed at which vacate checkpoints should | ||
- | ## be written, in kilobytes per second. | ||
- | # | ||
- | |||
- | ## How often should the shadow update the job queue with job | ||
- | ## attributes that periodically change? | ||
- | # | ||
- | |||
- | ## Should the shadow wait to update certain job attributes for the | ||
- | ## next periodic update, or should it immediately these update | ||
- | ## attributes as they change? | ||
- | ## aggressive updates to a busy condor_schedd, | ||
- | # | ||
- | |||
- | |||
- | ## | ||
- | ## condor_starter | ||
- | ## | ||
- | ## The condor_starter can renice the processes of Condor | ||
- | ## jobs on your execute machines. | ||
- | ## following entry and set it to how " | ||
- | ## jobs. (1-19) | ||
- | ## process gets on your machines. | ||
- | ## Note on Win32 platforms, this number needs to be greater than | ||
- | ## zero (i.e. the job must be reniced) or the mechanism that | ||
- | ## monitors CPU load on Win32 systems will give erratic results. | ||
- | # | ||
- | |||
- | ## Should the starter do local logging to its own log file, or send | ||
- | ## debug information back to the condor_shadow where it will end up | ||
- | ## in the ShadowLog? | ||
- | # | ||
- | |||
- | ## If the UID_DOMAIN settings match on both the execute and submit | ||
- | ## machines, but the UID of the user who submitted the job isn't in | ||
- | ## the passwd file of the execute machine, the starter will normally | ||
- | ## exit with an error. | ||
- | ## job with the specified UID, even if it's not in the passwd file? | ||
- | # | ||
- | |||
- | ## honor the run_as_owner option from the condor submit file. | ||
- | ## | ||
- | # | ||
- | |||
- | ## Tell the Starter/ | ||
- | ## condor_rmdir.exe is a windows-only command that does a better job | ||
- | ## than the built-in rmdir command when it is run with elevated privileges | ||
- | ## Such as when when Condor is running as a service. | ||
- | ## /s is delete subdirectories | ||
- | ## /c is continue on error | ||
- | WINDOWS_RMDIR = $(SBIN)\condor_rmdir.exe | ||
- | # | ||
- | |||
- | ## | ||
- | ## condor_procd | ||
- | ## | ||
- | ## | ||
- | # the path to the procd binary | ||
- | # | ||
- | PROCD = $(SBIN)/ | ||
- | |||
- | # the path to the procd " | ||
- | # - on UNIX this will be a named pipe; we'll put it in the | ||
- | # | ||
- | # will be created in this directory for when the procd responds | ||
- | # to its clients) | ||
- | # - on Windows, this will be a named pipe as well (but named pipes on | ||
- | # | ||
- | # | ||
- | # | ||
- | # | ||
- | PROCD_ADDRESS = $(LOCK)/ | ||
- | |||
- | # The procd currently uses a very simplistic logging system. Since this | ||
- | # log will not be rotated like other Condor logs, it is only recommended | ||
- | # to set PROCD_LOG when attempting to debug a problem. In other Condor | ||
- | # daemons, turning on D_PROCFAMILY will result in that daemon logging | ||
- | # all of its interactions with the ProcD. | ||
- | # | ||
- | #PROCD_LOG = $(LOG)/ | ||
- | |||
- | # This is the maximum period that the procd will use for taking | ||
- | # snapshots (the actual period may be lower if a condor daemon registers | ||
- | # a family for which it wants more frequent snapshots) | ||
- | # | ||
- | PROCD_MAX_SNAPSHOT_INTERVAL = 60 | ||
- | |||
- | # On Windows, we send a process a "soft kill" via a WM_CLOSE message. | ||
- | # This binary is used by the ProcD (and other Condor daemons if PRIVSEP | ||
- | # is not enabled) to help when sending soft kills. | ||
- | WINDOWS_SOFTKILL = $(SBIN)/ | ||
- | |||
- | ## | ||
- | ## condor_submit | ||
- | ## | ||
- | ## If you want condor_submit to automatically append an expression to | ||
- | ## the Requirements expression or Rank expression of jobs at your | ||
- | ## site, uncomment these entries. | ||
- | # | ||
- | # | ||
- | |||
- | ## If you want expressions only appended for either standard or | ||
- | ## vanilla universe jobs, you can uncomment these entries. | ||
- | ## them are defined, they are used for the given universe, instead of | ||
- | ## the generic entries above. | ||
- | # | ||
- | # | ||
- | # | ||
- | # | ||
- | |||
- | ## This can be used to define a default value for the rank expression | ||
- | ## if one is not specified in the submit file. | ||
- | # | ||
- | |||
- | ## If you want universe-specific defaults, you can use the following | ||
- | ## entries: | ||
- | # | ||
- | # | ||
- | |||
- | ## If you want condor_submit to automatically append expressions to | ||
- | ## the job ClassAds it creates, you can uncomment and define the | ||
- | ## SUBMIT_EXPRS setting. | ||
- | ## described above with respect to ClassAd vs. config file syntax, | ||
- | ## strings, etc. One common use would be to have the full hostname | ||
- | ## of the machine where a job was submitted placed in the job | ||
- | ## ClassAd. | ||
- | #MACHINE = " | ||
- | # | ||
- | |||
- | ## Condor keeps a buffer of recently-used data for each file an | ||
- | ## application opens. | ||
- | ## of bytes to be buffered for each open file at the executing | ||
- | ## machine. | ||
- | # | ||
- | |||
- | ## Condor will attempt to consolidate small read and write operations | ||
- | ## into large blocks. | ||
- | ## Condor will use. | ||
- | # | ||
- | |||
- | ## | ||
- | ## condor_preen | ||
- | ## | ||
- | ## Who should condor_preen send email to? | ||
- | # | ||
- | |||
- | ## What files should condor_preen leave in the spool directory? | ||
- | VALID_SPOOL_FILES = job_queue.log, | ||
- | Accountant.log, | ||
- | local_univ_execute, | ||
- | | ||
- | .schedd_address, | ||
- | |||
- | ## What files should condor_preen remove from the log directory? | ||
- | INVALID_LOG_FILES = core | ||
- | |||
- | ## | ||
- | ## Java parameters: | ||
- | ## | ||
- | ## If you would like this machine to be able to run Java jobs, | ||
- | ## then set JAVA to the path of your JVM binary. | ||
- | ## interested in Java, there is no harm in leaving this entry | ||
- | ## empty or incorrect. | ||
- | |||
- | JAVA = / | ||
- | |||
- | ## JAVA_CLASSPATH_DEFAULT gives the default set of paths in which | ||
- | ## Java classes are to be found. | ||
- | ## If your JVM needs to be informed of additional directories, | ||
- | ## them here. However, do not remove the existing entries, as Condor | ||
- | ## needs them. | ||
- | |||
- | JAVA_CLASSPATH_DEFAULT = $(LIB) $(LIB)/ | ||
- | |||
- | ## JAVA_CLASSPATH_ARGUMENT describes the command-line parameter | ||
- | ## used to introduce a new classpath: | ||
- | |||
- | JAVA_CLASSPATH_ARGUMENT = -classpath | ||
- | |||
- | ## JAVA_CLASSPATH_SEPARATOR describes the character used to mark | ||
- | ## one path element from another: | ||
- | |||
- | JAVA_CLASSPATH_SEPARATOR = : | ||
- | |||
- | ## JAVA_BENCHMARK_TIME describes the number of seconds for which | ||
- | ## to run Java benchmarks. | ||
- | ## benchmark, but consumes more otherwise useful CPU time. | ||
- | ## If this time is zero or undefined, no Java benchmarks will be run. | ||
- | |||
- | JAVA_BENCHMARK_TIME = 2 | ||
- | |||
- | ## If your JVM requires any special arguments not mentioned in | ||
- | ## the options above, then give them here. | ||
- | |||
- | JAVA_EXTRA_ARGUMENTS = | ||
- | |||
- | ## | ||
- | ## | ||
- | ## Condor-G settings | ||
- | ## | ||
- | ## Where is the GridManager binary installed? | ||
- | |||
- | GRIDMANAGER = $(SBIN)/ | ||
- | GT2_GAHP = $(SBIN)/ | ||
- | GRID_MONITOR = $(SBIN)/ | ||
- | |||
- | ## | ||
- | ## Settings that control the daemon' | ||
- | ## | ||
- | ## | ||
- | ## Note that the Gridmanager runs as the User, not a Condor daemon, so | ||
- | ## all users must have write permssion to the directory that the | ||
- | ## Gridmanager will use for it's logfile. Our suggestion is to create a | ||
- | ## directory called GridLogs in $(LOG) with UNIX permissions 1777 | ||
- | ## (just like /tmp ) | ||
- | ## Another option is to use /tmp as the location of the GridManager log. | ||
- | ## | ||
- | |||
- | MAX_GRIDMANAGER_LOG = 1000000 | ||
- | GRIDMANAGER_DEBUG = | ||
- | |||
- | GRIDMANAGER_LOG = $(LOG)/ | ||
- | GRIDMANAGER_LOCK = $(LOCK)/ | ||
- | |||
- | ## | ||
- | ## Various other settings that the Condor-G can use. | ||
- | ## | ||
- | |||
- | ## For grid-type gt2 jobs (pre-WS GRAM), limit the number of jobmanager | ||
- | ## processes the gridmanager will let run on the headnode. Letting too | ||
- | ## many jobmanagers run causes severe load on the headnode. | ||
- | GRIDMANAGER_MAX_JOBMANAGERS_PER_RESOURCE = 10 | ||
- | |||
- | ## If we're talking to a Globus 2.0 resource, Condor-G will use the new | ||
- | ## version of the GRAM protocol. The first option is how often to check the | ||
- | ## proxy on the submit site of things. If the GridManager discovers a new | ||
- | ## proxy, it will restart itself and use the new proxy for all future | ||
- | ## jobs launched. In seconds, | ||
- | # | ||
- | |||
- | ## The GridManager will shut things down 3 minutes before loosing Contact | ||
- | ## because of an expired proxy. | ||
- | ## In seconds, and defaults to 3 minutes | ||
- | # | ||
- | |||
- | ## Condor requires that each submitted job be designated to run under a | ||
- | ## particular " | ||
- | ## | ||
- | ## If no universe is specificed in the submit file, Condor must pick one | ||
- | ## for the job to use. By default, it chooses the " | ||
- | ## The default can be overridden in the config file with the DEFAULT_UNIVERSE | ||
- | ## setting, which is a string to insert into a job submit description if the | ||
- | ## job does not try and define it's own universe | ||
- | ## | ||
- | # | ||
- | |||
- | # | ||
- | # The Cred_min_time_left is the first-pass at making sure that Condor-G | ||
- | # does not submit your job without it having enough time left for the | ||
- | # job to finish. For example, if you have a job that runs for 20 minutes, and | ||
- | # you might spend 40 minutes in the queue, it's a bad idea to submit with less | ||
- | # than an hour left before your proxy expires. | ||
- | # 2 hours seemed like a reasonable default. | ||
- | # | ||
- | CRED_MIN_TIME_LEFT = 120 | ||
- | |||
- | |||
- | ## | ||
- | ## The GridMonitor allows you to submit many more jobs to a GT2 GRAM server | ||
- | ## than is normally possible. | ||
- | # | ||
- | |||
- | ## | ||
- | ## When an error occurs with the GridMonitor, | ||
- | ## gridmanager wait before trying to submit a new GridMonitor job? | ||
- | ## The default is 1 hour (3600 seconds). | ||
- | # | ||
- | |||
- | ## | ||
- | ## The location of the wrapper for invoking | ||
- | ## Condor GAHP server | ||
- | ## | ||
- | CONDOR_GAHP = $(SBIN)/ | ||
- | CONDOR_GAHP_WORKER = $(SBIN)/ | ||
- | |||
- | ## | ||
- | ## The Condor GAHP server has it's own log. Like the Gridmanager, | ||
- | ## GAHP server is run as the User, not a Condor daemon, so all users must | ||
- | ## have write permssion to the directory used for the logfile. Our | ||
- | ## suggestion is to create a directory called GridLogs in $(LOG) with | ||
- | ## UNIX permissions 1777 (just like /tmp ) | ||
- | ## Another option is to use /tmp as the location of the CGAHP log. | ||
- | ## | ||
- | MAX_C_GAHP_LOG = 1000000 | ||
- | |||
- | #C_GAHP_LOG = $(LOG)/ | ||
- | C_GAHP_LOG = / | ||
- | C_GAHP_LOCK = / | ||
- | C_GAHP_WORKER_THREAD_LOG = / | ||
- | C_GAHP_WORKER_THREAD_LOCK = / | ||
- | |||
- | ## | ||
- | ## The location of the wrapper for invoking | ||
- | ## GT4 GAHP server | ||
- | ## | ||
- | GT4_GAHP = $(SBIN)/ | ||
- | |||
- | ## | ||
- | ## The location of GT4 files. This should normally be lib/gt4 | ||
- | ## | ||
- | GT4_LOCATION = $(LIB)/gt4 | ||
- | |||
- | ## | ||
- | ## The location of the wrapper for invoking | ||
- | ## GT4 GAHP server | ||
- | ## | ||
- | GT42_GAHP = $(SBIN)/ | ||
- | |||
- | ## | ||
- | ## The location of GT4 files. This should normally be lib/gt4 | ||
- | ## | ||
- | GT42_LOCATION = $(LIB)/gt42 | ||
- | |||
- | ## | ||
- | ## gt4 gram requires a gridftp server to perform file transfers. | ||
- | ## If GRIDFTP_URL_BASE is set, then Condor assumes there is a gridftp | ||
- | ## server set up at that URL suitable for its use. Otherwise, Condor | ||
- | ## will start its own gridftp servers as needed, using the binary | ||
- | ## pointed at by GRIDFTP_SERVER. GRIDFTP_SERVER_WRAPPER points to a | ||
- | ## wrapper script needed to properly set the path to the gridmap file. | ||
- | ## | ||
- | # | ||
- | GRIDFTP_SERVER = $(LIBEXEC)/ | ||
- | GRIDFTP_SERVER_WRAPPER = $(LIBEXEC)/ | ||
- | |||
- | ## | ||
- | ## Location of the PBS/LSF gahp and its associated binaries | ||
- | ## | ||
- | GLITE_LOCATION = $(LIBEXEC)/ | ||
- | PBS_GAHP = $(GLITE_LOCATION)/ | ||
- | LSF_GAHP = $(GLITE_LOCATION)/ | ||
- | |||
- | ## | ||
- | ## The location of the wrapper for invoking the Unicore GAHP server | ||
- | ## | ||
- | UNICORE_GAHP = $(SBIN)/ | ||
- | |||
- | ## | ||
- | ## The location of the wrapper for invoking the NorduGrid GAHP server | ||
- | ## | ||
- | NORDUGRID_GAHP = $(SBIN)/ | ||
- | |||
- | ## The location of the CREAM GAHP server | ||
- | CREAM_GAHP = $(SBIN)/ | ||
- | |||
- | ## Condor-G and CredD can use MyProxy to refresh GSI proxies which are | ||
- | ## about to expire. | ||
- | # | ||
- | |||
- | ## The location of the Deltacloud GAHP server | ||
- | DELTACLOUD_GAHP = $(SBIN)/ | ||
- | |||
- | ## | ||
- | ## EC2: Universe = Grid, Grid_Resource = Amazon | ||
- | ## | ||
- | |||
- | ## The location of the amazon_gahp program, required | ||
- | AMAZON_GAHP = $(SBIN)/ | ||
- | |||
- | ## Location of log files, useful for debugging, must be in | ||
- | ## a directory writable by any user, such as /tmp | ||
- | # | ||
- | AMAZON_GAHP_LOG = / | ||
- | |||
- | ## The number of seconds between status update requests to EC2. You can | ||
- | ## make this short (5 seconds) if you want Condor to respond quickly to | ||
- | ## instances as they terminate, or you can make it long (300 seconds = 5 | ||
- | ## minutes) if you know your instances will run for awhile and don't mind | ||
- | ## delay between when they stop and when Condor responds to them | ||
- | ## stopping. | ||
- | GRIDMANAGER_JOB_PROBE_INTERVAL = 300 | ||
- | |||
- | ## As of this writing Amazon EC2 has a hard limit of 20 concurrently | ||
- | ## running instances, so a limit of 20 is imposed so the GridManager | ||
- | ## does not waste its time sending requests that will be rejected. | ||
- | GRIDMANAGER_MAX_SUBMITTED_JOBS_PER_RESOURCE_AMAZON = 20 | ||
- | |||
- | ## | ||
- | ## | ||
- | ## condor_credd credential managment daemon | ||
- | ## | ||
- | ## Where is the CredD binary installed? | ||
- | CREDD = $(SBIN)/ | ||
- | |||
- | ## When the credd starts up, it can place it's address (IP and port) | ||
- | ## into a file. This way, tools running on the local machine don't | ||
- | ## need an additional "-n host: | ||
- | ## feature can be turned off by commenting out this setting. | ||
- | CREDD_ADDRESS_FILE = $(LOG)/ | ||
- | |||
- | ## Specify a remote credd server here, | ||
- | # | ||
- | |||
- | ## CredD startup arguments | ||
- | ## Start the CredD on a well-known port. Uncomment to to simplify | ||
- | ## connecting to a remote CredD. | ||
- | ## in a future release. | ||
- | CREDD_PORT = 9620 | ||
- | CREDD_ARGS = -p $(CREDD_PORT) -f | ||
- | |||
- | ## CredD daemon debugging log | ||
- | CREDD_LOG = $(LOG)/ | ||
- | CREDD_DEBUG = D_FULLDEBUG | ||
- | MAX_CREDD_LOG = 4000000 | ||
- | |||
- | ## The credential owner submits the credential. | ||
- | ## other user who are also permitted to see all credentials. | ||
- | ## to root on Unix systems, and Administrator on Windows systems. | ||
- | # | ||
- | |||
- | ## Credential storage location. | ||
- | ## prior to starting condor_credd. | ||
- | ## restrict access permissions to _only_ the directory owner. | ||
- | CRED_STORE_DIR = $(LOCAL_DIR)/ | ||
- | |||
- | ## Index file path of saved credentials. | ||
- | ## This file will be automatically created if it does not exist. | ||
- | # | ||
- | |||
- | ## condor_credd | ||
- | ## remaining lifespan is less than this value. | ||
- | # | ||
- | |||
- | ## condor-credd periodically checks remaining lifespan of stored | ||
- | ## credentials, | ||
- | # | ||
- | |||
- | ## | ||
- | ## | ||
- | ## Stork data placment server | ||
- | ## | ||
- | ## Where is the Stork binary installed? | ||
- | STORK = $(SBIN)/ | ||
- | |||
- | ## When Stork starts up, it can place it's address (IP and port) | ||
- | ## into a file. This way, tools running on the local machine don't | ||
- | ## need an additional "-n host: | ||
- | ## feature can be turned off by commenting out this setting. | ||
- | STORK_ADDRESS_FILE = $(LOG)/ | ||
- | |||
- | ## Specify a remote Stork server here, | ||
- | # | ||
- | |||
- | ## STORK_LOG_BASE specifies the basename for heritage Stork log files. | ||
- | ## Stork uses this macro to create the following output log files: | ||
- | ## $(STORK_LOG_BASE): | ||
- | ## journal file. | ||
- | ## $(STORK_LOG_BASE).history: | ||
- | ## $(STORK_LOG_BASE).user_log: | ||
- | STORK_LOG_BASE = $(LOG)/ | ||
- | |||
- | ## Modern Condor DaemonCore logging feature. | ||
- | STORK_LOG = $(LOG)/ | ||
- | STORK_DEBUG = D_FULLDEBUG | ||
- | MAX_STORK_LOG = 4000000 | ||
- | |||
- | ## Stork startup arguments | ||
- | ## Start Stork on a well-known port. Uncomment to to simplify | ||
- | ## connecting to a remote Stork. | ||
- | ## in a future release. | ||
- | # | ||
- | STORK_PORT = 9621 | ||
- | STORK_ARGS = -p $(STORK_PORT) -f -Serverlog $(STORK_LOG_BASE) | ||
- | |||
- | ## Stork environment. | ||
- | ## shared object libraries. | ||
- | ## LD_LIBRARY_PATH environments. | ||
- | ## further specific environments. | ||
- | ## environment when invoked from condor_master or the shell. | ||
- | ## default environment is not adequate for all Stork modules, specify | ||
- | ## a replacement environment here. This environment will be set by | ||
- | ## condor_master before starting Stork, but does not apply if Stork is | ||
- | ## started directly from the command line. | ||
- | # | ||
- | |||
- | ## Limits the number of concurrent data placements handled by Stork. | ||
- | # | ||
- | |||
- | ## Limits the number of retries for a failed data placement. | ||
- | # | ||
- | |||
- | ## Limits the run time for a data placement job, after which the | ||
- | ## placement is considered failed. | ||
- | # | ||
- | |||
- | ## Temporary credential storage directory used by Stork. | ||
- | # | ||
- | |||
- | ## Directory containing Stork modules. | ||
- | # | ||
- | |||
- | ## | ||
- | ## | ||
- | ## Quill Job Queue Mirroring Server | ||
- | ## | ||
- | ## Where is the Quill binary installed and what arguments should be passed? | ||
- | QUILL = $(SBIN)/ | ||
- | #QUILL_ARGS = | ||
- | |||
- | # Where is the log file for the quill daemon? | ||
- | QUILL_LOG = $(LOG)/ | ||
- | |||
- | # The identification and location of the quill daemon for local clients. | ||
- | QUILL_ADDRESS_FILE = $(LOG)/ | ||
- | |||
- | # If this is set to true, then the rest of the QUILL arguments must be defined | ||
- | # for quill to function. If it is Fase or left undefined, then quill will not | ||
- | # be consulted by either the scheduler or the tools, but in the case of a | ||
- | # remote quill query where the local client has quill turned off, but the | ||
- | # remote client has quill turned on, things will still function normally. | ||
- | # | ||
- | |||
- | # | ||
- | # If Quill is enabled, by default it will only mirror the current job | ||
- | # queue into the database. For historical jobs, and classads from other | ||
- | # sources, the SQL Log must be enabled. | ||
- | # | ||
- | |||
- | # | ||
- | # The SQL Log can be enabled on a per-daemon basis. For example, to collect | ||
- | # historical job information, | ||
- | # uncomment these two lines | ||
- | # | ||
- | # | ||
- | |||
- | # This will be the name of a quill daemon using this config file. This name | ||
- | # should not conflict with any other quill name--or schedd name. | ||
- | #QUILL_NAME = quill@postgresql-server.machine.com | ||
- | |||
- | # The Postgreql server requires usernames that can manipulate tables. This will | ||
- | # be the username associated with this instance of the quill daemon mirroring | ||
- | # a schedd' | ||
- | # associated with it otherwise multiple quill daemons will corrupt the data | ||
- | # held under an indentical user name. | ||
- | # | ||
- | |||
- | # The required password for the DB user which quill will use to read | ||
- | # information from the database about the queue. | ||
- | # | ||
- | |||
- | # What kind of database server is this? | ||
- | # For now, only PGSQL is supported | ||
- | # | ||
- | |||
- | # The machine and port of the postgres server. | ||
- | # Although this says IP Addr, it can be a DNS name. | ||
- | # It must match whatever format you used for the .pgpass file, however | ||
- | # | ||
- | |||
- | # The login to use to attach to the database for updating information. | ||
- | # There should be an entry in file $SPOOL/ | ||
- | # for this login id. | ||
- | # | ||
- | |||
- | # Polling period, in seconds, for when quill reads transactions out of the | ||
- | # schedd' | ||
- | # | ||
- | |||
- | # Allows or disallows a remote query to the quill daemon and database | ||
- | # which is reading this log file. Defaults to true. | ||
- | # | ||
- | |||
- | # Add debugging flags to here if you need to debug quill for some reason. | ||
- | # | ||
- | |||
- | # Number of seconds the master should wait for the Quill daemon to respond | ||
- | # before killing it. This number might need to be increased for very | ||
- | # large logfiles. | ||
- | # The default is 3600 (one hour), but kicking it up to a few hours won't hurt | ||
- | # | ||
- | |||
- | # Should Quill hold open a database connection to the DBMSD? | ||
- | # Each open connection consumes resources at the server, so large pools | ||
- | # (100 or more machines) should set this variable to FALSE. Note the | ||
- | # default is TRUE. | ||
- | # | ||
- | |||
- | ## | ||
- | ## | ||
- | ## Database Management Daemon settings | ||
- | ## | ||
- | ## Where is the DBMSd binary installed and what arguments should be passed? | ||
- | DBMSD = $(SBIN)/ | ||
- | DBMSD_ARGS = -f | ||
- | |||
- | # Where is the log file for the quill daemon? | ||
- | DBMSD_LOG = $(LOG)/ | ||
- | |||
- | # Interval between consecutive purging calls (in seconds) | ||
- | # | ||
- | |||
- | # Interval between consecutive database reindexing operations | ||
- | # This is only used when dbtype = PGSQL | ||
- | # | ||
- | |||
- | # Number of days before purging resource classad history | ||
- | # This includes things like machine ads, daemon ads, submitters | ||
- | # | ||
- | |||
- | # Number of days before purging job run information | ||
- | # This includes job events, file transfers, matchmaker matches, etc | ||
- | # This does NOT include the final job ad. condor_history does not need | ||
- | # any of this information to work. | ||
- | # | ||
- | |||
- | # Number of days before purging job classad history | ||
- | # This is the information needed to run condor_history | ||
- | # | ||
- | |||
- | # DB size threshold for warning the condor administrator. This is checked | ||
- | # after every purge. The size is given in gigabytes. | ||
- | # | ||
- | |||
- | # Number of seconds the master should wait for the DBMSD to respond before | ||
- | # killing it. This number might need to be increased for very large databases | ||
- | # The default is 3600 (one hour). | ||
- | # | ||
- | |||
- | ## | ||
- | ## | ||
- | ## VM Universe Parameters | ||
- | ## | ||
- | ## Where is the Condor VM-GAHP installed? (Required) | ||
- | VM_GAHP_SERVER = $(SBIN)/ | ||
- | |||
- | ## If the VM-GAHP is to have its own log, define | ||
- | ## the location of log file. | ||
- | ## | ||
- | ## Optionally, if you do NOT define VM_GAHP_LOG, | ||
- | ## be stored in the starter' | ||
- | ## However, on Windows machine you must always define VM_GAHP_LOG. | ||
- | # | ||
- | VM_GAHP_LOG = $(LOG)/ | ||
- | MAX_VM_GAHP_LOG = 1000000 | ||
- | # | ||
- | |||
- | ## What kind of virtual machine program will be used for | ||
- | ## the VM universe? | ||
- | ## The two options are vmware and xen. (Required) | ||
- | #VM_TYPE = vmware | ||
- | |||
- | ## How much memory can be used for the VM universe? (Required) | ||
- | ## This value is the maximum amount of memory that can be used by the | ||
- | ## virtual machine program. | ||
- | #VM_MEMORY = 128 | ||
- | |||
- | ## Want to support networking for VM universe? | ||
- | ## Default value is FALSE | ||
- | # | ||
- | |||
- | ## What kind of networking types are supported? | ||
- | ## | ||
- | ## If you set VM_NETWORKING to TRUE, you must define this parameter. | ||
- | ## VM_NETWORKING_TYPE = nat | ||
- | ## VM_NETWORKING_TYPE = bridge | ||
- | ## VM_NETWORKING_TYPE = nat, bridge | ||
- | ## | ||
- | ## If multiple networking types are defined, you may define | ||
- | ## VM_NETWORKING_DEFAULT_TYPE for default networking type. | ||
- | ## Otherwise, nat is used for default networking type. | ||
- | ## VM_NETWORKING_DEFAULT_TYPE = nat | ||
- | # | ||
- | # | ||
- | |||
- | ## In default, the number of possible virtual machines is same as | ||
- | ## NUM_CPUS. | ||
- | ## Since too many virtual machines can cause the system to be too slow | ||
- | ## and lead to unexpected problems, limit the number of running | ||
- | ## virtual machines on this machine with | ||
- | # | ||
- | |||
- | ## When a VM universe job is started, a status command is sent | ||
- | ## to the VM-GAHP to see if the job is finished. | ||
- | ## If the interval between checks is too short, it will consume | ||
- | ## too much of the CPU. If the VM-GAHP fails to get status 5 times in a row, | ||
- | ## an error will be reported to startd, and then startd will check | ||
- | ## the availability of VM universe. | ||
- | ## Default value is 60 seconds and minimum value is 30 seconds | ||
- | # | ||
- | |||
- | ## How long will we wait for a request sent to the VM-GAHP to be completed? | ||
- | ## If a request is not completed within the timeout, an error will be reported | ||
- | ## to the startd, and then the startd will check | ||
- | ## the availability of vm universe. | ||
- | # | ||
- | |||
- | ## When VMware or Xen causes an error, the startd will disable the | ||
- | ## VM universe. | ||
- | ## we will test one more | ||
- | ## whether vm universe is still unavailable after some time. | ||
- | ## In default, startd will recheck vm universe after 10 minutes. | ||
- | ## If the test also fails, vm universe will be disabled. | ||
- | # | ||
- | |||
- | ## Usually, when we suspend a VM, the memory being used by the VM | ||
- | ## will be saved into a file and then freed. | ||
- | ## However, when we use soft suspend, neither saving nor memory freeing | ||
- | ## will occur. | ||
- | ## For VMware, we send SIGSTOP to a process for VM in order to | ||
- | ## stop the VM temporarily and send SIGCONT to resume the VM. | ||
- | ## For Xen, we pause CPU. Pausing CPU doesn' | ||
- | ## into a file. It only stops the execution of a VM temporarily. | ||
- | # | ||
- | |||
- | ## If Condor runs as root and a job comes from a different UID domain, | ||
- | ## Condor generally uses " | ||
- | ## If " | ||
- | ## as the user defined in " | ||
- | ## | ||
- | ## Notice: In VMware VM universe, " | ||
- | ## So we need to define " | ||
- | ## For VMware, the user defined in " | ||
- | ## home directory. | ||
- | ## If neither " | ||
- | ## VMware VM universe job will run as " | ||
- | ## As a result, the preference of local users for a VMware VM universe job | ||
- | ## which comes from the different UID domain is | ||
- | ## " | ||
- | # | ||
- | |||
- | ## If Condor runs as root and " | ||
- | ## all VM universe jobs will run as a user defined in " | ||
- | # | ||
- | |||
- | ## | ||
- | ## VM Universe Parameters Specific to VMware | ||
- | ## | ||
- | |||
- | ## Where is perl program? (Required) | ||
- | VMWARE_PERL = perl | ||
- | |||
- | ## Where is the Condor script program to control VMware? (Required) | ||
- | VMWARE_SCRIPT = $(SBIN)/ | ||
- | |||
- | ## Networking parameters for VMware | ||
- | ## | ||
- | ## What kind of VMware networking is used? | ||
- | ## | ||
- | ## If multiple networking types are defined, you may specify different | ||
- | ## parameters for each networking type. | ||
- | ## | ||
- | ## Examples | ||
- | ## (e.g.) VMWARE_NAT_NETWORKING_TYPE = nat | ||
- | ## (e.g.) VMWARE_BRIDGE_NETWORKING_TYPE = bridged | ||
- | ## | ||
- | ## If there is no parameter for specific networking type, VMWARE_NETWORKING_TYPE is used. | ||
- | ## | ||
- | # | ||
- | # | ||
- | VMWARE_NETWORKING_TYPE = nat | ||
- | |||
- | ## The contents of this file will be inserted into the .vmx file of | ||
- | ## the VMware virtual machine before Condor starts it. | ||
- | # | ||
- | |||
- | ## | ||
- | ## VM Universe Parameters common to libvirt controlled vm's (xen & kvm) | ||
- | ## | ||
- | |||
- | ## Networking parameters for Xen & KVM | ||
- | ## | ||
- | ## This is the path to the XML helper command; the libvirt_simple_script.awk | ||
- | ## script just reproduces what Condor already does for the kvm/xen VM | ||
- | ## universe | ||
- | LIBVIRT_XML_SCRIPT = $(LIBEXEC)/ | ||
- | |||
- | ## This is the optional debugging output file for the xml helper | ||
- | ## script. | ||
- | ## write them to the file specified by this argument, which will be | ||
- | ## passed as the second command line argument when the script is | ||
- | ## executed | ||
- | |||
- | # | ||
- | |||
- | ## | ||
- | ## VM Universe Parameters Specific to Xen | ||
- | ## | ||
- | |||
- | ## Where is bootloader for Xen domainU? (Required) | ||
- | ## | ||
- | ## The bootloader will be used in the case that a kernel image includes | ||
- | ## a disk image | ||
- | # | ||
- | |||
- | ## The contents of this file will be added to the Xen virtual machine | ||
- | ## description that Condor writes. | ||
- | # | ||
- | |||
- | ## | ||
- | ## | ||
- | ## condor_lease_manager lease manager daemon | ||
- | ## | ||
- | ## Where is the LeaseManager binary installed? | ||
- | LeaseManager = $(SBIN)/ | ||
- | |||
- | # Turn on the lease manager | ||
- | # | ||
- | |||
- | # The identification and location of the lease manager for local clients. | ||
- | LeaseManger_ADDRESS_FILE = $(LOG)/ | ||
- | |||
- | ## LeaseManager startup arguments | ||
- | # | ||
- | |||
- | ## LeaseManager daemon debugging log | ||
- | LeaseManager_LOG = $(LOG)/ | ||
- | LeaseManager_DEBUG = D_FULLDEBUG | ||
- | MAX_LeaseManager_LOG = 1000000 | ||
- | |||
- | # Basic parameters | ||
- | LeaseManager.GETADS_INTERVAL = 60 | ||
- | LeaseManager.UPDATE_INTERVAL = 300 | ||
- | LeaseManager.PRUNE_INTERVAL = 60 | ||
- | LeaseManager.DEBUG_ADS = False | ||
- | |||
- | LeaseManager.CLASSAD_LOG = $(SPOOL)/ | ||
- | # | ||
- | # | ||
- | # | ||
- | |||
- | ## | ||
- | ## | ||
- | ## KBDD - keyboard activity detection daemon | ||
- | ## | ||
- | ## When the KBDD starts up, it can place it's address (IP and port) | ||
- | ## into a file. This way, tools running on the local machine don't | ||
- | ## need an additional "-n host: | ||
- | ## feature can be turned off by commenting out this setting. | ||
- | KBDD_ADDRESS_FILE = $(LOG)/ | ||
- | |||
- | ## | ||
- | ## | ||
- | ## condor_ssh_to_job | ||
- | ## | ||
- | # NOTE: condor_ssh_to_job is not supported under Windows. | ||
- | |||
- | # Tell the starter (execute side) whether to allow the job owner or | ||
- | # queue super user on the schedd from which the job was submitted to | ||
- | # use condor_ssh_to_job to access the job interactively (e.g. for | ||
- | # debugging). | ||
- | # | ||
- | |||
- | # Tell the schedd (submit side) whether to allow the job owner or | ||
- | # queue super user to use condor_ssh_to_job to access the job | ||
- | # interactively (e.g. for debugging). | ||
- | # defined. | ||
- | # | ||
- | |||
- | # Command condor_ssh_to_job should use to invoke the ssh client. | ||
- | # %h --> remote host | ||
- | # %i --> ssh key file | ||
- | # %k --> known hosts file | ||
- | # %u --> remote user | ||
- | # %x --> proxy command | ||
- | # %% --> % | ||
- | # | ||
- | |||
- | # Additional ssh clients may be configured. | ||
- | # default as ssh, except for scp, which omits the %h: | ||
- | # | ||
- | |||
- | # Path to sshd | ||
- | # | ||
- | |||
- | # Arguments the starter should use to invoke sshd in inetd mode. | ||
- | # %f --> sshd config file | ||
- | # %% --> % | ||
- | # | ||
- | |||
- | # sshd configuration template used by condor_ssh_to_job_sshd_setup. | ||
- | # | ||
- | |||
- | # Path to ssh-keygen | ||
- | # | ||
- | |||
- | # Arguments to ssh-keygen | ||
- | # %f --> key file to generate | ||
- | # %% --> % | ||
- | # | ||
- | |||
- | ###################################################################### | ||
- | ## | ||
- | ## Condor HDFS | ||
- | ## | ||
- | ## This is the default local configuration file for configuring Condor | ||
- | ## daemon responsible for running services related to hadoop | ||
- | ## distributed storage system.You should copy this file to the | ||
- | ## appropriate location and customize it for your needs. | ||
- | ## | ||
- | ## Unless otherwise specified, settings that are commented out show | ||
- | ## the defaults that are used if you don't define a value. | ||
- | ## that are defined here MUST BE DEFINED since they have no default | ||
- | ## value. | ||
- | ## | ||
- | ###################################################################### | ||
- | |||
- | ###################################################################### | ||
- | ## FOLLOWING MUST BE CHANGED | ||
- | ###################################################################### | ||
- | |||
- | ## The location for hadoop installation directory. The default location | ||
- | ## is under ' | ||
- | ## should contain a lib folder that contains all the required Jars necessary | ||
- | ## to run HDFS name and data nodes. | ||
- | #HDFS_HOME = $(RELEASE_DIR)/ | ||
- | |||
- | ## The host and port for hadoop' | ||
- | ## name node (see HDFS_SERVICES) then the specified port will be used | ||
- | ## to run name node. | ||
- | HDFS_NAMENODE = hdfs:// | ||
- | HDFS_NAMENODE_WEB = example.com: | ||
- | |||
- | HDFS_BACKUPNODE = hdfs:// | ||
- | HDFS_BACKUPNODE_WEB = example.com: | ||
- | |||
- | ## You need to pick one machine as name node by setting this parameter | ||
- | ## to HDFS_NAMENODE. The remaining machines in a storage cluster will | ||
- | ## act as data nodes (HDFS_DATANODE). | ||
- | HDFS_NODETYPE = HDFS_DATANODE | ||
- | |||
- | ## If machine is selected to be NameNode then by a role should defined. | ||
- | ## If it selected to be DataNode then this paramer is ignored. | ||
- | ## Available options: | ||
- | ## ACTIVE: Active NameNode role (default value) | ||
- | ## BACKUP: Always synchronized with the active NameNode state, thus | ||
- | ## | ||
- | ## | ||
- | ## CHECKPOINT: Periodically creates checkpoints of the namespace. | ||
- | HDFS_NAMENODE_ROLE = ACTIVE | ||
- | |||
- | ## The two set of directories that are required by HDFS are for name | ||
- | ## node (HDFS_NAMENODE_DIR) and data node (HDFS_DATANODE_DIR). The | ||
- | ## directory for name node is only required for a machine running | ||
- | ## name node service and is used to store critical meta data for | ||
- | ## files. The data node needs its directory to store file blocks and | ||
- | ## their replicas. | ||
- | HDFS_NAMENODE_DIR = / | ||
- | HDFS_DATANODE_DIR = / | ||
- | |||
- | ## Unlike name node address settings (HDFS_NAMENODE), | ||
- | ## well known across the storage cluster, data node can run on any | ||
- | ## arbitrary port of given host. | ||
- | # | ||
- | |||
- | #################################################################### | ||
- | ## OPTIONAL | ||
- | ##################################################################### | ||
- | |||
- | ## Sets the log4j debug level. All the emitted debug output from HDFS | ||
- | ## will go in ' | ||
- | # | ||
- | |||
- | ## The access to HDFS services both name node and data node can be | ||
- | ## restricted by specifying IP/host based filters. By default settings | ||
- | ## from ALLOW_READ/ | ||
- | ## are used to specify allow and deny list. The below two parameters can | ||
- | ## be used to override these settings. Read the Condor manual for | ||
- | ## specification of these filters. | ||
- | ## WARN: HDFS doesn' | ||
- | # | ||
- | # | ||
- | |||
- | #Fully qualified name for Name node and Datanode class. | ||
- | # | ||
- | # | ||
- | # | ||
- | |||
- | ## In case an old name for hdfs configuration files is required. | ||
- | # | ||
- | |||
- | |||
- | ===== Condor Master Local Configuration File ===== | ||
- | <file autoconf condor_config.local>## | ||
- | CONDOR_HOST = john.cs.wlu.edu | ||
- | |||
- | ## Where is the local condor directory for each host? | ||
- | ## This is where the local config file(s), logs and | ||
- | ## spool/ | ||
- | LOCAL_DIR = / | ||
- | |||
- | ## Mail parameters: | ||
- | ## When something goes wrong with condor at your site, who should get | ||
- | ## the email? | ||
- | CONDOR_ADMIN = kollerg14@mail.wlu.edu | ||
- | |||
- | ## Full path to a mail delivery program that understands that " | ||
- | ## means you want to specify a subject: | ||
- | MAIL = /bin/mailx | ||
- | |||
- | ## Network domain parameters: | ||
- | ## Internet domain of machines sharing a common UID space. | ||
- | ## machines don't share a common UID space, set it to | ||
- | ## UID_DOMAIN = $(FULL_HOSTNAME) | ||
- | ## to specify that each machine has its own UID space. | ||
- | UID_DOMAIN = cs.wlu.edu | ||
- | |||
- | ## Internet domain of machines sharing a common file system. | ||
- | ## If your machines don't use a network file system, set it to | ||
- | ## FILESYSTEM_DOMAIN = $(FULL_HOSTNAME) | ||
- | ## to specify that each machine has its own file system. | ||
- | FILESYSTEM_DOMAIN = cs.wlu.edu | ||
- | |||
- | ## The user/group ID < | ||
- | ## (this can also be specified in the environment) | ||
- | ## Note: the CONDOR_IDS setting is ignored on Win32 platforms | ||
- | CONDOR_IDS = 201.481 | ||
- | |||
- | ## Condor needs to create a few lock files to synchronize access to | ||
- | ## various log files. | ||
- | ## filesystems and file locking over the years, we HIGHLY recommend | ||
- | ## that you put these lock files on a local partition on each | ||
- | ## machine. | ||
- | ## be sure to change this entry. | ||
- | ## running as needs to have write access to this directory. | ||
- | ## you're not running as root, this is whatever user you started up | ||
- | ## the condor_master as. If you are running as root, and there' | ||
- | ## condor account, it's probably condor. | ||
- | ## you've set in the CONDOR_IDS environment variable. | ||
- | ## manual for details on this. | ||
- | LOCK = / | ||
- | DAEMON_LIST = COLLECTOR, MASTER, NEGOTIATOR, SCHEDD, STARTD, KBDD | ||
- | |||
- | ## Java parameters: | ||
- | ## If you would like this machine to be able to run Java jobs, | ||
- | ## then set JAVA to the path of your JVM binary. | ||
- | ## interested in Java, there is no harm in leaving this entry | ||
- | ## empty or incorrect. | ||
- | JAVA = / | ||
- | JAVA_MAXHEAP_ARGUMENT = -Xmx1024m | ||
- | |||
- | # Designate which machines are members of this pool. | ||
- | PoolMembers = carl.cs.wlu.edu, | ||
- | # Allow machines to check the status of Condor | ||
- | ALLOW_READ = $(ALLOW_READ), | ||
- | # Allow machines to join this pool | ||
- | ALLOW_WRITE = $(ALLOW_WRITE), | ||
- | FLOCK_FROM = $(PoolMembers) | ||
- | |||
- | # Enable debugging of Class Ads | ||
- | LeaseManager.DEBUG_ADS = True</ | ||
- | |||
- | ===== Worker Local Configuration File ===== | ||
- | <file autoconf condor_config.local> | ||
- | CONDOR_DEVELOPERS = NONE | ||
- | CONDOR_HOST = $(PoolMaster) | ||
- | COLLECTOR_NAME = Orion | ||
- | |||
- | # If job submitter user is listed here, start the job regardless of | ||
- | # who might be using the computer at the time. | ||
- | IsGreedyUser = (Owner == " | ||
- | || Owner == " | ||
- | || Owner == " | ||
- | || Owner == " | ||
- | || Owner == " | ||
- | || Owner == " | ||
- | START = ( ( (KeyboardIdle > $(StartIdleTime)) \ | ||
- | && ( $(CPUIdle) || \ | ||
- | | ||
- | || $(IsGreedyUser) ) | ||
- | SUSPEND = FALSE | ||
- | PREEMPT = FALSE | ||
- | KILL = FALSE | ||
- | |||
- | DAEMON_LIST = MASTER, STARTD, KBDD | ||
- | NEGOTIATOR_INTERVAL = 20 | ||
- | TRUST_UID_DOMAIN = TRUE | ||
- | |||
- | # Join the W&L CS Pool (Orion) | ||
- | FLOCK_TO = john.cs.wlu.edu | ||
- | ALLOW_WRITE = $(ALLOW_WRITE), | ||
- | |||
- | # Enable debugging of Class Ads | ||
- | LeaseManager.DEBUG_ADS = True</ |