scriptworker package

Submodules

scriptworker.client module

Scripts running in scriptworker will use functions in this file.

scriptworker.client.get_task(config)

Read the task.json from work_dir.

Parameters:config (dict) – the running config, to find work_dir.
Returns:the contents of task.json
Return type:dict
Raises:ScriptWorkerTaskException – on error.
scriptworker.client.validate_artifact_url(config, url)

Ensure a URL fits in given scheme, netloc, and path restrictions.

If valid_artifact_schemes, valid_artifact_netlocs, and/or valid_artifact_path_regexes are defined in config but are None, skip that check.

If any are missing from config, fall back to the values in DEFAULT_CONFIG.

If valid_artifact_path_regexes is not None, the url path should match one. Each regex should define a filepath, which is what we’ll return.

Otherwise, if we pass all checks, return the unmodified path.

If we fail any checks, raise a ScriptWorkerTaskException with malformed-payload.

Parameters:
  • config (dict) – the running config.
  • url (str) – the url of the artifact.
Returns:

the filepath of the path regex.

Return type:

str

Raises:

ScriptWorkerTaskException – on failure to validate.

scriptworker.client.validate_json_schema(data, schema, name='task')

Given data and a jsonschema, let’s validate it.

This happens for tasks and chain of trust artifacts.

Parameters:
  • data (dict) – the json to validate.
  • schema (dict) – the jsonschema to validate against.
  • name (str, optional) – the name of the json, for exception messages. Defaults to “task”.
Raises:

ScriptWorkerTaskException – on failure

scriptworker.config module

Config for scriptworker

scriptworker.config.log

logging.Logger – the log object for the module.

scriptworker.config.CREDS_FILES

tuple – an ordered list of files to look for taskcluster credentials, if they aren’t in the config file or environment.

scriptworker.config.check_config(config, path)

Validate the config against DEFAULT_CONFIG.

Any unknown keys or wrong types will add error messages.

Parameters:
  • config (dict) – the running config.
  • path (str) – the path to the config file, used in error messages.
Returns:

the error messages found when validating the config.

Return type:

list

scriptworker.config.create_config(path='config.json')

Create a config from DEFAULT_CONFIG, arguments, and config file.

Then validate it and freeze it.

Parameters:path (str, optional) – the path to the config file. Defaults to “config.json”
Returns:(config dict, credentials dict)
Return type:tuple
scriptworker.config.freeze_values(dictionary)

Convert a dictionary’s list values into tuples, and dicts into frozendicts.

This won’t recurse; it’s best for relatively flat data structures.

Parameters:dictionary (dict) – the dictionary to modify in-place.
scriptworker.config.read_worker_creds(key='credentials')

Get credentials from CREDS_FILES or the environment.

This looks at the CREDS_FILES in order, and falls back to the environment.

Parameters:key (str, optional) – each CREDS_FILE is a json dict. This key’s value contains the credentials. Defaults to ‘credentials’.
Returns:the credentials found. None if no credentials found.
Return type:dict

scriptworker.context module

Most functions need access to a similar set of objects. Rather than having to pass them all around individually or create a monolithic ‘self’ object, let’s point to them from a single context object.

scriptworker.context.log

logging.Logger – the log object for the module.

class scriptworker.context.Context

Bases: object

Basic config holding object.

Avoids putting everything in single monolithic object, but allows for passing around config and easier overriding in tests.

config

dict – the running config. In production this will be a FrozenDict.

credentials_timestamp

int – the unix timestamp when we last updated our credentials.

poll_task_urls

dict – contains the Azure queues urls and an expires datestring.

proc

asyncio.subprocess.Process – when launching the script, this is the process object.

queue

taskcluster.async.Queue – the taskcluster Queue object containing the scriptworker credentials.

session

aiohttp.ClientSession – the default aiohttp session

task

dict – the task definition for the current task.

temp_queue

taskcluster.async.Queue – the taskcluster Queue object containing the task-specific temporary credentials.

claim_task

dict – The current or most recent claimTask definition json from the queue.

This contains the task definition, as well as other task-specific info.

When setting claim_task, we also set self.task and self.temp_credentails, zero out self.reclaim_task and self.proc, then write a task.json to disk.

config = None
create_queue(credentials)

Create a taskcluster queue.

Parameters:credentials (dict) – taskcluster credentials.
credentials

dict – The current scriptworker credentials, from the config or CREDS_FILES or environment.

When setting credentials, also create a new self.queue and update self.credentials_timestamp.

credentials_timestamp = None
poll_task_urls = None
proc = None
queue = None
reclaim_task

dict – The most recent reclaimTask definition.

This contains the newest expiration time and the newest temp credentials.

When setting reclaim_task, we also set self.temp_credentials.

reclaim_task will be None if there hasn’t been a claimed task yet, or if a task has been claimed more recently than the most recent reclaimTask call.

session = None
task = None
temp_credentials

dict

The latest temp credentials, or None if we haven’t claimed a
task yet.

When setting, create self.temp_queue from the temp taskcluster creds.

temp_queue = None
write_json(path, contents, message)

Write json to disk.

Parameters:
  • path (str) – the path to write to
  • contents (dict) – the contents of the json blob
  • message (str) – the message to log

scriptworker.exceptions module

scriptworker exceptions

exception scriptworker.exceptions.DownloadError(msg)

Bases: scriptworker.exceptions.ScriptWorkerTaskException

exception scriptworker.exceptions.ScriptWorkerException

Bases: Exception

The base exception in scriptworker.

When raised inside of the run_loop loop, set the taskcluster task status to at least self.exit_code.

exit_code

int – this is set to 5 (internal-error).

exit_code = 5
exception scriptworker.exceptions.ScriptWorkerGPGException

Bases: scriptworker.exceptions.ScriptWorkerException

Scriptworker GPG error.

exit_code

int – this is set to 5 (internal-error).

exit_code = 5
exception scriptworker.exceptions.ScriptWorkerRetryException

Bases: scriptworker.exceptions.ScriptWorkerException

ScriptWorkerRetryException.

exit_code

int – this is set to 4 (resource-unavailable)

exit_code = 4
exception scriptworker.exceptions.ScriptWorkerTaskException(*args, *, exit_code=1, **kwargs)

Bases: scriptworker.exceptions.ScriptWorkerException

To use:

import sys
import traceback
try:
    ...
except ScriptWorkerTaskException as exc:
    traceback.print_exc()
    sys.exit(exc.exit_code)
Parameters:exit_code (int, optional) – The exit_code we should exit with when this exception is raised. Defaults to 1 (failure).
exit_code

int – this is 1 by default (failure)

scriptworker.gpg module

GPG functions. These currently assume gpg 2.0.x

These GPG functions expose considerable functionality over gpg key management, data signatures, and validation, but by no means are they intended to cover all gnupg functionality. They are intended for automated key management and validation for scriptworker.

scriptworker.gpg.log

logging.Logger – the log object for this module.

scriptworker.gpg.GPG_CONFIG_MAPPING

dict – This maps the scriptworker config key names to the python-gnupg names.

scriptworker.gpg.GPG(context, gpg_home=None)

Get a python-gnupg GPG instance based on the settings in context.

Parameters:
  • context (scriptworker.context.Context) – the scriptworker context.
  • gpg_home (str, optional) – override context.config[‘gpg_home’] if desired. Defaults to None.
Returns:

the GPG instance with the appropriate configs.

Return type:

gnupg.GPG

scriptworker.gpg.check_ownertrust(context, gpg_home=None)

In theory, this will repair a broken trustdb.

Rebuild the trustdb via –import-ownertrust if not.

Parameters:
  • context (scriptworker.context.Context) – the scriptworker context.
  • gpg_home (str, optional) – override the gpg_home with a different gnupg home directory here. Defaults to None.
scriptworker.gpg.consume_valid_keys(context, keydir=None, ignore_suffixes=(), gpg_home=None)

Given a keydir, traverse the keydir, and import all gpg public keys.

Parameters:
  • context (scriptworker.context.Context) – the scriptworker context.
  • keydir (str, optional) – the path of the directory to traverse. If None, this function is noop. Default is None.
  • ignore_suffixes (list, optional) – file suffixes to ignore. Default is ().
  • gpg_home (str, optional) – override the gpg_home dir. Default is None.
Returns:

fingerprints

Return type:

list

Raises:

ScriptworkerGPGException – on error.

scriptworker.gpg.create_gpg_conf(gpg_home, keyserver=None, my_fingerprint=None)

Create a gpg.conf with Mozilla infosec guidelines.

Parameters:
  • gpg_home (str) – the homedir for this keyring.
  • keyserver (str, optional) – The gpg keyserver to specify, e.g. hkp://gpg.mozilla.org or hkp://keys.gnupg.net. If set, we also enable auto-key-retrieve. Defaults to None.
  • my_fingerprint (str, optional) – the fingerprint of the default key. Once set, gpg will use it by default, unless a different key is specified. Defaults to None.
scriptworker.gpg.export_key(gpg, fingerprint, private=False)

Return the ascii armored key identified by fingerprint.

Parameters:
  • gpg (gnupg.GPG) – the GPG instance.
  • fingerprint (str) – the fingerprint of the key to export.
  • private (bool, optional) – If True, return the private key instead of the public key. Defaults to False.
Returns:

the ascii armored key identified by fingerprint.

Return type:

str

Raises:

ScriptworkerGPGException – if the key isn’t found.

scriptworker.gpg.fingerprint_to_keyid(gpg, fingerprint, private=False)

Return the keyid of the key that corresponds to fingerprint.

Keyids should default to long keyids; this will happen once create_gpg_conf() is called.

Parameters:
  • gpg (gnupg.GPG) – gpg object for the appropriate gpg_home / keyring
  • fingerpint (str) – the fingerprint of the key we’re searching for.
  • private (bool, optional) – If True, search the private keyring instead of the public keyring. Defaults to False.
Returns:

keyid – the keyid of the key with fingerprint fingerprint

Return type:

str

Raises:

ScriptworkerGPGException – if we can’t find fingerprint in this keyring.

scriptworker.gpg.generate_key(gpg, name, comment, email, key_length=4096, expiration=None)

Generate a gpg keypair.

Parameters:
  • gpg (gnupg.GPG) – the GPG instance.
  • name (str) – the name attached to the key. 1/3 of the key user id.
  • comment (str) – the comment attached to the key. 1/3 of the key user id.
  • email (str) – the email attached to the key. 1/3 of the key user id.
  • key_length (int, optional) – the key length in bits. Defaults to 4096.
  • expiration (str, optional) – The expiration of the key. This can take the forms “2009-12-31”, “365d”, “3m”, “6w”, “5y”, “seconds=<epoch>”, or 0 for no expiry. Defaults to None.
Returns:

fingerprint – the fingerprint of the key just generated.

Return type:

str

scriptworker.gpg.get_body(gpg, signed_data, gpg_home=None, **kwargs)

Verifies the signature, then returns the unsigned data from signed_data.

Parameters:
  • gpg (gnupg.GPG) – the GPG instance.
  • signed_data (str) – The ascii armored signed data.
  • gpg_home (str, optional) – override the gpg_home with a different gnupg home directory here. Defaults to None.
  • kwargs (dict, optional) – These are passed directly to gpg.decrypt(). Defaults to {}. https://pythonhosted.org/python-gnupg/#decryption
Returns:

unsigned contents on success.

Return type:

str

Raises:

ScriptWorkerGPGException – on signature verification failure.

scriptworker.gpg.get_list_sigs_output(context, key_fingerprint, gpg_home=None, validate=True, expected=None)

gpg –list-sigs, with machine parsable output, for gpg 2.0.x

Parameters:
  • context (scriptworker.context.Context) – the scriptworker context.
  • key_fingerprint (str) – the fingerprint of the key we want to get signature information about.
  • gpg_home (str, optional) – override the gpg_home with a different gnupg home directory here. Defaults to None.
  • validate (bool, optional) – Validate the output via parse_list_sigs_output() Defaults to True.
  • expected (dict, optional) – This is passed on to parse_list_sigs_output() if validate is True. Defaults to None.
Returns:

the output from gpg –list-sigs, if validate is False dict: the output from parse_list_sigs_output, if validate is True

Return type:

str

Raises:

ScriptWorkerGPGException – if there is an issue with the key.

scriptworker.gpg.gpg_default_args(gpg_home)

For commandline gpg calls, use these args by default.

Parameters:gpg_home (str) – The path to the gpg homedir. gpg will look for the gpg.conf, trustdb.gpg, and keyring files in here.
Returns:the list of default commandline arguments to add to the gpg call.
Return type:list
scriptworker.gpg.guess_gpg_home(obj, gpg_home=None)

Guess gpg_home. If gpg_home is specified, return that.

Parameters:
  • obj (object) – If gpg_home is set, return that. Otherwise, if obj is a context object and context.config[‘gpg_home’] is not None, return that. If obj is a GPG object and obj.gnupghome is not None, return that. Otherwise look in ~/.gnupg.
  • gpg_home (str, optional) – The path to the gpg homedir. gpg will look for the gpg.conf, trustdb.gpg, and keyring files in here. Defaults to None.
Returns:

the path to the guessed gpg homedir.

Return type:

str

Raises:

ScriptWorkerGPGException – if obj doesn’t contain the gpg home info and os.environ[‘HOME’] isn’t set.

scriptworker.gpg.guess_gpg_path(context)

Simple gpg_path guessing function.

Parameters:context (scriptworker.context.Context) – the scriptworker context.
Returns:either context.config[‘gpg_path’] or ‘gpg’ if that’s not defined.
Return type:str
scriptworker.gpg.has_suffix(path, suffixes)

Given a list of suffixes, return True if path ends with one of them.

Parameters:
  • path (str) – the file path to check
  • suffixes (list) – the suffixes to check for
scriptworker.gpg.import_key(gpg, key_data, return_type='fingerprints')

Import ascii key_data.

In theory this can be multiple keys. However, jenkins is barfing on multiple key import tests, although multiple key import tests are working locally. Until we identify what the problem is (likely gpg version?) we should only import 1 key at a time.

Parameters:
  • gpg (gnupg.GPG) – the GPG instance.
  • key_data (str) – ascii armored key data
  • return_type (str, optional) – if ‘fingerprints’, return the fingerprints only. Otherwise return the result list.
Returns:

if return_type is ‘fingerprints’, return the fingerprints of the

imported keys. Otherwise return the results list. https://pythonhosted.org/python-gnupg/#importing-and-receiving-keys

Return type:

list

scriptworker.gpg.keyid_to_fingerprint(gpg, keyid, private=False)

Return the fingerprint of the key that corresponds to keyid.

Keyids should default to long keyids; this will happen once create_gpg_conf() is called.

Parameters:
  • gpg (gnupg.GPG) – gpg object for the appropriate gpg_home / keyring
  • keyid (str) – the long keyid that represents the key we’re searching for.
  • private (bool, optional) – If True, search the private keyring instead of the public keyring. Defaults to False.
Returns:

fingerprint – the fingerprint of the key with keyid keyid

Return type:

str

Raises:

ScriptworkerGPGException – if we can’t find keyid in this keyring.

scriptworker.gpg.latest_signed_git_commit(gpg, output, trusted_fingerprints)

Return the latest git commit sha that’s signed by a trusted fingerprint.

There are a number of ways to do this. git show –show-signature will show the output of gpg –verify, assuming we have the key already marked valid in our web of trust, also assuming we’re using the proper gpg_home.

This function allows for unsigned commits between signed commits. We may disallow those in the future.

Parameters:
  • gpg (gnupg.GPG) – the GPG instance.
  • output (str) – the output of git log –format=’%H:%GK’
  • trusted_fingerprints (list) – the gpg fingerprints of valid keys.
scriptworker.gpg.overwrite_gpg_home(tmp_gpg_home, real_gpg_home)

Take the contents of tmp_gpg_home and copy them to real_gpg_home.

For now, back up real_gpg_home before doing so. We may want to revisit for disk space reasons: only keep N backups?

Parameters:
  • tmp_gpg_home (str) – path to the rebuilt gpg_home with the new keychains+ trust models
  • real_gpg_home (str) – path to the old gpg_home to overwrite
scriptworker.gpg.parse_list_sigs_output(output, desc, expected=None)

Parse the output from –list-sigs; validate.

NOTE: This doesn’t work with complex key/subkeys; this is only written for the keys generated through the functions in this module.

  1. Field: Type of record

    pub = public key crt = X.509 certificate crs = X.509 certificate and private key available sub = subkey (secondary key) sec = secret key ssb = secret subkey (secondary key) uid = user id (only field 10 is used). uat = user attribute (same as user id except for field 10). sig = signature rev = revocation signature fpr = fingerprint: (fingerprint is in field 10) pkd = public key data (special field format, see below) grp = keygrip rvk = revocation key tru = trust database information spk = signature subpacket

There are also ‘gpg’ lines like

gpg: checking the trustdb gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model gpg: depth: 0 valid: 3 signed: 0 trust: 0-, 0q, 0n, 0m, 0f, 3u

This is a description of the web of trust. I’m currently not parsing these; per [1] and [2] I would need to read the source for full parsing.

[1] http://security.stackexchange.com/a/41209

[2] http://gnupg.10057.n7.nabble.com/placing-trust-in-imported-keys-td30124.html#a30125

Parameters:
  • output (str) – the output from get_list_sigs_output()
  • desc (str) – a description of the key being tested, for exception message purposes.
  • expected (dict, optional) – expected outputs. If specified and the expected doesn’t match the real, raise an exception. Expected takes keyid, fingerprint, uid, sig_keyids (list), and sig_uids (list), all optional. Defaults to None.
Returns:

real

the real values from the key. This specifies

keyid, fingerprint, uid, sig_keyids, and sig_uids.

Return type:

dict

Raises:

ScriptWorkerGPGException – on mismatched expectations, or if we found revocation markers or the like that make for a bad key.

scriptworker.gpg.rebuild_gpg_home(context, tmp_gpg_home, my_pub_key_path, my_sec_key_path)

import my key and create gpg.conf and trustdb.gpg

Parameters:
  • gpg (gnupg.GPG) – the GPG instance.
  • tmp_gpg_home (str) – the path to the tmp gpg_home. This should already exist.
  • my_pubkey_path (str) – the ascii pubkey file we want to import as the primary key
  • my_seckey_path (str) – the ascii seckey file we want to import as the primary key
scriptworker.gpg.rebuild_gpg_home_flat(context, real_gpg_home, my_pub_key_path, my_sec_key_path, consume_path, ignore_suffixes=(), consume_function=<function consume_valid_keys>)

Rebuild real_gpg_home with new trustdb, pub+secrings, gpg.conf.

In this ‘flat’ model, import all the pubkeys in consume_path and sign them directly. This makes them valid but not trusted.

Parameters:
  • context (scriptworker.context.Context) – the scriptworker context.
  • real_gpg_home (str) – the gpg_home path we want to rebuild
  • my_pubkey_path (str) – the ascii pubkey file we want to import as the primary key
  • my_seckey_path (str) – the ascii seckey file we want to import as the primary key
  • consume_path (str) – the path to the directory tree to import pubkeys from
  • ignore_suffixes (list, optional) – the suffixes to ignore in consume_path. Defaults to ()
  • consume_function (function, optional) – the function to call to consume the public keys. Defaults to consume_valid_keys()
scriptworker.gpg.rebuild_gpg_home_signed(context, real_gpg_home, my_pub_key_path, my_sec_key_path, trusted_path, untrusted_path=None, ignore_suffixes=(), consume_function=<function consume_valid_keys>)

Rebuild real_gpg_home with new trustdb, pub+secrings, gpg.conf.

In this ‘signed’ model, import all the pubkeys in consume_path and sign them directly. This makes them valid but not trusted.

Parameters:
  • context (scriptworker.context.Context) – the scriptworker context.
  • real_gpg_home (str) – the gpg_home path we want to rebuild
  • my_pubkey_path (str) – the ascii pubkey file we want to import as the primary key
  • my_seckey_path (str) – the ascii seckey file we want to import as the primary key
  • trusted_path (str) – the path to the directory tree to import trusted pubkeys from
  • untrusted_path (str, optional) – the path to the directory tree to import untrusted but valid pubkeys from
  • ignore_suffixes (list, optional) – the suffixes to ignore in consume_path. Defaults to ()
  • consume_function (function, optional) – the function to call to consume the public keys. Defaults to consume_valid_keys()
scriptworker.gpg.sign(gpg, data, **kwargs)

Sign data with the key kwargs[‘keyid’], or the default key if not specified

Parameters:
Returns:

the ascii armored signed data.

Return type:

str

scriptworker.gpg.sign_key(context, target_fingerprint, signing_key=None, exportable=False, gpg_home=None)

Sign the target_fingerprint key with the signing_key or default key

This signs the target key with the signing key, which adds to the web of trust.

Parameters:
  • context (scriptworker.context.Context) – the scriptworker context.
  • target_fingerprint (str) – the fingerprint of the key to sign.
  • signing_key (str, optional) – the fingerprint of the signing key to sign with. If not set, this defaults to the default-key in the gpg.conf. Defaults to None.
  • exportable (bool, optional) – whether the signature should be exportable. Defaults to False.
  • gpg_home (str, optional) – override the gpg_home with a different gnupg home directory here. Defaults to None.
Raises:

ScriptWorkerGPGException – on a failed signature.

scriptworker.gpg.update_ownertrust(context, my_fingerprint, trusted_fingerprints=None, gpg_home=None)

Trust my key ultimately; trusted_fingerprints fully

Parameters:
  • context (scriptworker.context.Context) – the scriptworker context.
  • my_fingerprint (str) – the fingerprint of the key we want to specify as ultimately trusted.
  • trusted_fingerprints (list, optional) – the list of fingerprints that we want to mark as fully trusted. These need to be signed by the my_fingerprint key before they are trusted.
  • gpg_home (str, optional) – override the gpg_home with a different gnupg home directory here. Defaults to None.
Raises:

ScriptWorkerGPGException – if there is an error.

scriptworker.gpg.verify_ownertrust(context, my_fingerprint, trusted_fingerprints=None, gpg_home=None)

Verify the ownertrust is exactly as expected.

Parameters:
  • context (scriptworker.context.Context) – the scriptworker context.
  • my_fingerprint (str) – the fingerprint of the key we specified as ultimately trusted.
  • trusted_fingerprints (list, optional) – the list of fingerprints that we marked as fully trusted.
  • gpg_home (str, optional) – override the gpg_home with a different gnupg home directory here. Defaults to None.
Raises:

ScriptWorkerGPGException – if there is an error.

scriptworker.gpg.verify_signature(gpg, signed_data, **kwargs)

Verify signed_data with the key kwargs[‘keyid’], or the default key if not specified.

Parameters:
Returns:

on success.

Return type:

gnupg.Verify

Raises:

ScriptWorkerGPGException – on failure.

scriptworker.log module

scriptworker logging

scriptworker.log.log

logging.Logger – the log object for this module.

scriptworker.log.get_log_fhs(context)

Helper contextmanager function to open the log and error filehandles.

Parameters:context (scriptworker.context.Context) – the scriptworker context.
Yields:tuple – log filehandle, error log filehandle
scriptworker.log.get_log_filenames(context)

Helper function to get the task log/error file paths.

Parameters:context (scriptworker.context.Context) – the scriptworker context.
Returns:log file path, error log file path
Return type:tuple
scriptworker.log.log_errors(reader, log_fh, error_fh)

Log STDERR from the task subprocess to the log and error filehandles.

These may not actually be errors; python logging uses STDERR for output. The running process should be able to detect its own status, rather than relying on scriptworker to do so.

Parameters:
  • reader (filehandle) – subprocess process stderr
  • log_fh (filehandle) – the stdout log filehandle
  • error_fh (filehandle) – the stderr log filehandle
scriptworker.log.read_stdout(stdout, log_fh)

Log STDOUT from the task subprocess to the log filehandle.

Parameters:
  • stdout (filehandle) – subprocess process stdout
  • log_fh (filehandle) – the stdout log filehandle
scriptworker.log.update_logging_config(context, log_name=None)

Update python logging settings from config.

By default, this sets the scriptworker log settings, but this will change if some other package calls this function or specifies the log_name.

  • Use formatting from config settings.
  • Log to screen if verbose
  • Add a rotating logfile from config settings.
Parameters:
  • context (scriptworker.context.Context) – the scriptworker context.
  • log_name (str, optional) – the name of the Logger to modify. If None, use the top level module (‘scriptworker’). Defaults to None.

scriptworker.poll module

Deal with the multi-step queue polling. At some point we may be able to just claimTask through Taskcluster; until that point we have these functions.

scriptworker.poll.log

logging.Logger – the log object for the module.

scriptworker.poll.claim_task(context, taskId, runId)

Attempt to claim a task that we found in the Azure queue.

Parameters:
  • context (scriptworker.context.Context) – the scriptworker context.
  • taskId (str) – the taskcluster taskId to claim
  • runId (int) – the taskcluster runId to claim
Returns:

claimTask definition, if successful. If unsuccessful, return None.

Return type:

dict

scriptworker.poll.find_task(context, poll_url, delete_url, request_function)

Main polling function.

For a given poll_url/delete_url pair, get the xml from the poll_url. For each message in the xml, parse and try to claim the task. Delete the message from the Azure queue whether the claim was successful or not (error 409 on claim means the task was cancelled/expired/claimed).

If the claim was successful, return the task json.

Parameters:
  • context (scriptworker.context.Context) – the scriptworker context.
  • poll_url (str) – The Azure URL to poll for tasks
  • delete_url (str) – The Azure URL to delete claimed tasks
  • request_function (function) – the function to call to poll the URLs. This should scriptworker.utils.retry_request outside of testing.
Returns:

the claimTask json

Return type:

dict

scriptworker.poll.get_azure_urls(context)

Yield the poll_url and delete_url from the poll_task_urls, in order.

These URLs are for finding the task breadcrumbs in Azure, and for deleting them from Azure, respectively.

Parameters:context (scriptworker.context.Context) – the scriptworker context.
Yields:tuple – poll_url, delete_url
scriptworker.poll.parse_azure_message(message)

Parse a single Azure message from the xml.

Parameters:message (Element) – xml element containing a single Azure message
Returns:the relevant message info
Return type:dict
scriptworker.poll.parse_azure_xml(xml)

Generator: parse the Azure xml and pass through parse_azure_message()

Parameters:xml (str) – the contents of the xml document
Yields:dict – yields the relevant message info for each message, in order.
scriptworker.poll.update_poll_task_urls(context, callback, min_seconds_left=300, args=(), kwargs=None)

Update the Azure urls to poll.

Queue.pollTaskUrls() returns an ordered list of Azure url pairs to poll for task “hints”. This list is valid until expiration.

This function checks for an up-to-date poll_task_urls; if non-existent or near expiration, get new poll_task_urls.

http://docs.taskcluster.net/queue/worker-interaction/

Parameters:
  • context (scriptworker.context.Context) – the scriptworker context.
  • callback (function) – This should be context.queue.pollTaskUrls outside of testing.
  • min_seconds_left (int, optional) – We have an expiry datestring; if we have less than min_seconds_left seconds left, then update the urls. Defaults to 300.
  • args (list, optional) – the args to pass to the callback. Defaults to ()
  • kwargs (dict, optional) – the kwargs to pass to the callback. Defaults to None.

scriptworker.task module

Scriptworker task execution

scriptworker.task.log

logging.Logger – the log object for the module

scriptworker.task.complete_task(context, result)

Mark the task as completed in the queue.

Decide whether to call reportCompleted, reportFailed, or reportException based on the exit status of the script.

If the task has expired or been cancelled, we’ll get a 409 status.

Parameters:context (scriptworker.context.Context) – the scriptworker context.
Raises:taskcluster.exceptions.TaskclusterRestFailure – on non-409 error.
scriptworker.task.create_artifact(context, path, target_path, storage_type='s3', expires=None, content_type=None)

Create an artifact and upload it.

This should support s3 and azure out of the box; we’ll need some tweaking if we want to support redirect/error artifacts.

Parameters:
  • context (scriptworker.context.Context) – the scriptworker context.
  • path (str) – the path of the file to upload.
  • target_path (str) –
  • storage_type (str, optional) – the taskcluster storage type to use. Defaults to ‘s3’
  • expires (str, optional) – datestring of when the artifact expires. Defaults to None.
  • content_type (str, optional) – Specify the content type of the artifact. If None, use guess_content_type(). Defaults to None.
Raises:

ScriptWorkerRetryException – on failure.

scriptworker.task.download_artifacts(context, file_urls, parent_dir=None, session=None, download_func=<function download_file>)
scriptworker.task.get_expiration_arrow(context)

Return an arrow, artifact_expiration_hours in the future from now.

Parameters:context (scriptworker.context.Context) – the scriptworker context
Returns:now + artifact_expiration_hours
Return type:arrow
scriptworker.task.guess_content_type(path)

Guess the content type of a path, using mimetypes

Parameters:path (str) – the path to guess the mimetype of
Returns:the content type of the file
Return type:str
scriptworker.task.kill(pid, sleep_time=1)

Kill pid with various signals.

Parameters:
  • pid (int) – the process id to kill.
  • sleep_time (int, optional) – how long to sleep between killing the pid and checking if the pid is still running.
scriptworker.task.max_timeout(context, proc, timeout)

Make sure the proc pid’s process and process group are killed.

First, kill the process group (-pid) and then the pid.

Parameters:
  • context (scriptworker.context.Context) – the scriptworker context.
  • proc (subprocess.Process) – the subprocess proc. This is compared against context.proc to make sure we’re killing the right pid.
  • timeout (int) – Used for the log message.
scriptworker.task.reclaim_task(context, task)

Try to reclaim a task from the queue.

This is a keepalive / heartbeat. Without it the job will expire and potentially be re-queued. Since this is run async from the task, the task may complete before we run, in which case we’ll get a 409 the next time we reclaim.

Parameters:context (scriptworker.context.Context) – the scriptworker context
Raises:taskcluster.exceptions.TaskclusterRestFailure – on non-409 status_code from taskcluster.async.Queue.reclaimTask()
scriptworker.task.retry_create_artifact(*args, **kwargs)

Retry create_artifact() calls.

Parameters:
  • *args – the args to pass on to create_artifact
  • **kwargs – the args to pass on to create_artifact
scriptworker.task.run_task(context)

Run the task, sending stdout+stderr to files.

https://github.com/python/asyncio/blob/master/examples/subprocess_shell.py

Parameters:context (scriptworker.context.Context) – the scriptworker context.
Returns:exit code
Return type:int
scriptworker.task.upload_artifacts(context)

Upload the files in artifact_dir, preserving relative paths.

This function expects the directory structure in artifact_dir to remain the same. So if we want the files in public/..., create an artifact_dir/public and put the files in there.

Parameters:context (scriptworker.context.Context) – the scriptworker context.
Raises:Exception – any exceptions the tasks raise.
scriptworker.task.worst_level(level1, level2)

Given two int levels, return the larger.

Parameters:
  • level1 (int) – exit code 1.
  • level2 (int) – exit code 2.
Returns:

the larger of the two levels.

Return type:

int

scriptworker.utils module

Generic utils for scriptworker

scriptworker.utils.log

logging.Logger – the log object for the module

scriptworker.utils.cleanup(context)

Clean up the work_dir and artifact_dir between task runs, then recreate.

Parameters:context (scriptworker.context.Context) – the scriptworker context.
scriptworker.utils.create_temp_creds(client_id, access_token, start=None, expires=None, scopes=None, name=None)

Request temp TC creds with our permanent creds.

Parameters:
  • client_id (str) – the taskcluster client_id to use
  • access_token (str) – the taskcluster access_token to use
  • start (str, optional) – the datetime string when the credentials will start to be valid. Defaults to 10 minutes ago, for clock skew.
  • expires (str, optional) – the datetime string when the credentials will expire. Defaults to 31 days after 10 minutes ago.
  • scopes (list, optional) – The list of scopes to request for the temp creds. Defaults to [‘assume:project:taskcluster:worker-test-scopes’, ]
  • name (str, optional) – the name to associate with the creds.
Returns:

the temporary taskcluster credentials.

Return type:

dict

scriptworker.utils.datestring_to_timestamp(datestring)

Create a timetamp from a taskcluster datestring

Parameters:datestring (str) – the datestring to convert. isoformat, like “2016-04-16T03:46:24.958Z”
Returns:the corresponding timestamp.
Return type:int
scriptworker.utils.download_file(context, url, abs_filename, session=None, chunk_size=128)
scriptworker.utils.filepaths_in_dir(path)

Find all files in a directory, and return the relative paths to those files.

Parameters:path (str) – the directory path to walk
Returns:
the list of relative paths to all files inside of path or its
subdirectories.
Return type:list
scriptworker.utils.format_json(data)

Format json as a sorted string (indents of 2)

Parameters:data (dict) – the json to format.
Returns:the formatted json.
Return type:str
scriptworker.utils.get_hash(path, hash_alg='sha256')

Get the hash of the file at path.

I’d love to make this async, but evidently file i/o is always ready

Parameters:
  • path (str) – the path to the file to hash.
  • hash_alg (str, optional) – the algorithm to use. Defaults to ‘sha256’.
Returns:

the hexdigest of the hash.

Return type:

str

scriptworker.utils.makedirs(path)

mkdir -p

Parameters:path (str) – the path to mkdir -p
Raises:ScriptWorkerException – if path exists already and the realpath is not a dir.
scriptworker.utils.raise_future_exceptions(tasks)

Given a list of futures, await them, then raise their exceptions if any.

Without something like this, a bare:

await asyncio.wait(tasks)

will swallow exceptions.

Parameters:tasks (list) – the list of futures to await and check for exceptions.
Raises:Exception – any exceptions in task.exception()
scriptworker.utils.request(context, url, timeout=60, method='get', good=(200, ), retry=(500, 501, 502, 503, 504, 505, 506, 507, 508, 509, 510, 511), return_type='text', **kwargs)

Async aiohttp request wrapper.

Parameters:
  • context (scriptworker.context.Context) – the scriptworker context.
  • url (str) – the url to request
  • timeout (int, optional) – timeout after this many seconds. Default is 60.
  • method (str, optional) – The request method to use. Default is ‘get’.
  • good (list, optional) – the set of good status codes. Default is (200, )
  • retry (list, optional) – the set of status codes that result in a retry. Default is tuple(range(500, 512)).
  • return_type (str, optional) – The type of value to return. Takes ‘json’ or ‘text’; other values will return the response object. Default is text.
  • **kwargs – the kwargs to send to the aiohttp request function.
Returns:

the response text() if return_type is ‘text’; the response

json() if return_type is ‘json’; the aiohttp request response object otherwise.

Return type:

object

Raises:
  • ScriptWorkerRetryException – if the status code is in the retry list.
  • ScriptWorkerException – if the status code is not in the retry list or good list.
scriptworker.utils.retry_async(func, attempts=5, sleeptime_callback=<function calculateSleepTime>, retry_exceptions=(<class 'Exception'>, ), args=(), kwargs=None)

Retry func, where func is an awaitable.

Parameters:
  • func (function) – an awaitable function.
  • attempts (int, optional) – the number of attempts to make. Default is 5.
  • sleeptime_callback (function, optional) – the function to use to determine how long to sleep after each attempt. Defaults to calculateSleepTime.
  • retry_exceptions (list, optional) – the exceptions to retry on. Defaults to (Exception, )
  • args (list, optional) – the args to pass to function. Defaults to ()
  • kwargs (dict, optional) – the kwargs to pass to function. Defaults to {}.
Returns:

the value from a successful function call

Return type:

object

Raises:

Exception – the exception from a failed function call, either outside of the retry_exceptions, or one of those if we pass the max attempts.

scriptworker.utils.retry_request(*args, *, retry_exceptions=(<class 'scriptworker.exceptions.ScriptWorkerRetryException'>, ), **kwargs)

Retry the request function

Parameters:
  • *args – the args to send to request() through retry_async().
  • retry_exceptions (list, optional) – the exceptions to retry on. Defaults to (ScriptWorkerRetryException, ).
  • **kwargs – the kwargs to send to request() through retry_async().
Returns:

the value from request().

Return type:

object

scriptworker.utils.rm(path)

rm -rf

Make sure path doesn’t exist after this call. If it’s a dir, shutil.rmtree(); if it’s a file, os.remove(); if it doesn’t exist, ignore.

Parameters:path (str) – the path to nuke.
scriptworker.utils.to_unicode(line)

Avoid |b'line'| type messages in the logs

Parameters:line (str) – The bytecode or unicode string.
Returns:
the unicode-decoded string, if line was a bytecode string.
Otherwise return line unmodified.
Return type:str

scriptworker.worker module

scriptworker.worker.async_main(context)

Main async loop, following the drawing at http://docs.taskcluster.net/queue/worker-interaction/

This is a simple loop, mainly to keep each function more testable.

Parameters:context (scriptworker.context.Context) – the scriptworker context.
scriptworker.worker.main()

Scriptworker entry point: get everything set up, then enter the main loop

scriptworker.worker.run_loop(context, creds_key='credentials')

Split this out of the async_main while loop for easier testing.

Parameters:
  • context (scriptworker.context.Context) – the scriptworker context.
  • creds_key (str, optional) – when reading the creds file, this dict key corresponds to the credentials value we want to use. Defaults to “credentials”.
Returns:

status

Return type:

int

Module contents