fune/taskcluster/gecko_taskgraph/__init__.py
Andrew Halberstadt 4c371dd4d8 Bug 1884364 - Create a new 'files_changed' parameter, r=taskgraph-reviewers,releng-reviewers,jcristau
We use hg.m.o's `json-automationrelevance` endpoint for a variety of reasons
such as getting the files changed for optimization purposes, or finding the
base revision for diff purposes. But this endpoint is slow and puts undue load
on hg.mozilla.org if queried too often.

The helper function that fetches this is memoized, so in theory we should only
ever make this request once per graph generation. However, there are still cases
where we request this unnecessarily:

1. When running `./mach taskgraph` locally, we first fetch
`json-automationrelevance` and then fall back to fetching it locally if the
revision wasn't found. I believe the reason for this is to be able to generate
identical graphs as produced by CI.

2. When specifying multiple parameters (so graphs are generated in parallel),
the memoize won't cache across processes, so we make the request once per
parameter set.

3. Any other time we generate tasks outside the context of a Decision task (e.g
`./mach try`), as there are transforms that call this function.

By turning `files_changed` into a parameter, we can ensure that this value gets
"frozen" by the Decision task and it will never need to be recomputed. E.g, you
could use `-p task-id=<decision id>` and you'd still get the `files_changed`
value that Decision task computed. This means, that for all non-Decision use
cases we can rely on local VCS to give us our changed files.

This should greatly cut back on the number of queries being made to `hg.m.o`.

Differential Revision: https://phabricator.services.mozilla.com/D204127
2024-03-19 14:13:54 +00:00

79 lines
2.7 KiB
Python

# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this
# file, You can obtain one at http://mozilla.org/MPL/2.0/.
import os
from taskgraph import config as taskgraph_config
from taskgraph import morph as taskgraph_morph
from taskgraph.util import schema
from taskgraph.util import taskcluster as tc_util
from gecko_taskgraph.config import graph_config_schema
GECKO = os.path.normpath(os.path.realpath(os.path.join(__file__, "..", "..", "..")))
# Maximum number of dependencies a single task can have
# https://firefox-ci-tc.services.mozilla.com/docs/reference/platform/queue/task-schema
# specifies 100, but we also optionally add the decision task id as a dep in
# taskgraph.create, so let's set this to 99.
MAX_DEPENDENCIES = 99
# Overwrite Taskgraph's default graph_config_schema with a custom one.
taskgraph_config.graph_config_schema = graph_config_schema
# Don't use any of the upstream morphs.
# TODO Investigate merging our morphs with upstream.
taskgraph_morph.registered_morphs = []
# Default rootUrl to use if none is given in the environment; this should point
# to the production Taskcluster deployment used for CI.
tc_util.PRODUCTION_TASKCLUSTER_ROOT_URL = "https://firefox-ci-tc.services.mozilla.com"
# Schemas for YAML files should use dashed identifiers by default. If there are
# components of the schema for which there is a good reason to use another format,
# exceptions can be added here.
schema.EXCEPTED_SCHEMA_IDENTIFIERS.extend(
[
"test_name",
"json_location",
"video_location",
"profile_name",
"target_path",
"try_task_config",
]
)
def register(graph_config):
"""Used to register Gecko specific extensions.
Args:
graph_config: The graph configuration object.
"""
import android_taskgraph
from taskgraph import generator
# TODO: Remove along with
# `gecko_taskgraph.optimize.strategies.SkipUnlessChanged`
# (see comment over there)
from taskgraph.optimize.base import registry
del registry["skip-unless-changed"]
from gecko_taskgraph import ( # noqa: trigger target task method registration
morph, # noqa: trigger morph registration
target_tasks,
)
android_taskgraph.register(graph_config)
from gecko_taskgraph.parameters import register_parameters
from gecko_taskgraph.util import dependencies # noqa: trigger group_by registration
from gecko_taskgraph.util.verify import verifications
# Don't use the upstream verifications, and replace them with our own.
# TODO Investigate merging our verifications with upstream.
generator.verifications = verifications
register_parameters()