Bug 1873265 - mach skip-fails improvements r=jmaher

Differential Revision: https://phabricator.services.mozilla.com/D198329
This commit is contained in:
Tom Marble 2024-01-12 15:56:31 +00:00
parent 23e246b6b0
commit d9e526f1a8
5 changed files with 299 additions and 73 deletions

View file

@ -16,19 +16,19 @@ Naming Convention
The build system does not enforce file naming for test manifest files.
However, the following convention is used.
mochitest.ini
mochitest.toml
For the *plain* flavor of mochitests.
chrome.ini
chrome.toml
For the *chrome* flavor of mochitests.
browser.ini
browser.toml
For the *browser chrome* flavor of mochitests.
a11y.ini
a11y.toml
For the *a11y* flavor of mochitests.
xpcshell.ini
xpcshell.toml
For *xpcshell* tests.
.. _manifestparser_manifests:
@ -36,25 +36,28 @@ xpcshell.ini
ManifestParser Manifests
==========================
ManifestParser manifests are essentially ini files that conform to a basic
ManifestParser manifests are essentially toml files that conform to a basic
set of assumptions.
The :doc:`reference documentation </mozbase/manifestparser>`
for manifestparser manifests describes the basic format of test manifests.
In summary, manifests are ini files with section names describing test files::
In summary, manifests are toml files with section names describing test files::
[test_foo.js]
[test_bar.js]
["test_foo.js"]
["test_bar.js"]
Keys under sections can hold metadata about each test::
[test_foo.js]
skip-if = os == "win"
[test_foo.js]
skip-if = os == "linux" && debug
[test_baz.js]
fail-if = os == "mac" || os == "android"
["test_foo.js"]
skip-if = ["os == 'win'"]
["test_foo.js"]
skip-if = ["os == 'linux' && debug"]
["test_baz.js"]
fail-if = [
"os == 'mac'",
"os == 'android'",
]
There is a special **DEFAULT** section whose keys/metadata apply to all
sections/tests::
@ -62,7 +65,7 @@ sections/tests::
[DEFAULT]
property = value
[test_foo.js]
["test_foo.js"]
In the above example, **test_foo.js** inherits the metadata **property = value**
from the **DEFAULT** section.
@ -105,10 +108,10 @@ support-files
in its own **support-files** entry. These use a syntax where paths
starting with ``!/`` will indicate the beginning of the path to a
shared support file starting from the root of the srcdir. For example,
if a manifest at ``dom/base/test/mochitest.ini`` has a support file,
if a manifest at ``dom/base/test/mochitest.toml`` has a support file,
``dom/base/test/server-script.sjs``, and a mochitest in
``dom/workers/test`` depends on that support file, the test manifest
at ``dom/workers/test/mochitest.ini`` must include
at ``dom/workers/test/mochitest.toml`` must include
``!/dom/base/test/server-script.sjs`` in its **support-files** entry.
generated-files
@ -131,7 +134,7 @@ dupe-manifest
Record that this manifest duplicates another manifest.
The common scenario is two manifest files will include a shared
manifest file via the ``[include:file]`` special section. The build
manifest file via the ``["include:file"]`` special section. The build
system enforces that each test file is only provided by a single
manifest. Having this key present bypasses that check.
@ -147,10 +150,11 @@ skip-if
.. parsed-literal::
[test_foo.js]
skip-if =
os == "mac" && fission # bug 123 - fails on fission
os == "windows" && debug # bug 456 - hits an assertion
["test_foo.js"]
skip-if = [
"os == 'mac' && fission", # bug 123 - fails on fission
"os == 'windows' && debug", # bug 456 - hits an assertion
]
fail-if
Expect test failure if the specified condition is true.

View file

@ -1279,6 +1279,13 @@ def manifest(_command_context):
dest="use_failures",
help="Use failures from file",
)
@CommandArgument(
"-M",
"--max-failures",
default=-1,
dest="max_failures",
help="Maximum number of failures to skip (-1 == no limit)",
)
@CommandArgument("-v", "--verbose", action="store_true", help="Verbose mode")
@CommandArgument(
"-d",
@ -1296,6 +1303,7 @@ def skipfails(
use_tasks=None,
save_failures=None,
use_failures=None,
max_failures=-1,
verbose=False,
dry_run=False,
):
@ -1307,10 +1315,19 @@ def skipfails(
except ValueError:
meta_bug_id = None
if max_failures is not None:
try:
max_failures = int(max_failures)
except ValueError:
max_failures = -1
else:
max_failures = -1
Skipfails(command_context, try_url, verbose, bugzilla, dry_run, turbo).run(
meta_bug_id,
save_tasks,
use_tasks,
save_failures,
use_failures,
max_failures,
)

View file

@ -75,8 +75,8 @@ advantages:
.. code-block:: text
['test_broken.js']
disabled = 'https://bugzilla.mozilla.org/show_bug.cgi?id=123456'
["test_broken.js"]
disabled = "https://bugzilla.mozilla.org/show_bug.cgi?id=123456"
* ability to run different (subsets of) tests on different
platforms. Traditionally, we've done a bit of magic or had the test
@ -86,10 +86,8 @@ advantages:
.. code-block:: text
['test_works_on_windows_only.js']
skip-if = [
"os != 'win'",
]
["test_works_on_windows_only.js"]
skip-if = ["os != 'win'"]
* ability to markup tests with metadata. We have a large, complicated,
and always changing infrastructure. key, value metadata may be used
@ -98,7 +96,7 @@ advantages:
number, if it were desirable.
* ability to have sane and well-defined test-runs. You can keep
different manifests for different test runs and ``['include:FILENAME.toml']``
different manifests for different test runs and ``["include:FILENAME.toml"]``
(sub)manifests as appropriate to your needs.
Manifest Format
@ -109,9 +107,9 @@ relative to the manifest:
.. code-block:: text
['foo.js']
['bar.js']
['fleem.js']
["foo.js"]
["bar.js"]
["fleem.js"]
The sections are read in order. In addition, tests may include
arbitrary key, value metadata to be used by the harness. You may also
@ -121,31 +119,31 @@ be inherited by each test unless overridden:
.. code-block:: text
[DEFAULT]
type = 'restart'
type = "restart"
['lilies.js']
color = 'white'
["lilies.js"]
color = "white"
['daffodils.js']
color = 'yellow'
type = 'other'
["daffodils.js"]
color = "yellow"
type = "other"
# override type from DEFAULT
['roses.js']
color = 'red'
["roses.js"]
color = "red"
You can also include other manifests:
.. code-block:: text
['include:subdir/anothermanifest.toml']
["include:subdir/anothermanifest.toml"]
And reference parent manifests to inherit keys and values from the DEFAULT
section, without adding possible included tests.
.. code-block:: text
['parent:../manifest.toml']
["parent:../manifest.toml"]
Manifests are included relative to the directory of the manifest with
the `[include:]` directive unless they are absolute paths.
@ -155,17 +153,17 @@ new line, or be inline.
.. code-block:: text
['roses.js']
["roses.js"]
# a valid comment
color = 'red' # another valid comment
color = "red" # another valid comment
Because in TOML all values must be quoted there is no risk of an anchor in
an URL being interpreted as a comment.
.. code-block:: text
['test1.js']
url = 'https://foo.com/bar#baz' # Bug 1234
["test1.js"]
url = "https://foo.com/bar#baz" # Bug 1234
Manifest Conditional Expressions
@ -198,7 +196,7 @@ This data corresponds to a one-line manifest:
.. code-block:: text
['testToolbar/testBackForwardButtons.js']
["testToolbar/testBackForwardButtons.js"]
If additional key, values were specified, they would be in this dict
as well.
@ -241,7 +239,7 @@ Additional manifest files may be included with an `[include:]` directive:
.. code-block:: text
['include:path-to-additional-file-manifest.toml']
["include:path-to-additional-file-manifest.toml"]
The path to included files is relative to the current manifest.
@ -317,10 +315,8 @@ files will look like this:
.. code-block:: text
['test_foo.py']
timeout-if = [
"300, os == 'win'",
]
["test_foo.py"]
timeout-if = ["300, os == 'win'"]
The value is <timeout>, <condition> where condition is the same format as the one in
`skip-if`. In the above case, if os == 'win', a timeout of 300 seconds will be
@ -499,6 +495,108 @@ manifestparser includes a suite of tests.
`test_manifest.txt` is a doctest that may be helpful in figuring out
how to use the API. Tests are run via `mach python-test testing/mozbase/manifestparser`.
Using mach manifest skip-fails
``````````````````````````````
The first of the ``mach manifest`` subcommands is ``skip-fails``. This command
can be used to *automatically* edit manifests to skip tests that are failing
as well as file the corresponding bugs for the failures. This is particularly
useful when "greening up" a new platform.
You may verify the proposed changes from ``skip-fails`` output and examine
any local manifest changes with ``hg status``.
Here is the usage:
.. code-block:: text
$ ./mach manifest skip-fails --help
usage: mach [global arguments] manifest skip-fails [command arguments]
Sub Command Arguments:
try_url Treeherder URL for try (please use quotes)
-b BUGZILLA, --bugzilla BUGZILLA
Bugzilla instance
-m META_BUG_ID, --meta-bug-id META_BUG_ID
Meta Bug id
-s, --turbo Skip all secondary failures
-t SAVE_TASKS, --save-tasks SAVE_TASKS
Save tasks to file
-T USE_TASKS, --use-tasks USE_TASKS
Use tasks from file
-f SAVE_FAILURES, --save-failures SAVE_FAILURES
Save failures to file
-F USE_FAILURES, --use-failures USE_FAILURES
Use failures from file
-M MAX_FAILURES, --max-failures MAX_FAILURES
Maximum number of failures to skip (-1 == no limit)
-v, --verbose Verbose mode
-d, --dry-run Determine manifest changes, but do not write them
$
``try_url`` --- Treeherder URL
------------------------------
This is the url (usually in single quotes) from running tests in try, for example:
'https://treeherder.mozilla.org/jobs?repo=try&revision=babc28f495ee8af2e4f059e9cbd23e84efab7d0d'
``--bugzilla BUGZILLA`` --- Bugzilla instance
---------------------------------------------
By default the Bugzilla instance is ``bugzilla.allizom.org``, but you may set it on the command
line to another value such as ``bugzilla.mozilla.org`` (or by setting the environment variable
``BUGZILLA``).
``--meta-bug-id META_BUG_ID`` --- Meta Bug id
---------------------------------------------
Any new bugs that are filed will block (be dependents of) this "meta" bug (optional).
``--turbo`` --- Skip all secondary failures
-------------------------------------------
The default ``skip-fails`` behavior is to skip only the first failure (for a given label) for each test.
In `turbo` mode, all failures for this manifest + label will skipped.
``--save-tasks SAVE_TASKS`` --- Save tasks to file
--------------------------------------------------
This feature is primarily for ``skip-fails`` development and debugging.
It will save the tasks (downloaded via mozci) to the specified JSON file
(which may be used in a future ``--use-tasks`` option)
``--use-tasks USE_TASKS`` --- Use tasks from file
-------------------------------------------------
This feature is primarily for ``skip-fails`` development and debugging.
It will uses the tasks from the specified JSON file (instead of downloading them via mozci).
See also ``--save-tasks``.
``--save-failures SAVE_FAILURES`` --- Save failures to file
-----------------------------------------------------------
This feature is primarily for ``skip-fails`` development and debugging.
It will save the failures (calculated from the tasks) to the specified JSON file
(which may be used in a future ``--use-failures`` option)
``--use-failures USE_FAILURES`` --- Use failures from file
----------------------------------------------------------
This feature is primarily for ``skip-fails`` development and debugging.
It will uses the failures from the specified JSON file (instead of downloading them via mozci).
See also ``--save-failures``.
``--max-failures MAX_FAILURES`` --- Maximum number of failures to skip
----------------------------------------------------------------------
This feature is primarily for ``skip-fails`` development and debugging.
It will limit the number of failures that are skipped (default is -1 == no limit).
``--verbose`` --- Verbose mode
------------------------------
Increase verbosity of output.
``--dry-run`` --- Dry run
-------------------------
In dry run mode, the manifest changes (and bugs top be filed) are determined, but not written.
Bugs
````

View file

@ -2,16 +2,20 @@
# License, v. 2.0. If a copy of the MPL was not distributed with this
# file, You can obtain one at http://mozilla.org/MPL/2.0/.
import gzip
import io
import json
import logging
import os
import os.path
import pprint
import re
import sys
import tempfile
import urllib.parse
from enum import Enum
from pathlib import Path
from xmlrpc.client import Fault
from yaml import load
@ -29,6 +33,14 @@ from mozci.task import TestTask
from mozci.util.taskcluster import get_task
BUGZILLA_AUTHENTICATION_HELP = "Must create a Bugzilla API key per https://github.com/mozilla/mozci-tools/blob/main/citools/test_triage_bug_filer.py"
TASK_LOG = "live_backing.log"
TASK_ARTIFACT = "public/logs/" + TASK_LOG
ATTACHMENT_DESCRIPTION = "Compressed " + TASK_ARTIFACT + " for task "
ATTACHMENT_REGEX = (
r".*Created attachment ([0-9]+)\n.*"
+ ATTACHMENT_DESCRIPTION
+ "([A-Za-z0-9_-]+)\n.*"
)
class MockResult(object):
@ -137,6 +149,7 @@ class Skipfails(object):
self.bugzilla = Skipfails.BUGZILLA_SERVER_DEFAULT
self.component = "skip-fails"
self._bzapi = None
self._attach_rx = None
self.variants = {}
self.tasks = {}
self.pp = None
@ -151,6 +164,7 @@ class Skipfails(object):
"""Lazily initializes the Bugzilla API"""
if self._bzapi is None:
self._bzapi = bugzilla.Bugzilla(self.bugzilla)
self._attach_rx = re.compile(ATTACHMENT_REGEX, flags=re.M)
def pprint(self, obj):
if self.pp is None:
@ -182,6 +196,10 @@ class Skipfails(object):
else:
print(f"INFO: {e}", file=sys.stderr, flush=True)
def vinfo(self, e):
if self.verbose:
self.info(e)
def run(
self,
meta_bug_id=None,
@ -189,6 +207,7 @@ class Skipfails(object):
use_tasks=None,
save_failures=None,
use_failures=None,
max_failures=-1,
):
"Run skip-fails on try_url, return True on success"
@ -197,7 +216,7 @@ class Skipfails(object):
if use_tasks is not None:
if os.path.exists(use_tasks):
self.info(f"use tasks: {use_tasks}")
self.vinfo(f"use tasks: {use_tasks}")
tasks = self.read_json(use_tasks)
tasks = [MockTask(task) for task in tasks]
else:
@ -208,7 +227,7 @@ class Skipfails(object):
if use_failures is not None:
if os.path.exists(use_failures):
self.info(f"use failures: {use_failures}")
self.vinfo(f"use failures: {use_failures}")
failures = self.read_json(use_failures)
else:
self.error(f"use failures JSON file does not exist: {use_failures}")
@ -216,13 +235,14 @@ class Skipfails(object):
else:
failures = self.get_failures(tasks)
if save_failures is not None:
self.info(f"save failures: {save_failures}")
self.vinfo(f"save failures: {save_failures}")
self.write_json(save_failures, failures)
if save_tasks is not None:
self.info(f"save tasks: {save_tasks}")
self.vinfo(f"save tasks: {save_tasks}")
self.write_tasks(save_tasks, tasks)
num_failures = 0
for manifest in failures:
if not manifest.endswith(".toml"):
self.warning(f"cannot process skip-fails on INI manifests: {manifest}")
@ -249,6 +269,12 @@ class Skipfails(object):
repo,
meta_bug_id,
)
num_failures += 1
if max_failures >= 0 and num_failures >= max_failures:
self.warning(
f"max_failures={max_failures} threshold reached. stopping."
)
return True
break # just use the first task_id
return True
@ -268,8 +294,7 @@ class Skipfails(object):
repo = query[Skipfails.REPO][0]
else:
repo = "try"
if self.verbose:
self.info(f"considering {repo} revision={revision}")
self.vinfo(f"considering {repo} revision={revision}")
return revision, repo
def get_tasks(self, revision, repo):
@ -284,8 +309,8 @@ class Skipfails(object):
* True (passed)
classification: Classification
* unknown (default) < 3 runs
* intermittent (not enough failures) >3 runs < 0.5 failure rate
* disable_recommended (enough repeated failures) >3 runs >= 0.5
* intermittent (not enough failures) >3 runs < 0.4 failure rate
* disable_recommended (enough repeated failures) >3 runs >= 0.4
* disable_manifest (disable DEFAULT if no other failures)
* secondary (not first failure in group)
* success
@ -397,7 +422,7 @@ class Skipfails(object):
"classification"
]
if total_runs >= 3:
if failed_runs / total_runs < 0.5:
if failed_runs / total_runs < 0.4:
if failed_runs == 0:
classification = Classification.SUCCESS
else:
@ -551,6 +576,7 @@ class Skipfails(object):
):
"""Skip a failure"""
self.vinfo(f"===== Skip failure in manifest: {manifest} =====")
skip_if = self.task_to_skip_if(task_id)
if skip_if is None:
self.warning(
@ -583,10 +609,14 @@ class Skipfails(object):
)
if log_url is not None:
comment += f"\n\nBug suggestions: {suggestions_url}"
comment += f"\nSpecifically see at line {line_number}:\n"
comment += f'\n "{line}"'
comment += f"\n\nIn the log: {log_url}"
comment += f"\nSpecifically see at line {line_number} in the attached log: {log_url}"
comment += f'\n\n "{line}"\n'
platform, testname = self.label_to_platform_testname(label)
if platform is not None:
comment += "\n\nCommand line to reproduce:\n\n"
comment += f" \"mach try fuzzy -q '{platform}' {testname}\""
bug_summary = f"MANIFEST {manifest}"
attachments = {}
bugs = self.get_bugs_by_summary(bug_summary)
if len(bugs) == 0:
description = (
@ -602,7 +632,7 @@ class Skipfails(object):
else:
bug = self.create_bug(bug_summary, description, product, component)
bugid = bug.id
self.info(
self.vinfo(
f'Created Bug {bugid} {product}::{component} : "{bug_summary}"'
)
bug_reference = f"Bug {bugid}" + bug_reference
@ -611,11 +641,22 @@ class Skipfails(object):
bug_reference = f"Bug {bugid}" + bug_reference
product = bugs[0].product
component = bugs[0].component
self.info(f'Found Bug {bugid} {product}::{component} "{bug_summary}"')
self.vinfo(f'Found Bug {bugid} {product}::{component} "{bug_summary}"')
if meta_bug_id is not None:
if meta_bug_id in bugs[0].blocks:
self.info(f" Bug {bugid} already blocks meta bug {meta_bug_id}")
self.vinfo(f" Bug {bugid} already blocks meta bug {meta_bug_id}")
meta_bug_id = None # no need to add again
comments = bugs[0].getcomments()
for i in range(len(comments)):
text = comments[i]["text"]
m = self._attach_rx.findall(text)
if len(m) == 1:
a_task_id = m[0][1]
attachments[a_task_id] = m[0][0]
if a_task_id == task_id:
self.vinfo(
f" Bug {bugid} already has the compressed log attached for this task"
)
else:
self.error(f'More than one bug found for summary: "{bug_summary}"')
return
@ -623,11 +664,16 @@ class Skipfails(object):
self.warning(f"Dry-run NOT adding comment to Bug {bugid}: {comment}")
self.info(f'Dry-run NOT editing ["{filename}"] manifest: "{manifest}"')
self.info(f'would add skip-if condition: "{skip_if}" # {bug_reference}')
if task_id not in attachments:
self.info("would add compressed log for this task")
return
self.add_bug_comment(bugid, comment, meta_bug_id)
self.info(f"Added comment to Bug {bugid}: {comment}")
if meta_bug_id is not None:
self.info(f" Bug {bugid} blocks meta Bug: {meta_bug_id}")
if task_id not in attachments:
self.add_attachment_log_for_task(bugid, task_id)
self.info("Added compressed log for this task")
mp = ManifestParser(use_toml=True, document=True)
manifest_path = os.path.join(self.topsrcdir, os.path.normpath(manifest))
mp.read(manifest_path)
@ -745,7 +791,7 @@ class Skipfails(object):
Provide defaults (in case command_context is not defined
or there isn't file info available).
"""
if self.command_context is not None:
if path != "DEFAULT" and self.command_context is not None:
reader = self.command_context.mozbuild_reader(config_mode="empty")
info = reader.files_info([path])
cp = info[path]["BUG_COMPONENT"]
@ -774,7 +820,7 @@ class Skipfails(object):
def get_push_id(self, revision, repo):
"""Return the push_id for revision and repo (or None)"""
self.info(f"Retrieving push_id for {repo} revision: {revision} ...")
self.vinfo(f"Retrieving push_id for {repo} revision: {revision} ...")
if revision in self.push_ids: # if cached
push_id = self.push_ids[revision]
else:
@ -801,7 +847,7 @@ class Skipfails(object):
def get_job_id(self, push_id, task_id):
"""Return the job_id for push_id, task_id (or None)"""
self.info(f"Retrieving job_id for push_id: {push_id}, task_id: {task_id} ...")
self.vinfo(f"Retrieving job_id for push_id: {push_id}, task_id: {task_id} ...")
if push_id in self.job_ids: # if cached
job_id = self.job_ids[push_id]
else:
@ -829,7 +875,7 @@ class Skipfails(object):
Return the (suggestions_url, line_number, line, log_url)
for the given repo and job_id
"""
self.info(
self.vinfo(
f"Retrieving bug_suggestions for {repo} job_id: {job_id}, path: {path} ..."
)
suggestions_url = f"https://treeherder.mozilla.org/api/project/{repo}/jobs/{job_id}/bug_suggestions/"
@ -844,7 +890,7 @@ class Skipfails(object):
if len(response) > 0:
for sugg in response:
if sugg["path_end"] == path:
line_number = sugg["line_number"]
line_number = sugg["line_number"] + 1
line = sugg["search"]
log_url = f"https://treeherder.mozilla.org/logviewer?repo={repo}&job_id={job_id}&lineNumber={line_number}"
break
@ -895,3 +941,54 @@ class Skipfails(object):
jtask["failure_types"] = jft
jtasks.append(jtask)
self.write_json(save_tasks, jtasks)
def label_to_platform_testname(self, label):
"""convert from label to platform, testname for mach command line"""
platform = None
testname = None
platform_details = label.split("/")
if len(platform_details) == 2:
platform, details = platform_details
words = details.split("-")
if len(words) > 2:
platform += "/" + words.pop(0) # opt or debug
try:
_chunk = int(words[-1])
words.pop()
except ValueError:
pass
words.pop() # remove test suffix
testname = "-".join(words)
else:
platform = None
return platform, testname
def add_attachment_log_for_task(self, bugid, task_id):
"""Adds compressed log for this task to bugid"""
log_url = f"https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/{task_id}/artifacts/public/logs/live_backing.log"
r = requests.get(log_url, headers=self.headers)
if r.status_code != 200:
self.error(f"Unable get log for task: {task_id}")
return
attach_fp = tempfile.NamedTemporaryFile()
fp = gzip.open(attach_fp, "wb")
fp.write(r.text.encode("utf-8"))
fp.close()
self._initialize_bzapi()
description = ATTACHMENT_DESCRIPTION + task_id
file_name = TASK_LOG + ".gz"
comment = "Added compressed log"
content_type = "application/gzip"
try:
self._bzapi.attachfile(
[bugid],
attach_fp.name,
description,
file_name=file_name,
comment=comment,
content_type=content_type,
is_private=False,
)
except Fault:
pass # Fault expected: Failed to fetch key 9372091 from network storage: The specified key does not exist.

View file

@ -166,5 +166,15 @@ def test_get_filename_in_manifest():
)
def test_label_to_platform_testname():
"""Test label_to_platform_testname"""
sf = Skipfails()
label = "test-linux2204-64-wayland/opt-mochitest-browser-chrome-swr-13"
platform, testname = sf.label_to_platform_testname(label)
assert platform == "test-linux2204-64-wayland/opt"
assert testname == "mochitest-browser-chrome"
if __name__ == "__main__":
main()