nagai/fune

Author	SHA1	Message	Date
Nick Thomas	0c309d25d1	Bug 1630809 - when downloading artifacts using fetch-content, optionally verify hash using chain-of-trust.json r=aki This improves the integrity of downloads of upstream artifacts when using fetch-content. If `verify-hash: True` is set on the fetch config, then the chain-of-trust.json of the upstream is used to retieve the expected sha256 of the artifact, and this is checked. Differential Revision: https://phabricator.services.mozilla.com/D87725	2020-08-27 22:19:46 +00:00
Butkovits Atila	b8629b8d1e	Backed out 9 changesets (bug 1630809, bug 1653476) for Gecko Decision failures. CLOSED TREE Backed out changeset 02a27bfc76dd (bug 1653476) Backed out changeset afb5df61943a (bug 1630809) Backed out changeset 04628c1f98e9 (bug 1630809) Backed out changeset 4b4d50e0b1bf (bug 1630809) Backed out changeset 2fa2deb5c993 (bug 1630809) Backed out changeset d6652114cac3 (bug 1630809) Backed out changeset ad5e4caa3291 (bug 1630809) Backed out changeset d3d841cd14f3 (bug 1630809) Backed out changeset b3746502e227 (bug 1630809)	2020-08-28 01:15:03 +03:00
Nick Thomas	a2c4b8f1ea	Bug 1630809 - when downloading artifacts using fetch-content, optionally verify hash using chain-of-trust.json r=aki This improves the integrity of downloads of upstream artifacts when using fetch-content. If `verify-hash: True` is set on the fetch config, then the chain-of-trust.json of the upstream is used to retieve the expected sha256 of the artifact, and this is checked. Differential Revision: https://phabricator.services.mozilla.com/D87725	2020-08-27 05:28:00 +00:00
Tom Ritter	e6b8454b50	Bug 1616925 - Support a taskcluster-based ssh key for fetch jobs r=tomprince Differential Revision: https://phabricator.services.mozilla.com/D81448	2020-08-03 15:33:01 +00:00
Noemi Erli	e91002b722	Backed out changeset 359f9a3acc75 (bug 1616925) for causing failures in test_2_conformance2__glsl3__matrix-row-major-dynamic-indexing.html CLOSED TREE	2020-08-03 22:35:34 +03:00
Tom Ritter	58fc2fa062	Bug 1616925 - Support a taskcluster-based ssh key for fetch jobs r=tomprince Differential Revision: https://phabricator.services.mozilla.com/D81448	2020-08-03 15:33:01 +00:00
Mike Hommey	a760600961	Bug 1634560 - Fix fetch-config for git repos with submodules. r=dmajor There are cases where --recurse-submodules breaks things (e.g. when newer versions of the repository remove a submodule). So don't recurse-submodules at all at clone or checkout time, but instead initialize and update submodules after the checkout. Also don't checkout at clone time, because it's redundant with the checkout, and we only really trust the explicit checkout anyways, so it's better to not checkout during the clone. Differential Revision: https://phabricator.services.mozilla.com/D73353	2020-05-02 06:18:33 +00:00
Mike Hommey	f8e37d2100	Bug 1621845 - Normalize fetch path in fetch-content. r=rstewart The win64-aarch64 have a kind of a nasty trick that makes fetch-content download artifacts of a dependent task directly as artifacts of the task itself. For some reason, while this pattern works on native Windows jobs, it doesn't on Linux. What happens is essentially that: `pathlib.Path(path).joinpath('../foo').mkdir(parents=True, exist=ok=True)` fails when path doesn't exist first. I guess the fetches directory already exists on Windows worker or something. Unfortunately, os.path.normpath doesn't take `pathlib.Path`s in still-supported python 3.5, so we have to convert to str first. Differential Revision: https://phabricator.services.mozilla.com/D66518 --HG-- extra : moz-landing-system : lando	2020-03-19 08:18:37 +00:00
Mike Hommey	7ef39a5fd8	Bug 1617043 - Track the time spent in fetch-content and mach artifact toolchain. r=rstewart Note: while we can use time.monotonic in fetch-content, we can't in mach artifact toolchain yet because it's still python2. Differential Revision: https://phabricator.services.mozilla.com/D65690 --HG-- extra : moz-landing-system : lando	2020-03-07 10:46:14 +00:00
Justin Wood	b627a90bcf	No Bug - Remove taskcluster.net references in the tree. r=aki Differential Revision: https://phabricator.services.mozilla.com/D58297 --HG-- extra : moz-landing-system : lando	2020-01-24 15:52:50 +00:00
Noemi Erli	8c4ff0fb12	Backed out changeset cf3d74d0cf82 per Callek's request DONTBUILD CLOSED TREE	2020-01-24 17:48:10 +02:00
Justin Wood	19e5f06716	No Bug - Remove taskcluster.net references in the tree. Differential Revision: https://phabricator.services.mozilla.com/D58297	2020-01-24 00:16:37 +02:00
Andreea Pavel	38dd93c9be	Backed out changeset c5a138a88095 on request on a CLOSED TREE	2020-01-24 00:29:17 +02:00
Justin Wood	e38c52acbe	No Bug - Remove taskcluster.net references in the tree. Differential Revision: https://phabricator.services.mozilla.com//D58297	2020-01-24 00:16:37 +02:00
Sebastian Hengst	e2dd028d86	Backed out changeset bbd910f6301a because it only landed to build toolchains and docker images. CLOSED TREE DONTBUILD It will be relanded once these are complete. This prevents from those tasks getting scheduled for every push until the initial ones have been completed.	2020-01-06 17:09:20 +01:00
Justin Wood	3835fde8ca	No Bug - Remove taskcluster.net references in the tree. r=aki CLOSED TREE Differential Revision: https://phabricator.services.mozilla.com/D58297 --HG-- extra : amend_source : 0bcd812ae330be7a69ec60f60034533f15e58769	2020-01-03 20:52:34 +01:00
Mike Shal	9b622424d1	Bug 1582189 - Include submodules in git fetch tasks; r=froydnj Using git-archive for the fetch task means that we don't get the submodules of a git repository included in the archive. There isn't a straightforward way to get submodules from a bare repo included with git-archive, so instead we can simply clone & checkout with --recurse-submodules and then use a standard tar command to bundle up the tree. Adding --recurse-submodules to the commands has no effect on a repo without submodules, so we can add it to all invocations for simplicity. Differential Revision: https://phabricator.services.mozilla.com/D46827 --HG-- extra : moz-landing-system : lando	2019-09-25 20:46:24 +00:00
Rob Thijssen	142d7d3127	Bug 1582726 - use cafile from certifi when available r=dustin python's `urllib.request.urlopen(url)` can fail when a system doesn't know how to verify a ca certificate. this patch makes use of the cafile provided by the `certifi` module, if/when it is installed, to verify certificates. Differential Revision: https://phabricator.services.mozilla.com/D47044 --HG-- extra : source : 92b9ffc8f37ddd16ca3f426d64df059eea38d5fa	2019-09-26 09:17:15 +00:00
Noemi Erli	5e34ed9990	Backed out changeset 92b9ffc8f37d (bug 1582726) for causing fetch bustages CLOSED TREE	2019-09-26 14:14:17 +03:00
Rob Thijssen	37c23f431d	Bug 1582726 - use cafile from certifi when available r=dustin python's `urllib.request.urlopen(url)` can fail when a system doesn't know how to verify a ca certificate. this patch makes use of the cafile provided by the `certifi` module, if/when it is installed, to verify certificates. Differential Revision: https://phabricator.services.mozilla.com/D47044 --HG-- extra : moz-landing-system : lando	2019-09-26 09:17:15 +00:00
Dustin J. Mitchell	b6c8e578bf	Bug 1572132 - fix URL generation in fetch-content r=glandium MANUAL PUSH: to allow docker images to build without closing autoland Differential Revision: https://phabricator.services.mozilla.com/D41038 --HG-- extra : rebase_source : 60ae00549917411d1839b6e3f8e6ae962d217470 extra : amend_source : a2531b115f5732345f8c34c88669428510d100a4	2019-08-07 15:53:15 +00:00
Mike Hommey	b3c14183b8	Bug 1571589 - Allow simple manipulation of file paths in fetched archives. r=tomprince Namely: - adding a prefix, - stripping path components. Differential Revision: https://phabricator.services.mozilla.com/D40741	2019-08-07 13:54:26 +09:00
Mike Hommey	890f87dad8	Bug 1571589 - Allow to repack downloaded archives "on the fly". r=tomprince Bug 1479533 was proposing to add a similar functionality, but this iteration avoids actually unpacking anything, and ensures reproducibility by relying on the reproducible bits from the original archives: file ordering, flags, etc. (since they are checksummed, those are never going to change for a given archive). Another notable difference is that this applies the repack on the fetch task itself, rather than create a separate task to apply the repack. The latter has advantages, in that it allows to change the repacking without redownloading the original file from a third-party server, but in practice, most changes to the repacking would trigger the download tasks anyways. This patch only takes care of changing the archive type (zip->tar), and the compression type (anything->zstandard). Differential Revision: https://phabricator.services.mozilla.com/D40740	2019-08-07 13:54:25 +09:00
Mike Hommey	525bccdd60	Bug 1571589 - Abstract opening a temporary file and renaming it after close. r=tomprince And use that in git_checkout_archive. Differential Revision: https://phabricator.services.mozilla.com/D40739	2019-08-07 13:54:24 +09:00
Mike Hommey	34a2eebc79	Bug 1571589 - Use urlparse rather relying on just splitting on / being enough. r=tomprince Differential Revision: https://phabricator.services.mozilla.com/D40738	2019-08-07 13:54:23 +09:00
Mike Hommey	a57cc9b49f	Bug 1570541 - Use tarfile in fetch-content on Windows. r=tomprince Differential Revision: https://phabricator.services.mozilla.com/D40401	2019-08-07 13:54:14 +09:00
Mike Hommey	21665e187c	Bug 1569124 - Add git support to fetch tasks. r=tomprince This is loosely based on what was in bug 1467359, but simplified to handle git only, and simply using git-archive because, at least now, it's deterministic (it uses the commit date as timestamp in tar archives). This also adds 4 tasks for some of the things we use for toolchains, but doesn't hook them up yet. This also upgrades the fetch docker image to Debian buster, and installs the required packages in it. Differential Revision: https://phabricator.services.mozilla.com/D39480	2019-07-30 14:43:31 +09:00
Dustin J. Mitchell	1fded4473e	Bug 1508381 - use rootUrl style with taskcluster-proxy r=tomprince Differential Revision: https://phabricator.services.mozilla.com/D18023 --HG-- extra : moz-landing-system : lando	2019-03-12 20:38:42 +00:00
arthur.iakab	c152ccec1d	Backed out 4 changesets (bug 1508381) for multiple Windows build bustages CLOSED TREE Backed out changeset f01cec6f712e (bug 1508381) Backed out changeset ba69e59924de (bug 1508381) Backed out changeset 97fe4e5a665e (bug 1508381) Backed out changeset 0c3065c12bef (bug 1508381)	2019-01-31 23:14:11 +02:00
Dustin J. Mitchell	22fcbfc133	Bug 1508381 - use rootUrl style with taskcluster-proxy r=tomprince Differential Revision: https://phabricator.services.mozilla.com/D18023 --HG-- extra : moz-landing-system : lando	2019-01-30 18:58:09 +00:00
Dustin J. Mitchell	60d0a26a65	Bug 1492664 - update fetch-content to use TASKCLUSTER_ROOT_URL; r=tomprince --HG-- extra : rebase_source : ae5064b8cf13ee50b4db0299e4b7fad215902af1 extra : source : 742b038bb1dd39c029afce73eb3f5b683cb590f2	2018-10-02 14:40:39 +00:00
Sebastian Hengst	767c971623	Backed out 21 changesets (bug 1492664) for breaking cron task for nightlies. a=backout Backed out changeset a7d50dbb2c8e (bug 1492664) Backed out changeset 2d876c4ece8b (bug 1492664) Backed out changeset c82285d253de (bug 1492664) Backed out changeset bf6d089640eb (bug 1492664) Backed out changeset d9a7f2ce49c3 (bug 1492664) Backed out changeset 06c466ab4323 (bug 1492664) Backed out changeset c1ea4a10cc8d (bug 1492664) Backed out changeset 4c63a04fdd47 (bug 1492664) Backed out changeset 742b038bb1dd (bug 1492664) Backed out changeset 911b4b0fb683 (bug 1492664) Backed out changeset 870c8cec99e5 (bug 1492664) Backed out changeset 77699b51336b (bug 1492664) Backed out changeset 29f33f22fd8b (bug 1492664) Backed out changeset e7f305408708 (bug 1492664) Backed out changeset 335a92b1f424 (bug 1492664) Backed out changeset c566f1c8dcdf (bug 1492664) Backed out changeset c77ae59aba41 (bug 1492664) Backed out changeset 9c35dd209c6b (bug 1492664) Backed out changeset a972d6b4434e (bug 1492664) Backed out changeset 5ea6f03f845e (bug 1492664) Backed out changeset 0699d3873e44 (bug 1492664) --HG-- extra : histedit_source : 5cb1f7e50f25d4a875c1a58c86b7dce902e1a89c%2C20f1ab1a843b612cfcc67cf5c6ff745d65abf076	2018-12-20 12:43:22 +02:00
Dustin J. Mitchell	26dce736fb	Bug 1492664 - update fetch-content to use TASKCLUSTER_ROOT_URL; r=tomprince --HG-- extra : rebase_source : 1cb8dcaf83ffd97088b35d68420b506cc650f197	2018-10-02 14:40:39 +00:00
Margareta Eliza Balazs	2e5e28f518	Backed out 16 changesets (bug 1492664) for breaking developer artifact builds, requested by standard8 a=backout Backed out changeset 31e500489665 (bug 1492664) Backed out changeset f4945658d45f (bug 1492664) Backed out changeset 6d17291b8b92 (bug 1492664) Backed out changeset 90f3faa36137 (bug 1492664) Backed out changeset 0b229b00818a (bug 1492664) Backed out changeset 5eb2c77d70a9 (bug 1492664) Backed out changeset e1ebad5d89c5 (bug 1492664) Backed out changeset 3017e5890739 (bug 1492664) Backed out changeset c8b7e620eabf (bug 1492664) Backed out changeset d3dfbd848236 (bug 1492664) Backed out changeset 5c92bb5ac895 (bug 1492664) Backed out changeset fb7cfca6ebc3 (bug 1492664) Backed out changeset 0c4101230d4d (bug 1492664) Backed out changeset b93a0fcc86f3 (bug 1492664) Backed out changeset 6dc9522ee0bf (bug 1492664) Backed out changeset 85d7f8b330eb (bug 1492664)	2018-12-19 11:45:29 +02:00
Dustin J. Mitchell	015e1e8538	Bug 1492664 - update fetch-content to use TASKCLUSTER_ROOT_URL; r=tomprince Differential Revision: https://phabricator.services.mozilla.com/D14207 --HG-- extra : moz-landing-system : lando	2018-12-18 17:26:43 +00:00
Justin Wood	d83c794486	Bug 1475512 - Fix .zip fetch tasks on windows. r=tomprince Differential Revision: https://phabricator.services.mozilla.com/D9329 --HG-- extra : moz-landing-system : lando	2018-10-22 18:23:05 +00:00
Tom Prince	14cf8b64d6	Bug 1486224: [fetch-content] Retry downloads when fetching content; r=gps Differential Revision: https://phabricator.services.mozilla.com/D6686 --HG-- extra : moz-landing-system : lando	2018-09-25 16:40:42 +00:00
Nick Thomas	64b1b8b4a0	Bug 1493056 - fetch-content tries to use https for private urls with the proxy, should use http, r=tomprince Differential Revision: https://phabricator.services.mozilla.com/D6454 --HG-- extra : moz-landing-system : lando	2018-09-21 03:14:27 +00:00
Tom Prince	83067e2603	Bug 1484012: [fetch-content] Add support for downloading private artifacts; r=gps Differential Revision: https://phabricator.services.mozilla.com/D3556 --HG-- extra : rebase_source : 8207be2e99ee8fdc75209f62c4a357b5c827edce	2018-08-16 15:13:02 -06:00
Tom Prince	9fd238c7e3	Bug 1484012: [fetch-content] Transparently decompress artifacts; r=gps generic-worker transparently compresses uncompressed artifacts. Teach fetch-content to decompress those artifacts. Differential Revision: https://phabricator.services.mozilla.com/D3555 --HG-- extra : rebase_source : 3e1847b545de5443fd4349f75acc605ea5a46701	2018-08-15 15:53:27 -06:00
Tom Prince	6701e41a4c	Bug 1484012: [fetch-content] Add an option to not unpack downloaded artifacts; r=gps Differential Revision: https://phabricator.services.mozilla.com/D3554 --HG-- extra : rebase_source : 58bba31bd921d29d4a40ad8d9ba09c4c7ac1f8dc	2018-08-15 15:16:49 -06:00
Tom Prince	43c8cdcaae	Bug 1484012: [fetch-content] Pass `MOZ_FETCHES` as json; r=gps,ahal Rather than trying to parse strings, just pass a json blob. This will allow us to easily do things like mark artifacts to be left unextracted. Differential Revision: https://phabricator.services.mozilla.com/D3553 --HG-- extra : rebase_source : 4e762c65d1c9f13361d5bae2e4608ba09bb39a91	2018-08-17 10:37:21 -06:00
Andrew Halberstadt	321a8788f2	Bug 1484790 - [fetches] Overwrite without prompting when unzipping an artifact with fetch-content, r=gps This also moves the call to 'fetch_artifacts' in run-task down inside the try/finally block. This way if something goes wrong, we'll still cleanup MOZ_FETCHES_DIR. Differential Revision: https://phabricator.services.mozilla.com/D4152 --HG-- extra : moz-landing-system : lando	2018-08-24 16:04:59 +00:00
Gregory Szorc	71e90e5309	Bug 1480431 - Make ifh a file object; r=tomprince Otherwise it can't be used as a context manager since it doesn't have __enter__ or __exit__. Differential Revision: https://phabricator.services.mozilla.com/D2672 --HG-- extra : moz-landing-system : lando	2018-08-02 16:22:46 +00:00
Gregory Szorc	3b427569ba	Bug 1479533 - Log to stderr, capitalize messages; r=tomprince This is what a lot of programs do. We do logging in a helper function so we can flush after every write. Differential Revision: https://phabricator.services.mozilla.com/D2526 --HG-- extra : rebase_source : 98563aee129c16662a783122241623b8ed2fe457	2018-07-31 15:39:10 -07:00
Gregory Szorc	2207dd7026	Bug 1479533 - Refactor archive decompression; r=tomprince Previously, we told `tar` or `unzip` to operate on an explicit file. This worked when `tar` understood the compression format of the file. And this worked in the majority of cases. But `tar` does not support zstandard compression (at least not outside extremely new versions, which aren't yet widely deployed). And not all versions of `tar` support the `-a` argument. This commit changes our invocation of `tar` so input data is piped to it from Python. In the case of `tar`, we perform decompression in Python, if possible. This allows us to support zstandard and `tar` binaries that don't support `-a` to auto-detect the compression format. I wanted to be consistent and always pipe the raw data via stdin. But `unzip` doesn't appear to like this. Oh well. We also refactor the logic around detecting archives. We have a function to identify the archive type based on a filename. We then pass the archive type to the extraction function and key off that logic within. We also conditionally call extract_archive() and fail hard in extract_archive() when things fail. This will make future archive code easier to reason about. Differential Revision: https://phabricator.services.mozilla.com/D1576 --HG-- extra : rebase_source : 1c66396cced1b2a94a959386eecc3f512b033308	2018-08-01 09:00:58 -07:00
Andrew Halberstadt	e8a36b30d0	Bug 1468812 - [fetch-content] Implement ability to specify a per-fetch subdirectory to extract into r=gps Currently 'fetch' artifacts are all extracted in the same directory, this could make the extdir messy, or in the worst case, cause file name collisions. Some artifacts are ok to extract into the same directory as they're already bundled within the archive. But other artifacts are not. This patch keeps the default behaviour (extracting everything into the same directory), but allows task authors to specify per-artifact directories to extract into. The syntax is: path[>dest]@<task> The 'dest' value will be a subdirectory of the MOZ_FETCHES_DIR environment variable. Depends on D2102. Differential Revision: https://phabricator.services.mozilla.com/D2166 --HG-- extra : moz-landing-system : lando	2018-07-18 17:52:43 +00:00
Gregory Szorc	8922082362	Bug 1460777 - Taskgraph tasks for retrieving remote content; r=dustin, glandium Currently, many tasks fetch content from the Internets. A problem with that is fetching from the Internets is unreliable: servers may have outages or be slow; content may disappear or change out from under us. The unreliability of 3rd party services poses a risk to Firefox CI. If services aren't available, we could potentially not run some CI tasks. In the worst case, we might not be able to release Firefox. That would be bad. In fact, as I write this, gmplib.org has been unavailable for ~24 hours and Firefox CI is unable to retrieve the GMP source code. As a result, building GCC toolchains is failing. A solution to this is to make tasks more hermetic by depending on fewer network services (which by definition aren't reliable over time and therefore introduce instability). This commit attempts to mitigate some external service dependencies by introducing the fetch task kind. The primary goal of the fetch kind is to obtain remote content and re-expose it as a task artifact. By making external content available as a cached task artifact, we allow dependent tasks to consume this content without touching the service originally providing that content, thus eliminating a run-time dependency and making tasks more hermetic and reproducible over time. We introduce a single "fetch-url" "using" flavor to define tasks that fetch single URLs and then re-expose that URL as an artifact. Powering this is a new, minimal "fetch" Docker image that contains a "fetch-content" Python script that does the work for us. We have added tasks to fetch source archives used to build the GCC toolchains. Fetching remote content and re-exposing it as an artifact is not very useful by itself: the value is in having tasks use those artifacts. We introduce a taskgraph transform that allows tasks to define an array of "fetches." Each entry corresponds to the name of a "fetch" task kind. When present, the corresponding "fetch" task is added as a dependency. And the task ID and artifact path from that "fetch" task is added to the MOZ_FETCHES environment variable of the task depending on it. Our "fetch-content" script has a "task-artifacts" sub-command that tasks can execute to perform retrieval of all artifacts listed in MOZ_FETCHES. To prove all of this works, the code for fetching dependencies when building GCC toolchains has been updated to use `fetch-content`. The now-unused legacy code has been deleted. This commit improves the reliability and efficiency of GCC toolchain tasks. Dependencies now all come from task artifacts and should always be available in the common case. In addition, `fetch-content` downloads and extracts files concurrently. This makes it faster than the serial application which we were previously using. There are some things I don't like about this commit. First, a new Docker image and Python script for downloading URLs feels a bit heavyweight. The Docker image is definitely overkill as things stand. I can eventually justify it because I want to implement support for fetching and repackaging VCS repositories and for caching Debian packages. These will require more packages than what I'm comfortable installing on the base Debian image, therefore justifying a dedicated image. The `fetch-content static-url` sub-command could definitely be implemented as a shell script. But Python is readily available and is more pleasant to maintain than shell, so I wrote it in Python. `fetch-content task-artifacts` is more advanced and writing it in Python is more justified, IMO. FWIW, the script is Python 3 only, which conveniently gives us access to `concurrent.futures`, which facilitates concurrent download. `fetch-content` also duplicates functionality found elsewhere. generic-worker's task payload supports a "mounts" feature which facilitates downloading remote content, including from a task artifact. However, this feature doesn't exist on docker-worker. So we have to implement downloading inside the task rather than at the worker level. I concede that if all workers had generic-worker's "mounts" feature and supported concurrent download, `fetch-content` wouldn't need to exist. `fetch-content` also duplicates functionality of `mach artifact toolchain`. I probably could have used `mach artifact toolchain` instead of writing `fetch-content task-artifacts`. However, I didn't want to introduce the requirement of a VCS checkout. `mach artifact toolchain` has its origins in providing a feature to the build system. And "fetching artifacts from tasks" is a more generic feature than that. I think it should be implemented as a generic feature and not something that is "toolchain" specific. I think the best place for a generic "fetch content" feature is in the worker, where content can be defined in the task payload. But as explained above, that feature isn't universally available. The next best place is probably run-task. run-task already performs generic, very-early task preparation steps, such as performing a VCS checkout. I would like to fold `fetch-content` into run-task and make it all driven by environment variables. But run-task is currently Python 2 and achieving concurrency would involve a bit of programming (or adding package dependencies). I may very well port run-task to Python 3 and then fold fetch-content into it. Or maybe we leave `fetch-content` as a standalone script. MozReview-Commit-ID: AGuTcwNcNJR --HG-- extra : source : 0b941cbdca76fb2fbb98dc5bbc1a0237c69954d0 extra : histedit_source : a3e43bdd8a9a58550bef02fec3be832ca304ea93	2018-06-06 14:37:49 -07:00
Gurzau Raul	53a10471cf	Backed out 2 changesets (bug 1460777) for Toolchains failure on a CLOSED TREE Backed out changeset 52ef9348401d (bug 1460777) Backed out changeset 60ed097650b8 (bug 1460777)	2018-06-06 20:57:29 +03:00
Gregory Szorc	2f189264b9	Bug 1460777 - Taskgraph tasks for retrieving remote content; r=dustin,glandium Currently, many tasks fetch content from the Internets. A problem with that is fetching from the Internets is unreliable: servers may have outages or be slow; content may disappear or change out from under us. The unreliability of 3rd party services poses a risk to Firefox CI. If services aren't available, we could potentially not run some CI tasks. In the worst case, we might not be able to release Firefox. That would be bad. In fact, as I write this, gmplib.org has been unavailable for ~24 hours and Firefox CI is unable to retrieve the GMP source code. As a result, building GCC toolchains is failing. A solution to this is to make tasks more hermetic by depending on fewer network services (which by definition aren't reliable over time and therefore introduce instability). This commit attempts to mitigate some external service dependencies by introducing the fetch task kind. The primary goal of the fetch kind is to obtain remote content and re-expose it as a task artifact. By making external content available as a cached task artifact, we allow dependent tasks to consume this content without touching the service originally providing that content, thus eliminating a run-time dependency and making tasks more hermetic and reproducible over time. We introduce a single "fetch-url" "using" flavor to define tasks that fetch single URLs and then re-expose that URL as an artifact. Powering this is a new, minimal "fetch" Docker image that contains a "fetch-content" Python script that does the work for us. We have added tasks to fetch source archives used to build the GCC toolchains. Fetching remote content and re-exposing it as an artifact is not very useful by itself: the value is in having tasks use those artifacts. We introduce a taskgraph transform that allows tasks to define an array of "fetches." Each entry corresponds to the name of a "fetch" task kind. When present, the corresponding "fetch" task is added as a dependency. And the task ID and artifact path from that "fetch" task is added to the MOZ_FETCHES environment variable of the task depending on it. Our "fetch-content" script has a "task-artifacts" sub-command that tasks can execute to perform retrieval of all artifacts listed in MOZ_FETCHES. To prove all of this works, the code for fetching dependencies when building GCC toolchains has been updated to use `fetch-content`. The now-unused legacy code has been deleted. This commit improves the reliability and efficiency of GCC toolchain tasks. Dependencies now all come from task artifacts and should always be available in the common case. In addition, `fetch-content` downloads and extracts files concurrently. This makes it faster than the serial application which we were previously using. There are some things I don't like about this commit. First, a new Docker image and Python script for downloading URLs feels a bit heavyweight. The Docker image is definitely overkill as things stand. I can eventually justify it because I want to implement support for fetching and repackaging VCS repositories and for caching Debian packages. These will require more packages than what I'm comfortable installing on the base Debian image, therefore justifying a dedicated image. The `fetch-content static-url` sub-command could definitely be implemented as a shell script. But Python is readily available and is more pleasant to maintain than shell, so I wrote it in Python. `fetch-content task-artifacts` is more advanced and writing it in Python is more justified, IMO. FWIW, the script is Python 3 only, which conveniently gives us access to `concurrent.futures`, which facilitates concurrent download. `fetch-content` also duplicates functionality found elsewhere. generic-worker's task payload supports a "mounts" feature which facilitates downloading remote content, including from a task artifact. However, this feature doesn't exist on docker-worker. So we have to implement downloading inside the task rather than at the worker level. I concede that if all workers had generic-worker's "mounts" feature and supported concurrent download, `fetch-content` wouldn't need to exist. `fetch-content` also duplicates functionality of `mach artifact toolchain`. I probably could have used `mach artifact toolchain` instead of writing `fetch-content task-artifacts`. However, I didn't want to introduce the requirement of a VCS checkout. `mach artifact toolchain` has its origins in providing a feature to the build system. And "fetching artifacts from tasks" is a more generic feature than that. I think it should be implemented as a generic feature and not something that is "toolchain" specific. I think the best place for a generic "fetch content" feature is in the worker, where content can be defined in the task payload. But as explained above, that feature isn't universally available. The next best place is probably run-task. run-task already performs generic, very-early task preparation steps, such as performing a VCS checkout. I would like to fold `fetch-content` into run-task and make it all driven by environment variables. But run-task is currently Python 2 and achieving concurrency would involve a bit of programming (or adding package dependencies). I may very well port run-task to Python 3 and then fold fetch-content into it. Or maybe we leave `fetch-content` as a standalone script. MozReview-Commit-ID: AGuTcwNcNJR --HG-- extra : rebase_source : 4918b8c3bac53d63665006802054038bfbca0314	2018-06-06 09:37:38 -07:00

50 commits