From 3d9d459e5206f7bb9b91cf20677a8ccc3a347df9 Mon Sep 17 00:00:00 2001 From: Pierre-Yves David Date: Wed, 1 May 2024 14:54:59 +0000 Subject: [PATCH] Bug 1894160: hgignore: simplify the egginfo pattern; r=sheehan MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This lookahead expression prevents the use of more modern and efficient regexp engine. This slows down "hg status" and other operations. Since the exception are only about vendored content whose addition is managed by a script (`match vendor`), that script can deal with this exception by itself, and it does since the last changeset.. So we drop the exception to unlock various performance improvements for status. ### Why does this improves things? There improvement can come from different sources: * Using the "re2" regexp engine to match ignored files and directories provide a performance boost for vanillia mercurial installation and fs-monitor one in various cases. To benefit from it, just install the "google-re2" packages and mercurial will automatically uses it. * Installing a Mercurial compiled with the Rust extensions unlock the use of a more efficient code path for status that performs the necessary action in a smarter and parallel ways, providing a significant boost. These extensions are available on Linux and MacOs and some distribution have started to enable them by default. * Moving to a more modern "dirstate" format. The dirstate tracks the state of the working copy. For a couple of years, Mercurial has a new format for this information that is more efficient to read and update and tracks finer grained information. This allow substantial improvement in the way we run status. The Rust extensions are required to efficiently using this format. * Using a pure-rust executable. Mercurial has a pure rust version (called "rhg") that can handled a limited set of commands. It run without the overhead of starting and initializing Python providing another very significant boost to performance… but obviously requiring the Rust code path to be usable. ### Quick Conclusion of the Benchmarks (Putting that first for people who just want a quick read.) * fsmonitor struggle on working copy with many modication, * Using the "re2" binding from "google-re2" helps, especially for these cases * On typical mozilla developer machine, the Rust variants match the fsmonitor performance at worse and exceed it in multiple cases. Especially it does not stuggle with the "many modification" case. * On smaller machine, the Rust variants still provide a solid and reliable performance win accross all operation. That make them preferable to fsmonitor. * The rust variants matches "git status" performance on equivalement workload. The pure Rust version significantly outperforms it. ### Benchmarks descriptions Machines -------- We ran benchmark on two different machines: * A i7-7700K 4 physical / 4 logical cores released in Jan 2017 To see performance in "low" parallelism case. * A i9-9900K 8 physical / 16 logical cores released in October 2019 To see performance in a "high" parallism case. In both cases the repositories lived in a btrfs file system backed by solid state disks (ssd or nvme) and the machines had enough ram to keep caches in memory. I also ran benchmarks on a more modern i7-1370P release on Jan 2023, and the results were consistent with the i9-9900K ones. Variants -------- Benchmarks were run with multiple variants of Mercurial: * python-re: * no Rust extensions used, * regex engine is the std-lib "re" module. * fsmonitor is disabled * using the dirstate-v1 format * python-re2: * no Rust extensions used, * regex engine is the std-lib "re" module. * fsmonitor is disabled * using the dirstate-v1 format * fsmonitor-re: * no Rust extensions used, * regex engine is the std-lib "re" module. * fsmonitor is enabled and working at its best * using the dirstate-v1 format * fsmonitor-re2: * no Rust extensions used, * regex engine is the std-lib "re" module. * fsmonitor is enabled and working at its best * using the dirstate-v1 format * rust-ds1: * Rust extensions are used, * regex engine from the Rust "regexp" crate. * fsmonitor is disabled * using the dirstate-v1 format * rust-ds2: * Rust extensions are used, * regex engine from the Rust "regexp" crate. * fsmonitor is disabled * using the dirstate-v2 format * rgh-ds1: * Pure rust executable is used, * regex engine from the Rust "regexp" crate. * fsmonitor is disabled * using the dirstate-v1 format * rgh-ds2: * Pure rust executable is used, * regex engine from the Rust "regexp" crate. * fsmonitor is disabled * using the dirstate-v2 format Commands -------- We ran two kind of operations: * `hg status` with the default output. This command need to search for ignored and unknown files. In this case improving the regex engine usually provides significant performance gain. * `hg status --modified --added --removed --deleted`. This command only need to check the state of tracked files. In this case, improving the regex engine does not have much effect, but it is interesting to compare the performance of the various implementation. Working copies -------------- Case 1: pristine-928b0540e421 Working copy parent is 928b0540e421 * 341 759 tracked files * 21 253 directories * no untracked files Case 2: pristine-8f96f8c756ae Working copy parent is 8f96f8c756ae (an older changeset I had dirty working copy for) * 246 855 tracked files * 15 047 directory * no untracked files Case 3: clean-8f96f8c756ae Working copy parent is 8f96f8c756ae * 246 855 tracked files * 23 540 directories * 79 901 ignored files Case 4: dirty-8f96f8c756ae Working copy parent is 8f96f8c756ae * 246 855 tracked files * 33 720 directories * 244 386 clean files * 1 065 modified files * 247 added files * 1 040 removed * 364 missing files * 63 455 unknown files * 79 915 ignored files ### Results Analysis (full, raw number after this section) About fsmonitor --------------- Before diving into the improvements related to regex engine, we can note that the benchmark show that fsmonitor provides a good boost in the pristine/clean cases, and a noticeable but disappointing improvement in the very dirty case. python-re fsmonitor-re pristine-928b0540e421: 1.884 → 0.293 (-85%) dirty-8f96f8c756ae: 2.157 → 1.440 (-33%) Surprisingly when only listing tracked file (during commit for example), fsmonitor actually get counter productive in the very dirty case pristine-928b0540e421: 1.313 → 0.297 (-77%) dirty-8f96f8c756ae: 0.993 → 1.272 (+28%) In addition to being disappointing in the the very dirty case. The performance with fsmonitor collapses when fsmonitor cannot use its cache. I observed 4 seconds execution time while setting up the brenchmark.. Improvement without involving Rust: ----------------------------------- Using the re2 binding from the google-re2 package provides a small improvement to plain python execution (about 15%). This case is relevant because this is the one that will be used when fsmonitor cannot help or start. python-re python-re2 pristine-928b0540e421: 1.884 → 1.650 (-15%) dirty-8f96f8c756ae: 2.157 → 1.718 (-20%) It does not make a difference when only listing tracked files as the hgignore is not involved. python-re python-re2 pristine-928b0540e421: 1.313 → 1.332 dirty-8f96f8c756ae: 0.993 → 0.998 However, surprisingly, it helps fsmonitor quite a lot in in the dirty case (dirty-8f96f8c756ae). Bringing fsmonitor performance in line with the plain python one. fsmonitor-re fsmonitor-re2 list-unknown 1.440 → 1.012 (-30%) tracked only 1.272 → 0.840 (-34%) So to conclude being able to use the "re2" regex engine save up to ⅓ of the runtime of some operation and never slow things down. So that's a good win. Improvement involving Rust variants: ------------------------------------ For the pristine-928b0540e421 case (all tracked files clean, no ignored files), Rust provides speed boost "equivalent" (or better) to the one from fsmonitor. The precise comparison depends of the parallelism level. With the 4 physical / 4 logical core machine. The Python+Rust version is slower than fsmonitor, using dirstate-v2 helping to close some of the gap with fsmonitor. Using dirstate-v2 also allow the "rhg" version to become twice faster than the fsmonitor version. Also keep in mind that even when a bit slower, the performance of the rust version will be much more stable than fsmonitor. python-re2: 1.650 fsmonitor-re2: 0.296 (-82%) rust-ds1: 0.542 (-67%) rust-ds2: 0.368 (-77%) rhg-ds1: 0.401 (-75%) rhg-ds2: 0.132 (-92%) With the 8 physical / 16 physical code machine, the Rust catch up with fsmonitor performance much quicker. The dirstate-v1 is a little slower, but the dirstate-v2 version is already faster. The pure rust is always faster. python-re2: 1.430 fsmonitor-re2: 0.278 (-80%) rust-ds1: 0.359 (-74%) rust-ds2: 0.259 (-81%) rhg-ds1: 0.235 (-83%) rhg-ds2: 0.052 (-96%) Talking about parallism. We see that the code scale well, doubling the number of core bring about twice the performance which is great. pristine-928b0540e421 4/4 8/16 rhg-ds1: 0.401 → 0.235 (× 1.70) rhg-ds2: 0.132 → 0.052 (× 2.54) clean-8f96f8c756ae rhg-ds1: 0.286 → 0.169 (× 1.70) rhg-ds2: 0.101 → 0.040 (× 2.52) dirty-8f96f8c756ae rhg-ds1: 0.380 → 0.234 (x 1.62) rhg-ds2: 0.232 → 0.124 (x 1.87) Comparing with git performance on the pristine-928b0540e421 case also yield great results. Surprisingly, the variant with a Python overhead still beat (or match) git performance in this case. The pure Rust executable is always significantly faster. Below is a comparison grouped by comparable formats. git status -s: 0.554 (without untracked cache) rust-ds1: 0.359 (- 35%) rhg-ds1: 0.235 (- 57%) git status -s: 0.232 (with untracked cache) rust-ds2: 0.259 (+ 11%) rhg-ds2: 0.052 (- 77%) The clean-8f96f8c756ae case (all tracked clean, many ignored files) show result result similar to pristine-928b0540e421. "Low" parallism give good gains without fully matching the fs monitor performance. The High parallism provide similar performance. In both case we gain the benefit of more stable performances. (cores) 4/4 8/16 python-re2: 1.282 | 1.119 fsmonitor-re2: 0.243 (-81%) | 0.225 (-80%) rust-ds1: 0.416 (-68%) | 0.282 (-75%) rust-ds2: 0.303 (-76%) | 0.222 (-80%) rhg-ds1: 0.286 (-78%) | 0.169 (-85%) rhg-ds2: 0.101 (-92%) | 0.040 (-96%) Things change quite a lot in the dirty-8f96f8c756ae case, where fsmonitor struggled. The Rust variants still provides great speedup, significantly beating the fsmonitor variants for both machines. (comparing to fsmonitor-re this time) (cores) 4/4 8/16 fsmonitor-re: 1.440 | 1.501 fsmonitor-re2: 1.012 (-30%) | 1.051 (-30%) rust-ds1: 0.624 (-56%) | 0.519 (-65%) rust-ds2: 0.553 (-62%) | 0.483 (-68%) rhg-ds1: 0.380 (-73%) | 0.234 (-84%) rhg-ds2: 0.232 (-83%) | 0.124 (-91%) Things is confirmed in the "listing tracked only" version of dirty-8f96f8c756ae case were fs monitor was not really improving the situation compared to Python. (cores) 4/4 8/16 python-re: 0.993 | 0.843076 python-re2: 0.998 | 0.843324 fsmonitor-re: 1.272 (+28%) | 1.291313 (+53%) fsmonitor-re2: 0.840 (-15%) | 0.844374 rust-ds1: 0.364 (-63%) | 0.273305 (-68%) rust-ds2: 0.301 (-70%) | 0.233230 (-72%) rhg-ds1: 0.231 (-77%) | 0.153346 (-82%) rhg-ds2: 0.099 (-90%) | 0.039545 (-95%) ### Full benchmark numbers for `hg status` Here are the exhaustive number, all time in seconds. Case 1: pristine-928b0540e421 (4/4 cores i7-7700K Jan 2017) python-re: 1.884 python-re2: 1.650 fsmonitor-re: 0.293 (more about 4 second when confused) fsmonitor-re2: 0.296 rust-ds1: 0.542 rust-ds2: 0.368 rhg-ds1: 0.401 rhg-ds2: 0.132 (8/16 cores i9-9900K CPU October 2018) python-re: 1.674 python-re2: 1.430 fsmonitor-re: 0.272 fsmonitor-re2: 0.278 rust-ds1: 0.359 rust-ds2: 0.259 rhg-ds1: 0.235 rhg-ds2: 0.052 For reference, I also gathered timing for `git status` on this machine and repo git status -s: 0.554 (without untracked cache) git status -s: 0.232 (with untracked cache) Case 2: pristine-8f96f8c756ae (4/4 cores i7-7700K) python-re: 1.306 python-re2: 1.227 fsmonitor-re: 0.243 fsmonitor-re2: 0.242 rust-ds1: 0.416 rust-ds2: 0.308 rhg-ds1: 0.287 rhg-ds2: 0.102 (8/16 cores i9-9900K CPU) python-re: 1.131 python-re2: 1.076 fsmonitor-re: 0.222 fsmonitor-re2: 0.222 rust-ds1: 0.279 rust-ds2: 0.222 rhg-ds1: 0.168 rhg-ds2: 0.038 Case 3: clean-8f96f8c756ae (4/4 cores i7-7700K) python-re: 1.294 python-re2: 1.282 fsmonitor-re: 0.241 fsmonitor-re2: 0.243 rust-ds1: 0.416 rust-ds2: 0.303 rhg-ds1: 0.286 rhg-ds2: 0.101 (8/16 cores i9-9900K CPU) python-re: 1.170 python-re2: 1.119 fsmonitor-re: 0.224 fsmonitor-re2: 0.225 rust-ds1: 0.282 rust-ds2: 0.222 rhg-ds1: 0.169 rhg-ds2: 0.040 Case 4: dirty-8f96f8c756ae (4/4 cores i7-7700K) python-re: 2.157 python-re2: 1.718 fsmonitor-re: 1.440 fsmonitor-re2: 1.012 rust-ds1: 0.624 rust-ds2: 0.553 rhg-ds1: 0.380 rhg-ds2: 0.232 (8/16 cores i9-9900K CPU) python-re: 2.031 python-re2: 1.560 fsmonitor-re: 1.501 fsmonitor-re2: 1.051 rust-ds1: 0.519 rust-ds2: 0.483 rhg-ds1: 0.234 rhg-ds2: 0.124 ### Benchmark numbers for `hg status --modified --added --removed --deleted` With this invocation, status no longer need to list directory content (or use cache to skip that step). Status just need to check the known list of tracked files. Case 1: pristine-928b0540e421 (4/4 cores i7-7700K CPU) python-re: 1.313 python-re2: 1.332 fsmonitor-re: 0.297 fsmonitor-re2: 0.296 rust-ds1: 0.455 rust-ds2: 0.369 rhg-ds1: 0.316 rhg-ds2: 0.130 (8/16 cores i9-9900K CPU) python-re: 1.129 python-re2: 1.133 fsmonitor-re: 0.273 fsmonitor-re2: 0.271 rust-ds1: 0.330 rust-ds2: 0.244 rhg-ds1: 0.207 rhg-ds2: 0.050 For reference, I also gathered timing for `git status` on this machine and repo git status -s --untracked-files=no: 0.110 Case 2: pristine-8f96f8c756ae (4/4 cores i7-7700K) python-re: 0.993 python-re2: 0.987 fsmonitor-re: 0.241 fsmonitor-re2: 0.243 rust-ds1: 0.358 rust-ds2: 0.307 rhg-ds1: 0.228 rhg-ds2: 0.100 (8/16 cores i9-9900K CPU) python-re: 0.856 python-re2: 0.839 fsmonitor-re: 0.221 fsmonitor-re2: 0.222 rust-ds1: 0.262 rust-ds2: 0.221 rhg-ds1: 0.152 rhg-ds2: 0.038 Case 3: clean-8f96f8c756ae (4/4 cores i7-7700K) python-re: 0.973 python-re2: 0.979 fsmonitor-re: 0.242 fsmonitor-re2: 0.242 rust-ds1: 0.357 rust-ds2: 0.304 rhg-ds1: 0.224 rhg-ds2: 0.098 (8/16 cores i9-9900K CPU) python-re: 0.838 python-re2: 0.837 fsmonitor-re: 0.222 fsmonitor-re2: 0.221 rust-ds1: 0.263 rust-ds2: 0.219 rhg-ds1: 0.152 rhg-ds2: 0.037 Case 4: dirty-8f96f8c756ae (4/4 cores i7-7700K) python-re: 0.993 python-re2: 0.998 fsmonitor-re: 1.272 fsmonitor-re2: 0.840 rust-ds1: 0.364 rust-ds2: 0.301 rhg-ds1: 0.231 rhg-ds2: 0.099 (8/16 cores i9-9900K CPU) python-re: 0.843 python-re2: 0.843 fsmonitor-re: 1.291 fsmonitor-re2: 0.844 rust-ds1: 0.273 rust-ds2: 0.233 rhg-ds1: 0.153 rhg-ds2: 0.040 Differential Revision: https://phabricator.services.mozilla.com/D208966 --- .gitignore | 12 ++++-------- .hgignore | 8 ++++---- 2 files changed, 8 insertions(+), 12 deletions(-) diff --git a/.gitignore b/.gitignore index 6eaf50745af7..5fec67ea842f 100644 --- a/.gitignore +++ b/.gitignore @@ -31,14 +31,10 @@ ID # Filesystem temporaries .fuse_hidden* -# Ignore Python .egg-info directories for first-party modules (but, -# still add vendored packages' .egg-info directories) -# lint-ignore-next-line: syntax-difference -*.egg-info -# lint-ignore-next-line: syntax-difference -!third_party/python/**/*.egg-info -# lint-ignore-next-line: syntax-difference -!testing/web-platform/tests/tools/third_party/**/*.egg-info +# Ignore Python .egg-info directories. +# This is only relevant for first-party modules, but adding that directory for +# third-party packages is dealt with by the script vendoring them. +*.egg-info/ # Vim swap files. .*.sw[a-z] diff --git a/.hgignore b/.hgignore index 7a596d31e1b1..5bdfb43a1c51 100644 --- a/.hgignore +++ b/.hgignore @@ -27,10 +27,10 @@ # Filesystem temporaries (^|/)\.fuse_hidden.*$ -# Ignore Python .egg-info directories for first-party modules (but, -# still add vendored packages' .egg-info directories) -# lint-ignore-next-line: syntax-difference -^(?=.*\.egg-info/)(?!^third_party/python/)(?!^testing/web-platform/tests/tools/third_party/) +# Ignore Python .egg-info directories. +# This is only relevant for first-party modules, but adding that directory for +# third-party packages is dealt with by the script vendoring them. +.*\.egg-info/ # Vim swap files. ^\.sw[a-z]$