forked from mirrors/gecko-dev
		
	
				
				Fune (船) is a Firefox ESR fork with the intent of bringing back the Firefox 2.0 look and overall decrapifying the browser.
				
			
			
		|  3d9d459e52 This lookahead expression prevents the use of more modern and efficient regexp
engine. This slows down "hg status" and other operations.
Since the exception are only about vendored content whose addition is managed by
a script (`match vendor`), that script can deal with this exception by itself,
and it does since the last changeset..
So we drop the exception to unlock various performance improvements for status.
### Why does this improves things?
There improvement can come from different sources:
* Using the "re2" regexp engine to match ignored files and directories provide
  a performance boost for vanillia mercurial installation and fs-monitor one in
  various cases. To benefit from it, just install the "google-re2" packages and
  mercurial will automatically uses it.
* Installing a Mercurial compiled with the Rust extensions unlock the use of a
  more efficient code path for status that performs the necessary action in a
  smarter and parallel ways, providing a significant boost. These extensions
  are available on Linux and MacOs and some distribution have started to enable
  them by default.
* Moving to a more modern "dirstate" format. The dirstate tracks the state of
  the working copy. For a couple of years, Mercurial has a new format for this
  information that is more efficient to read and update and tracks finer
  grained information. This allow substantial improvement in the way we run
  status. The Rust extensions are required to efficiently using this format.
* Using a pure-rust executable. Mercurial has a pure rust version (called
  "rhg") that can handled a limited set of commands. It run without the
  overhead of starting and initializing Python providing another very
  significant boost to performance… but obviously requiring the Rust code path
  to be usable.
### Quick Conclusion of the Benchmarks
(Putting that first for people who just want a quick read.)
* fsmonitor struggle on working copy with many modication,
* Using the "re2" binding from "google-re2" helps, especially for these cases
* On typical mozilla developer machine, the Rust variants match the fsmonitor
  performance at worse and exceed it in multiple cases. Especially it does not
  stuggle with the "many modification" case.
* On smaller machine, the Rust variants still provide a solid and reliable
  performance win accross all operation. That make them preferable to fsmonitor.
* The rust variants matches "git status" performance on equivalement workload.
  The pure Rust version significantly outperforms it.
### Benchmarks descriptions
Machines
--------
We ran benchmark on two different machines:
* A i7-7700K 4 physical / 4 logical cores released in Jan 2017
  To see performance in "low" parallelism case.
* A i9-9900K 8 physical / 16 logical cores released in October 2019
  To see performance in a "high" parallism case.
In both cases the repositories lived in a btrfs file system backed by solid
state disks (ssd or nvme) and the machines had enough ram to keep caches in
memory.
I also ran benchmarks on a more modern i7-1370P release on Jan 2023, and the
results were consistent with the i9-9900K ones.
Variants
--------
Benchmarks were run with multiple variants of Mercurial:
  * python-re:
    * no Rust extensions used,
    * regex engine is the std-lib "re" module.
    * fsmonitor is disabled
    * using the dirstate-v1 format
  * python-re2:
    * no Rust extensions used,
    * regex engine is the std-lib "re" module.
    * fsmonitor is disabled
    * using the dirstate-v1 format
  * fsmonitor-re:
    * no Rust extensions used,
    * regex engine is the std-lib "re" module.
    * fsmonitor is enabled and working at its best
    * using the dirstate-v1 format
  * fsmonitor-re2:
    * no Rust extensions used,
    * regex engine is the std-lib "re" module.
    * fsmonitor is enabled and working at its best
    * using the dirstate-v1 format
  * rust-ds1:
    * Rust extensions are used,
    * regex engine from the Rust "regexp" crate.
    * fsmonitor is disabled
    * using the dirstate-v1 format
  * rust-ds2:
    * Rust extensions are used,
    * regex engine from the Rust "regexp" crate.
    * fsmonitor is disabled
    * using the dirstate-v2 format
  * rgh-ds1:
    * Pure rust executable is used,
    * regex engine from the Rust "regexp" crate.
    * fsmonitor is disabled
    * using the dirstate-v1 format
  * rgh-ds2:
    * Pure rust executable is used,
    * regex engine from the Rust "regexp" crate.
    * fsmonitor is disabled
    * using the dirstate-v2 format
Commands
--------
We ran two kind of operations:
* `hg status` with the default output.
    This command need to search for ignored and unknown files.
    In this case improving the regex engine usually provides significant performance gain.
* `hg status --modified --added --removed --deleted`.
    This command only need to check the state of tracked files.
    In this case, improving the regex engine does not have much effect, but it
    is interesting to compare the performance of the various implementation.
Working copies
--------------
Case 1: pristine-928b0540e421
    Working copy parent is 928b0540e421
      * 341 759 tracked files
      *  21 253 directories
      * no untracked files
Case 2: pristine-8f96f8c756ae
    Working copy parent is 8f96f8c756ae
        (an older changeset I had dirty working copy for)
      * 246 855 tracked files
      *  15 047 directory
      * no untracked files
Case 3: clean-8f96f8c756ae
    Working copy parent is 8f96f8c756ae
      * 246 855 tracked files
      *  23 540 directories
      *  79 901 ignored files
Case 4: dirty-8f96f8c756ae
    Working copy parent is 8f96f8c756ae
      * 246 855 tracked files
      *  33 720 directories
      * 244 386   clean files
      *   1 065 modified files
      *     247   added files
      *   1 040 removed
      *     364 missing files
      *  63 455 unknown files
      *  79 915 ignored files
### Results Analysis
(full, raw number after this section)
About fsmonitor
---------------
Before diving into the improvements related to regex engine, we can note that
the benchmark show that fsmonitor provides a good boost in the pristine/clean cases, and
a noticeable but disappointing improvement in the very dirty case.
                           python-re fsmonitor-re
    pristine-928b0540e421:     1.884 →      0.293 (-85%)
    dirty-8f96f8c756ae:        2.157 →      1.440 (-33%)
Surprisingly when only listing tracked file (during commit for example), fsmonitor actually
get counter productive in the very dirty case
    pristine-928b0540e421:     1.313 →      0.297 (-77%)
    dirty-8f96f8c756ae:        0.993 →      1.272 (+28%)
In addition to being disappointing in the the very dirty case. The performance
with fsmonitor collapses when fsmonitor cannot use its cache. I observed 4
seconds execution time while setting up the brenchmark..
Improvement without involving Rust:
-----------------------------------
Using the re2 binding from the google-re2 package provides a small improvement
to plain python execution (about 15%). This case is relevant because this is
the one that will be used when fsmonitor cannot help or start.
                           python-re  python-re2
    pristine-928b0540e421:      1.884 →   1.650 (-15%)
    dirty-8f96f8c756ae:         2.157 →   1.718 (-20%)
It does not make a difference when only listing tracked files as the hgignore is not involved.
                           python-re  python-re2
    pristine-928b0540e421:      1.313 →    1.332
    dirty-8f96f8c756ae:         0.993 →    0.998
However, surprisingly, it helps fsmonitor quite a lot in in the dirty case
(dirty-8f96f8c756ae). Bringing fsmonitor performance in line with the plain
python one.
                   fsmonitor-re fsmonitor-re2
    list-unknown          1.440 →       1.012 (-30%)
    tracked only          1.272 →       0.840 (-34%)
So to conclude being able to use the "re2" regex engine save up to ⅓ of the
runtime of some operation and never slow things down. So that's a good win.
Improvement involving Rust variants:
------------------------------------
For the pristine-928b0540e421 case (all tracked files clean, no ignored files),
Rust provides speed boost "equivalent" (or better) to the one from fsmonitor.
The precise comparison depends of the parallelism level.
With the 4 physical / 4 logical core machine. The Python+Rust version is slower
than fsmonitor, using dirstate-v2 helping to close some of the gap with
fsmonitor.  Using dirstate-v2 also allow the "rhg" version to become twice
faster than the fsmonitor version. Also keep in mind that even when a bit
slower, the performance of the rust version will be much more stable than
fsmonitor.
    python-re2:    1.650
    fsmonitor-re2: 0.296 (-82%)
    rust-ds1:      0.542 (-67%)
    rust-ds2:      0.368 (-77%)
    rhg-ds1:       0.401 (-75%)
    rhg-ds2:       0.132 (-92%)
With the 8 physical / 16 physical code machine, the Rust catch up with
fsmonitor performance much quicker. The dirstate-v1 is a little slower, but the
dirstate-v2 version is already faster. The pure rust is always faster.
    python-re2:    1.430
    fsmonitor-re2: 0.278 (-80%)
    rust-ds1:      0.359 (-74%)
    rust-ds2:      0.259 (-81%)
    rhg-ds1:       0.235 (-83%)
    rhg-ds2:       0.052 (-96%)
Talking about parallism. We see that the code scale well, doubling the
number of core bring about twice the performance which is great.
    pristine-928b0540e421     4/4    8/16
        rhg-ds1:            0.401 → 0.235 (× 1.70)
        rhg-ds2:            0.132 → 0.052 (× 2.54)
    clean-8f96f8c756ae
        rhg-ds1:            0.286 → 0.169 (× 1.70)
        rhg-ds2:            0.101 → 0.040 (× 2.52)
    dirty-8f96f8c756ae
        rhg-ds1:            0.380 → 0.234 (x 1.62)
        rhg-ds2:            0.232 → 0.124 (x 1.87)
Comparing with git performance on the pristine-928b0540e421 case also yield
great results. Surprisingly, the variant with a Python overhead still beat (or
match) git performance in this case. The pure Rust executable is always
significantly faster. Below is a comparison grouped by comparable formats.
    git status -s: 0.554 (without untracked cache)
    rust-ds1:      0.359 (- 35%)
    rhg-ds1:       0.235 (- 57%)
    git status -s: 0.232 (with untracked cache)
    rust-ds2:      0.259 (+ 11%)
    rhg-ds2:       0.052 (- 77%)
The clean-8f96f8c756ae case (all tracked clean, many ignored files) show result
result similar to pristine-928b0540e421. "Low" parallism give good gains
without fully matching the fs monitor performance. The High parallism provide
similar performance. In both case we gain the benefit of more stable
performances.
        (cores)          4/4           8/16
        python-re2:    1.282        | 1.119
        fsmonitor-re2: 0.243 (-81%) | 0.225 (-80%)
        rust-ds1:      0.416 (-68%) | 0.282 (-75%)
        rust-ds2:      0.303 (-76%) | 0.222 (-80%)
        rhg-ds1:       0.286 (-78%) | 0.169 (-85%)
        rhg-ds2:       0.101 (-92%) | 0.040 (-96%)
Things change quite a lot in the dirty-8f96f8c756ae case, where fsmonitor
struggled. The Rust variants still provides great speedup, significantly
beating the fsmonitor variants for both machines. (comparing to fsmonitor-re
this time)
        (cores)          4/4           8/16
        fsmonitor-re:  1.440        | 1.501
        fsmonitor-re2: 1.012 (-30%) | 1.051 (-30%)
        rust-ds1:      0.624 (-56%) | 0.519 (-65%)
        rust-ds2:      0.553 (-62%) | 0.483 (-68%)
        rhg-ds1:       0.380 (-73%) | 0.234 (-84%)
        rhg-ds2:       0.232 (-83%) | 0.124 (-91%)
Things is confirmed in the "listing tracked only" version of dirty-8f96f8c756ae
case were fs monitor was not really improving the situation compared to Python.
        (cores)          4/4           8/16
        python-re:     0.993        | 0.843076
        python-re2:    0.998        | 0.843324
        fsmonitor-re:  1.272 (+28%) | 1.291313 (+53%)
        fsmonitor-re2: 0.840 (-15%) | 0.844374
        rust-ds1:      0.364 (-63%) | 0.273305 (-68%)
        rust-ds2:      0.301 (-70%) | 0.233230 (-72%)
        rhg-ds1:       0.231 (-77%) | 0.153346 (-82%)
        rhg-ds2:       0.099 (-90%) | 0.039545 (-95%)
### Full benchmark numbers for `hg status`
Here are the exhaustive number, all time in seconds.
Case 1: pristine-928b0540e421
    (4/4 cores i7-7700K Jan 2017)
        python-re:     1.884
        python-re2:    1.650
        fsmonitor-re:  0.293 (more about 4 second when confused)
        fsmonitor-re2: 0.296
        rust-ds1:      0.542
        rust-ds2:      0.368
        rhg-ds1:       0.401
        rhg-ds2:       0.132
    (8/16 cores i9-9900K CPU October 2018)
        python-re:     1.674
        python-re2:    1.430
        fsmonitor-re:  0.272
        fsmonitor-re2: 0.278
        rust-ds1:      0.359
        rust-ds2:      0.259
        rhg-ds1:       0.235
        rhg-ds2:       0.052
        For reference, I also gathered timing for `git status` on this machine and repo
        git status -s: 0.554 (without untracked cache)
        git status -s: 0.232 (with untracked cache)
Case 2: pristine-8f96f8c756ae
    (4/4 cores i7-7700K)
        python-re:     1.306
        python-re2:    1.227
        fsmonitor-re:  0.243
        fsmonitor-re2: 0.242
        rust-ds1:      0.416
        rust-ds2:      0.308
        rhg-ds1:       0.287
        rhg-ds2:       0.102
    (8/16 cores i9-9900K CPU)
        python-re:     1.131
        python-re2:    1.076
        fsmonitor-re:  0.222
        fsmonitor-re2: 0.222
        rust-ds1:      0.279
        rust-ds2:      0.222
        rhg-ds1:       0.168
        rhg-ds2:       0.038
Case 3: clean-8f96f8c756ae
    (4/4 cores i7-7700K)
        python-re:     1.294
        python-re2:    1.282
        fsmonitor-re:  0.241
        fsmonitor-re2: 0.243
        rust-ds1:      0.416
        rust-ds2:      0.303
        rhg-ds1:       0.286
        rhg-ds2:       0.101
    (8/16 cores i9-9900K CPU)
        python-re:     1.170
        python-re2:    1.119
        fsmonitor-re:  0.224
        fsmonitor-re2: 0.225
        rust-ds1:      0.282
        rust-ds2:      0.222
        rhg-ds1:       0.169
        rhg-ds2:       0.040
Case 4: dirty-8f96f8c756ae
    (4/4 cores i7-7700K)
        python-re:     2.157
        python-re2:    1.718
        fsmonitor-re:  1.440
        fsmonitor-re2: 1.012
        rust-ds1:      0.624
        rust-ds2:      0.553
        rhg-ds1:       0.380
        rhg-ds2:       0.232
    (8/16 cores i9-9900K CPU)
        python-re:     2.031
        python-re2:    1.560
        fsmonitor-re:  1.501
        fsmonitor-re2: 1.051
        rust-ds1:      0.519
        rust-ds2:      0.483
        rhg-ds1:       0.234
        rhg-ds2:       0.124
### Benchmark numbers for `hg status --modified --added --removed --deleted`
With this invocation, status no longer need to list directory content (or use
cache to skip that step). Status just need to check the known list of tracked
files.
Case 1: pristine-928b0540e421
    (4/4 cores i7-7700K CPU)
        python-re:     1.313
        python-re2:    1.332
        fsmonitor-re:  0.297
        fsmonitor-re2: 0.296
        rust-ds1:      0.455
        rust-ds2:      0.369
        rhg-ds1:       0.316
        rhg-ds2:       0.130
    (8/16 cores i9-9900K CPU)
        python-re:     1.129
        python-re2:    1.133
        fsmonitor-re:  0.273
        fsmonitor-re2: 0.271
        rust-ds1:      0.330
        rust-ds2:      0.244
        rhg-ds1:       0.207
        rhg-ds2:       0.050
        For reference, I also gathered timing for `git status` on this machine and repo
        git status -s --untracked-files=no: 0.110
Case 2: pristine-8f96f8c756ae
    (4/4 cores i7-7700K)
        python-re:     0.993
        python-re2:    0.987
        fsmonitor-re:  0.241
        fsmonitor-re2: 0.243
        rust-ds1:      0.358
        rust-ds2:      0.307
        rhg-ds1:       0.228
        rhg-ds2:       0.100
    (8/16 cores i9-9900K CPU)
        python-re:     0.856
        python-re2:    0.839
        fsmonitor-re:  0.221
        fsmonitor-re2: 0.222
        rust-ds1:      0.262
        rust-ds2:      0.221
        rhg-ds1:       0.152
        rhg-ds2:       0.038
Case 3: clean-8f96f8c756ae
    (4/4 cores i7-7700K)
        python-re:     0.973
        python-re2:    0.979
        fsmonitor-re:  0.242
        fsmonitor-re2: 0.242
        rust-ds1:      0.357
        rust-ds2:      0.304
        rhg-ds1:       0.224
        rhg-ds2:       0.098
    (8/16 cores i9-9900K CPU)
        python-re:     0.838
        python-re2:    0.837
        fsmonitor-re:  0.222
        fsmonitor-re2: 0.221
        rust-ds1:      0.263
        rust-ds2:      0.219
        rhg-ds1:       0.152
        rhg-ds2:       0.037
Case 4: dirty-8f96f8c756ae
    (4/4 cores i7-7700K)
        python-re:     0.993
        python-re2:    0.998
        fsmonitor-re:  1.272
        fsmonitor-re2: 0.840
        rust-ds1:      0.364
        rust-ds2:      0.301
        rhg-ds1:       0.231
        rhg-ds2:       0.099
    (8/16 cores i9-9900K CPU)
        python-re:     0.843
        python-re2:    0.843
        fsmonitor-re:  1.291
        fsmonitor-re2: 0.844
        rust-ds1:      0.273
        rust-ds2:      0.233
        rhg-ds1:       0.153
        rhg-ds2:       0.040
Differential Revision: https://phabricator.services.mozilla.com/D208966 | ||
|---|---|---|
| .cargo | ||
| .github/workflows | ||
| .vscode | ||
| accessible | ||
| browser | ||
| build | ||
| caps | ||
| chrome | ||
| config | ||
| devtools | ||
| docs | ||
| docshell | ||
| dom | ||
| editor | ||
| extensions | ||
| gfx | ||
| gradle/wrapper | ||
| hal | ||
| image | ||
| intl | ||
| ipc | ||
| js | ||
| layout | ||
| media | ||
| memory | ||
| mfbt | ||
| mobile | ||
| modules | ||
| mozglue | ||
| netwerk | ||
| nsprpub | ||
| other-licenses | ||
| parser | ||
| python | ||
| remote | ||
| security | ||
| services | ||
| servo | ||
| startupcache | ||
| storage | ||
| supply-chain | ||
| taskcluster | ||
| testing | ||
| third_party | ||
| toolkit | ||
| tools | ||
| uriloader | ||
| view | ||
| widget | ||
| xpcom | ||
| xpfe/appshell | ||
| .arcconfig | ||
| .babel-eslint.rc.js | ||
| .clang-format | ||
| .clang-format-ignore | ||
| .cron.yml | ||
| .eslintignore | ||
| .eslintrc-test-paths.js | ||
| .eslintrc.js | ||
| .git-blame-ignore-revs | ||
| .gitattributes | ||
| .gitignore | ||
| .hg-annotate-ignore-revs | ||
| .hg-format-source | ||
| .hgignore | ||
| .hgtags | ||
| .lando.ini | ||
| .lldbinit | ||
| .mailmap | ||
| .prettierignore | ||
| .prettierrc.js | ||
| .stylelintignore | ||
| .stylelintrc.js | ||
| .taskcluster.yml | ||
| .trackerignore | ||
| .yamllint | ||
| .ycm_extra_conf.py | ||
| aclocal.m4 | ||
| AUTHORS | ||
| build.gradle | ||
| Cargo.lock | ||
| Cargo.toml | ||
| client.mk | ||
| client.py | ||
| CLOBBER | ||
| configure | ||
| configure.py | ||
| GNUmakefile | ||
| gradle.properties | ||
| gradlew | ||
| gradlew.bat | ||
| LICENSE | ||
| mach | ||
| mach.cmd | ||
| mach.ps1 | ||
| Makefile.in | ||
| mots.yaml | ||
| moz.build | ||
| moz.configure | ||
| mozilla-config.h.in | ||
| old-configure.in | ||
| package-lock.json | ||
| package.json | ||
| pyproject.toml | ||
| README.txt | ||
| settings.gradle | ||
| substitute-local-geckoview.gradle | ||
| test.mozbuild | ||
An explanation of the Firefox Source Code Directory Structure and links to
project pages with documentation can be found at:
    https://firefox-source-docs.mozilla.org/contributing/directory_structure.html
For information on how to build Firefox from the source code and create the patch see:
    https://firefox-source-docs.mozilla.org/contributing/contribution_quickref.html
If you have a question about developing Firefox, and can't find the solution
on https://firefox-source-docs.mozilla.org/, you can try asking your question on Matrix at chat.mozilla.org in `Introduction` (https://chat.mozilla.org/#/room/#introduction:mozilla.org) channel.
Nightly development builds can be downloaded from:
    https://archive.mozilla.org/pub/firefox/nightly/latest-mozilla-central/
            - or -
    https://www.mozilla.org/firefox/channel/desktop/#nightly
Keep in mind that nightly builds, which are used by Firefox developers for
testing, may be buggy.