Most of this is fixing functions that in some cases return a value but then
can also run to completion without returning anything. ESLint 2 catches this
where previous versions didn't. Unless there was an obvious other choice I just
made these functions return undefined at the end which is effectively what
already happens.
MozReview-Commit-ID: KHYdAkRvhVr
--HG--
extra : rebase_source : 720cc9d864175248508146a7f4704b4bf3f02439
extra : histedit_source : 6cc0ecbf21a571e1a41d517b67512a3452fac19a
We have some oddities in our jemalloc stats reporting.
- "heap-overhead-ratio" is a strange measurement: overhead / non-overhead,
expressed as a percentage. And it omits "bin_unused", which appears to be an
oversight.
- "heap-committed" also omits "bin_unused".
- There are some minor errors in memory report descriptions.
This patch fixes these and improves the heap reporting. It makes the following
reporting changes:
- "heap-allocated": Duplicated as "heap-committed/allocated". (We keep
"heap-allocated" because that's a special value used in the computation of
"heap-unclassified".)
- "heap-committed/overhead": Added; it's the same as the sum of the
"explicit/heap-overhead/*" values. Together with "heap-committed/allocated"
it shows clearly what fraction of the heap is overhead and what fraction is
useful.
- "heap-committed": Removed; now implicit as the "heap-committed/" node.
- "heap-overhead-ratio":
- Removed from memory reports; now shown as the percentage of the new
"heap-committed/overhead" node.
- Still available as a distinguished amount (because it's useful in
isolation) but renamed to heapOverheadFraction, and the telemetry ID is
renamed as MEMORY_HEAP_OVERHEAD_FRACTION.
- "heap-chunks": Removed; it's not that interesting, and can be manually
computed as "heap-mapped" / "heap-chunksize" if necessary.
--HG--
extra : rebase_source : 6f238cda780eb17b2de2f8b9a0b04377c93b109c
What it does:
Adds a new function, TelemetrySession.getChildThreadHangs(), which returns a promise resolving to an array of threadHangStats [1], one per process.
Note that processes that spawn or die while the function's promise is created but not resolved may be excluded from the final result.
How we do this:
1. Parent sends a MESSAGE_TELEMETRY_GET_CHILD_PAYLOAD message to each child, promising the results of these messages.
2. Child processes respond to parent with a MESSAGE_TELEMETRY_THREAD_HANGS, which contains BHR stats in the payload.
3. Parent combines all the child responses together and resolves the promise.
Plus a bunch of synchronization stuff and handling of edge cases since the number of child processes can change at any time.
Also, there is a 200ms timeout since we can't handle all of these cases. Specifically, when a child dies without responding, after all other child processes have responded.
Why we do this:
* We can technically get thread hang stats by retrieving Telemetry pings (see requestChildPayloads() in TelemetrySession for details), but this is very slow and can only be done for one process at a time.
* TelemetrySession is responsible for various Telemetry IPC-related tasks, and so is a natural place to expose this function (i.e., the function blends in well with the rest of the API).
* Statuser [2] uses this for quickly obtaining child process BHR stats. This allows us to get realtime hang monitoring for child processes.
[1]: https://dxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/nsITelemetry.idl#146
[2]: https://github.com/chutten/statuser
What it does:
Adds a new function, TelemetrySession.getChildThreadHangs(), which returns a promise resolving to an array of threadHangStats [1], one per process.
Note that processes that spawn or die while the function's promise is created but not resolved may be excluded from the final result.
How we do this:
1. Parent sends a MESSAGE_TELEMETRY_GET_CHILD_PAYLOAD message to each child, promising the results of these messages.
2. Child processes respond to parent with a MESSAGE_TELEMETRY_THREAD_HANGS, which contains BHR stats in the payload.
3. Parent combines all the child responses together and resolves the promise.
Plus a bunch of synchronization stuff and handling of edge cases since the number of child processes can change at any time.
Also, there is a 200ms timeout since we can't handle all of these cases. Specifically, when a child dies without responding, after all other child processes have responded.
Why we do this:
* We can technically get thread hang stats by retrieving Telemetry pings (see requestChildPayloads() in TelemetrySession for details), but this is very slow and can only be done for one process at a time.
* TelemetrySession is responsible for various Telemetry IPC-related tasks, and so is a natural place to expose this function (i.e., the function blends in well with the rest of the API).
* Statuser [2] uses this for quickly obtaining child process BHR stats. This allows us to get realtime hang monitoring for child processes.
[1]: https://dxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/nsITelemetry.idl#146
[2]: https://github.com/chutten/statuser
--HG--
extra : rebase_source : d323afc3de716e5c978f1120f43af77f193f0b8a
We now have all the necessary measurement APIs to get a full memory picture for
a running multi-process instance. However, there's no way to correlate one
particular RSS measurement on chrome with its USS measurements on content
processes.
So do that in TelemetrySession and report it.
--HG--
extra : rebase_source : d410307f557923cf140105acfeaa486e274731a5
We now have all the necessary measurement APIs to get a full memory picture for
a running multi-process instance. However, there's no way to correlate one
particular RSS measurement on chrome with its USS measurements on content
processes.
So do that in TelemetrySession and report it.
Unique Set Size (USS) could be described as "the amount of memory you could
expect to reclaim if you killed this process"
Resident Set Size (RSS) is USS with the addition of memory allocated by
shared memory.
To get a full picture of the memory use of a multi-process application is
impossible. A better guess than most is
Parent Process' RSS + sum(Child Processes' USS)
Or, from Telemetry:
Parent MEMORY_RESIDENT + sum(Children's MEMORY_UNIQUE)