This is some sort of followup to bug 1423813, providing a minimalistic
way to undo elfhack when the elfhack sections are in separate segments,
which has been the case since bug 1385783 but didn't cause problems
on Android builds until bug 1423822.
Depends on D9622
Differential Revision: https://phabricator.services.mozilla.com/D9623
--HG--
extra : moz-landing-system : lando
This effectively backs out bug 822584, which worked around a similar
problem to what we are facing with Android xpcshell, being that the
crash reporter doesn't handle the address space "fragmentation" induced
by elfhack. The work around worked, at the expense of some added
complexity.
It was used for B2G only, and has effectively been unused since B2G was
retired.
Differential Revision: https://phabricator.services.mozilla.com/D9622
--HG--
extra : moz-landing-system : lando
When checking whether the new relocations sizes are going to be a win, we
need to account for the fact that there are non-elfhacked relocation
remaining.
Differential Revision: https://phabricator.services.mozilla.com/D5837
--HG--
extra : moz-landing-system : lando
In some rare cases, it is possible for one of the eh_frame sections'
original address to be larger than the address of the injected code
section, which is added before the first executable section. Namely,
this happens in the rare case where that eh_frame section is smaller
than the injected code section, and is adjacent to the first executable
section. We obviously want to move the eh_frame sections in that case,
since one of them is in the way.
Bug 1423822 moved the injected code section before the .text section.
When linking with lld, the text section is usually page aligned, and
starting a PT_LOAD. We inject code at the beginning of the PT_LOAD,
which means the PT_LOAD is going to be extended at least a page
downwards. And it means the preceding PT_LOAD can't finish in that same
page, so the overhead of the injected code is needs to account for the
page alignment.
If the .eh_frame_hdr and .eh_frame sections are not between the elfhack
relocation and elfhack code sections, it's not going to change anything
to try to move it, so don't even try.
While here, adjust the adjacency test to error out when the section name
doesn't match, and account for the fact that the eh_frame_hdr section
might appear after eh_frame.
--HG--
extra : rebase_source : 7d3525abe75b5a014b39ce0bd406e8f78592ec39
If the .eh_frame_hdr and .eh_frame sections are not between the elfhack
relocation and elfhack code sections, it's not going to change anything
to try to move it, so don't even try.
While here, adjust the adjacency test to error out when the section name
doesn't match.
--HG--
extra : rebase_source : 4b31712576fd3472bb94a2b9ab9542253f04cba8
Somehow, when building with LTO, clang can end up creating a eh_frame
section with only one, empty, entry (which just looks like a 4-bytes
long section full of 0x00).
--HG--
extra : rebase_source : 385c05c7e447fe1c4bc261b79c7d56138e268458
When linking with ld.bfd or gold, this changes the PT_LOAD in which the
elfhack code section ends up, making it go in the same one as .init, .text,
etc. rather than .rel.*. When linking with lld, this completely
avoids doing a PT_LOAD split, because lld already splits .rel.* and
.text.
--HG--
extra : rebase_source : 1f69b8f4b48b055892ea24eaa6226859cc4ffd50
The current check makes assumption wrt what PT_LOAD the injected sections
end up in, and won't work with upcoming changes.
--HG--
extra : rebase_source : b7cfb65ea13c16f977fe523aaf9f39eafeb2cdce
When a binary has a PT_GNU_RELRO segment, the elfhack injected code
uses mprotect to add the writable flag to relocated pages before
applying relocations, removing it afterwards. To do so, the elfhack
program uses the location and size of the PT_GNU_RELRO segment, and
adjusts it to be aligned according to the PT_LOAD alignment.
The problem here is that the PT_LOAD alignment doesn't necessarily match
the actual page alignment, and the resulting mprotect may end up not
covering the full extent of what the dynamic linker has protected
read-only according to the PT_GNU_RELRO segment. In turn, this can lead
to a crash on startup when trying to apply relocations to the still
read-only locations.
Practically speaking, this doesn't end up being a problem on x86, where
the PT_LOAD alignment is usually 4096, which happens to be the page
size, but on Debian armhf, it is 64k, while the run time page size can be
4k.
--HG--
extra : rebase_source : 5ac7356f685d87c1628727e6c84f7615409c57a5
Bug 1385783 changed things such that the two elfhack sections are not
adjacent anymore. They can even be in different segments in some cases,
but the undo code doesn't know how to actually handle that case.
So for now, allow non adjacent sections, but still verify that they are
in the same segment.
--HG--
extra : rebase_source : da95ef7df19eeea8dfd07b24f22e7bee18939b69
In bug 635961, elfhack was made to (ab)use the bss section as a
temporary space for a pointer. To find it, it scanned writable PT_LOAD
segments to find one that has a different file and memory size,
indicating the presence of .bss. This usually works fine, but when
the binary is linked with lld and relro is enabled, the end of the
file-backed part of the PT_LOAD segment containing the .bss section
ends up in the RELRO segment, making that location read-only and
subsequently making the elfhacked binary crash when it tries to restore
the .bss to a clean state, because it's not actually writing in the .bss
section: lld page aligns it after the RELRO segment.
So instead of scanning PT_LOAD segments, we scan for SHT_NOBITS
sections that are not SHF_TLS (i.e. not .tbss).
--HG--
extra : rebase_source : f18c43897fd0139aa8535f983e13eb785088cb18
The lld linker creates separate segments for purely executable sections
(such as .text) and sections preceding those (such as .rel.dyn). Neither
gold nor bfd ld do that, and just put all those sections in the same
executable segment.
Since elfhack is putting its executable code between the two relocation
sections, it ends up in a non-executable segment, leading to a crash
when it's time to run that code.
We thus insert the elfhack code before the first executable section
instead of between the two relocation sections (which is where the
elfhack data lies, and stays).
--HG--
extra : rebase_source : ab18eb9ac518d69a8639ad0e785741395b662112
Somehow, with the Android toolchain, we end up with
non-empty-but-really-empty DT_INIT_ARRAYs.
In practical terms, they are arrays with no relocations, and content
that is meaningless:
$ objdump -s -j .init_array libnss3.so
libnss3.so: file format elf32-little
Contents of section .init_array:
1086e0 00000000 ....
$ readelf -r libnss3.so | grep 1086e0
$ objdump -s -j .init_array libplugin-container-pie.so
libplugin-container-pie.so: file format elf32-little
Contents of section .init_array:
4479c ffffffff 00000000 ffffffff 00000000 ................
$ readelf -r libplugin-container-pie.so | grep 4479c
Because so far, elfhack expected meaningful DT_INIT_ARRAYs, it bailed out
early in that case.
--HG--
extra : rebase_source : 217aacb42fdfabb466ed1f8149dfaeb4a595eda8