MFC r339877-r339879,r343564-r343566,r343580,r343754:Untangle jemalloc and mutexes initialization.The merge includes required warnings cleanup by arichardson, both toavoid conflicts and to make rt
MFC r339877-r339879,r343564-r343566,r343580,r343754:Untangle jemalloc and mutexes initialization.The merge includes required warnings cleanup by arichardson, both toavoid conflicts and to make rtld_malloc.c compilable with the libthrWARNS settings.
show more ...
MFC r342113: Improve R_AARCH64_TLSDESC relocation. The original code did not support dynamically loaded libraries and used suboptimal access to TLS variables. New implementation removes lazy
MFC r342113: Improve R_AARCH64_TLSDESC relocation. The original code did not support dynamically loaded libraries and used suboptimal access to TLS variables. New implementation removes lazy resolving of TLS relocation - due to flaw in TLSDESC design is impossible to switch resolver function at runtime without expensive locking.
MFC r341738: Implement R_AARCH64_TLS_DTPMOD64 and A_AARCH64_TLS_DTPREL64 relocations. Although these are slightly obsolete in favor of R_AARCH64_TLSDESC, gcc -mtls-dialect=trad still use them.
MFC r341511,r341512,r341513: r341511: Fix style(9). Not a functional change. r341512: Implement arm64 version of __tls_get_addr(). r341513: Tidy up arm64 reloc_jmpslots() implement
MFC r341511,r341512,r341513: r341511: Fix style(9). Not a functional change. r341512: Implement arm64 version of __tls_get_addr(). r341513: Tidy up arm64 reloc_jmpslots() implementation. - don't relocate jump slots multiple times (if LD_BIND_NOW is defined). - process only R_AARCH64_JUMP_SLOT here, other relocation types are handled by reloc_plt().
Add STT_GNU_IFUNC and R_AARCH64_IRELATIVE support on arm64.This is based on the amd64 implementation. Support for both PLT andnon-PLT (e.g. a global variable initilised with a pointer to an ifunc)
Add STT_GNU_IFUNC and R_AARCH64_IRELATIVE support on arm64.This is based on the amd64 implementation. Support for both PLT andnon-PLT (e.g. a global variable initilised with a pointer to an ifunc)cases are supported.We don't pass anything to the resolver as it is expected they will readthe ID registers directly, with the number of registers with CPU infolikely to increase in the future.Reviewed by: kibApproved by: re (gjb)Differential Revision: https://reviews.freebsd.org/D17341
Rework rtld's TLS Variant I implementation to match r326794The above commit fixed handling overaligned TLS segments in libc'sTLS Variant I implementation, but rtld provides its own implementation
Rework rtld's TLS Variant I implementation to match r326794The above commit fixed handling overaligned TLS segments in libc'sTLS Variant I implementation, but rtld provides its own implementationfor dynamically-linked executables which lacks these fixes. Thus,port these changes to rtld.This was previously commited as r337978 and reverted in r338149 due toexposing a bug the ARM rtld. This bug was fixed in r338317 by mmel.Submitted by: James ClarkeApproved by: re (kib)Reviewed by: kbowlingTesting by: kbowling (powerpc64), br (riscv), kevans (armv7)Obtained from: CheriBSDSponsored by: DARPA, AFRLDifferential Revision: https://reviews.freebsd.org/D16510
Revert r337978: Rework rtld's TLS Variant I implementation to match r326794Michal Meloun reports that it breaks ctype (isspace()..) relatedfunctions on armv7 so back out while we diagnose the issu
Revert r337978: Rework rtld's TLS Variant I implementation to match r326794Michal Meloun reports that it breaks ctype (isspace()..) relatedfunctions on armv7 so back out while we diagnose the issue.Reported by: Michal Meloun <[email protected]>
Rework rtld's TLS Variant I implementation to match r326794The above commit fixed handling overaligned TLS segments in libc'sTLS Variant I implementation, but rtld provides its own implementationfor dynamically-linked executables which lacks these fixes. Thus,port these changes to rtld.Submitted by: James ClarkeReviewed by: kbowlingTesting byL kbowling (powerpc64), br (riscv), kevans (armv7)Obtained from: CheriBSDSponsored by: DARPA, AFRLDifferential Revision: https://reviews.freebsd.org/D16510
Make rtld_bind_start() debugger friendly.Save link register and annotate call frame structure so debugger can unwindcall frame created by rtld_bind_start().MFC after: 2 weeks
o Let rtld(1) set up psABI user trap handlers prior to executing the objects' init functions instead of doing the setup via a constructor in libc as the init functions may already depend on these
o Let rtld(1) set up psABI user trap handlers prior to executing the objects' init functions instead of doing the setup via a constructor in libc as the init functions may already depend on these handlers to be in place. This gets us rid of: - the undefined order in which libc constructors as __guard_setup() and jemalloc_constructor() are executed WRT __sparc_utrap_setup(), - the requirement to link libc last so __sparc_utrap_setup() gets called prior to constructors in other libraries (see r122883). For static binaries, crt1.o still sets up the user trap handlers.o Move misplaced prototypes for MD functions in to the MD prototype section of rtld.h.o Sprinkle nitems().
Implement LD_BIND_NOT knob for rtld.From the manpage:When set to a nonempty string, prevents modifications of the PLT slotswhen doing bindings. As result, each call of the PLT-resolvedfunction
Implement LD_BIND_NOT knob for rtld.From the manpage:When set to a nonempty string, prevents modifications of the PLT slotswhen doing bindings. As result, each call of the PLT-resolvedfunction is resolved. In combination with debug output, this providescomplete account of all bind actions at runtime.Same feature exists on Linux and Solaris.Sponsored by: The FreeBSD FoundationMFC after: 2 weeks
Pull the R_AARCH64_TLSDESC code out into a common function and use them inboth the plt and non-plt case.This fixes an issue where libraries built with LLD can fail with"Unhandled relocation 1031"
Pull the R_AARCH64_TLSDESC code out into a common function and use them inboth the plt and non-plt case.This fixes an issue where libraries built with LLD can fail with"Unhandled relocation 1031"PR: 214971Obtained from: 1 weekSponsored by: DARPA, AFRL
Retire long-broken/unused static rtld supportrtld-elf has some vestigial support for building as a static executable.r45501 introduced a partial implementation with a prescient note that it"might
Retire long-broken/unused static rtld supportrtld-elf has some vestigial support for building as a static executable.r45501 introduced a partial implementation with a prescient note that it"might never be enabled." r153515 introduced ELF symbol versioningsupport, and removed part of the unused build infrastructure for staticrtld.GNU ld populates rela relocation addends and GOT entries with the samevalues, and rtld's run-time dynamic executable check relied on this.Alternate toolchains may not populate the GOT entries, which causedRTLD_IS_DYNAMIC to return false. Simplify rtld by just removing theunused check.If we want to restore static rtld support later on we ought to introducea build-time #ifdef flag.PR: 214972Reviewed by: kanMFC after: 1 monthSponsored by: The FreeBSD FoundationDifferential Revision: https://reviews.freebsd.org/D8687
Adjust r308689 to make rtld compilable with either in-tree or(hopefully) stock gcc 4.2.1 on i386 and other arches.In particular:- Do not use %ebx in the asm constraints on i386, since rtld is c
Adjust r308689 to make rtld compilable with either in-tree or(hopefully) stock gcc 4.2.1 on i386 and other arches.In particular:- Do not use %ebx in the asm constraints on i386, since rtld is compiled with -fPIC and gcc cannot handle GOT-base register reload (clang and newer gcc can).- Avoid direct use of [static N] construct in the function declaration/definion. In-tree gcc was patched to support this, but stock 4.2.1 cannot handle the feature.Requested by: bdeSponsored by: The FreeBSD FoundationMFC after: 1 week
Pass CPUID[1] %edx (cpu_feature), %ecx (cpu_feature2) andCPUID[7].%ebx (cpu_stdext_feature), %ecx (cpu_stdext_feature2) to theifunc resolvers on x86.It is much more clean to use CPUID instruction
Pass CPUID[1] %edx (cpu_feature), %ecx (cpu_feature2) andCPUID[7].%ebx (cpu_stdext_feature), %ecx (cpu_stdext_feature2) to theifunc resolvers on x86.It is much more clean to use CPUID instruction in usermode to retrievethis information than to pass AT_HWCAP aux vector from kernel, onx86. Still, the change does allow for use of AT_HWCAP on arches where it isneeded, by passing aux array to ifunc_init() initializer which shouldprepare arguments for ifunc resolvers.Current signature for resolvers on x86 is func_t iresolve(uint32_t cpu_feature, uint32_t cpu_feature2, uint32_t cpu_stdext_feature, uint32_t cpu_stdext_feature2);where arguments have identical meaning as the kernel variables of thesame name. The ABIs allow to use resolvers with the void or shortenedlist of arguments.Reviewed by: jhbSponsored by: The FreeBSD FoundationMFC after: 1 weekDifferential revision: https://reviews.freebsd.org/D8448
Do not call callbacks for dl_iterate_phdr(3) with the rtld bind andphdr locks locked. This allows to call rtld services from thecallback, which is only reasonable for dlopen(path, RTLD_NOLOAD) to
Do not call callbacks for dl_iterate_phdr(3) with the rtld bind andphdr locks locked. This allows to call rtld services from thecallback, which is only reasonable for dlopen(path, RTLD_NOLOAD) totest existence of the library in the image, and for dlsym(). Thelater might still be not quite safe, due to the lazy resolution offilters.To allow dropping the locks around iteration in dl_iterate_phdr(3), weinsert markers to track current position between relocks. The globalobjects list is converted to tailq and all iterators skip markers,globallist_next() and globallist_curr() helpers are added.Reported and tested by: davideReviewed by: kanSponsored by: The FreeBSD FoundationMFC after: 3 weeks
Remove the compat code to handle the kernel passing us an unalingedstackpointer. Userland expects the kernel to pass it an aligned sp andpass a pointer to the arguments in x0. The kernel side was u
Remove the compat code to handle the kernel passing us an unalingedstackpointer. Userland expects the kernel to pass it an aligned sp andpass a pointer to the arguments in x0. The kernel side was updated inr289502, 3 months ago.Sponsored by: ABT Systems Ltd
Create a generalized exec hook that different architectures can hookinto if they need to, but default to no action.Differential Review: https://reviews.freebsd.org/D2718
Fix how we place each objects thread local data. The code used was basedon the Variant II code, however arm64 uses Variant I. The former placed thethread pointer after the data, pointing at the thr
Fix how we place each objects thread local data. The code used was basedon the Variant II code, however arm64 uses Variant I. The former placed thethread pointer after the data, pointing at the thread control block, whilethe latter places these before said data.Because of this we need to use the size of the previous entry to calculatewhere to place the current entry. We also need to reserve 16 bytes at thestart for the thread control block.This also fixes the value of TLS_TCB_SIZE to be correct. This is the sizeof two unsigned longs, i.e. 2 * 8 bytes.While here remove the bogus adjustment of the pointer in theR_AARCH64_TLS_TPREL64 case. It should be the offset of the data relativeto the thread pointer, including the thread control block.Sponsored by: ABT Systems Ltd
Add on the addend when in the R_AARCH64_ABS64 and R_AARCH64_GLOB_DAT cases.This fixes at least sshd, and some of the boehm-gc tests.Sponsored by: ABT Systems Ltd
Save & restore the floating-pont argument registers before calling_rtld_bind. The compiler may generate code using these registers and notsave them. Unfortunately, as we make use of libc, we are un
Save & restore the floating-pont argument registers before calling_rtld_bind. The compiler may generate code using these registers and notsave them. Unfortunately, as we make use of libc, we are unable to disallowrtld from using floating-point register without also doing the same for theparts of libc we use, or by limiting what _rtld_bind is able to call.Obtained from: ABT Systems LtdSponsored by: The FReeBSD Foundation
Also save x8. It may be passed into a function as the indirect resultlocation pointer when the return value doesn't fit in a register, e.g. whenreturning a struct.Obtained from: ABT Systems LtdS
Also save x8. It may be passed into a function as the indirect resultlocation pointer when the return value doesn't fit in a register, e.g. whenreturning a struct.Obtained from: ABT Systems LtdSponsored by: The FreeBSD Foundation
Add a workaround to correctly align the stack before calling into C code.When enough time has passed for users to update their userland the kernelfix will be applied. This will change the ABI to ha
Add a workaround to correctly align the stack before calling into C code.When enough time has passed for users to update their userland the kernelfix will be applied. This will change the ABI to have x0 point to the argsand sp be correctly aligned.It is expected this compatibility code can be removed when the kernel andqemu usermode emulation have both been updated for the new ABI.This fixes clang failures, and most likely other crashes.Obtained from: ABT Systems LtdSponsored by: The FreeBSD Foundation
Use the correct value to get the offset of the objects tls data.Sponsored by: The FreeBSD Foundation
Add support for thread local storage on arm64 to the runtime linker. TheABI specifies that, for R_AARCH64_TLSDESC relocations, we use the symbolvalue, addend, and object tls offset to calculate the
Add support for thread local storage on arm64 to the runtime linker. TheABI specifies that, for R_AARCH64_TLSDESC relocations, we use the symbolvalue, addend, and object tls offset to calculate the offset from the tlsbase. We then cache this value for future reference.Differential Revision: https://reviews.freebsd.org/D2183Reviewed by: kibSponsored by: The FreeBSD Foundation
12