MFC r350575:rtld-elf: Remove x86 elf_rtld.x linker scripts.
MFC r339877-r339879,r343564-r343566,r343580,r343754:Untangle jemalloc and mutexes initialization.The merge includes required warnings cleanup by arichardson, both toavoid conflicts and to make rt
MFC r339877-r339879,r343564-r343566,r343580,r343754:Untangle jemalloc and mutexes initialization.The merge includes required warnings cleanup by arichardson, both toavoid conflicts and to make rtld_malloc.c compilable with the libthrWARNS settings.
show more ...
MFC r342113: Improve R_AARCH64_TLSDESC relocation. The original code did not support dynamically loaded libraries and used suboptimal access to TLS variables. New implementation removes lazy
MFC r342113: Improve R_AARCH64_TLSDESC relocation. The original code did not support dynamically loaded libraries and used suboptimal access to TLS variables. New implementation removes lazy resolving of TLS relocation - due to flaw in TLSDESC design is impossible to switch resolver function at runtime without expensive locking.
MFC r340842:Silence gcc warnings.
MFC r339897:Remove rtld use of libc amd64_set_fsbase().
o Let rtld(1) set up psABI user trap handlers prior to executing the objects' init functions instead of doing the setup via a constructor in libc as the init functions may already depend on these
o Let rtld(1) set up psABI user trap handlers prior to executing the objects' init functions instead of doing the setup via a constructor in libc as the init functions may already depend on these handlers to be in place. This gets us rid of: - the undefined order in which libc constructors as __guard_setup() and jemalloc_constructor() are executed WRT __sparc_utrap_setup(), - the requirement to link libc last so __sparc_utrap_setup() gets called prior to constructors in other libraries (see r122883). For static binaries, crt1.o still sets up the user trap handlers.o Move misplaced prototypes for MD functions in to the MD prototype section of rtld.h.o Sprinkle nitems().
libexec: adoption of SPDX licensing ID tags.Mainly focus on files that use BSD 2-Clause license, however the tool Iwas using misidentified many licenses so this was mostly a manual - errorprone -
libexec: adoption of SPDX licensing ID tags.Mainly focus on files that use BSD 2-Clause license, however the tool Iwas using misidentified many licenses so this was mostly a manual - errorprone - task.The Software Package Data Exchange (SPDX) group provides a specificationto make it easier for automated tools to detect and summarize well knownopensource licenses. We are gradually adopting the specification, notingthat the tags are considered only advisory and do not, in any way,superceed or replace the license texts.No functional change intended.
Implement LD_BIND_NOT knob for rtld.From the manpage:When set to a nonempty string, prevents modifications of the PLT slotswhen doing bindings. As result, each call of the PLT-resolvedfunction
Implement LD_BIND_NOT knob for rtld.From the manpage:When set to a nonempty string, prevents modifications of the PLT slotswhen doing bindings. As result, each call of the PLT-resolvedfunction is resolved. In combination with debug output, this providescomplete account of all bind actions at runtime.Same feature exists on Linux and Solaris.Sponsored by: The FreeBSD FoundationMFC after: 2 weeks
rtld: do not rely on a populated GOT on amd64On rela architectures GNU BFD ld and gold store the relocation addendin GOT entries (in addition to the relocation's r_addend field).rtld previously r
rtld: do not rely on a populated GOT on amd64On rela architectures GNU BFD ld and gold store the relocation addendin GOT entries (in addition to the relocation's r_addend field).rtld previously relied on this to access its own _DYNAMIC symbol inorder to apply its own relocations.However, recording addends in the GOT is not specified by the ABI,and some versions of LLVM's LLD linker leave the GOT uninitialized onrela architectures.BFD ld does not populate the GOT on sparc64, and sparc64 rtld has amachine-dependent rtld_dynamic_addr() function that returns the_DYNAMIC address. Use the same approach on amd64, obtaining the %rip-relative _DYNAMIC address following a suggestion from Rafael Espíndola.Architectures other than amd64 should be addressed in future work.PR: 214972Reviewed by: kibMFC after: 2 weeksSponsored by: The FreeBSD FoundationDifferential Revision: https://reviews.freebsd.org/D9180
Adjust r308689 to make rtld compilable with either in-tree or(hopefully) stock gcc 4.2.1 on i386 and other arches.In particular:- Do not use %ebx in the asm constraints on i386, since rtld is c
Adjust r308689 to make rtld compilable with either in-tree or(hopefully) stock gcc 4.2.1 on i386 and other arches.In particular:- Do not use %ebx in the asm constraints on i386, since rtld is compiled with -fPIC and gcc cannot handle GOT-base register reload (clang and newer gcc can).- Avoid direct use of [static N] construct in the function declaration/definion. In-tree gcc was patched to support this, but stock 4.2.1 cannot handle the feature.Requested by: bdeSponsored by: The FreeBSD FoundationMFC after: 1 week
Pass CPUID[1] %edx (cpu_feature), %ecx (cpu_feature2) andCPUID[7].%ebx (cpu_stdext_feature), %ecx (cpu_stdext_feature2) to theifunc resolvers on x86.It is much more clean to use CPUID instruction
Pass CPUID[1] %edx (cpu_feature), %ecx (cpu_feature2) andCPUID[7].%ebx (cpu_stdext_feature), %ecx (cpu_stdext_feature2) to theifunc resolvers on x86.It is much more clean to use CPUID instruction in usermode to retrievethis information than to pass AT_HWCAP aux vector from kernel, onx86. Still, the change does allow for use of AT_HWCAP on arches where it isneeded, by passing aux array to ifunc_init() initializer which shouldprepare arguments for ifunc resolvers.Current signature for resolvers on x86 is func_t iresolve(uint32_t cpu_feature, uint32_t cpu_feature2, uint32_t cpu_stdext_feature, uint32_t cpu_stdext_feature2);where arguments have identical meaning as the kernel variables of thesame name. The ABIs allow to use resolvers with the void or shortenedlist of arguments.Reviewed by: jhbSponsored by: The FreeBSD FoundationMFC after: 1 weekDifferential revision: https://reviews.freebsd.org/D8448
Do not call callbacks for dl_iterate_phdr(3) with the rtld bind andphdr locks locked. This allows to call rtld services from thecallback, which is only reasonable for dlopen(path, RTLD_NOLOAD) to
Do not call callbacks for dl_iterate_phdr(3) with the rtld bind andphdr locks locked. This allows to call rtld services from thecallback, which is only reasonable for dlopen(path, RTLD_NOLOAD) totest existence of the library in the image, and for dlsym(). Thelater might still be not quite safe, due to the lazy resolution offilters.To allow dropping the locks around iteration in dl_iterate_phdr(3), weinsert markers to track current position between relocks. The globalobjects list is converted to tailq and all iterators skip markers,globallist_next() and globallist_curr() helpers are added.Reported and tested by: davideReviewed by: kanSponsored by: The FreeBSD FoundationMFC after: 3 weeks
rtld: wrap a comment to 80 columns
Create a generalized exec hook that different architectures can hookinto if they need to, but default to no action.Differential Review: https://reviews.freebsd.org/D2718
Disable SSE in libthrClang emits SSE instructions on amd64 in the common path ofpthread_mutex_unlock. If the thread does not otherwise use SSE,this usage incurs a context-switch of the FPU/SSE s
Disable SSE in libthrClang emits SSE instructions on amd64 in the common path ofpthread_mutex_unlock. If the thread does not otherwise use SSE,this usage incurs a context-switch of the FPU/SSE state, whichreduces the performance of multiple real-world applications by anon-trivial amount (3-5% in one application).Instead of this change, I experimented with eagerly switching theFPU state at context-switch time. This did not help. Most of thecost seems to be in the read/write of memory--as kib@ stated--andnot in the #NM handling. I tested on machines with and withoutXSAVEOPT.One counter-argument to this change is that most applications alreadyuse SIMD, and the number of applications and amount of SIMD usageare only increasing. This is absolutely true. I agree that--ingeneral and in principle--this change is in the wrong direction.However, there are applications that do not use enough SSE to offsetthe extra context-switch cost. SSE does not provide a clear benefitin the current libthr code with the current compiler, but it doesprovide a clear loss in some cases. Therefore, disabling SSE inlibthr is a non-loss for most, and a gain for some.I refrained from disabling SSE in libc--as was suggested--becauseI can't make the above argument for libc. It provides a wide varietyof code; each case should be analyzed separately.https://lists.freebsd.org/pipermail/freebsd-current/2015-March/055193.htmlSuggestions from: dim, jmg, rpauloApproved by: kib (mentor)MFC after: 2 weeksSponsored by: Dell Inc.
Change compiler setting to make default visibility of the symbols forrtld on x86 to be hidden. This is a micro-optimization, which allowsintrinsic references inside rtld to be handled without indi
Change compiler setting to make default visibility of the symbols forrtld on x86 to be hidden. This is a micro-optimization, which allowsintrinsic references inside rtld to be handled without indirectionthrough PLT. The visibility of rtld symbols for other objects in thesymbol namespace is controlled by a version script.Reviewed by: kan, jillesSponsored by: The FreeBSD FoundationMFC after: 2 weeks
Optimize r270798, only do the second pass over non-plt relocationswhen the first pass found IFUNCs.Sponsored by: The FreeBSD FoundationMFC after: 2 weeks
IFUNC symbol type shall be processed for non-PLT relocations,e.g. when a global variable is initialized with a pointer to ifunc.Add symbol type check and call resolver for STT_GNU_IFUNC symbol type
IFUNC symbol type shall be processed for non-PLT relocations,e.g. when a global variable is initialized with a pointer to ifunc.Add symbol type check and call resolver for STT_GNU_IFUNC symbol typeswhen processing non-PLT relocations, but only after non-IFUNCrelocations are done. The two-phase proceessing is required sinceresolvers may reference other symbols, which must be ready to use whenresolver calls are done.Restructure reloc_non_plt() on x86 to call find_symdef() and handleIFUNC in single place.For non-x86 reloc_non_plt(), check for call for IFUNC relocation anddo nothing, to avoid processing relocs twice.PR: 193048Sponsored by: The FreeBSD FoundationMFC after: 2 weeks
Add dwarf annotations to the amd64 _rtld_bind_start to allow debuggersto unwind around the calls from PLT to binder.Sponsored by: The FreeBSD FoundationMFC after: 1 week
Add GNU hash support for rtld.Based on dragonflybsd support for GNU hash by John Marino <draco marino st>Reviewed by: kanTested by: baptMFC after: 2 weeks
Fix several problems with our ELF filters implementation.Do not relocate twice an object which happens to be needed by loadedbinary (or dso) and some filtee opened due to symbol resolution whenre
Fix several problems with our ELF filters implementation.Do not relocate twice an object which happens to be needed by loadedbinary (or dso) and some filtee opened due to symbol resolution whenrelocating need objects. Record the state of the relocationprocessing in Obj_Entry and short-circuit relocate_objects() ifcurrent object already processed.Do not call constructors for filtees loaded during the earlyrelocation processing before image is initialized enough to runuser-provided code. Filtees are loaded using dlopen_object(), whichnormally performs relocation and initialization. If filtee islazy-loaded during the relocation of dso needed by the main object,dlopen_object() runs too earlier, when most runtime services are notyet ready.Postpone the constructors call to the time when main binary anddepended libraries constructors are run, passing the new flagRTLD_LO_EARLY to dlopen_object(). Symbol lookups callers informsymlook_* functions about early stage of initialization withSYMLOOK_EARLY. Pass flags through all functions participating inobject relocation.Use the opportunity and fix flags argument to find_symdef() inarch-specific reloc.c to use proper name SYMLOOK_IN_PLT instead oftrue, which happen to have the same numeric value.Reported and tested by: theravenReviewed by: kanMFC after: 2 weeks
Add support for preinit, init and fini arrays. Some ABIs, inparticular on ARM, do require working init arrays.Traditional FreeBSD crt1 calls _init and _fini of the binary, insteadof allowing run
Add support for preinit, init and fini arrays. Some ABIs, inparticular on ARM, do require working init arrays.Traditional FreeBSD crt1 calls _init and _fini of the binary, insteadof allowing runtime linker to arrange the calls. This was probablydone to have the same crt code serve both statically and dynamicallylinked binaries. Since ABI mandates that first is called preinitarray functions, then init, and then init array functions, the inithave to be called from rtld now.To provide binary compatibility to old FreeBSD crt1, which calls _inititself, rtld only calls intializers and finalizers for main binary ifbinary has a note indicating that new crt was used for linking. Addparsing of ELF notes to rtld, and cache p_osrel value since we parsedit anyway.The patch is inspired by init_array support for DragonflyBSD, writtenby John Marino.Reviewed by: kanTested by: andrew (arm, previous version), flo (sparc64, previous version)MFC after: 3 weeks
Remove unneeded dtv variable.It is only assigned and not used at all. The object files stay identicalwhen the variables are removed.Approved by: kib
_rtld_bind() read-locks the bind lock, and possible plt resolutionfrom the dispatcher would also acquire bind lock in read mode, whichis the supported operation. plt is explicitely designed to allo
_rtld_bind() read-locks the bind lock, and possible plt resolutionfrom the dispatcher would also acquire bind lock in read mode, whichis the supported operation. plt is explicitely designed to allow safemultithreaded updates, so the shared lock do not cause problems.The error in r228435 is that it allows read lock acquisition after thewrite lock for the bind block. If we dlopened the shared object thatcontains IRELATIVE or jump slot which target is STT_GNU_IFUNC, thenpossible recursive plt resolve from the dispatcher would cause it.Postpone the resolution for irelative/ifunc right before initializersare called, and drop bind lock around calls to dispatcher. Useinitlist to iterate over the objects instead of the ->next, due todrop of the bind lock in iteration.For i386/reloc.c:reloc_iresolve(), fix calculation of the dispatchfunction address for dso, by taking into account possible non-zerorelocbase.MFC after: 3 weeks
Add support for STT_GNU_IFUNC and R_MACHINE_IRELATIVE GNU extensions tortld on 386 and amd64. This adds runtime bits neccessary for the useof the dispatch functions from the dynamically-linked exec
Add support for STT_GNU_IFUNC and R_MACHINE_IRELATIVE GNU extensions tortld on 386 and amd64. This adds runtime bits neccessary for the useof the dispatch functions from the dynamically-linked executables andshared libraries.To allow use of external references from the dispatch function, resolutionof the R_MACHINE_IRESOLVE relocations in PLT is postponed until GOT entriesfor PLT are prepared, and normal resolution of the GOT entries is finished.Similar to how it is done by GNU, IRELATIVE relocations are resolved inadvance, instead of normal lazy handling for PLT.Move the init_pltgot() call before the relocations for the object areprocessed.MFC after: 3 weeks
123