|
Revision tags: release/12.2.0, release/11.4.0, release/12.1.0, release/11.3.0 |
|
| #
4bd91f6e |
| 15-Mar-2019 |
Konstantin Belousov <[email protected]> |
MFC r341689, r341711, r341712, r341809: Add getfhat(2), fhlink(2), fhlinkat(2), fhreadlink(2) file handle system calls.
To easier potential MFC of the AT_BENEATH feature, some vestiges of it were le
MFC r341689, r341711, r341712, r341809: Add getfhat(2), fhlink(2), fhlinkat(2), fhreadlink(2) file handle system calls.
To easier potential MFC of the AT_BENEATH feature, some vestiges of it were left in the merged product but commented out.
Due to a lot of conflicts, it was impossible to split the merge and regeneration of the syscall tables, because I needed to test the result. It is fine for stable branch to commit the whole change with the generated diff.
show more ...
|
|
Revision tags: release/12.0.0 |
|
| #
7cc923f8 |
| 10-Jul-2018 |
Brooks Davis <[email protected]> |
Get rid of netbsd_lchown and netbsd_msync syscall entries.
No valid FreeBSD binary very called them (they would call lchown and msync directly) and we haven't supported NetBSD binaries in ages.
Thi
Get rid of netbsd_lchown and netbsd_msync syscall entries.
No valid FreeBSD binary very called them (they would call lchown and msync directly) and we haven't supported NetBSD binaries in ages.
This is a respin of r335983 with a workaround for the ancient BFD linker in the libc stubs.
Reviewed by: kib Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D16193
show more ...
|
| #
714c03c8 |
| 05-Jul-2018 |
Brooks Davis <[email protected]> |
Revert r335983.
The bfd linker in tree doesn't support multiple names for the same symbol (at least with current flags).
|
| #
5b04a71d |
| 05-Jul-2018 |
Brooks Davis <[email protected]> |
Get rid of netbsd_lchown and netbsd_msync syscall entries.
No valid FreeBSD binary ever called them (they would call lchown and msync directly) and we haven't supported NetBSD binaries in ages.
Rev
Get rid of netbsd_lchown and netbsd_msync syscall entries.
No valid FreeBSD binary ever called them (they would call lchown and msync directly) and we haven't supported NetBSD binaries in ages.
Reviewed by: kib Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D15814
show more ...
|
|
Revision tags: release/11.2.0 |
|
| #
7351a8bd |
| 25-May-2018 |
Brooks Davis <[email protected]> |
Make vadvise compat freebsd11.
The vadvise syscall (aka ovadvise) is undocumented and has always been implmented as returning EINVAL. Put the syscall under COMPAT11 and provide a userspace implemen
Make vadvise compat freebsd11.
The vadvise syscall (aka ovadvise) is undocumented and has always been implmented as returning EINVAL. Put the syscall under COMPAT11 and provide a userspace implementation.
Reviewed by: kib Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D15557
show more ...
|
| #
08a7e74c |
| 21-Mar-2018 |
Conrad Meyer <[email protected]> |
getentropy(3): Fallback to kern.arandom sysctl on older kernels
On older kernels, when userspace program disables SIGSYS, catch ENOSYS and emulate getrandom(2) syscall with the kern.arandom sysctl (
getentropy(3): Fallback to kern.arandom sysctl on older kernels
On older kernels, when userspace program disables SIGSYS, catch ENOSYS and emulate getrandom(2) syscall with the kern.arandom sysctl (via existing arc4_sysctl wrapper).
Special care is taken to faithfully emulate EFAULT on NULL pointers, because sysctl(3) as used by kern.arandom ignores NULL oldp. (This was caught by getentropy(3) ATF tests.)
Reported by: kib Reviewed by: kib Discussed with: delphij Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D14785
show more ...
|
| #
e9ac2743 |
| 21-Mar-2018 |
Conrad Meyer <[email protected]> |
Implement getrandom(2) and getentropy(3)
The general idea here is to provide userspace programs with well-defined sources of entropy, in a fashion that doesn't require opening a new file descriptor
Implement getrandom(2) and getentropy(3)
The general idea here is to provide userspace programs with well-defined sources of entropy, in a fashion that doesn't require opening a new file descriptor (ulimits) or accessing paths (/dev/urandom may be restricted by chroot or capsicum).
getrandom(2) is the more general API, and comes from the Linux world. Since our urandom and random devices are identical, the GRND_RANDOM flag is ignored.
getentropy(3) is added as a compatibility shim for the OpenBSD API.
truss(1) support is included.
Tests for both system calls are provided. Coverage is believed to be at least as comprehensive as LTP getrandom(2) test coverage. Additionally, instructions for running the LTP tests directly against FreeBSD are provided in the "Test Plan" section of the Differential revision linked below. (They pass, of course.)
PR: 194204 Reported by: David CARLIER <david.carlier AT hardenedbsd.org> Discussed with: cperciva, delphij, jhb, markj Relnotes: maybe Differential Revision: https://reviews.freebsd.org/D14500
show more ...
|
| #
3f289c3f |
| 12-Jan-2018 |
Jeff Roberson <[email protected]> |
Implement 'domainset', a cpuset based NUMA policy mechanism. This allows userspace to control NUMA policy administratively and programmatically.
Implement domainset based iterators in the page laye
Implement 'domainset', a cpuset based NUMA policy mechanism. This allows userspace to control NUMA policy administratively and programmatically.
Implement domainset based iterators in the page layer.
Remove the now legacy numa_* syscalls.
Cleanup some header polution created by having seq.h in proc.h.
Reviewed by: markj, kib Discussed with: alc Tested by: pho Sponsored by: Netflix, Dell/EMC Isilon Differential Revision: https://reviews.freebsd.org/D13403
show more ...
|
|
Revision tags: release/10.4.0, release/11.1.0 |
|
| #
1bf9ff76 |
| 20-Jul-2017 |
Alan Somers <[email protected]> |
Remove some private symbols from librt
Private functions like __aio_read and _aio_read were exposed in FBSDprivate_1.0 by r169090, even though they've never been used outside of librt. Also, remove
Remove some private symbols from librt
Private functions like __aio_read and _aio_read were exposed in FBSDprivate_1.0 by r169090, even though they've never been used outside of librt. Also, remove some weak references from r156136 that have never resolved.
Reviewed by: kib MFC after: 3 weeks Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D11649
show more ...
|
| #
2b34e843 |
| 17-Jun-2017 |
Konstantin Belousov <[email protected]> |
Add abstime kqueue(2) timers and expand struct kevent members.
This change implements NOTE_ABSTIME flag for EVFILT_TIMER, which specifies that the data field contains absolute time to fire the event
Add abstime kqueue(2) timers and expand struct kevent members.
This change implements NOTE_ABSTIME flag for EVFILT_TIMER, which specifies that the data field contains absolute time to fire the event.
To make this useful, data member of the struct kevent must be extended to 64bit. Using the opportunity, I also added ext members. This changes struct kevent almost to Apple struct kevent64, except I did not changed type of ident and udata, the later would cause serious API incompatibilities.
The type of ident was kept uintptr_t since EVFILT_AIO returns a pointer in this field, and e.g. CHERI is sensitive to the type (discussed with brooks, jhb).
Unlike Apple kevent64, symbol versioning allows us to claim ABI compatibility and still name the new syscall kevent(2). Compat shims are provided for both host native and compat32.
Requested by: bapt Reviewed by: bapt, brooks, ngie (previous version) Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D11025
show more ...
|
| #
69921123 |
| 23-May-2017 |
Konstantin Belousov <[email protected]> |
Commit the 64-bit inode project.
Extend the ino_t, dev_t, nlink_t types to 64-bit ints. Modify struct dirent layout to add d_off, increase the size of d_fileno to 64-bits, increase the size of d_na
Commit the 64-bit inode project.
Extend the ino_t, dev_t, nlink_t types to 64-bit ints. Modify struct dirent layout to add d_off, increase the size of d_fileno to 64-bits, increase the size of d_namlen to 16-bits, and change the required alignment. Increase struct statfs f_mntfromname[] and f_mntonname[] array length MNAMELEN to 1024.
ABI breakage is mitigated by providing compatibility using versioned symbols, ingenious use of the existing padding in structures, and by employing other tricks. Unfortunately, not everything can be fixed, especially outside the base system. For instance, third-party APIs which pass struct stat around are broken in backward and forward incompatible ways.
Kinfo sysctl MIBs ABI is changed in backward-compatible way, but there is no general mechanism to handle other sysctl MIBS which return structures where the layout has changed. It was considered that the breakage is either in the management interfaces, where we usually allow ABI slip, or is not important.
Struct xvnode changed layout, no compat shims are provided.
For struct xtty, dev_t tty device member was reduced to uint32_t. It was decided that keeping ABI compat in this case is more useful than reporting 64-bit dev_t, for the sake of pstat.
Update note: strictly follow the instructions in UPDATING. Build and install the new kernel with COMPAT_FREEBSD11 option enabled, then reboot, and only then install new world.
Credits: The 64-bit inode project, also known as ino64, started life many years ago as a project by Gleb Kurtsou (gleb). Kirk McKusick (mckusick) then picked up and updated the patch, and acted as a flag-waver. Feedback, suggestions, and discussions were carried by Ed Maste (emaste), John Baldwin (jhb), Jilles Tjoelker (jilles), and Rick Macklem (rmacklem). Kris Moore (kris) performed an initial ports investigation followed by an exp-run by Antoine Brodin (antoine). Essential and all-embracing testing was done by Peter Holm (pho). The heavy lifting of coordinating all these efforts and bringing the project to completion were done by Konstantin Belousov (kib).
Sponsored by: The FreeBSD Foundation (emaste, kib) Differential revision: https://reviews.freebsd.org/D10439
show more ...
|
| #
73065ae8 |
| 20-Mar-2017 |
Xin LI <[email protected]> |
Make space style consistent with earlier entries.
X-MFC with: r315526
|
| #
3f8455b0 |
| 19-Mar-2017 |
Eric van Gyzen <[email protected]> |
Add clock_nanosleep()
Add a clock_nanosleep() syscall, as specified by POSIX. Make nanosleep() a wrapper around it.
Attach the clock_nanosleep test from NetBSD. Adjust it for the FreeBSD behavior o
Add clock_nanosleep()
Add a clock_nanosleep() syscall, as specified by POSIX. Make nanosleep() a wrapper around it.
Attach the clock_nanosleep test from NetBSD. Adjust it for the FreeBSD behavior of updating rmtp only when interrupted by a signal. I believe this to be POSIX-compliant, since POSIX mentions the rmtp parameter only in the paragraph about EINTR. This is also what Linux does. (NetBSD updates rmtp unconditionally.)
Copy the whole nanosleep.2 man page from NetBSD because it is complete and closely resembles the POSIX description. Edit, polish, and reword it a bit, being sure to keep any relevant text from the FreeBSD page.
Reviewed by: kib, ngie, jilles MFC after: 3 weeks Relnotes: yes Sponsored by: Dell EMC Differential Revision: https://reviews.freebsd.org/D10020
show more ...
|
|
Revision tags: release/11.0.1, release/11.0.0 |
|
| #
b3879151 |
| 17-Aug-2016 |
Bryan Drewery <[email protected]> |
Garbage collect _umtx_lock(2)/_umtx_unlock(2) references removed in r263318.
This has no real impact on the resulting libc.so file.
MFC after: 3 days Sponsored by: EMC / Isilon Storage Division
|
| #
295af703 |
| 15-Aug-2016 |
Konstantin Belousov <[email protected]> |
Add an implementation of fdatasync(2).
The syscall is a trivial wrapper around new VOP_FDATASYNC(), sharing code with fsync(2). For all filesystems, this commit provides the implementation which de
Add an implementation of fdatasync(2).
The syscall is a trivial wrapper around new VOP_FDATASYNC(), sharing code with fsync(2). For all filesystems, this commit provides the implementation which delegates the work of VOP_FDATASYNC() to VOP_FSYNC(). This is functionally correct but not efficient.
This is not yet POSIX-compliant implementation, because it does not ensure that queued AIO requests are completed before returning.
Reviewed by: mckusick Discussed with: avg (ZFS), jhb (AIO part) Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D7471
show more ...
|
|
Revision tags: release/10.3.0 |
|
| #
6d3eca24 |
| 12-Mar-2016 |
John Baldwin <[email protected]> |
Remove Symbol.map entries for old AIO system calls for FreeBSD 6 compat.
These entries should have never been present since they only exist for compat with FreeBSD 6.x (and older) binaries. This wa
Remove Symbol.map entries for old AIO system calls for FreeBSD 6 compat.
These entries should have never been present since they only exist for compat with FreeBSD 6.x (and older) binaries. This was missed in r296572. Technically this breaks the ABI by removing versioned symbols. However, no binaries should be linked against these symbols. No release has shipped with a header that contained a prototype for these functions.
Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D5615
show more ...
|
| #
bf420ace |
| 29-Jan-2016 |
Konstantin Belousov <[email protected]> |
Add implementations of sendmmsg(3) and recvmmsg(3) functions which wraps sendmsg(2) and recvmsg(2) into batch send and receive operation. The goal of this implementation is only to provide API compat
Add implementations of sendmmsg(3) and recvmmsg(3) functions which wraps sendmsg(2) and recvmsg(2) into batch send and receive operation. The goal of this implementation is only to provide API compatibility with Linux.
The cancellation behaviour of the functions is not quite right, but due to relative rare use of cancellation it is considered acceptable comparing with the complexity of the correct implementation. If functions are reimplemented as syscalls, the fix would come almost trivial. The direct use of the syscall trampolines instead of libc wrappers for sendmsg(2) and recvmsg(2) is to avoid data loss on cancellation.
Submitted by: Boris Astardzhiev <[email protected]> Discussed with: jilles (cancellation behaviour) MFC after: 1 month
show more ...
|
| #
842898ce |
| 14-Aug-2015 |
Pedro F. Giffuni <[email protected]> |
Remove a stale comment and clarify the original where it was taken from
The comment in the libc/sys symbol map referenced the generated symbols for the syscall trampolines. Such comment was out of p
Remove a stale comment and clarify the original where it was taken from
The comment in the libc/sys symbol map referenced the generated symbols for the syscall trampolines. Such comment was out of place in the secure symbol map so remove the stale comment and attempt to clarify the old one to avoid risks of confusion.
Pointed out by: kib
show more ...
|
| #
fe0d386c |
| 14-Aug-2015 |
Pedro F. Giffuni <[email protected]> |
Move the stack protector to a new "secure" directory
As part of the code refactoring to support FORTIFY_SOURCE we want a new subdirectory "secure" to keep the files related to security. Move the sta
Move the stack protector to a new "secure" directory
As part of the code refactoring to support FORTIFY_SOURCE we want a new subdirectory "secure" to keep the files related to security. Move the stack protector functions to this new directory.
No functional change.
Differential Review: https://reviews.freebsd.org/D3333
show more ...
|
|
Revision tags: release/10.2.0 |
|
| #
6520495a |
| 11-Jul-2015 |
Adrian Chadd <[email protected]> |
Add an initial NUMA affinity/policy configuration for threads and processes.
This is based on work done by jeff@ and jhb@, as well as the numa.diff patch that has been circulating when someone asks
Add an initial NUMA affinity/policy configuration for threads and processes.
This is based on work done by jeff@ and jhb@, as well as the numa.diff patch that has been circulating when someone asks for first-touch NUMA on -10 or -11.
* Introduce a simple set of VM policy and iterator types. * tie the policy types into the vm_phys path for now, mirroring how the initial first-touch allocation work was enabled. * add syscalls to control changing thread and process defaults. * add a global NUMA VM domain policy. * implement a simple cascade policy order - if a thread policy exists, use it; if a process policy exists, use it; use the default policy. * processes inherit policies from their parent processes, threads inherit policies from their parent threads. * add a simple tool (numactl) to query and modify default thread/process policities. * add documentation for the new syscalls, for numa and for numactl. * re-enable first touch NUMA again by default, as now policies can be set in a variety of methods.
This is only relevant for very specific workloads.
This doesn't pretend to be a final NUMA solution.
The previous defaults in -HEAD (with MAXMEMDOM set) can be achieved by 'sysctl vm.default_policy=rr'.
This is only relevant if MAXMEMDOM is set to something other than 1. Ie, if you're using GENERIC or a modified kernel with non-NUMA, then this is a glorified no-op for you.
Thank you to Norse Corp for giving me access to rather large (for FreeBSD!) NUMA machines in order to develop and verify this.
Thank you to Dell for providing me with dual socket sandybridge and westmere v3 hardware to do NUMA development with.
Thank you to Scott Long at Netflix for providing me with access to the two-socket, four-domain haswell v3 hardware.
Thank you to Peter Holm for running the stress testing suite against the NUMA branch during various stages of development!
Tested:
* MIPS (regression testing; non-NUMA) * i386 (regression testing; non-NUMA GENERIC) * amd64 (regression testing; non-NUMA GENERIC) * westmere, 2 socket (thankyou norse!) * sandy bridge, 2 socket (thankyou dell!) * ivy bridge, 2 socket (thankyou norse!) * westmere-EX, 4 socket / 1TB RAM (thankyou norse!) * haswell, 2 socket (thankyou norse!) * haswell v3, 2 socket (thankyou dell) * haswell v3, 2x18 core (thankyou scott long / netflix!)
* Peter Holm ran a stress test suite on this work and found one issue, but has not been able to verify it (it doesn't look NUMA related, and he only saw it once over many testing runs.)
* I've tested bhyve instances running in fixed NUMA domains and cpusets; all seems to work correctly.
Verified:
* intel-pcm - pcm-numa.x and pcm-memory.x, whilst selecting different NUMA policies for processes under test.
Review:
This was reviewed through phabricator (https://reviews.freebsd.org/D2559) as well as privately and via emails to freebsd-arch@. The git history with specific attributes is available at https://github.com/erikarn/freebsd/ in the NUMA branch (https://github.com/erikarn/freebsd/compare/local/adrian_numa_policy).
This has been reviewed by a number of people (stas, rpaulo, kib, ngie, wblock) but not achieved a clear consensus. My hope is that with further exposure and testing more functionality can be implemented and evaluated.
Notes:
* The VM doesn't handle unbalanced domains very well, and if you have an overly unbalanced memory setup whilst under high memory pressure, VM page allocation may fail leading to a kernel panic. This was a problem in the past, but it's much more easily triggered now with these tools.
* This work only controls the path through vm_phys; it doesn't yet strongly/predictably affect contigmalloc, KVA placement, UMA, etc. So, driver placement of memory isn't really guaranteed in any way. That's next on my plate.
Sponsored by: Norse Corp, Inc.; Dell
show more ...
|
| #
2205e0d1 |
| 23-Jan-2015 |
Jilles Tjoelker <[email protected]> |
Add futimens and utimensat system calls.
The core kernel part is patch file utimes.2008.4.diff from [email protected]. I updated the code for API changes, added the manual page and added compatibi
Add futimens and utimensat system calls.
The core kernel part is patch file utimes.2008.4.diff from [email protected]. I updated the code for API changes, added the manual page and added compatibility code for old kernels. There is also audit and Capsicum support.
A new UTIME_* constant might allow setting birthtimes in future.
Differential Revision: https://reviews.freebsd.org/D1426 Submitted by: pluknet (partially) Reviewed by: delphij, pluknet, rwatson Relnotes: yes
show more ...
|
| #
1a744fef |
| 05-Jan-2015 |
Konstantin Belousov <[email protected]> |
Avoid calling internal libc function through PLT or accessing data though GOT, by staticizing and hiding. Add setter for __error_selector to hide it as well.
Suggested and reviewed by: jilles Spons
Avoid calling internal libc function through PLT or accessing data though GOT, by staticizing and hiding. Add setter for __error_selector to hide it as well.
Suggested and reviewed by: jilles Sponsored by: The FreeBSD Foundation MFC after: 1 week
show more ...
|
| #
8495e8b1 |
| 03-Jan-2015 |
Konstantin Belousov <[email protected]> |
Fix known issues which blow up the process after dlopen("libthr.so") (or loading a dso linked to libthr.so into process which was not linked against threading library).
- Remove libthr interposers o
Fix known issues which blow up the process after dlopen("libthr.so") (or loading a dso linked to libthr.so into process which was not linked against threading library).
- Remove libthr interposers of the libc functions, including __error(). Instead, functions calls are indirected through the interposing table, similar to how pthread stubs in libc are already done. Libc by default points either to syscall trampolines or to existing libc implementations. On libthr load, libthr rewrites the pointers to the cancellable implementations already in libthr. The interposition table is separate from pthreads stubs indirection table to not pull pthreads stubs into static binaries.
- Postpone the malloc(3) internal mutexes initialization until libthr is loaded. This avoids recursion between calloc(3) and static pthread_mutex_t initialization.
- Reinstall signal handlers with wrapper on libthr load. The _rtld_is_dlopened(3) is used to avoid useless calls to sigaction(2) when libthr is statically referenced from the main binary.
In the process, fix openat(2), swapcontext(2) and setcontext(2) interposing. The libc symbols were exported at different versions than libthr interposers. Export both libc and libthr versions from libc now, with default set to the higher version from libthr.
Remove unused and disconnected swapcontext(3) userspace implementation from libc/gen.
No objections from: deischen Tested by: pho, antoine (exp-run) (previous versions) Sponsored by: The FreeBSD Foundation MFC after: 1 week
show more ...
|
| #
186d9c34 |
| 13-Nov-2014 |
Dmitry Chagin <[email protected]> |
Add the ppoll() system call. Export kern_poll() needed by an upcoming Linuxulator change.
Differential Revision: https://reviews.freebsd.org/D1133 Reviewed by: kib, wblock MFC after: 1 month
|
|
Revision tags: release/10.1.0, release/9.3.0, release/10.0.0, release/9.2.0 |
|
| #
55648840 |
| 19-Sep-2013 |
John Baldwin <[email protected]> |
Extend the support for exempting processes from being killed when swap is exhausted. - Add a new protect(1) command that can be used to set or revoke protection from arbitrary processes. Similar t
Extend the support for exempting processes from being killed when swap is exhausted. - Add a new protect(1) command that can be used to set or revoke protection from arbitrary processes. Similar to ktrace it can apply a change to all existing descendants of a process as well as future descendants. - Add a new procctl(2) system call that provides a generic interface for control operations on processes (as opposed to the debugger-specific operations provided by ptrace(2)). procctl(2) uses a combination of idtype_t and an id to identify the set of processes on which to operate similar to wait6(). - Add a PROC_SPROTECT control operation to manage the protection status of a set of processes. MADV_PROTECT still works for backwards compatability. - Add a p_flag2 to struct proc (and a corresponding ki_flag2 to kinfo_proc) the first bit of which is used to track if P_PROTECT should be inherited by new child processes.
Reviewed by: kib, jilles (earlier version) Approved by: re (delphij) MFC after: 1 month
show more ...
|