1.\" Copyright (c) 2000 Jonathan Lemon 2.\" All rights reserved. 3.\" 4.\" Redistribution and use in source and binary forms, with or without 5.\" modification, are permitted provided that the following conditions 6.\" are met: 7.\" 1. Redistributions of source code must retain the above copyright 8.\" notice, this list of conditions and the following disclaimer. 9.\" 2. Redistributions in binary form must reproduce the above copyright 10.\" notice, this list of conditions and the following disclaimer in the 11.\" documentation and/or other materials provided with the distribution. 12.\" 13.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND 14.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 15.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 16.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 17.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 18.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 19.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 20.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 21.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 22.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 23.\" SUCH DAMAGE. 24.\" 25.\" $FreeBSD$ 26.\" 27.Dd April 14, 2000 28.Dt KQUEUE 2 29.Os 30.Sh NAME 31.Nm kqueue , 32.Nm kevent 33.Nd kernel event notification mechanism 34.Sh LIBRARY 35.Lb libc 36.Sh SYNOPSIS 37.In sys/types.h 38.In sys/event.h 39.In sys/time.h 40.Ft int 41.Fn kqueue "void" 42.Ft int 43.Fn kevent "int kq" "const struct kevent *changelist" "int nchanges" "struct kevent *eventlist" "int nevents" "const struct timespec *timeout" 44.Fn EV_SET "&kev" ident filter flags fflags data udata 45.Sh DESCRIPTION 46The 47.Fn kqueue 48system call 49provides a generic method of notifying the user when an event 50happens or a condition holds, based on the results of small 51pieces of kernel code termed filters. 52A kevent is identified by the (ident, filter) pair; there may only 53be one unique kevent per kqueue. 54.Pp 55The filter is executed upon the initial registration of a kevent 56in order to detect whether a preexisting condition is present, and is also 57executed whenever an event is passed to the filter for evaluation. 58If the filter determines that the condition should be reported, 59then the kevent is placed on the kqueue for the user to retrieve. 60.Pp 61The filter is also run when the user attempts to retrieve the kevent 62from the kqueue. 63If the filter indicates that the condition that triggered 64the event no longer holds, the kevent is removed from the kqueue and 65is not returned. 66.Pp 67Multiple events which trigger the filter do not result in multiple 68kevents being placed on the kqueue; instead, the filter will aggregate 69the events into a single struct kevent. 70Calling 71.Fn close 72on a file descriptor will remove any kevents that reference the descriptor. 73.Pp 74The 75.Fn kqueue 76system call 77creates a new kernel event queue and returns a descriptor. 78The queue is not inherited by a child created with 79.Xr fork 2 . 80However, if 81.Xr rfork 2 82is called without the 83.Dv RFFDG 84flag, then the descriptor table is shared, 85which will allow sharing of the kqueue between two processes. 86.Pp 87The 88.Fn kevent 89system call 90is used to register events with the queue, and return any pending 91events to the user. 92The 93.Fa changelist 94argument 95is a pointer to an array of 96.Va kevent 97structures, as defined in 98.Aq Pa sys/event.h . 99All changes contained in the 100.Fa changelist 101are applied before any pending events are read from the queue. 102The 103.Fa nchanges 104argument 105gives the size of 106.Fa changelist . 107The 108.Fa eventlist 109argument 110is a pointer to an array of kevent structures. 111The 112.Fa nevents 113argument 114determines the size of 115.Fa eventlist . 116If 117.Fa timeout 118is a non-NULL pointer, it specifies a maximum interval to wait 119for an event, which will be interpreted as a struct timespec. If 120.Fa timeout 121is a NULL pointer, 122.Fn kevent 123waits indefinitely. To effect a poll, the 124.Fa timeout 125argument should be non-NULL, pointing to a zero-valued 126.Va timespec 127structure. The same array may be used for the 128.Fa changelist 129and 130.Fa eventlist . 131.Pp 132The 133.Fn EV_SET 134macro is provided for ease of initializing a 135kevent structure. 136.Pp 137The 138.Va kevent 139structure is defined as: 140.Bd -literal 141struct kevent { 142 uintptr_t ident; /* identifier for this event */ 143 short filter; /* filter for event */ 144 u_short flags; /* action flags for kqueue */ 145 u_int fflags; /* filter flag value */ 146 intptr_t data; /* filter data value */ 147 void *udata; /* opaque user data identifier */ 148}; 149.Ed 150.Pp 151The fields of 152.Fa struct kevent 153are: 154.Bl -tag -width XXXfilter 155.It ident 156Value used to identify this event. 157The exact interpretation is determined by the attached filter, 158but often is a file descriptor. 159.It filter 160Identifies the kernel filter used to process this event. The pre-defined 161system filters are described below. 162.It flags 163Actions to perform on the event. 164.It fflags 165Filter-specific flags. 166.It data 167Filter-specific data value. 168.It udata 169Opaque user-defined value passed through the kernel unchanged. 170.El 171.Pp 172The 173.Va flags 174field can contain the following values: 175.Bl -tag -width XXXEV_ONESHOT 176.It EV_ADD 177Adds the event to the kqueue. Re-adding an existing event 178will modify the parameters of the original event, and not result 179in a duplicate entry. Adding an event automatically enables it, 180unless overridden by the EV_DISABLE flag. 181.It EV_ENABLE 182Permit 183.Fn kevent 184to return the event if it is triggered. 185.It EV_DISABLE 186Disable the event so 187.Fn kevent 188will not return it. The filter itself is not disabled. 189.It EV_DELETE 190Removes the event from the kqueue. Events which are attached to 191file descriptors are automatically deleted on the last close of 192the descriptor. 193.It EV_ONESHOT 194Causes the event to return only the first occurrence of the filter 195being triggered. After the user retrieves the event from the kqueue, 196it is deleted. 197.It EV_CLEAR 198After the event is retrieved by the user, its state is reset. 199This is useful for filters which report state transitions 200instead of the current state. Note that some filters may automatically 201set this flag internally. 202.It EV_EOF 203Filters may set this flag to indicate filter-specific EOF condition. 204.It EV_ERROR 205See 206.Sx RETURN VALUES 207below. 208.El 209.Pp 210The predefined system filters are listed below. 211Arguments may be passed to and from the filter via the 212.Va fflags 213and 214.Va data 215fields in the kevent structure. 216.Bl -tag -width EVFILT_SIGNAL 217.It EVFILT_READ 218Takes a descriptor as the identifier, and returns whenever 219there is data available to read. 220The behavior of the filter is slightly different depending 221on the descriptor type. 222.Pp 223.Bl -tag -width 2n 224.It Sockets 225Sockets which have previously been passed to 226.Fn listen 227return when there is an incoming connection pending. 228.Va data 229contains the size of the listen backlog. 230.Pp 231Other socket descriptors return when there is data to be read, 232subject to the 233.Dv SO_RCVLOWAT 234value of the socket buffer. 235This may be overridden with a per-filter low water mark at the 236time the filter is added by setting the 237NOTE_LOWAT 238flag in 239.Va fflags , 240and specifying the new low water mark in 241.Va data . 242On return, 243.Va data 244contains the number of bytes of protocol data available to read. 245.Pp 246If the read direction of the socket has shutdown, then the filter 247also sets EV_EOF in 248.Va flags , 249and returns the socket error (if any) in 250.Va fflags . 251It is possible for EOF to be returned (indicating the connection is gone) 252while there is still data pending in the socket buffer. 253.It Vnodes 254Returns when the file pointer is not at the end of file. 255.Va data 256contains the offset from current position to end of file, 257and may be negative. 258.It "Fifos, Pipes" 259Returns when the there is data to read; 260.Va data 261contains the number of bytes available. 262.Pp 263When the last writer disconnects, the filter will set EV_EOF in 264.Va flags . 265This may be cleared by passing in EV_CLEAR, at which point the 266filter will resume waiting for data to become available before 267returning. 268.El 269.It EVFILT_WRITE 270Takes a descriptor as the identifier, and returns whenever 271it is possible to write to the descriptor. For sockets, pipes 272and fifos, 273.Va data 274will contain the amount of space remaining in the write buffer. 275The filter will set EV_EOF when the reader disconnects, and for 276the fifo case, this may be cleared by use of EV_CLEAR. 277Note that this filter is not supported for vnodes. 278.Pp 279For sockets, the low water mark and socket error handling is 280identical to the EVFILT_READ case. 281.It EVFILT_AIO 282The sigevent portion of the AIO request is filled in, with 283.Va sigev_notify_kqueue 284containing the descriptor of the kqueue that the event should 285be attached to, 286.Va sigev_value 287containing the udata value, and 288.Va sigev_notify 289set to SIGEV_KEVENT. 290When the 291.Fn aio_* 292system call is made, the event will be registered 293with the specified kqueue, and the 294.Va ident 295argument set to the 296.Fa struct aiocb 297returned by the 298.Fn aio_* 299system call. 300The filter returns under the same conditions as aio_error. 301.Pp 302Alternatively, a kevent structure may be initialized, with 303.Va ident 304containing the descriptor of the kqueue, and the 305address of the kevent structure placed in the 306.Va aio_lio_opcode 307field of the AIO request. However, this approach will not work on 308architectures with 64-bit pointers, and should be considered deprecated. 309.It EVFILT_VNODE 310Takes a file descriptor as the identifier and the events to watch for in 311.Va fflags , 312and returns when one or more of the requested events occurs on the descriptor. 313The events to monitor are: 314.Bl -tag -width XXNOTE_RENAME 315.It NOTE_DELETE 316The 317.Fn unlink 318system call 319was called on the file referenced by the descriptor. 320.It NOTE_WRITE 321A write occurred on the file referenced by the descriptor. 322.It NOTE_EXTEND 323The file referenced by the descriptor was extended. 324.It NOTE_ATTRIB 325The file referenced by the descriptor had its attributes changed. 326.It NOTE_LINK 327The link count on the file changed. 328.It NOTE_RENAME 329The file referenced by the descriptor was renamed. 330.It NOTE_REVOKE 331Access to the file was revoked via 332.Xr revoke 2 333or the underlying fileystem was unmounted. 334.El 335.Pp 336On return, 337.Va fflags 338contains the events which triggered the filter. 339.It EVFILT_PROC 340Takes the process ID to monitor as the identifier and the events to watch for 341in 342.Va fflags , 343and returns when the process performs one or more of the requested events. 344If a process can normally see another process, it can attach an event to it. 345The events to monitor are: 346.Bl -tag -width XXNOTE_TRACKERR 347.It NOTE_EXIT 348The process has exited. 349.It NOTE_FORK 350The process has called 351.Fn fork . 352.It NOTE_EXEC 353The process has executed a new process via 354.Xr execve 2 355or similar call. 356.It NOTE_TRACK 357Follow a process across 358.Fn fork 359calls. The parent process will return with NOTE_TRACK set in the 360.Va fflags 361field, while the child process will return with NOTE_CHILD set in 362.Va fflags 363and the parent PID in 364.Va data . 365.It NOTE_TRACKERR 366This flag is returned if the system was unable to attach an event to 367the child process, usually due to resource limitations. 368.El 369.Pp 370On return, 371.Va fflags 372contains the events which triggered the filter. 373.It EVFILT_SIGNAL 374Takes the signal number to monitor as the identifier and returns 375when the given signal is delivered to the process. 376This coexists with the 377.Fn signal 378and 379.Fn sigaction 380facilities, and has a lower precedence. The filter will record 381all attempts to deliver a signal to a process, even if the signal has 382been marked as SIG_IGN. Event notification happens after normal 383signal delivery processing. 384.Va data 385returns the number of times the signal has occurred since the last call to 386.Fn kevent . 387This filter automatically sets the EV_CLEAR flag internally. 388.It EVFILT_TIMER 389Establishes an arbitrary timer identified by 390.Va ident . 391When adding a timer, 392.Va data 393specifies the timeout period in milliseconds. 394The timer will be periodic unless EV_ONESHOT is specified. 395On return, 396.Va data 397contains the number of times the timeout has expired since the last call to 398.Fn kevent . 399This filter automatically sets the EV_CLEAR flag internally. 400.El 401.Sh RETURN VALUES 402The 403.Fn kqueue 404system call 405creates a new kernel event queue and returns a file descriptor. 406If there was an error creating the kernel event queue, a value of -1 is 407returned and errno set. 408.Pp 409The 410.Fn kevent 411system call 412returns the number of events placed in the 413.Fa eventlist , 414up to the value given by 415.Fa nevents . 416If an error occurs while processing an element of the 417.Fa changelist 418and there is enough room in the 419.Fa eventlist , 420then the event will be placed in the 421.Fa eventlist 422with 423.Dv EV_ERROR 424set in 425.Va flags 426and the system error in 427.Va data . 428Otherwise, 429.Dv -1 430will be returned, and 431.Dv errno 432will be set to indicate the error condition. 433If the time limit expires, then 434.Fn kevent 435returns 0. 436.Sh ERRORS 437The 438.Fn kqueue 439system call fails if: 440.Bl -tag -width Er 441.It Bq Er ENOMEM 442The kernel failed to allocate enough memory for the kernel queue. 443.It Bq Er EMFILE 444The per-process descriptor table is full. 445.It Bq Er ENFILE 446The system file table is full. 447.El 448.Pp 449The 450.Fn kevent 451system call fails if: 452.Bl -tag -width Er 453.It Bq Er EACCES 454The process does not have permission to register a filter. 455.It Bq Er EFAULT 456There was an error reading or writing the 457.Va kevent 458structure. 459.It Bq Er EBADF 460The specified descriptor is invalid. 461.It Bq Er EINTR 462A signal was delivered before the timeout expired and before any 463events were placed on the kqueue for return. 464.It Bq Er EINVAL 465The specified time limit or filter is invalid. 466.It Bq Er ENOENT 467The event could not be found to be modified or deleted. 468.It Bq Er ENOMEM 469No memory was available to register the event. 470.It Bq Er ESRCH 471The specified process to attach to does not exist. 472.El 473.Sh SEE ALSO 474.Xr aio_error 2 , 475.Xr aio_read 2 , 476.Xr aio_return 2 , 477.Xr poll 2 , 478.Xr read 2 , 479.Xr select 2 , 480.Xr sigaction 2 , 481.Xr write 2 , 482.Xr signal 3 483.Sh HISTORY 484The 485.Fn kqueue 486and 487.Fn kevent 488system calls first appeared in 489.Fx 4.1 . 490.Sh AUTHORS 491The 492.Fn kqueue 493system and this manual page were written by 494.An Jonathan Lemon Aq [email protected] . 495.Sh BUGS 496It is currently not possible to watch a 497.Xr vnode 9 498that resides on anything but 499a UFS file system. 500