1.\" Copyright (c) 2003, David G. Lawrence 2.\" All rights reserved. 3.\" 4.\" Redistribution and use in source and binary forms, with or without 5.\" modification, are permitted provided that the following conditions 6.\" are met: 7.\" 1. Redistributions of source code must retain the above copyright 8.\" notice unmodified, this list of conditions, and the following 9.\" disclaimer. 10.\" 2. Redistributions in binary form must reproduce the above copyright 11.\" notice, this list of conditions and the following disclaimer in the 12.\" documentation and/or other materials provided with the distribution. 13.\" 14.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND 15.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 16.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 17.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 18.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 19.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 20.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 21.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 22.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 23.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 24.\" SUCH DAMAGE. 25.\" 26.\" $FreeBSD$ 27.\" 28.Dd January 7, 2016 29.Dt SENDFILE 2 30.Os 31.Sh NAME 32.Nm sendfile 33.Nd send a file to a socket 34.Sh LIBRARY 35.Lb libc 36.Sh SYNOPSIS 37.In sys/types.h 38.In sys/socket.h 39.In sys/uio.h 40.Ft int 41.Fo sendfile 42.Fa "int fd" "int s" "off_t offset" "size_t nbytes" 43.Fa "struct sf_hdtr *hdtr" "off_t *sbytes" "int flags" 44.Fc 45.Sh DESCRIPTION 46The 47.Fn sendfile 48system call 49sends a regular file or shared memory object specified by descriptor 50.Fa fd 51out a stream socket specified by descriptor 52.Fa s . 53.Pp 54The 55.Fa offset 56argument specifies where to begin in the file. 57Should 58.Fa offset 59fall beyond the end of file, the system will return 60success and report 0 bytes sent as described below. 61The 62.Fa nbytes 63argument specifies how many bytes of the file should be sent, with 0 having the special 64meaning of send until the end of file has been reached. 65.Pp 66An optional header and/or trailer can be sent before and after the file data by specifying 67a pointer to a 68.Vt "struct sf_hdtr" , 69which has the following structure: 70.Pp 71.Bd -literal -offset indent -compact 72struct sf_hdtr { 73 struct iovec *headers; /* pointer to header iovecs */ 74 int hdr_cnt; /* number of header iovecs */ 75 struct iovec *trailers; /* pointer to trailer iovecs */ 76 int trl_cnt; /* number of trailer iovecs */ 77}; 78.Ed 79.Pp 80The 81.Fa headers 82and 83.Fa trailers 84pointers, if 85.Pf non- Dv NULL , 86point to arrays of 87.Vt "struct iovec" 88structures. 89See the 90.Fn writev 91system call for information on the iovec structure. 92The number of iovecs in these 93arrays is specified by 94.Fa hdr_cnt 95and 96.Fa trl_cnt . 97.Pp 98If 99.Pf non- Dv NULL , 100the system will write the total number of bytes sent on the socket to the 101variable pointed to by 102.Fa sbytes . 103.Pp 104The least significant 16 bits of 105.Fa flags 106argument is a bitmap of these values: 107.Bl -tag -offset indent 108.It Dv SF_NODISKIO 109This flag causes 110.Nm 111to return 112.Er EBUSY 113instead of blocking when a busy page is encountered. 114This rare situation can happen if some other process is now working 115with the same region of the file. 116It is advised to retry the operation after a short period. 117.Pp 118Note that in older 119.Fx 120versions the 121.Dv SF_NODISKIO 122had slightly different notion. 123The flag prevented 124.Nm 125to run I/O operations in case if an invalid (not cached) page is encountered, 126thus avoiding blocking on I/O. 127Starting with 128.Fx 11 129.Nm 130sending files off the 131.Xr ffs 7 132filesystem doesn't block on I/O 133(see 134.Sx IMPLEMENTATION NOTES 135), so the condition no longer applies. 136However, it is safe if an application utilizes 137.Dv SF_NODISKIO 138and on 139.Er EBUSY 140performs the same action as it did in 141older 142.Fx 143versions, e.g. 144.Xr aio_read 2, 145.Xr read 2 146or 147.Nm 148in a different context. 149.It Dv SF_NOCACHE 150The data sent to socket will not be cached by the virtual memory system, 151and will be freed directly to the pool of free pages. 152.It Dv SF_SYNC 153.Nm 154sleeps until the network stack no longer references the VM pages 155of the file, making subsequent modifications to it safe. 156Please note that this is not a guarantee that the data has actually 157been sent. 158.El 159.Pp 160The most significant 16 bits of 161.Fa flags 162specify amount of pages that 163.Nm 164may read ahead when reading the file. 165A macro 166.Fn SF_FLAGS 167is provided to combine readahead amount and flags. 168Example shows specifing readahead of 16 pages and 169.Dv SF_NOCACHE 170flag: 171.Pp 172.Bd -literal -offset indent -compact 173 SF_FLAGS(16, SF_NOCACHE) 174.Ed 175.Pp 176When using a socket marked for non-blocking I/O, 177.Fn sendfile 178may send fewer bytes than requested. 179In this case, the number of bytes successfully 180written is returned in 181.Fa *sbytes 182(if specified), 183and the error 184.Er EAGAIN 185is returned. 186.Sh IMPLEMENTATION NOTES 187The 188.Fx 189implementation of 190.Fn sendfile 191doesn't block on disk I/O when it sends a file off the 192.Xr ffs 7 193filesystem. 194The syscall returns success before the actual I/O completes, and data 195is put into the socket later unattended. 196However, the order of data in the socket is preserved, so it is safe 197to do further writes to the socket. 198.Pp 199The 200.Fx 201implementation of 202.Fn sendfile 203is "zero-copy", meaning that it has been optimized so that copying of the file data is avoided. 204.Sh TUNING 205On some architectures, this system call internally uses a special 206.Fn sendfile 207buffer 208.Pq Vt "struct sf_buf" 209to handle sending file data to the client. 210If the sending socket is 211blocking, and there are not enough 212.Fn sendfile 213buffers available, 214.Fn sendfile 215will block and report a state of 216.Dq Li sfbufa . 217If the sending socket is non-blocking and there are not enough 218.Fn sendfile 219buffers available, the call will block and wait for the 220necessary buffers to become available before finishing the call. 221.Pp 222The number of 223.Vt sf_buf Ns 's 224allocated should be proportional to the number of nmbclusters used to 225send data to a client via 226.Fn sendfile . 227Tune accordingly to avoid blocking! 228Busy installations that make extensive use of 229.Fn sendfile 230may want to increase these values to be inline with their 231.Va kern.ipc.nmbclusters 232(see 233.Xr tuning 7 234for details). 235.Pp 236The number of 237.Fn sendfile 238buffers available is determined at boot time by either the 239.Va kern.ipc.nsfbufs 240.Xr loader.conf 5 241variable or the 242.Dv NSFBUFS 243kernel configuration tunable. 244The number of 245.Fn sendfile 246buffers scales with 247.Va kern.maxusers . 248The 249.Va kern.ipc.nsfbufsused 250and 251.Va kern.ipc.nsfbufspeak 252read-only 253.Xr sysctl 8 254variables show current and peak 255.Fn sendfile 256buffers usage respectively. 257These values may also be viewed through 258.Nm netstat Fl m . 259.Pp 260If a value of zero is reported for 261.Va kern.ipc.nsfbufs , 262your architecture does not need to use 263.Fn sendfile 264buffers because their task can be efficiently performed 265by the generic virtual memory structures. 266.Sh RETURN VALUES 267.Rv -std sendfile 268.Sh ERRORS 269.Bl -tag -width Er 270.It Bq Er EAGAIN 271The socket is marked for non-blocking I/O and not all data was sent due to 272the socket buffer being filled. 273If specified, the number of bytes successfully sent will be returned in 274.Fa *sbytes . 275.It Bq Er EBADF 276The 277.Fa fd 278argument 279is not a valid file descriptor. 280.It Bq Er EBADF 281The 282.Fa s 283argument 284is not a valid socket descriptor. 285.It Bq Er EBUSY 286A busy page was encountered and 287.Dv SF_NODISKIO 288had been specified. 289Partial data may have been sent. 290.It Bq Er EFAULT 291An invalid address was specified for an argument. 292.It Bq Er EINTR 293A signal interrupted 294.Fn sendfile 295before it could be completed. 296If specified, the number 297of bytes successfully sent will be returned in 298.Fa *sbytes . 299.It Bq Er EINVAL 300The 301.Fa fd 302argument 303is not a regular file. 304.It Bq Er EINVAL 305The 306.Fa s 307argument 308is not a SOCK_STREAM type socket. 309.It Bq Er EINVAL 310The 311.Fa offset 312argument 313is negative. 314.It Bq Er EIO 315An error occurred while reading from 316.Fa fd . 317.It Bq Er ENOBUFS 318The system was unable to allocate an internal buffer. 319.It Bq Er ENOTCONN 320The 321.Fa s 322argument 323points to an unconnected socket. 324.It Bq Er ENOTSOCK 325The 326.Fa s 327argument 328is not a socket. 329.It Bq Er EOPNOTSUPP 330The file system for descriptor 331.Fa fd 332does not support 333.Fn sendfile . 334.It Bq Er EPIPE 335The socket peer has closed the connection. 336.El 337.Sh SEE ALSO 338.Xr netstat 1 , 339.Xr open 2 , 340.Xr send 2 , 341.Xr socket 2 , 342.Xr writev 2 , 343.Xr tuning 7 344.Rs 345.%A K. Elmeleegy 346.%A A. Chanda 347.%A A. L. Cox 348.%A W. Zwaenepoel 349.%T A Portable Kernel Abstraction for Low-Overhead Ephemeral Mapping Management 350.%J The Proceedings of the 2005 USENIX Annual Technical Conference 351.%P pp 223-236 352.%D 2005 353.Re 354.Sh HISTORY 355The 356.Fn sendfile 357system call 358first appeared in 359.Fx 3.0 . 360This manual page first appeared in 361.Fx 3.1 . 362In 363.Fx 10 364support for sending shared memory descriptors had been introduced. 365In 366.Fx 11 367a non-blocking implementation had been introduced. 368.Sh AUTHORS 369The initial implementation of 370.Fn sendfile 371system call 372and this manual page were written by 373.An David G. Lawrence Aq Mt [email protected] . 374The 375.Fx 11 376implementation was written by 377.An Gleb Smirnoff Aq Mt [email protected] . 378