xref: /freebsd-12.1/lib/libc/sys/sendfile.2 (revision bb487d2b)
1.\" Copyright (c) 2003, David G. Lawrence
2.\" All rights reserved.
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice unmodified, this list of conditions, and the following
9.\"    disclaimer.
10.\" 2. Redistributions in binary form must reproduce the above copyright
11.\"    notice, this list of conditions and the following disclaimer in the
12.\"    documentation and/or other materials provided with the distribution.
13.\"
14.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
15.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
16.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
17.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
18.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
19.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
20.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
21.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
22.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
23.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
24.\" SUCH DAMAGE.
25.\"
26.\" $FreeBSD$
27.\"
28.Dd January 7, 2016
29.Dt SENDFILE 2
30.Os
31.Sh NAME
32.Nm sendfile
33.Nd send a file to a socket
34.Sh LIBRARY
35.Lb libc
36.Sh SYNOPSIS
37.In sys/types.h
38.In sys/socket.h
39.In sys/uio.h
40.Ft int
41.Fo sendfile
42.Fa "int fd" "int s" "off_t offset" "size_t nbytes"
43.Fa "struct sf_hdtr *hdtr" "off_t *sbytes" "int flags"
44.Fc
45.Sh DESCRIPTION
46The
47.Fn sendfile
48system call
49sends a regular file or shared memory object specified by descriptor
50.Fa fd
51out a stream socket specified by descriptor
52.Fa s .
53.Pp
54The
55.Fa offset
56argument specifies where to begin in the file.
57Should
58.Fa offset
59fall beyond the end of file, the system will return
60success and report 0 bytes sent as described below.
61The
62.Fa nbytes
63argument specifies how many bytes of the file should be sent, with 0 having the special
64meaning of send until the end of file has been reached.
65.Pp
66An optional header and/or trailer can be sent before and after the file data by specifying
67a pointer to a
68.Vt "struct sf_hdtr" ,
69which has the following structure:
70.Pp
71.Bd -literal -offset indent -compact
72struct sf_hdtr {
73	struct iovec *headers;	/* pointer to header iovecs */
74	int hdr_cnt;		/* number of header iovecs */
75	struct iovec *trailers;	/* pointer to trailer iovecs */
76	int trl_cnt;		/* number of trailer iovecs */
77};
78.Ed
79.Pp
80The
81.Fa headers
82and
83.Fa trailers
84pointers, if
85.Pf non- Dv NULL ,
86point to arrays of
87.Vt "struct iovec"
88structures.
89See the
90.Fn writev
91system call for information on the iovec structure.
92The number of iovecs in these
93arrays is specified by
94.Fa hdr_cnt
95and
96.Fa trl_cnt .
97.Pp
98If
99.Pf non- Dv NULL ,
100the system will write the total number of bytes sent on the socket to the
101variable pointed to by
102.Fa sbytes .
103.Pp
104The least significant 16 bits of
105.Fa flags
106argument is a bitmap of these values:
107.Bl -tag -offset indent
108.It Dv SF_NODISKIO
109This flag causes
110.Nm
111to return
112.Er EBUSY
113instead of blocking when a busy page is encountered.
114This rare situation can happen if some other process is now working
115with the same region of the file.
116It is advised to retry the operation after a short period.
117.Pp
118Note that in older
119.Fx
120versions the
121.Dv SF_NODISKIO
122had slightly different notion.
123The flag prevented
124.Nm
125to run I/O operations in case if an invalid (not cached) page is encountered,
126thus avoiding blocking on I/O.
127Starting with
128.Fx 11
129.Nm
130sending files off the
131.Xr ffs 7
132filesystem doesn't block on I/O
133(see
134.Sx IMPLEMENTATION NOTES
135), so the condition no longer applies.
136However, it is safe if an application utilizes
137.Dv SF_NODISKIO
138and on
139.Er EBUSY
140performs the same action as it did in
141older
142.Fx
143versions, e.g.
144.Xr aio_read 2,
145.Xr read 2
146or
147.Nm
148in a different context.
149.It Dv SF_NOCACHE
150The data sent to socket will not be cached by the virtual memory system,
151and will be freed directly to the pool of free pages.
152.It Dv SF_SYNC
153.Nm
154sleeps until the network stack no longer references the VM pages
155of the file, making subsequent modifications to it safe.
156Please note that this is not a guarantee that the data has actually
157been sent.
158.El
159.Pp
160The most significant 16 bits of
161.Fa flags
162specify amount of pages that
163.Nm
164may read ahead when reading the file.
165A macro
166.Fn SF_FLAGS
167is provided to combine readahead amount and flags.
168Example shows specifing readahead of 16 pages and
169.Dv SF_NOCACHE
170flag:
171.Pp
172.Bd -literal -offset indent -compact
173	SF_FLAGS(16, SF_NOCACHE)
174.Ed
175.Pp
176When using a socket marked for non-blocking I/O,
177.Fn sendfile
178may send fewer bytes than requested.
179In this case, the number of bytes successfully
180written is returned in
181.Fa *sbytes
182(if specified),
183and the error
184.Er EAGAIN
185is returned.
186.Sh IMPLEMENTATION NOTES
187The
188.Fx
189implementation of
190.Fn sendfile
191doesn't block on disk I/O when it sends a file off the
192.Xr ffs 7
193filesystem.
194The syscall returns success before the actual I/O completes, and data
195is put into the socket later unattended.
196However, the order of data in the socket is preserved, so it is safe
197to do further writes to the socket.
198.Pp
199The
200.Fx
201implementation of
202.Fn sendfile
203is "zero-copy", meaning that it has been optimized so that copying of the file data is avoided.
204.Sh TUNING
205On some architectures, this system call internally uses a special
206.Fn sendfile
207buffer
208.Pq Vt "struct sf_buf"
209to handle sending file data to the client.
210If the sending socket is
211blocking, and there are not enough
212.Fn sendfile
213buffers available,
214.Fn sendfile
215will block and report a state of
216.Dq Li sfbufa .
217If the sending socket is non-blocking and there are not enough
218.Fn sendfile
219buffers available, the call will block and wait for the
220necessary buffers to become available before finishing the call.
221.Pp
222The number of
223.Vt sf_buf Ns 's
224allocated should be proportional to the number of nmbclusters used to
225send data to a client via
226.Fn sendfile .
227Tune accordingly to avoid blocking!
228Busy installations that make extensive use of
229.Fn sendfile
230may want to increase these values to be inline with their
231.Va kern.ipc.nmbclusters
232(see
233.Xr tuning 7
234for details).
235.Pp
236The number of
237.Fn sendfile
238buffers available is determined at boot time by either the
239.Va kern.ipc.nsfbufs
240.Xr loader.conf 5
241variable or the
242.Dv NSFBUFS
243kernel configuration tunable.
244The number of
245.Fn sendfile
246buffers scales with
247.Va kern.maxusers .
248The
249.Va kern.ipc.nsfbufsused
250and
251.Va kern.ipc.nsfbufspeak
252read-only
253.Xr sysctl 8
254variables show current and peak
255.Fn sendfile
256buffers usage respectively.
257These values may also be viewed through
258.Nm netstat Fl m .
259.Pp
260If a value of zero is reported for
261.Va kern.ipc.nsfbufs ,
262your architecture does not need to use
263.Fn sendfile
264buffers because their task can be efficiently performed
265by the generic virtual memory structures.
266.Sh RETURN VALUES
267.Rv -std sendfile
268.Sh ERRORS
269.Bl -tag -width Er
270.It Bq Er EAGAIN
271The socket is marked for non-blocking I/O and not all data was sent due to
272the socket buffer being filled.
273If specified, the number of bytes successfully sent will be returned in
274.Fa *sbytes .
275.It Bq Er EBADF
276The
277.Fa fd
278argument
279is not a valid file descriptor.
280.It Bq Er EBADF
281The
282.Fa s
283argument
284is not a valid socket descriptor.
285.It Bq Er EBUSY
286A busy page was encountered and
287.Dv SF_NODISKIO
288had been specified.
289Partial data may have been sent.
290.It Bq Er EFAULT
291An invalid address was specified for an argument.
292.It Bq Er EINTR
293A signal interrupted
294.Fn sendfile
295before it could be completed.
296If specified, the number
297of bytes successfully sent will be returned in
298.Fa *sbytes .
299.It Bq Er EINVAL
300The
301.Fa fd
302argument
303is not a regular file.
304.It Bq Er EINVAL
305The
306.Fa s
307argument
308is not a SOCK_STREAM type socket.
309.It Bq Er EINVAL
310The
311.Fa offset
312argument
313is negative.
314.It Bq Er EIO
315An error occurred while reading from
316.Fa fd .
317.It Bq Er ENOBUFS
318The system was unable to allocate an internal buffer.
319.It Bq Er ENOTCONN
320The
321.Fa s
322argument
323points to an unconnected socket.
324.It Bq Er ENOTSOCK
325The
326.Fa s
327argument
328is not a socket.
329.It Bq Er EOPNOTSUPP
330The file system for descriptor
331.Fa fd
332does not support
333.Fn sendfile .
334.It Bq Er EPIPE
335The socket peer has closed the connection.
336.El
337.Sh SEE ALSO
338.Xr netstat 1 ,
339.Xr open 2 ,
340.Xr send 2 ,
341.Xr socket 2 ,
342.Xr writev 2 ,
343.Xr tuning 7
344.Rs
345.%A K. Elmeleegy
346.%A A. Chanda
347.%A A. L. Cox
348.%A W. Zwaenepoel
349.%T A Portable Kernel Abstraction for Low-Overhead Ephemeral Mapping Management
350.%J The Proceedings of the 2005 USENIX Annual Technical Conference
351.%P pp 223-236
352.%D 2005
353.Re
354.Sh HISTORY
355The
356.Fn sendfile
357system call
358first appeared in
359.Fx 3.0 .
360This manual page first appeared in
361.Fx 3.1 .
362In
363.Fx 10
364support for sending shared memory descriptors had been introduced.
365In
366.Fx 11
367a non-blocking implementation had been introduced.
368.Sh AUTHORS
369The initial implementation of
370.Fn sendfile
371system call
372and this manual page were written by
373.An David G. Lawrence Aq Mt [email protected] .
374The
375.Fx 11
376implementation was written by
377.An Gleb Smirnoff Aq Mt [email protected] .
378