1*22ce4affSfengbojiangzstd(1) -- zstd, zstdmt, unzstd, zstdcat - Compress or decompress .zst files
2*22ce4affSfengbojiang============================================================================
3*22ce4affSfengbojiang
4*22ce4affSfengbojiangSYNOPSIS
5*22ce4affSfengbojiang--------
6*22ce4affSfengbojiang
7*22ce4affSfengbojiang`zstd` [*OPTIONS*] [-|_INPUT-FILE_] [-o _OUTPUT-FILE_]
8*22ce4affSfengbojiang
9*22ce4affSfengbojiang`zstdmt` is equivalent to `zstd -T0`
10*22ce4affSfengbojiang
11*22ce4affSfengbojiang`unzstd` is equivalent to `zstd -d`
12*22ce4affSfengbojiang
13*22ce4affSfengbojiang`zstdcat` is equivalent to `zstd -dcf`
14*22ce4affSfengbojiang
15*22ce4affSfengbojiang
16*22ce4affSfengbojiangDESCRIPTION
17*22ce4affSfengbojiang-----------
18*22ce4affSfengbojiang`zstd` is a fast lossless compression algorithm and data compression tool,
19*22ce4affSfengbojiangwith command line syntax similar to `gzip (1)` and `xz (1)`.
20*22ce4affSfengbojiangIt is based on the **LZ77** family, with further FSE & huff0 entropy stages.
21*22ce4affSfengbojiang`zstd` offers highly configurable compression speed,
22*22ce4affSfengbojiangwith fast modes at > 200 MB/s per core,
23*22ce4affSfengbojiangand strong modes nearing lzma compression ratios.
24*22ce4affSfengbojiangIt also features a very fast decoder, with speeds > 500 MB/s per core.
25*22ce4affSfengbojiang
26*22ce4affSfengbojiang`zstd` command line syntax is generally similar to gzip,
27*22ce4affSfengbojiangbut features the following differences :
28*22ce4affSfengbojiang
29*22ce4affSfengbojiang  - Source files are preserved by default.
30*22ce4affSfengbojiang    It's possible to remove them automatically by using the `--rm` command.
31*22ce4affSfengbojiang  - When compressing a single file, `zstd` displays progress notifications
32*22ce4affSfengbojiang    and result summary by default.
33*22ce4affSfengbojiang    Use `-q` to turn them off.
34*22ce4affSfengbojiang  - `zstd` does not accept input from console,
35*22ce4affSfengbojiang    but it properly accepts `stdin` when it's not the console.
36*22ce4affSfengbojiang  - `zstd` displays a short help page when command line is an error.
37*22ce4affSfengbojiang    Use `-q` to turn it off.
38*22ce4affSfengbojiang
39*22ce4affSfengbojiang`zstd` compresses or decompresses each _file_ according to the selected
40*22ce4affSfengbojiangoperation mode.
41*22ce4affSfengbojiangIf no _files_ are given or _file_ is `-`, `zstd` reads from standard input
42*22ce4affSfengbojiangand writes the processed data to standard output.
43*22ce4affSfengbojiang`zstd` will refuse to write compressed data to standard output
44*22ce4affSfengbojiangif it is a terminal : it will display an error message and skip the _file_.
45*22ce4affSfengbojiangSimilarly, `zstd` will refuse to read compressed data from standard input
46*22ce4affSfengbojiangif it is a terminal.
47*22ce4affSfengbojiang
48*22ce4affSfengbojiangUnless `--stdout` or `-o` is specified, _files_ are written to a new file
49*22ce4affSfengbojiangwhose name is derived from the source _file_ name:
50*22ce4affSfengbojiang
51*22ce4affSfengbojiang* When compressing, the suffix `.zst` is appended to the source filename to
52*22ce4affSfengbojiang  get the target filename.
53*22ce4affSfengbojiang* When decompressing, the `.zst` suffix is removed from the source filename to
54*22ce4affSfengbojiang  get the target filename
55*22ce4affSfengbojiang
56*22ce4affSfengbojiang### Concatenation with .zst files
57*22ce4affSfengbojiangIt is possible to concatenate `.zst` files as is.
58*22ce4affSfengbojiang`zstd` will decompress such files as if they were a single `.zst` file.
59*22ce4affSfengbojiang
60*22ce4affSfengbojiangOPTIONS
61*22ce4affSfengbojiang-------
62*22ce4affSfengbojiang
63*22ce4affSfengbojiang### Integer suffixes and special values
64*22ce4affSfengbojiangIn most places where an integer argument is expected,
65*22ce4affSfengbojiangan optional suffix is supported to easily indicate large integers.
66*22ce4affSfengbojiangThere must be no space between the integer and the suffix.
67*22ce4affSfengbojiang
68*22ce4affSfengbojiang* `KiB`:
69*22ce4affSfengbojiang    Multiply the integer by 1,024 (2\^10).
70*22ce4affSfengbojiang    `Ki`, `K`, and `KB` are accepted as synonyms for `KiB`.
71*22ce4affSfengbojiang* `MiB`:
72*22ce4affSfengbojiang    Multiply the integer by 1,048,576 (2\^20).
73*22ce4affSfengbojiang    `Mi`, `M`, and `MB` are accepted as synonyms for `MiB`.
74*22ce4affSfengbojiang
75*22ce4affSfengbojiang### Operation mode
76*22ce4affSfengbojiangIf multiple operation mode options are given,
77*22ce4affSfengbojiangthe last one takes effect.
78*22ce4affSfengbojiang
79*22ce4affSfengbojiang* `-z`, `--compress`:
80*22ce4affSfengbojiang    Compress.
81*22ce4affSfengbojiang    This is the default operation mode when no operation mode option is specified
82*22ce4affSfengbojiang    and no other operation mode is implied from the command name
83*22ce4affSfengbojiang    (for example, `unzstd` implies `--decompress`).
84*22ce4affSfengbojiang* `-d`, `--decompress`, `--uncompress`:
85*22ce4affSfengbojiang    Decompress.
86*22ce4affSfengbojiang* `-t`, `--test`:
87*22ce4affSfengbojiang    Test the integrity of compressed _files_.
88*22ce4affSfengbojiang    This option is equivalent to `--decompress --stdout` except that the
89*22ce4affSfengbojiang    decompressed data is discarded instead of being written to standard output.
90*22ce4affSfengbojiang    No files are created or removed.
91*22ce4affSfengbojiang* `-b#`:
92*22ce4affSfengbojiang    Benchmark file(s) using compression level #
93*22ce4affSfengbojiang* `--train FILEs`:
94*22ce4affSfengbojiang    Use FILEs as a training set to create a dictionary.
95*22ce4affSfengbojiang    The training set should contain a lot of small files (> 100).
96*22ce4affSfengbojiang* `-l`, `--list`:
97*22ce4affSfengbojiang    Display information related to a zstd compressed file, such as size, ratio, and checksum.
98*22ce4affSfengbojiang    Some of these fields may not be available.
99*22ce4affSfengbojiang    This command can be augmented with the `-v` modifier.
100*22ce4affSfengbojiang
101*22ce4affSfengbojiang### Operation modifiers
102*22ce4affSfengbojiang
103*22ce4affSfengbojiang* `-#`:
104*22ce4affSfengbojiang    `#` compression level \[1-19] (default: 3)
105*22ce4affSfengbojiang* `--ultra`:
106*22ce4affSfengbojiang    unlocks high compression levels 20+ (maximum 22), using a lot more memory.
107*22ce4affSfengbojiang    Note that decompression will also require more memory when using these levels.
108*22ce4affSfengbojiang* `--fast[=#]`:
109*22ce4affSfengbojiang    switch to ultra-fast compression levels.
110*22ce4affSfengbojiang    If `=#` is not present, it defaults to `1`.
111*22ce4affSfengbojiang    The higher the value, the faster the compression speed,
112*22ce4affSfengbojiang    at the cost of some compression ratio.
113*22ce4affSfengbojiang    This setting overwrites compression level if one was set previously.
114*22ce4affSfengbojiang    Similarly, if a compression level is set after `--fast`, it overrides it.
115*22ce4affSfengbojiang* `-T#`, `--threads=#`:
116*22ce4affSfengbojiang    Compress using `#` working threads (default: 1).
117*22ce4affSfengbojiang    If `#` is 0, attempt to detect and use the number of physical CPU cores.
118*22ce4affSfengbojiang    In all cases, the nb of threads is capped to ZSTDMT_NBWORKERS_MAX==200.
119*22ce4affSfengbojiang    This modifier does nothing if `zstd` is compiled without multithread support.
120*22ce4affSfengbojiang* `--single-thread`:
121*22ce4affSfengbojiang    Does not spawn a thread for compression, use a single thread for both I/O and compression.
122*22ce4affSfengbojiang    In this mode, compression is serialized with I/O, which is slightly slower.
123*22ce4affSfengbojiang    (This is different from `-T1`, which spawns 1 compression thread in parallel of I/O).
124*22ce4affSfengbojiang    This mode is the only one available when multithread support is disabled.
125*22ce4affSfengbojiang    Single-thread mode features lower memory usage.
126*22ce4affSfengbojiang    Final compressed result is slightly different from `-T1`.
127*22ce4affSfengbojiang* `--adapt[=min=#,max=#]` :
128*22ce4affSfengbojiang    `zstd` will dynamically adapt compression level to perceived I/O conditions.
129*22ce4affSfengbojiang    Compression level adaptation can be observed live by using command `-v`.
130*22ce4affSfengbojiang    Adaptation can be constrained between supplied `min` and `max` levels.
131*22ce4affSfengbojiang    The feature works when combined with multi-threading and `--long` mode.
132*22ce4affSfengbojiang    It does not work with `--single-thread`.
133*22ce4affSfengbojiang    It sets window size to 8 MB by default (can be changed manually, see `wlog`).
134*22ce4affSfengbojiang    Due to the chaotic nature of dynamic adaptation, compressed result is not reproducible.
135*22ce4affSfengbojiang    _note_ : at the time of this writing, `--adapt` can remain stuck at low speed
136*22ce4affSfengbojiang    when combined with multiple worker threads (>=2).
137*22ce4affSfengbojiang* `--long[=#]`:
138*22ce4affSfengbojiang    enables long distance matching with `#` `windowLog`, if not `#` is not
139*22ce4affSfengbojiang    present it defaults to `27`.
140*22ce4affSfengbojiang    This increases the window size (`windowLog`) and memory usage for both the
141*22ce4affSfengbojiang    compressor and decompressor.
142*22ce4affSfengbojiang    This setting is designed to improve the compression ratio for files with
143*22ce4affSfengbojiang    long matches at a large distance.
144*22ce4affSfengbojiang
145*22ce4affSfengbojiang    Note: If `windowLog` is set to larger than 27, `--long=windowLog` or
146*22ce4affSfengbojiang    `--memory=windowSize` needs to be passed to the decompressor.
147*22ce4affSfengbojiang* `-D DICT`:
148*22ce4affSfengbojiang    use `DICT` as Dictionary to compress or decompress FILE(s)
149*22ce4affSfengbojiang* `--patch-from FILE`:
150*22ce4affSfengbojiang    Specify the file to be used as a reference point for zstd's diff engine.
151*22ce4affSfengbojiang    This is effectively dictionary compression with some convenient parameter
152*22ce4affSfengbojiang    selection, namely that windowSize > srcSize.
153*22ce4affSfengbojiang
154*22ce4affSfengbojiang    Note: cannot use both this and -D together
155*22ce4affSfengbojiang    Note: `--long` mode will be automatically activated if chainLog < fileLog
156*22ce4affSfengbojiang        (fileLog being the windowLog required to cover the whole file). You
157*22ce4affSfengbojiang        can also manually force it.
158*22ce4affSfengbojiang	Node: for all levels, you can use --patch-from in --single-thread mode
159*22ce4affSfengbojiang		to improve compression ratio at the cost of speed
160*22ce4affSfengbojiang    Note: for level 19, you can get increased compression ratio at the cost
161*22ce4affSfengbojiang        of speed by specifying `--zstd=targetLength=` to be something large
162*22ce4affSfengbojiang        (i.e 4096), and by setting a large `--zstd=chainLog=`
163*22ce4affSfengbojiang* `--rsyncable` :
164*22ce4affSfengbojiang    `zstd` will periodically synchronize the compression state to make the
165*22ce4affSfengbojiang    compressed file more rsync-friendly. There is a negligible impact to
166*22ce4affSfengbojiang    compression ratio, and the faster compression levels will see a small
167*22ce4affSfengbojiang    compression speed hit.
168*22ce4affSfengbojiang    This feature does not work with `--single-thread`. You probably don't want
169*22ce4affSfengbojiang    to use it with long range mode, since it will decrease the effectiveness of
170*22ce4affSfengbojiang    the synchronization points, but your milage may vary.
171*22ce4affSfengbojiang* `-C`, `--[no-]check`:
172*22ce4affSfengbojiang    add integrity check computed from uncompressed data (default: enabled)
173*22ce4affSfengbojiang* `--[no-]content-size`:
174*22ce4affSfengbojiang    enable / disable whether or not the original size of the file is placed in
175*22ce4affSfengbojiang    the header of the compressed file. The default option is
176*22ce4affSfengbojiang    --content-size (meaning that the original size will be placed in the header).
177*22ce4affSfengbojiang* `--no-dictID`:
178*22ce4affSfengbojiang    do not store dictionary ID within frame header (dictionary compression).
179*22ce4affSfengbojiang    The decoder will have to rely on implicit knowledge about which dictionary to use,
180*22ce4affSfengbojiang    it won't be able to check if it's correct.
181*22ce4affSfengbojiang* `-M#`, `--memory=#`:
182*22ce4affSfengbojiang    Set a memory usage limit. By default, Zstandard uses 128 MB for decompression
183*22ce4affSfengbojiang    as the maximum amount of memory the decompressor is allowed to use, but you can
184*22ce4affSfengbojiang    override this manually if need be in either direction (ie. you can increase or
185*22ce4affSfengbojiang    decrease it).
186*22ce4affSfengbojiang
187*22ce4affSfengbojiang    This is also used during compression when using with --patch-from=. In this case,
188*22ce4affSfengbojiang    this parameter overrides that maximum size allowed for a dictionary. (128 MB).
189*22ce4affSfengbojiang* `--stream-size=#` :
190*22ce4affSfengbojiang    Sets the pledged source size of input coming from a stream. This value must be exact, as it
191*22ce4affSfengbojiang    will be included in the produced frame header. Incorrect stream sizes will cause an error.
192*22ce4affSfengbojiang    This information will be used to better optimize compression parameters, resulting in
193*22ce4affSfengbojiang    better and potentially faster compression, especially for smaller source sizes.
194*22ce4affSfengbojiang* `--size-hint=#`:
195*22ce4affSfengbojiang    When handling input from a stream, `zstd` must guess how large the source size
196*22ce4affSfengbojiang    will be when optimizing compression parameters. If the stream size is relatively
197*22ce4affSfengbojiang    small, this guess may be a poor one, resulting in a higher compression ratio than
198*22ce4affSfengbojiang    expected. This feature allows for controlling the guess when needed.
199*22ce4affSfengbojiang    Exact guesses result in better compression ratios. Overestimates result in slightly
200*22ce4affSfengbojiang    degraded compression ratios, while underestimates may result in significant degradation.
201*22ce4affSfengbojiang* `-o FILE`:
202*22ce4affSfengbojiang    save result into `FILE`
203*22ce4affSfengbojiang* `-f`, `--force`:
204*22ce4affSfengbojiang    overwrite output without prompting, and (de)compress symbolic links
205*22ce4affSfengbojiang* `-c`, `--stdout`:
206*22ce4affSfengbojiang    force write to standard output, even if it is the console
207*22ce4affSfengbojiang* `--[no-]sparse`:
208*22ce4affSfengbojiang    enable / disable sparse FS support,
209*22ce4affSfengbojiang    to make files with many zeroes smaller on disk.
210*22ce4affSfengbojiang    Creating sparse files may save disk space and speed up decompression by
211*22ce4affSfengbojiang    reducing the amount of disk I/O.
212*22ce4affSfengbojiang    default: enabled when output is into a file,
213*22ce4affSfengbojiang    and disabled when output is stdout.
214*22ce4affSfengbojiang    This setting overrides default and can force sparse mode over stdout.
215*22ce4affSfengbojiang* `--rm`:
216*22ce4affSfengbojiang    remove source file(s) after successful compression or decompression. If used in combination with
217*22ce4affSfengbojiang    -o, will trigger a confirmation prompt (which can be silenced with -f), as this is a destructive operation.
218*22ce4affSfengbojiang* `-k`, `--keep`:
219*22ce4affSfengbojiang    keep source file(s) after successful compression or decompression.
220*22ce4affSfengbojiang    This is the default behavior.
221*22ce4affSfengbojiang* `-r`:
222*22ce4affSfengbojiang    operate recursively on directories
223*22ce4affSfengbojiang* `--filelist FILE`
224*22ce4affSfengbojiang    read a list of files to process as content from `FILE`.
225*22ce4affSfengbojiang    Format is compatible with `ls` output, with one file per line.
226*22ce4affSfengbojiang* `--output-dir-flat DIR`:
227*22ce4affSfengbojiang    resulting files are stored into target `DIR` directory,
228*22ce4affSfengbojiang    instead of same directory as origin file.
229*22ce4affSfengbojiang    Be aware that this command can introduce name collision issues,
230*22ce4affSfengbojiang    if multiple files, from different directories, end up having the same name.
231*22ce4affSfengbojiang    Collision resolution ensures first file with a given name will be present in `DIR`,
232*22ce4affSfengbojiang    while in combination with `-f`, the last file will be present instead.
233*22ce4affSfengbojiang* `--output-dir-mirror DIR`:
234*22ce4affSfengbojiang    similar to `--output-dir-flat`,
235*22ce4affSfengbojiang    the output files are stored underneath target `DIR` directory,
236*22ce4affSfengbojiang    but this option will replicate input directory hierarchy into output `DIR`.
237*22ce4affSfengbojiang
238*22ce4affSfengbojiang    If input directory contains "..", the files in this directory will be ignored.
239*22ce4affSfengbojiang    If input directory is an absolute directory (i.e. "/var/tmp/abc"),
240*22ce4affSfengbojiang    it will be stored into the "output-dir/var/tmp/abc".
241*22ce4affSfengbojiang    If there are multiple input files or directories,
242*22ce4affSfengbojiang    name collision resolution will follow the same rules as `--output-dir-flat`.
243*22ce4affSfengbojiang* `--format=FORMAT`:
244*22ce4affSfengbojiang    compress and decompress in other formats. If compiled with
245*22ce4affSfengbojiang    support, zstd can compress to or decompress from other compression algorithm
246*22ce4affSfengbojiang    formats. Possibly available options are `zstd`, `gzip`, `xz`, `lzma`, and `lz4`.
247*22ce4affSfengbojiang    If no such format is provided, `zstd` is the default.
248*22ce4affSfengbojiang* `-h`/`-H`, `--help`:
249*22ce4affSfengbojiang    display help/long help and exit
250*22ce4affSfengbojiang* `-V`, `--version`:
251*22ce4affSfengbojiang    display version number and exit.
252*22ce4affSfengbojiang    Advanced : `-vV` also displays supported formats.
253*22ce4affSfengbojiang    `-vvV` also displays POSIX support.
254*22ce4affSfengbojiang    `-q` will only display the version number, suitable for machine reading.
255*22ce4affSfengbojiang* `-v`, `--verbose`:
256*22ce4affSfengbojiang    verbose mode, display more information
257*22ce4affSfengbojiang* `-q`, `--quiet`:
258*22ce4affSfengbojiang    suppress warnings, interactivity, and notifications.
259*22ce4affSfengbojiang    specify twice to suppress errors too.
260*22ce4affSfengbojiang* `--no-progress`:
261*22ce4affSfengbojiang    do not display the progress bar, but keep all other messages.
262*22ce4affSfengbojiang* `--show-default-cparams`:
263*22ce4affSfengbojiang    Shows the default compression parameters that will be used for a
264*22ce4affSfengbojiang    particular src file. If the provided src file is not a regular file
265*22ce4affSfengbojiang    (eg. named pipe), the cli will just output the default parameters.
266*22ce4affSfengbojiang    That is, the parameters that are used when the src size is unknown.
267*22ce4affSfengbojiang* `--`:
268*22ce4affSfengbojiang    All arguments after `--` are treated as files
269*22ce4affSfengbojiang
270*22ce4affSfengbojiang### Restricted usage of Environment Variables
271*22ce4affSfengbojiang
272*22ce4affSfengbojiangUsing environment variables to set parameters has security implications.
273*22ce4affSfengbojiangTherefore, this avenue is intentionally restricted.
274*22ce4affSfengbojiangOnly `ZSTD_CLEVEL` and `ZSTD_NBTHREADS` are currently supported.
275*22ce4affSfengbojiangThey set the compression level and number of threads to use during compression, respectively.
276*22ce4affSfengbojiang
277*22ce4affSfengbojiang`ZSTD_CLEVEL` can be used to set the level between 1 and 19 (the "normal" range).
278*22ce4affSfengbojiangIf the value of `ZSTD_CLEVEL` is not a valid integer, it will be ignored with a warning message.
279*22ce4affSfengbojiang`ZSTD_CLEVEL` just replaces the default compression level (`3`).
280*22ce4affSfengbojiang
281*22ce4affSfengbojiang`ZSTD_NBTHREADS` can be used to set the number of threads `zstd` will attempt to use during compression.
282*22ce4affSfengbojiangIf the value of `ZSTD_NBTHREADS` is not a valid unsigned integer, it will be ignored with a warning message.
283*22ce4affSfengbojiang'ZSTD_NBTHREADS` has a default value of (`1`), and is capped at ZSTDMT_NBWORKERS_MAX==200. `zstd` must be
284*22ce4affSfengbojiangcompiled with multithread support for this to have any effect.
285*22ce4affSfengbojiang
286*22ce4affSfengbojiangThey can both be overridden by corresponding command line arguments:
287*22ce4affSfengbojiang`-#` for compression level and `-T#` for number of compression threads.
288*22ce4affSfengbojiang
289*22ce4affSfengbojiang
290*22ce4affSfengbojiangDICTIONARY BUILDER
291*22ce4affSfengbojiang------------------
292*22ce4affSfengbojiang`zstd` offers _dictionary_ compression,
293*22ce4affSfengbojiangwhich greatly improves efficiency on small files and messages.
294*22ce4affSfengbojiangIt's possible to train `zstd` with a set of samples,
295*22ce4affSfengbojiangthe result of which is saved into a file called a `dictionary`.
296*22ce4affSfengbojiangThen during compression and decompression, reference the same dictionary,
297*22ce4affSfengbojiangusing command `-D dictionaryFileName`.
298*22ce4affSfengbojiangCompression of small files similar to the sample set will be greatly improved.
299*22ce4affSfengbojiang
300*22ce4affSfengbojiang* `--train FILEs`:
301*22ce4affSfengbojiang    Use FILEs as training set to create a dictionary.
302*22ce4affSfengbojiang    The training set should contain a lot of small files (> 100),
303*22ce4affSfengbojiang    and weight typically 100x the target dictionary size
304*22ce4affSfengbojiang    (for example, 10 MB for a 100 KB dictionary).
305*22ce4affSfengbojiang
306*22ce4affSfengbojiang    Supports multithreading if `zstd` is compiled with threading support.
307*22ce4affSfengbojiang    Additional parameters can be specified with `--train-fastcover`.
308*22ce4affSfengbojiang    The legacy dictionary builder can be accessed with `--train-legacy`.
309*22ce4affSfengbojiang    The cover dictionary builder can be accessed with `--train-cover`.
310*22ce4affSfengbojiang    Equivalent to `--train-fastcover=d=8,steps=4`.
311*22ce4affSfengbojiang* `-o file`:
312*22ce4affSfengbojiang    Dictionary saved into `file` (default name: dictionary).
313*22ce4affSfengbojiang* `--maxdict=#`:
314*22ce4affSfengbojiang    Limit dictionary to specified size (default: 112640).
315*22ce4affSfengbojiang* `-#`:
316*22ce4affSfengbojiang    Use `#` compression level during training (optional).
317*22ce4affSfengbojiang    Will generate statistics more tuned for selected compression level,
318*22ce4affSfengbojiang    resulting in a _small_ compression ratio improvement for this level.
319*22ce4affSfengbojiang* `-B#`:
320*22ce4affSfengbojiang    Split input files in blocks of size # (default: no split)
321*22ce4affSfengbojiang* `--dictID=#`:
322*22ce4affSfengbojiang    A dictionary ID is a locally unique ID that a decoder can use to verify it is
323*22ce4affSfengbojiang    using the right dictionary.
324*22ce4affSfengbojiang    By default, zstd will create a 4-bytes random number ID.
325*22ce4affSfengbojiang    It's possible to give a precise number instead.
326*22ce4affSfengbojiang    Short numbers have an advantage : an ID < 256 will only need 1 byte in the
327*22ce4affSfengbojiang    compressed frame header, and an ID < 65536 will only need 2 bytes.
328*22ce4affSfengbojiang    This compares favorably to 4 bytes default.
329*22ce4affSfengbojiang    However, it's up to the dictionary manager to not assign twice the same ID to
330*22ce4affSfengbojiang    2 different dictionaries.
331*22ce4affSfengbojiang* `--train-cover[=k#,d=#,steps=#,split=#,shrink[=#]]`:
332*22ce4affSfengbojiang    Select parameters for the default dictionary builder algorithm named cover.
333*22ce4affSfengbojiang    If _d_ is not specified, then it tries _d_ = 6 and _d_ = 8.
334*22ce4affSfengbojiang    If _k_ is not specified, then it tries _steps_ values in the range [50, 2000].
335*22ce4affSfengbojiang    If _steps_ is not specified, then the default value of 40 is used.
336*22ce4affSfengbojiang    If _split_ is not specified or split <= 0, then the default value of 100 is used.
337*22ce4affSfengbojiang    Requires that _d_ <= _k_.
338*22ce4affSfengbojiang    If _shrink_ flag is not used, then the default value for _shrinkDict_ of 0 is used.
339*22ce4affSfengbojiang    If _shrink_ is not specified, then the default value for _shrinkDictMaxRegression_ of 1 is used.
340*22ce4affSfengbojiang
341*22ce4affSfengbojiang    Selects segments of size _k_ with highest score to put in the dictionary.
342*22ce4affSfengbojiang    The score of a segment is computed by the sum of the frequencies of all the
343*22ce4affSfengbojiang    subsegments of size _d_.
344*22ce4affSfengbojiang    Generally _d_ should be in the range [6, 8], occasionally up to 16, but the
345*22ce4affSfengbojiang    algorithm will run faster with d <= _8_.
346*22ce4affSfengbojiang    Good values for _k_ vary widely based on the input data, but a safe range is
347*22ce4affSfengbojiang    [2 * _d_, 2000].
348*22ce4affSfengbojiang    If _split_ is 100, all input samples are used for both training and testing
349*22ce4affSfengbojiang    to find optimal _d_ and _k_ to build dictionary.
350*22ce4affSfengbojiang    Supports multithreading if `zstd` is compiled with threading support.
351*22ce4affSfengbojiang    Having _shrink_ enabled takes a truncated dictionary of minimum size and doubles
352*22ce4affSfengbojiang    in size until compression ratio of the truncated dictionary is at most
353*22ce4affSfengbojiang    _shrinkDictMaxRegression%_ worse than the compression ratio of the largest dictionary.
354*22ce4affSfengbojiang
355*22ce4affSfengbojiang    Examples:
356*22ce4affSfengbojiang
357*22ce4affSfengbojiang    `zstd --train-cover FILEs`
358*22ce4affSfengbojiang
359*22ce4affSfengbojiang    `zstd --train-cover=k=50,d=8 FILEs`
360*22ce4affSfengbojiang
361*22ce4affSfengbojiang    `zstd --train-cover=d=8,steps=500 FILEs`
362*22ce4affSfengbojiang
363*22ce4affSfengbojiang    `zstd --train-cover=k=50 FILEs`
364*22ce4affSfengbojiang
365*22ce4affSfengbojiang    `zstd --train-cover=k=50,split=60 FILEs`
366*22ce4affSfengbojiang
367*22ce4affSfengbojiang    `zstd --train-cover=shrink FILEs`
368*22ce4affSfengbojiang
369*22ce4affSfengbojiang    `zstd --train-cover=shrink=2 FILEs`
370*22ce4affSfengbojiang
371*22ce4affSfengbojiang* `--train-fastcover[=k#,d=#,f=#,steps=#,split=#,accel=#]`:
372*22ce4affSfengbojiang    Same as cover but with extra parameters _f_ and _accel_ and different default value of split
373*22ce4affSfengbojiang    If _split_ is not specified, then it tries _split_ = 75.
374*22ce4affSfengbojiang    If _f_ is not specified, then it tries _f_ = 20.
375*22ce4affSfengbojiang    Requires that 0 < _f_ < 32.
376*22ce4affSfengbojiang    If _accel_ is not specified, then it tries _accel_ = 1.
377*22ce4affSfengbojiang    Requires that 0 < _accel_ <= 10.
378*22ce4affSfengbojiang    Requires that _d_ = 6 or _d_ = 8.
379*22ce4affSfengbojiang
380*22ce4affSfengbojiang    _f_ is log of size of array that keeps track of frequency of subsegments of size _d_.
381*22ce4affSfengbojiang    The subsegment is hashed to an index in the range [0,2^_f_ - 1].
382*22ce4affSfengbojiang    It is possible that 2 different subsegments are hashed to the same index, and they are considered as the same subsegment when computing frequency.
383*22ce4affSfengbojiang    Using a higher _f_ reduces collision but takes longer.
384*22ce4affSfengbojiang
385*22ce4affSfengbojiang    Examples:
386*22ce4affSfengbojiang
387*22ce4affSfengbojiang    `zstd --train-fastcover FILEs`
388*22ce4affSfengbojiang
389*22ce4affSfengbojiang    `zstd --train-fastcover=d=8,f=15,accel=2 FILEs`
390*22ce4affSfengbojiang
391*22ce4affSfengbojiang* `--train-legacy[=selectivity=#]`:
392*22ce4affSfengbojiang    Use legacy dictionary builder algorithm with the given dictionary
393*22ce4affSfengbojiang    _selectivity_ (default: 9).
394*22ce4affSfengbojiang    The smaller the _selectivity_ value, the denser the dictionary,
395*22ce4affSfengbojiang    improving its efficiency but reducing its possible maximum size.
396*22ce4affSfengbojiang    `--train-legacy=s=#` is also accepted.
397*22ce4affSfengbojiang
398*22ce4affSfengbojiang    Examples:
399*22ce4affSfengbojiang
400*22ce4affSfengbojiang    `zstd --train-legacy FILEs`
401*22ce4affSfengbojiang
402*22ce4affSfengbojiang    `zstd --train-legacy=selectivity=8 FILEs`
403*22ce4affSfengbojiang
404*22ce4affSfengbojiang
405*22ce4affSfengbojiangBENCHMARK
406*22ce4affSfengbojiang---------
407*22ce4affSfengbojiang
408*22ce4affSfengbojiang* `-b#`:
409*22ce4affSfengbojiang    benchmark file(s) using compression level #
410*22ce4affSfengbojiang* `-e#`:
411*22ce4affSfengbojiang    benchmark file(s) using multiple compression levels, from `-b#` to `-e#` (inclusive)
412*22ce4affSfengbojiang* `-i#`:
413*22ce4affSfengbojiang    minimum evaluation time, in seconds (default: 3s), benchmark mode only
414*22ce4affSfengbojiang* `-B#`, `--block-size=#`:
415*22ce4affSfengbojiang    cut file(s) into independent blocks of size # (default: no block)
416*22ce4affSfengbojiang* `--priority=rt`:
417*22ce4affSfengbojiang    set process priority to real-time
418*22ce4affSfengbojiang
419*22ce4affSfengbojiang**Output Format:** CompressionLevel#Filename : IntputSize -> OutputSize (CompressionRatio), CompressionSpeed, DecompressionSpeed
420*22ce4affSfengbojiang
421*22ce4affSfengbojiang**Methodology:** For both compression and decompression speed, the entire input is compressed/decompressed in-memory to measure speed. A run lasts at least 1 sec, so when files are small, they are compressed/decompressed several times per run, in order to improve measurement accuracy.
422*22ce4affSfengbojiang
423*22ce4affSfengbojiangADVANCED COMPRESSION OPTIONS
424*22ce4affSfengbojiang----------------------------
425*22ce4affSfengbojiang### --zstd[=options]:
426*22ce4affSfengbojiang`zstd` provides 22 predefined compression levels.
427*22ce4affSfengbojiangThe selected or default predefined compression level can be changed with
428*22ce4affSfengbojiangadvanced compression options.
429*22ce4affSfengbojiangThe _options_ are provided as a comma-separated list.
430*22ce4affSfengbojiangYou may specify only the options you want to change and the rest will be
431*22ce4affSfengbojiangtaken from the selected or default compression level.
432*22ce4affSfengbojiangThe list of available _options_:
433*22ce4affSfengbojiang
434*22ce4affSfengbojiang- `strategy`=_strat_, `strat`=_strat_:
435*22ce4affSfengbojiang    Specify a strategy used by a match finder.
436*22ce4affSfengbojiang
437*22ce4affSfengbojiang    There are 9 strategies numbered from 1 to 9, from faster to stronger:
438*22ce4affSfengbojiang    1=ZSTD\_fast, 2=ZSTD\_dfast, 3=ZSTD\_greedy,
439*22ce4affSfengbojiang    4=ZSTD\_lazy, 5=ZSTD\_lazy2, 6=ZSTD\_btlazy2,
440*22ce4affSfengbojiang    7=ZSTD\_btopt, 8=ZSTD\_btultra, 9=ZSTD\_btultra2.
441*22ce4affSfengbojiang
442*22ce4affSfengbojiang- `windowLog`=_wlog_, `wlog`=_wlog_:
443*22ce4affSfengbojiang    Specify the maximum number of bits for a match distance.
444*22ce4affSfengbojiang
445*22ce4affSfengbojiang    The higher number of increases the chance to find a match which usually
446*22ce4affSfengbojiang    improves compression ratio.
447*22ce4affSfengbojiang    It also increases memory requirements for the compressor and decompressor.
448*22ce4affSfengbojiang    The minimum _wlog_ is 10 (1 KiB) and the maximum is 30 (1 GiB) on 32-bit
449*22ce4affSfengbojiang    platforms and 31 (2 GiB) on 64-bit platforms.
450*22ce4affSfengbojiang
451*22ce4affSfengbojiang    Note: If `windowLog` is set to larger than 27, `--long=windowLog` or
452*22ce4affSfengbojiang    `--memory=windowSize` needs to be passed to the decompressor.
453*22ce4affSfengbojiang
454*22ce4affSfengbojiang- `hashLog`=_hlog_, `hlog`=_hlog_:
455*22ce4affSfengbojiang    Specify the maximum number of bits for a hash table.
456*22ce4affSfengbojiang
457*22ce4affSfengbojiang    Bigger hash tables cause less collisions which usually makes compression
458*22ce4affSfengbojiang    faster, but requires more memory during compression.
459*22ce4affSfengbojiang
460*22ce4affSfengbojiang    The minimum _hlog_ is 6 (64 B) and the maximum is 30 (1 GiB).
461*22ce4affSfengbojiang
462*22ce4affSfengbojiang- `chainLog`=_clog_, `clog`=_clog_:
463*22ce4affSfengbojiang    Specify the maximum number of bits for a hash chain or a binary tree.
464*22ce4affSfengbojiang
465*22ce4affSfengbojiang    Higher numbers of bits increases the chance to find a match which usually
466*22ce4affSfengbojiang    improves compression ratio.
467*22ce4affSfengbojiang    It also slows down compression speed and increases memory requirements for
468*22ce4affSfengbojiang    compression.
469*22ce4affSfengbojiang    This option is ignored for the ZSTD_fast strategy.
470*22ce4affSfengbojiang
471*22ce4affSfengbojiang    The minimum _clog_ is 6 (64 B) and the maximum is 29 (524 Mib) on 32-bit platforms
472*22ce4affSfengbojiang    and 30 (1 Gib) on 64-bit platforms.
473*22ce4affSfengbojiang
474*22ce4affSfengbojiang- `searchLog`=_slog_, `slog`=_slog_:
475*22ce4affSfengbojiang    Specify the maximum number of searches in a hash chain or a binary tree
476*22ce4affSfengbojiang    using logarithmic scale.
477*22ce4affSfengbojiang
478*22ce4affSfengbojiang    More searches increases the chance to find a match which usually increases
479*22ce4affSfengbojiang    compression ratio but decreases compression speed.
480*22ce4affSfengbojiang
481*22ce4affSfengbojiang    The minimum _slog_ is 1 and the maximum is 'windowLog' - 1.
482*22ce4affSfengbojiang
483*22ce4affSfengbojiang- `minMatch`=_mml_, `mml`=_mml_:
484*22ce4affSfengbojiang    Specify the minimum searched length of a match in a hash table.
485*22ce4affSfengbojiang
486*22ce4affSfengbojiang    Larger search lengths usually decrease compression ratio but improve
487*22ce4affSfengbojiang    decompression speed.
488*22ce4affSfengbojiang
489*22ce4affSfengbojiang    The minimum _mml_ is 3 and the maximum is 7.
490*22ce4affSfengbojiang
491*22ce4affSfengbojiang- `targetLength`=_tlen_, `tlen`=_tlen_:
492*22ce4affSfengbojiang    The impact of this field vary depending on selected strategy.
493*22ce4affSfengbojiang
494*22ce4affSfengbojiang    For ZSTD\_btopt, ZSTD\_btultra and ZSTD\_btultra2, it specifies
495*22ce4affSfengbojiang    the minimum match length that causes match finder to stop searching.
496*22ce4affSfengbojiang    A larger `targetLength` usually improves compression ratio
497*22ce4affSfengbojiang    but decreases compression speed.
498*22ce4affSfengbojiangt
499*22ce4affSfengbojiang    For ZSTD\_fast, it triggers ultra-fast mode when > 0.
500*22ce4affSfengbojiang    The value represents the amount of data skipped between match sampling.
501*22ce4affSfengbojiang    Impact is reversed : a larger `targetLength` increases compression speed
502*22ce4affSfengbojiang    but decreases compression ratio.
503*22ce4affSfengbojiang
504*22ce4affSfengbojiang    For all other strategies, this field has no impact.
505*22ce4affSfengbojiang
506*22ce4affSfengbojiang    The minimum _tlen_ is 0 and the maximum is 128 Kib.
507*22ce4affSfengbojiang
508*22ce4affSfengbojiang- `overlapLog`=_ovlog_,  `ovlog`=_ovlog_:
509*22ce4affSfengbojiang    Determine `overlapSize`, amount of data reloaded from previous job.
510*22ce4affSfengbojiang    This parameter is only available when multithreading is enabled.
511*22ce4affSfengbojiang    Reloading more data improves compression ratio, but decreases speed.
512*22ce4affSfengbojiang
513*22ce4affSfengbojiang    The minimum _ovlog_ is 0, and the maximum is 9.
514*22ce4affSfengbojiang    1 means "no overlap", hence completely independent jobs.
515*22ce4affSfengbojiang    9 means "full overlap", meaning up to `windowSize` is reloaded from previous job.
516*22ce4affSfengbojiang    Reducing _ovlog_ by 1 reduces the reloaded amount by a factor 2.
517*22ce4affSfengbojiang    For example, 8 means "windowSize/2", and 6 means "windowSize/8".
518*22ce4affSfengbojiang    Value 0 is special and means "default" : _ovlog_ is automatically determined by `zstd`.
519*22ce4affSfengbojiang    In which case, _ovlog_ will range from 6 to 9, depending on selected _strat_.
520*22ce4affSfengbojiang
521*22ce4affSfengbojiang- `ldmHashLog`=_lhlog_, `lhlog`=_lhlog_:
522*22ce4affSfengbojiang    Specify the maximum size for a hash table used for long distance matching.
523*22ce4affSfengbojiang
524*22ce4affSfengbojiang    This option is ignored unless long distance matching is enabled.
525*22ce4affSfengbojiang
526*22ce4affSfengbojiang    Bigger hash tables usually improve compression ratio at the expense of more
527*22ce4affSfengbojiang    memory during compression and a decrease in compression speed.
528*22ce4affSfengbojiang
529*22ce4affSfengbojiang    The minimum _lhlog_ is 6 and the maximum is 30 (default: 20).
530*22ce4affSfengbojiang
531*22ce4affSfengbojiang- `ldmMinMatch`=_lmml_, `lmml`=_lmml_:
532*22ce4affSfengbojiang    Specify the minimum searched length of a match for long distance matching.
533*22ce4affSfengbojiang
534*22ce4affSfengbojiang    This option is ignored unless long distance matching is enabled.
535*22ce4affSfengbojiang
536*22ce4affSfengbojiang    Larger/very small values usually decrease compression ratio.
537*22ce4affSfengbojiang
538*22ce4affSfengbojiang    The minimum _lmml_ is 4 and the maximum is 4096 (default: 64).
539*22ce4affSfengbojiang
540*22ce4affSfengbojiang- `ldmBucketSizeLog`=_lblog_, `lblog`=_lblog_:
541*22ce4affSfengbojiang    Specify the size of each bucket for the hash table used for long distance
542*22ce4affSfengbojiang    matching.
543*22ce4affSfengbojiang
544*22ce4affSfengbojiang    This option is ignored unless long distance matching is enabled.
545*22ce4affSfengbojiang
546*22ce4affSfengbojiang    Larger bucket sizes improve collision resolution but decrease compression
547*22ce4affSfengbojiang    speed.
548*22ce4affSfengbojiang
549*22ce4affSfengbojiang    The minimum _lblog_ is 1 and the maximum is 8 (default: 3).
550*22ce4affSfengbojiang
551*22ce4affSfengbojiang- `ldmHashRateLog`=_lhrlog_, `lhrlog`=_lhrlog_:
552*22ce4affSfengbojiang    Specify the frequency of inserting entries into the long distance matching
553*22ce4affSfengbojiang    hash table.
554*22ce4affSfengbojiang
555*22ce4affSfengbojiang    This option is ignored unless long distance matching is enabled.
556*22ce4affSfengbojiang
557*22ce4affSfengbojiang    Larger values will improve compression speed. Deviating far from the
558*22ce4affSfengbojiang    default value will likely result in a decrease in compression ratio.
559*22ce4affSfengbojiang
560*22ce4affSfengbojiang    The default value is `wlog - lhlog`.
561*22ce4affSfengbojiang
562*22ce4affSfengbojiang### Example
563*22ce4affSfengbojiangThe following parameters sets advanced compression options to something
564*22ce4affSfengbojiangsimilar to predefined level 19 for files bigger than 256 KB:
565*22ce4affSfengbojiang
566*22ce4affSfengbojiang`--zstd`=wlog=23,clog=23,hlog=22,slog=6,mml=3,tlen=48,strat=6
567*22ce4affSfengbojiang
568*22ce4affSfengbojiang### -B#:
569*22ce4affSfengbojiangSelect the size of each compression job.
570*22ce4affSfengbojiangThis parameter is available only when multi-threading is enabled.
571*22ce4affSfengbojiangDefault value is `4 * windowSize`, which means it varies depending on compression level.
572*22ce4affSfengbojiang`-B#` makes it possible to select a custom value.
573*22ce4affSfengbojiangNote that job size must respect a minimum value which is enforced transparently.
574*22ce4affSfengbojiangThis minimum is either 1 MB, or `overlapSize`, whichever is largest.
575*22ce4affSfengbojiang
576*22ce4affSfengbojiangBUGS
577*22ce4affSfengbojiang----
578*22ce4affSfengbojiangReport bugs at: https://github.com/facebook/zstd/issues
579*22ce4affSfengbojiang
580*22ce4affSfengbojiangAUTHOR
581*22ce4affSfengbojiang------
582*22ce4affSfengbojiangYann Collet
583