1*22ce4affSfengbojiangzstd(1) -- zstd, zstdmt, unzstd, zstdcat - Compress or decompress .zst files 2*22ce4affSfengbojiang============================================================================ 3*22ce4affSfengbojiang 4*22ce4affSfengbojiangSYNOPSIS 5*22ce4affSfengbojiang-------- 6*22ce4affSfengbojiang 7*22ce4affSfengbojiang`zstd` [*OPTIONS*] [-|_INPUT-FILE_] [-o _OUTPUT-FILE_] 8*22ce4affSfengbojiang 9*22ce4affSfengbojiang`zstdmt` is equivalent to `zstd -T0` 10*22ce4affSfengbojiang 11*22ce4affSfengbojiang`unzstd` is equivalent to `zstd -d` 12*22ce4affSfengbojiang 13*22ce4affSfengbojiang`zstdcat` is equivalent to `zstd -dcf` 14*22ce4affSfengbojiang 15*22ce4affSfengbojiang 16*22ce4affSfengbojiangDESCRIPTION 17*22ce4affSfengbojiang----------- 18*22ce4affSfengbojiang`zstd` is a fast lossless compression algorithm and data compression tool, 19*22ce4affSfengbojiangwith command line syntax similar to `gzip (1)` and `xz (1)`. 20*22ce4affSfengbojiangIt is based on the **LZ77** family, with further FSE & huff0 entropy stages. 21*22ce4affSfengbojiang`zstd` offers highly configurable compression speed, 22*22ce4affSfengbojiangwith fast modes at > 200 MB/s per core, 23*22ce4affSfengbojiangand strong modes nearing lzma compression ratios. 24*22ce4affSfengbojiangIt also features a very fast decoder, with speeds > 500 MB/s per core. 25*22ce4affSfengbojiang 26*22ce4affSfengbojiang`zstd` command line syntax is generally similar to gzip, 27*22ce4affSfengbojiangbut features the following differences : 28*22ce4affSfengbojiang 29*22ce4affSfengbojiang - Source files are preserved by default. 30*22ce4affSfengbojiang It's possible to remove them automatically by using the `--rm` command. 31*22ce4affSfengbojiang - When compressing a single file, `zstd` displays progress notifications 32*22ce4affSfengbojiang and result summary by default. 33*22ce4affSfengbojiang Use `-q` to turn them off. 34*22ce4affSfengbojiang - `zstd` does not accept input from console, 35*22ce4affSfengbojiang but it properly accepts `stdin` when it's not the console. 36*22ce4affSfengbojiang - `zstd` displays a short help page when command line is an error. 37*22ce4affSfengbojiang Use `-q` to turn it off. 38*22ce4affSfengbojiang 39*22ce4affSfengbojiang`zstd` compresses or decompresses each _file_ according to the selected 40*22ce4affSfengbojiangoperation mode. 41*22ce4affSfengbojiangIf no _files_ are given or _file_ is `-`, `zstd` reads from standard input 42*22ce4affSfengbojiangand writes the processed data to standard output. 43*22ce4affSfengbojiang`zstd` will refuse to write compressed data to standard output 44*22ce4affSfengbojiangif it is a terminal : it will display an error message and skip the _file_. 45*22ce4affSfengbojiangSimilarly, `zstd` will refuse to read compressed data from standard input 46*22ce4affSfengbojiangif it is a terminal. 47*22ce4affSfengbojiang 48*22ce4affSfengbojiangUnless `--stdout` or `-o` is specified, _files_ are written to a new file 49*22ce4affSfengbojiangwhose name is derived from the source _file_ name: 50*22ce4affSfengbojiang 51*22ce4affSfengbojiang* When compressing, the suffix `.zst` is appended to the source filename to 52*22ce4affSfengbojiang get the target filename. 53*22ce4affSfengbojiang* When decompressing, the `.zst` suffix is removed from the source filename to 54*22ce4affSfengbojiang get the target filename 55*22ce4affSfengbojiang 56*22ce4affSfengbojiang### Concatenation with .zst files 57*22ce4affSfengbojiangIt is possible to concatenate `.zst` files as is. 58*22ce4affSfengbojiang`zstd` will decompress such files as if they were a single `.zst` file. 59*22ce4affSfengbojiang 60*22ce4affSfengbojiangOPTIONS 61*22ce4affSfengbojiang------- 62*22ce4affSfengbojiang 63*22ce4affSfengbojiang### Integer suffixes and special values 64*22ce4affSfengbojiangIn most places where an integer argument is expected, 65*22ce4affSfengbojiangan optional suffix is supported to easily indicate large integers. 66*22ce4affSfengbojiangThere must be no space between the integer and the suffix. 67*22ce4affSfengbojiang 68*22ce4affSfengbojiang* `KiB`: 69*22ce4affSfengbojiang Multiply the integer by 1,024 (2\^10). 70*22ce4affSfengbojiang `Ki`, `K`, and `KB` are accepted as synonyms for `KiB`. 71*22ce4affSfengbojiang* `MiB`: 72*22ce4affSfengbojiang Multiply the integer by 1,048,576 (2\^20). 73*22ce4affSfengbojiang `Mi`, `M`, and `MB` are accepted as synonyms for `MiB`. 74*22ce4affSfengbojiang 75*22ce4affSfengbojiang### Operation mode 76*22ce4affSfengbojiangIf multiple operation mode options are given, 77*22ce4affSfengbojiangthe last one takes effect. 78*22ce4affSfengbojiang 79*22ce4affSfengbojiang* `-z`, `--compress`: 80*22ce4affSfengbojiang Compress. 81*22ce4affSfengbojiang This is the default operation mode when no operation mode option is specified 82*22ce4affSfengbojiang and no other operation mode is implied from the command name 83*22ce4affSfengbojiang (for example, `unzstd` implies `--decompress`). 84*22ce4affSfengbojiang* `-d`, `--decompress`, `--uncompress`: 85*22ce4affSfengbojiang Decompress. 86*22ce4affSfengbojiang* `-t`, `--test`: 87*22ce4affSfengbojiang Test the integrity of compressed _files_. 88*22ce4affSfengbojiang This option is equivalent to `--decompress --stdout` except that the 89*22ce4affSfengbojiang decompressed data is discarded instead of being written to standard output. 90*22ce4affSfengbojiang No files are created or removed. 91*22ce4affSfengbojiang* `-b#`: 92*22ce4affSfengbojiang Benchmark file(s) using compression level # 93*22ce4affSfengbojiang* `--train FILEs`: 94*22ce4affSfengbojiang Use FILEs as a training set to create a dictionary. 95*22ce4affSfengbojiang The training set should contain a lot of small files (> 100). 96*22ce4affSfengbojiang* `-l`, `--list`: 97*22ce4affSfengbojiang Display information related to a zstd compressed file, such as size, ratio, and checksum. 98*22ce4affSfengbojiang Some of these fields may not be available. 99*22ce4affSfengbojiang This command can be augmented with the `-v` modifier. 100*22ce4affSfengbojiang 101*22ce4affSfengbojiang### Operation modifiers 102*22ce4affSfengbojiang 103*22ce4affSfengbojiang* `-#`: 104*22ce4affSfengbojiang `#` compression level \[1-19] (default: 3) 105*22ce4affSfengbojiang* `--ultra`: 106*22ce4affSfengbojiang unlocks high compression levels 20+ (maximum 22), using a lot more memory. 107*22ce4affSfengbojiang Note that decompression will also require more memory when using these levels. 108*22ce4affSfengbojiang* `--fast[=#]`: 109*22ce4affSfengbojiang switch to ultra-fast compression levels. 110*22ce4affSfengbojiang If `=#` is not present, it defaults to `1`. 111*22ce4affSfengbojiang The higher the value, the faster the compression speed, 112*22ce4affSfengbojiang at the cost of some compression ratio. 113*22ce4affSfengbojiang This setting overwrites compression level if one was set previously. 114*22ce4affSfengbojiang Similarly, if a compression level is set after `--fast`, it overrides it. 115*22ce4affSfengbojiang* `-T#`, `--threads=#`: 116*22ce4affSfengbojiang Compress using `#` working threads (default: 1). 117*22ce4affSfengbojiang If `#` is 0, attempt to detect and use the number of physical CPU cores. 118*22ce4affSfengbojiang In all cases, the nb of threads is capped to ZSTDMT_NBWORKERS_MAX==200. 119*22ce4affSfengbojiang This modifier does nothing if `zstd` is compiled without multithread support. 120*22ce4affSfengbojiang* `--single-thread`: 121*22ce4affSfengbojiang Does not spawn a thread for compression, use a single thread for both I/O and compression. 122*22ce4affSfengbojiang In this mode, compression is serialized with I/O, which is slightly slower. 123*22ce4affSfengbojiang (This is different from `-T1`, which spawns 1 compression thread in parallel of I/O). 124*22ce4affSfengbojiang This mode is the only one available when multithread support is disabled. 125*22ce4affSfengbojiang Single-thread mode features lower memory usage. 126*22ce4affSfengbojiang Final compressed result is slightly different from `-T1`. 127*22ce4affSfengbojiang* `--adapt[=min=#,max=#]` : 128*22ce4affSfengbojiang `zstd` will dynamically adapt compression level to perceived I/O conditions. 129*22ce4affSfengbojiang Compression level adaptation can be observed live by using command `-v`. 130*22ce4affSfengbojiang Adaptation can be constrained between supplied `min` and `max` levels. 131*22ce4affSfengbojiang The feature works when combined with multi-threading and `--long` mode. 132*22ce4affSfengbojiang It does not work with `--single-thread`. 133*22ce4affSfengbojiang It sets window size to 8 MB by default (can be changed manually, see `wlog`). 134*22ce4affSfengbojiang Due to the chaotic nature of dynamic adaptation, compressed result is not reproducible. 135*22ce4affSfengbojiang _note_ : at the time of this writing, `--adapt` can remain stuck at low speed 136*22ce4affSfengbojiang when combined with multiple worker threads (>=2). 137*22ce4affSfengbojiang* `--long[=#]`: 138*22ce4affSfengbojiang enables long distance matching with `#` `windowLog`, if not `#` is not 139*22ce4affSfengbojiang present it defaults to `27`. 140*22ce4affSfengbojiang This increases the window size (`windowLog`) and memory usage for both the 141*22ce4affSfengbojiang compressor and decompressor. 142*22ce4affSfengbojiang This setting is designed to improve the compression ratio for files with 143*22ce4affSfengbojiang long matches at a large distance. 144*22ce4affSfengbojiang 145*22ce4affSfengbojiang Note: If `windowLog` is set to larger than 27, `--long=windowLog` or 146*22ce4affSfengbojiang `--memory=windowSize` needs to be passed to the decompressor. 147*22ce4affSfengbojiang* `-D DICT`: 148*22ce4affSfengbojiang use `DICT` as Dictionary to compress or decompress FILE(s) 149*22ce4affSfengbojiang* `--patch-from FILE`: 150*22ce4affSfengbojiang Specify the file to be used as a reference point for zstd's diff engine. 151*22ce4affSfengbojiang This is effectively dictionary compression with some convenient parameter 152*22ce4affSfengbojiang selection, namely that windowSize > srcSize. 153*22ce4affSfengbojiang 154*22ce4affSfengbojiang Note: cannot use both this and -D together 155*22ce4affSfengbojiang Note: `--long` mode will be automatically activated if chainLog < fileLog 156*22ce4affSfengbojiang (fileLog being the windowLog required to cover the whole file). You 157*22ce4affSfengbojiang can also manually force it. 158*22ce4affSfengbojiang Node: for all levels, you can use --patch-from in --single-thread mode 159*22ce4affSfengbojiang to improve compression ratio at the cost of speed 160*22ce4affSfengbojiang Note: for level 19, you can get increased compression ratio at the cost 161*22ce4affSfengbojiang of speed by specifying `--zstd=targetLength=` to be something large 162*22ce4affSfengbojiang (i.e 4096), and by setting a large `--zstd=chainLog=` 163*22ce4affSfengbojiang* `--rsyncable` : 164*22ce4affSfengbojiang `zstd` will periodically synchronize the compression state to make the 165*22ce4affSfengbojiang compressed file more rsync-friendly. There is a negligible impact to 166*22ce4affSfengbojiang compression ratio, and the faster compression levels will see a small 167*22ce4affSfengbojiang compression speed hit. 168*22ce4affSfengbojiang This feature does not work with `--single-thread`. You probably don't want 169*22ce4affSfengbojiang to use it with long range mode, since it will decrease the effectiveness of 170*22ce4affSfengbojiang the synchronization points, but your milage may vary. 171*22ce4affSfengbojiang* `-C`, `--[no-]check`: 172*22ce4affSfengbojiang add integrity check computed from uncompressed data (default: enabled) 173*22ce4affSfengbojiang* `--[no-]content-size`: 174*22ce4affSfengbojiang enable / disable whether or not the original size of the file is placed in 175*22ce4affSfengbojiang the header of the compressed file. The default option is 176*22ce4affSfengbojiang --content-size (meaning that the original size will be placed in the header). 177*22ce4affSfengbojiang* `--no-dictID`: 178*22ce4affSfengbojiang do not store dictionary ID within frame header (dictionary compression). 179*22ce4affSfengbojiang The decoder will have to rely on implicit knowledge about which dictionary to use, 180*22ce4affSfengbojiang it won't be able to check if it's correct. 181*22ce4affSfengbojiang* `-M#`, `--memory=#`: 182*22ce4affSfengbojiang Set a memory usage limit. By default, Zstandard uses 128 MB for decompression 183*22ce4affSfengbojiang as the maximum amount of memory the decompressor is allowed to use, but you can 184*22ce4affSfengbojiang override this manually if need be in either direction (ie. you can increase or 185*22ce4affSfengbojiang decrease it). 186*22ce4affSfengbojiang 187*22ce4affSfengbojiang This is also used during compression when using with --patch-from=. In this case, 188*22ce4affSfengbojiang this parameter overrides that maximum size allowed for a dictionary. (128 MB). 189*22ce4affSfengbojiang* `--stream-size=#` : 190*22ce4affSfengbojiang Sets the pledged source size of input coming from a stream. This value must be exact, as it 191*22ce4affSfengbojiang will be included in the produced frame header. Incorrect stream sizes will cause an error. 192*22ce4affSfengbojiang This information will be used to better optimize compression parameters, resulting in 193*22ce4affSfengbojiang better and potentially faster compression, especially for smaller source sizes. 194*22ce4affSfengbojiang* `--size-hint=#`: 195*22ce4affSfengbojiang When handling input from a stream, `zstd` must guess how large the source size 196*22ce4affSfengbojiang will be when optimizing compression parameters. If the stream size is relatively 197*22ce4affSfengbojiang small, this guess may be a poor one, resulting in a higher compression ratio than 198*22ce4affSfengbojiang expected. This feature allows for controlling the guess when needed. 199*22ce4affSfengbojiang Exact guesses result in better compression ratios. Overestimates result in slightly 200*22ce4affSfengbojiang degraded compression ratios, while underestimates may result in significant degradation. 201*22ce4affSfengbojiang* `-o FILE`: 202*22ce4affSfengbojiang save result into `FILE` 203*22ce4affSfengbojiang* `-f`, `--force`: 204*22ce4affSfengbojiang overwrite output without prompting, and (de)compress symbolic links 205*22ce4affSfengbojiang* `-c`, `--stdout`: 206*22ce4affSfengbojiang force write to standard output, even if it is the console 207*22ce4affSfengbojiang* `--[no-]sparse`: 208*22ce4affSfengbojiang enable / disable sparse FS support, 209*22ce4affSfengbojiang to make files with many zeroes smaller on disk. 210*22ce4affSfengbojiang Creating sparse files may save disk space and speed up decompression by 211*22ce4affSfengbojiang reducing the amount of disk I/O. 212*22ce4affSfengbojiang default: enabled when output is into a file, 213*22ce4affSfengbojiang and disabled when output is stdout. 214*22ce4affSfengbojiang This setting overrides default and can force sparse mode over stdout. 215*22ce4affSfengbojiang* `--rm`: 216*22ce4affSfengbojiang remove source file(s) after successful compression or decompression. If used in combination with 217*22ce4affSfengbojiang -o, will trigger a confirmation prompt (which can be silenced with -f), as this is a destructive operation. 218*22ce4affSfengbojiang* `-k`, `--keep`: 219*22ce4affSfengbojiang keep source file(s) after successful compression or decompression. 220*22ce4affSfengbojiang This is the default behavior. 221*22ce4affSfengbojiang* `-r`: 222*22ce4affSfengbojiang operate recursively on directories 223*22ce4affSfengbojiang* `--filelist FILE` 224*22ce4affSfengbojiang read a list of files to process as content from `FILE`. 225*22ce4affSfengbojiang Format is compatible with `ls` output, with one file per line. 226*22ce4affSfengbojiang* `--output-dir-flat DIR`: 227*22ce4affSfengbojiang resulting files are stored into target `DIR` directory, 228*22ce4affSfengbojiang instead of same directory as origin file. 229*22ce4affSfengbojiang Be aware that this command can introduce name collision issues, 230*22ce4affSfengbojiang if multiple files, from different directories, end up having the same name. 231*22ce4affSfengbojiang Collision resolution ensures first file with a given name will be present in `DIR`, 232*22ce4affSfengbojiang while in combination with `-f`, the last file will be present instead. 233*22ce4affSfengbojiang* `--output-dir-mirror DIR`: 234*22ce4affSfengbojiang similar to `--output-dir-flat`, 235*22ce4affSfengbojiang the output files are stored underneath target `DIR` directory, 236*22ce4affSfengbojiang but this option will replicate input directory hierarchy into output `DIR`. 237*22ce4affSfengbojiang 238*22ce4affSfengbojiang If input directory contains "..", the files in this directory will be ignored. 239*22ce4affSfengbojiang If input directory is an absolute directory (i.e. "/var/tmp/abc"), 240*22ce4affSfengbojiang it will be stored into the "output-dir/var/tmp/abc". 241*22ce4affSfengbojiang If there are multiple input files or directories, 242*22ce4affSfengbojiang name collision resolution will follow the same rules as `--output-dir-flat`. 243*22ce4affSfengbojiang* `--format=FORMAT`: 244*22ce4affSfengbojiang compress and decompress in other formats. If compiled with 245*22ce4affSfengbojiang support, zstd can compress to or decompress from other compression algorithm 246*22ce4affSfengbojiang formats. Possibly available options are `zstd`, `gzip`, `xz`, `lzma`, and `lz4`. 247*22ce4affSfengbojiang If no such format is provided, `zstd` is the default. 248*22ce4affSfengbojiang* `-h`/`-H`, `--help`: 249*22ce4affSfengbojiang display help/long help and exit 250*22ce4affSfengbojiang* `-V`, `--version`: 251*22ce4affSfengbojiang display version number and exit. 252*22ce4affSfengbojiang Advanced : `-vV` also displays supported formats. 253*22ce4affSfengbojiang `-vvV` also displays POSIX support. 254*22ce4affSfengbojiang `-q` will only display the version number, suitable for machine reading. 255*22ce4affSfengbojiang* `-v`, `--verbose`: 256*22ce4affSfengbojiang verbose mode, display more information 257*22ce4affSfengbojiang* `-q`, `--quiet`: 258*22ce4affSfengbojiang suppress warnings, interactivity, and notifications. 259*22ce4affSfengbojiang specify twice to suppress errors too. 260*22ce4affSfengbojiang* `--no-progress`: 261*22ce4affSfengbojiang do not display the progress bar, but keep all other messages. 262*22ce4affSfengbojiang* `--show-default-cparams`: 263*22ce4affSfengbojiang Shows the default compression parameters that will be used for a 264*22ce4affSfengbojiang particular src file. If the provided src file is not a regular file 265*22ce4affSfengbojiang (eg. named pipe), the cli will just output the default parameters. 266*22ce4affSfengbojiang That is, the parameters that are used when the src size is unknown. 267*22ce4affSfengbojiang* `--`: 268*22ce4affSfengbojiang All arguments after `--` are treated as files 269*22ce4affSfengbojiang 270*22ce4affSfengbojiang### Restricted usage of Environment Variables 271*22ce4affSfengbojiang 272*22ce4affSfengbojiangUsing environment variables to set parameters has security implications. 273*22ce4affSfengbojiangTherefore, this avenue is intentionally restricted. 274*22ce4affSfengbojiangOnly `ZSTD_CLEVEL` and `ZSTD_NBTHREADS` are currently supported. 275*22ce4affSfengbojiangThey set the compression level and number of threads to use during compression, respectively. 276*22ce4affSfengbojiang 277*22ce4affSfengbojiang`ZSTD_CLEVEL` can be used to set the level between 1 and 19 (the "normal" range). 278*22ce4affSfengbojiangIf the value of `ZSTD_CLEVEL` is not a valid integer, it will be ignored with a warning message. 279*22ce4affSfengbojiang`ZSTD_CLEVEL` just replaces the default compression level (`3`). 280*22ce4affSfengbojiang 281*22ce4affSfengbojiang`ZSTD_NBTHREADS` can be used to set the number of threads `zstd` will attempt to use during compression. 282*22ce4affSfengbojiangIf the value of `ZSTD_NBTHREADS` is not a valid unsigned integer, it will be ignored with a warning message. 283*22ce4affSfengbojiang'ZSTD_NBTHREADS` has a default value of (`1`), and is capped at ZSTDMT_NBWORKERS_MAX==200. `zstd` must be 284*22ce4affSfengbojiangcompiled with multithread support for this to have any effect. 285*22ce4affSfengbojiang 286*22ce4affSfengbojiangThey can both be overridden by corresponding command line arguments: 287*22ce4affSfengbojiang`-#` for compression level and `-T#` for number of compression threads. 288*22ce4affSfengbojiang 289*22ce4affSfengbojiang 290*22ce4affSfengbojiangDICTIONARY BUILDER 291*22ce4affSfengbojiang------------------ 292*22ce4affSfengbojiang`zstd` offers _dictionary_ compression, 293*22ce4affSfengbojiangwhich greatly improves efficiency on small files and messages. 294*22ce4affSfengbojiangIt's possible to train `zstd` with a set of samples, 295*22ce4affSfengbojiangthe result of which is saved into a file called a `dictionary`. 296*22ce4affSfengbojiangThen during compression and decompression, reference the same dictionary, 297*22ce4affSfengbojiangusing command `-D dictionaryFileName`. 298*22ce4affSfengbojiangCompression of small files similar to the sample set will be greatly improved. 299*22ce4affSfengbojiang 300*22ce4affSfengbojiang* `--train FILEs`: 301*22ce4affSfengbojiang Use FILEs as training set to create a dictionary. 302*22ce4affSfengbojiang The training set should contain a lot of small files (> 100), 303*22ce4affSfengbojiang and weight typically 100x the target dictionary size 304*22ce4affSfengbojiang (for example, 10 MB for a 100 KB dictionary). 305*22ce4affSfengbojiang 306*22ce4affSfengbojiang Supports multithreading if `zstd` is compiled with threading support. 307*22ce4affSfengbojiang Additional parameters can be specified with `--train-fastcover`. 308*22ce4affSfengbojiang The legacy dictionary builder can be accessed with `--train-legacy`. 309*22ce4affSfengbojiang The cover dictionary builder can be accessed with `--train-cover`. 310*22ce4affSfengbojiang Equivalent to `--train-fastcover=d=8,steps=4`. 311*22ce4affSfengbojiang* `-o file`: 312*22ce4affSfengbojiang Dictionary saved into `file` (default name: dictionary). 313*22ce4affSfengbojiang* `--maxdict=#`: 314*22ce4affSfengbojiang Limit dictionary to specified size (default: 112640). 315*22ce4affSfengbojiang* `-#`: 316*22ce4affSfengbojiang Use `#` compression level during training (optional). 317*22ce4affSfengbojiang Will generate statistics more tuned for selected compression level, 318*22ce4affSfengbojiang resulting in a _small_ compression ratio improvement for this level. 319*22ce4affSfengbojiang* `-B#`: 320*22ce4affSfengbojiang Split input files in blocks of size # (default: no split) 321*22ce4affSfengbojiang* `--dictID=#`: 322*22ce4affSfengbojiang A dictionary ID is a locally unique ID that a decoder can use to verify it is 323*22ce4affSfengbojiang using the right dictionary. 324*22ce4affSfengbojiang By default, zstd will create a 4-bytes random number ID. 325*22ce4affSfengbojiang It's possible to give a precise number instead. 326*22ce4affSfengbojiang Short numbers have an advantage : an ID < 256 will only need 1 byte in the 327*22ce4affSfengbojiang compressed frame header, and an ID < 65536 will only need 2 bytes. 328*22ce4affSfengbojiang This compares favorably to 4 bytes default. 329*22ce4affSfengbojiang However, it's up to the dictionary manager to not assign twice the same ID to 330*22ce4affSfengbojiang 2 different dictionaries. 331*22ce4affSfengbojiang* `--train-cover[=k#,d=#,steps=#,split=#,shrink[=#]]`: 332*22ce4affSfengbojiang Select parameters for the default dictionary builder algorithm named cover. 333*22ce4affSfengbojiang If _d_ is not specified, then it tries _d_ = 6 and _d_ = 8. 334*22ce4affSfengbojiang If _k_ is not specified, then it tries _steps_ values in the range [50, 2000]. 335*22ce4affSfengbojiang If _steps_ is not specified, then the default value of 40 is used. 336*22ce4affSfengbojiang If _split_ is not specified or split <= 0, then the default value of 100 is used. 337*22ce4affSfengbojiang Requires that _d_ <= _k_. 338*22ce4affSfengbojiang If _shrink_ flag is not used, then the default value for _shrinkDict_ of 0 is used. 339*22ce4affSfengbojiang If _shrink_ is not specified, then the default value for _shrinkDictMaxRegression_ of 1 is used. 340*22ce4affSfengbojiang 341*22ce4affSfengbojiang Selects segments of size _k_ with highest score to put in the dictionary. 342*22ce4affSfengbojiang The score of a segment is computed by the sum of the frequencies of all the 343*22ce4affSfengbojiang subsegments of size _d_. 344*22ce4affSfengbojiang Generally _d_ should be in the range [6, 8], occasionally up to 16, but the 345*22ce4affSfengbojiang algorithm will run faster with d <= _8_. 346*22ce4affSfengbojiang Good values for _k_ vary widely based on the input data, but a safe range is 347*22ce4affSfengbojiang [2 * _d_, 2000]. 348*22ce4affSfengbojiang If _split_ is 100, all input samples are used for both training and testing 349*22ce4affSfengbojiang to find optimal _d_ and _k_ to build dictionary. 350*22ce4affSfengbojiang Supports multithreading if `zstd` is compiled with threading support. 351*22ce4affSfengbojiang Having _shrink_ enabled takes a truncated dictionary of minimum size and doubles 352*22ce4affSfengbojiang in size until compression ratio of the truncated dictionary is at most 353*22ce4affSfengbojiang _shrinkDictMaxRegression%_ worse than the compression ratio of the largest dictionary. 354*22ce4affSfengbojiang 355*22ce4affSfengbojiang Examples: 356*22ce4affSfengbojiang 357*22ce4affSfengbojiang `zstd --train-cover FILEs` 358*22ce4affSfengbojiang 359*22ce4affSfengbojiang `zstd --train-cover=k=50,d=8 FILEs` 360*22ce4affSfengbojiang 361*22ce4affSfengbojiang `zstd --train-cover=d=8,steps=500 FILEs` 362*22ce4affSfengbojiang 363*22ce4affSfengbojiang `zstd --train-cover=k=50 FILEs` 364*22ce4affSfengbojiang 365*22ce4affSfengbojiang `zstd --train-cover=k=50,split=60 FILEs` 366*22ce4affSfengbojiang 367*22ce4affSfengbojiang `zstd --train-cover=shrink FILEs` 368*22ce4affSfengbojiang 369*22ce4affSfengbojiang `zstd --train-cover=shrink=2 FILEs` 370*22ce4affSfengbojiang 371*22ce4affSfengbojiang* `--train-fastcover[=k#,d=#,f=#,steps=#,split=#,accel=#]`: 372*22ce4affSfengbojiang Same as cover but with extra parameters _f_ and _accel_ and different default value of split 373*22ce4affSfengbojiang If _split_ is not specified, then it tries _split_ = 75. 374*22ce4affSfengbojiang If _f_ is not specified, then it tries _f_ = 20. 375*22ce4affSfengbojiang Requires that 0 < _f_ < 32. 376*22ce4affSfengbojiang If _accel_ is not specified, then it tries _accel_ = 1. 377*22ce4affSfengbojiang Requires that 0 < _accel_ <= 10. 378*22ce4affSfengbojiang Requires that _d_ = 6 or _d_ = 8. 379*22ce4affSfengbojiang 380*22ce4affSfengbojiang _f_ is log of size of array that keeps track of frequency of subsegments of size _d_. 381*22ce4affSfengbojiang The subsegment is hashed to an index in the range [0,2^_f_ - 1]. 382*22ce4affSfengbojiang It is possible that 2 different subsegments are hashed to the same index, and they are considered as the same subsegment when computing frequency. 383*22ce4affSfengbojiang Using a higher _f_ reduces collision but takes longer. 384*22ce4affSfengbojiang 385*22ce4affSfengbojiang Examples: 386*22ce4affSfengbojiang 387*22ce4affSfengbojiang `zstd --train-fastcover FILEs` 388*22ce4affSfengbojiang 389*22ce4affSfengbojiang `zstd --train-fastcover=d=8,f=15,accel=2 FILEs` 390*22ce4affSfengbojiang 391*22ce4affSfengbojiang* `--train-legacy[=selectivity=#]`: 392*22ce4affSfengbojiang Use legacy dictionary builder algorithm with the given dictionary 393*22ce4affSfengbojiang _selectivity_ (default: 9). 394*22ce4affSfengbojiang The smaller the _selectivity_ value, the denser the dictionary, 395*22ce4affSfengbojiang improving its efficiency but reducing its possible maximum size. 396*22ce4affSfengbojiang `--train-legacy=s=#` is also accepted. 397*22ce4affSfengbojiang 398*22ce4affSfengbojiang Examples: 399*22ce4affSfengbojiang 400*22ce4affSfengbojiang `zstd --train-legacy FILEs` 401*22ce4affSfengbojiang 402*22ce4affSfengbojiang `zstd --train-legacy=selectivity=8 FILEs` 403*22ce4affSfengbojiang 404*22ce4affSfengbojiang 405*22ce4affSfengbojiangBENCHMARK 406*22ce4affSfengbojiang--------- 407*22ce4affSfengbojiang 408*22ce4affSfengbojiang* `-b#`: 409*22ce4affSfengbojiang benchmark file(s) using compression level # 410*22ce4affSfengbojiang* `-e#`: 411*22ce4affSfengbojiang benchmark file(s) using multiple compression levels, from `-b#` to `-e#` (inclusive) 412*22ce4affSfengbojiang* `-i#`: 413*22ce4affSfengbojiang minimum evaluation time, in seconds (default: 3s), benchmark mode only 414*22ce4affSfengbojiang* `-B#`, `--block-size=#`: 415*22ce4affSfengbojiang cut file(s) into independent blocks of size # (default: no block) 416*22ce4affSfengbojiang* `--priority=rt`: 417*22ce4affSfengbojiang set process priority to real-time 418*22ce4affSfengbojiang 419*22ce4affSfengbojiang**Output Format:** CompressionLevel#Filename : IntputSize -> OutputSize (CompressionRatio), CompressionSpeed, DecompressionSpeed 420*22ce4affSfengbojiang 421*22ce4affSfengbojiang**Methodology:** For both compression and decompression speed, the entire input is compressed/decompressed in-memory to measure speed. A run lasts at least 1 sec, so when files are small, they are compressed/decompressed several times per run, in order to improve measurement accuracy. 422*22ce4affSfengbojiang 423*22ce4affSfengbojiangADVANCED COMPRESSION OPTIONS 424*22ce4affSfengbojiang---------------------------- 425*22ce4affSfengbojiang### --zstd[=options]: 426*22ce4affSfengbojiang`zstd` provides 22 predefined compression levels. 427*22ce4affSfengbojiangThe selected or default predefined compression level can be changed with 428*22ce4affSfengbojiangadvanced compression options. 429*22ce4affSfengbojiangThe _options_ are provided as a comma-separated list. 430*22ce4affSfengbojiangYou may specify only the options you want to change and the rest will be 431*22ce4affSfengbojiangtaken from the selected or default compression level. 432*22ce4affSfengbojiangThe list of available _options_: 433*22ce4affSfengbojiang 434*22ce4affSfengbojiang- `strategy`=_strat_, `strat`=_strat_: 435*22ce4affSfengbojiang Specify a strategy used by a match finder. 436*22ce4affSfengbojiang 437*22ce4affSfengbojiang There are 9 strategies numbered from 1 to 9, from faster to stronger: 438*22ce4affSfengbojiang 1=ZSTD\_fast, 2=ZSTD\_dfast, 3=ZSTD\_greedy, 439*22ce4affSfengbojiang 4=ZSTD\_lazy, 5=ZSTD\_lazy2, 6=ZSTD\_btlazy2, 440*22ce4affSfengbojiang 7=ZSTD\_btopt, 8=ZSTD\_btultra, 9=ZSTD\_btultra2. 441*22ce4affSfengbojiang 442*22ce4affSfengbojiang- `windowLog`=_wlog_, `wlog`=_wlog_: 443*22ce4affSfengbojiang Specify the maximum number of bits for a match distance. 444*22ce4affSfengbojiang 445*22ce4affSfengbojiang The higher number of increases the chance to find a match which usually 446*22ce4affSfengbojiang improves compression ratio. 447*22ce4affSfengbojiang It also increases memory requirements for the compressor and decompressor. 448*22ce4affSfengbojiang The minimum _wlog_ is 10 (1 KiB) and the maximum is 30 (1 GiB) on 32-bit 449*22ce4affSfengbojiang platforms and 31 (2 GiB) on 64-bit platforms. 450*22ce4affSfengbojiang 451*22ce4affSfengbojiang Note: If `windowLog` is set to larger than 27, `--long=windowLog` or 452*22ce4affSfengbojiang `--memory=windowSize` needs to be passed to the decompressor. 453*22ce4affSfengbojiang 454*22ce4affSfengbojiang- `hashLog`=_hlog_, `hlog`=_hlog_: 455*22ce4affSfengbojiang Specify the maximum number of bits for a hash table. 456*22ce4affSfengbojiang 457*22ce4affSfengbojiang Bigger hash tables cause less collisions which usually makes compression 458*22ce4affSfengbojiang faster, but requires more memory during compression. 459*22ce4affSfengbojiang 460*22ce4affSfengbojiang The minimum _hlog_ is 6 (64 B) and the maximum is 30 (1 GiB). 461*22ce4affSfengbojiang 462*22ce4affSfengbojiang- `chainLog`=_clog_, `clog`=_clog_: 463*22ce4affSfengbojiang Specify the maximum number of bits for a hash chain or a binary tree. 464*22ce4affSfengbojiang 465*22ce4affSfengbojiang Higher numbers of bits increases the chance to find a match which usually 466*22ce4affSfengbojiang improves compression ratio. 467*22ce4affSfengbojiang It also slows down compression speed and increases memory requirements for 468*22ce4affSfengbojiang compression. 469*22ce4affSfengbojiang This option is ignored for the ZSTD_fast strategy. 470*22ce4affSfengbojiang 471*22ce4affSfengbojiang The minimum _clog_ is 6 (64 B) and the maximum is 29 (524 Mib) on 32-bit platforms 472*22ce4affSfengbojiang and 30 (1 Gib) on 64-bit platforms. 473*22ce4affSfengbojiang 474*22ce4affSfengbojiang- `searchLog`=_slog_, `slog`=_slog_: 475*22ce4affSfengbojiang Specify the maximum number of searches in a hash chain or a binary tree 476*22ce4affSfengbojiang using logarithmic scale. 477*22ce4affSfengbojiang 478*22ce4affSfengbojiang More searches increases the chance to find a match which usually increases 479*22ce4affSfengbojiang compression ratio but decreases compression speed. 480*22ce4affSfengbojiang 481*22ce4affSfengbojiang The minimum _slog_ is 1 and the maximum is 'windowLog' - 1. 482*22ce4affSfengbojiang 483*22ce4affSfengbojiang- `minMatch`=_mml_, `mml`=_mml_: 484*22ce4affSfengbojiang Specify the minimum searched length of a match in a hash table. 485*22ce4affSfengbojiang 486*22ce4affSfengbojiang Larger search lengths usually decrease compression ratio but improve 487*22ce4affSfengbojiang decompression speed. 488*22ce4affSfengbojiang 489*22ce4affSfengbojiang The minimum _mml_ is 3 and the maximum is 7. 490*22ce4affSfengbojiang 491*22ce4affSfengbojiang- `targetLength`=_tlen_, `tlen`=_tlen_: 492*22ce4affSfengbojiang The impact of this field vary depending on selected strategy. 493*22ce4affSfengbojiang 494*22ce4affSfengbojiang For ZSTD\_btopt, ZSTD\_btultra and ZSTD\_btultra2, it specifies 495*22ce4affSfengbojiang the minimum match length that causes match finder to stop searching. 496*22ce4affSfengbojiang A larger `targetLength` usually improves compression ratio 497*22ce4affSfengbojiang but decreases compression speed. 498*22ce4affSfengbojiangt 499*22ce4affSfengbojiang For ZSTD\_fast, it triggers ultra-fast mode when > 0. 500*22ce4affSfengbojiang The value represents the amount of data skipped between match sampling. 501*22ce4affSfengbojiang Impact is reversed : a larger `targetLength` increases compression speed 502*22ce4affSfengbojiang but decreases compression ratio. 503*22ce4affSfengbojiang 504*22ce4affSfengbojiang For all other strategies, this field has no impact. 505*22ce4affSfengbojiang 506*22ce4affSfengbojiang The minimum _tlen_ is 0 and the maximum is 128 Kib. 507*22ce4affSfengbojiang 508*22ce4affSfengbojiang- `overlapLog`=_ovlog_, `ovlog`=_ovlog_: 509*22ce4affSfengbojiang Determine `overlapSize`, amount of data reloaded from previous job. 510*22ce4affSfengbojiang This parameter is only available when multithreading is enabled. 511*22ce4affSfengbojiang Reloading more data improves compression ratio, but decreases speed. 512*22ce4affSfengbojiang 513*22ce4affSfengbojiang The minimum _ovlog_ is 0, and the maximum is 9. 514*22ce4affSfengbojiang 1 means "no overlap", hence completely independent jobs. 515*22ce4affSfengbojiang 9 means "full overlap", meaning up to `windowSize` is reloaded from previous job. 516*22ce4affSfengbojiang Reducing _ovlog_ by 1 reduces the reloaded amount by a factor 2. 517*22ce4affSfengbojiang For example, 8 means "windowSize/2", and 6 means "windowSize/8". 518*22ce4affSfengbojiang Value 0 is special and means "default" : _ovlog_ is automatically determined by `zstd`. 519*22ce4affSfengbojiang In which case, _ovlog_ will range from 6 to 9, depending on selected _strat_. 520*22ce4affSfengbojiang 521*22ce4affSfengbojiang- `ldmHashLog`=_lhlog_, `lhlog`=_lhlog_: 522*22ce4affSfengbojiang Specify the maximum size for a hash table used for long distance matching. 523*22ce4affSfengbojiang 524*22ce4affSfengbojiang This option is ignored unless long distance matching is enabled. 525*22ce4affSfengbojiang 526*22ce4affSfengbojiang Bigger hash tables usually improve compression ratio at the expense of more 527*22ce4affSfengbojiang memory during compression and a decrease in compression speed. 528*22ce4affSfengbojiang 529*22ce4affSfengbojiang The minimum _lhlog_ is 6 and the maximum is 30 (default: 20). 530*22ce4affSfengbojiang 531*22ce4affSfengbojiang- `ldmMinMatch`=_lmml_, `lmml`=_lmml_: 532*22ce4affSfengbojiang Specify the minimum searched length of a match for long distance matching. 533*22ce4affSfengbojiang 534*22ce4affSfengbojiang This option is ignored unless long distance matching is enabled. 535*22ce4affSfengbojiang 536*22ce4affSfengbojiang Larger/very small values usually decrease compression ratio. 537*22ce4affSfengbojiang 538*22ce4affSfengbojiang The minimum _lmml_ is 4 and the maximum is 4096 (default: 64). 539*22ce4affSfengbojiang 540*22ce4affSfengbojiang- `ldmBucketSizeLog`=_lblog_, `lblog`=_lblog_: 541*22ce4affSfengbojiang Specify the size of each bucket for the hash table used for long distance 542*22ce4affSfengbojiang matching. 543*22ce4affSfengbojiang 544*22ce4affSfengbojiang This option is ignored unless long distance matching is enabled. 545*22ce4affSfengbojiang 546*22ce4affSfengbojiang Larger bucket sizes improve collision resolution but decrease compression 547*22ce4affSfengbojiang speed. 548*22ce4affSfengbojiang 549*22ce4affSfengbojiang The minimum _lblog_ is 1 and the maximum is 8 (default: 3). 550*22ce4affSfengbojiang 551*22ce4affSfengbojiang- `ldmHashRateLog`=_lhrlog_, `lhrlog`=_lhrlog_: 552*22ce4affSfengbojiang Specify the frequency of inserting entries into the long distance matching 553*22ce4affSfengbojiang hash table. 554*22ce4affSfengbojiang 555*22ce4affSfengbojiang This option is ignored unless long distance matching is enabled. 556*22ce4affSfengbojiang 557*22ce4affSfengbojiang Larger values will improve compression speed. Deviating far from the 558*22ce4affSfengbojiang default value will likely result in a decrease in compression ratio. 559*22ce4affSfengbojiang 560*22ce4affSfengbojiang The default value is `wlog - lhlog`. 561*22ce4affSfengbojiang 562*22ce4affSfengbojiang### Example 563*22ce4affSfengbojiangThe following parameters sets advanced compression options to something 564*22ce4affSfengbojiangsimilar to predefined level 19 for files bigger than 256 KB: 565*22ce4affSfengbojiang 566*22ce4affSfengbojiang`--zstd`=wlog=23,clog=23,hlog=22,slog=6,mml=3,tlen=48,strat=6 567*22ce4affSfengbojiang 568*22ce4affSfengbojiang### -B#: 569*22ce4affSfengbojiangSelect the size of each compression job. 570*22ce4affSfengbojiangThis parameter is available only when multi-threading is enabled. 571*22ce4affSfengbojiangDefault value is `4 * windowSize`, which means it varies depending on compression level. 572*22ce4affSfengbojiang`-B#` makes it possible to select a custom value. 573*22ce4affSfengbojiangNote that job size must respect a minimum value which is enforced transparently. 574*22ce4affSfengbojiangThis minimum is either 1 MB, or `overlapSize`, whichever is largest. 575*22ce4affSfengbojiang 576*22ce4affSfengbojiangBUGS 577*22ce4affSfengbojiang---- 578*22ce4affSfengbojiangReport bugs at: https://github.com/facebook/zstd/issues 579*22ce4affSfengbojiang 580*22ce4affSfengbojiangAUTHOR 581*22ce4affSfengbojiang------ 582*22ce4affSfengbojiangYann Collet 583