16b4e84f2SAndrew Brown# Using `VTune` 2dff789c7SJohnnie Birch 3*2fafa358SAndrew Brown[VTune][main] is a popular performance profiling tool that targets both 32-bit 499b00cd9SAndrew Brownand 64-bit x86 architectures. The tool collects profiling data during runtime 599b00cd9SAndrew Brownand then, either through the command line or GUI, provides a variety of options 699b00cd9SAndrew Brownfor viewing and analyzing that data. VTune Profiler is available in both 7523bc959Svuittont60commercial and free options. The free, downloadable version is available 899b00cd9SAndrew Brown[here][download] and is backed by a community forum for support. This version is 96b4e84f2SAndrew Brownappropriate for detailed analysis of your Wasm program. 10dff789c7SJohnnie Birch 1199b00cd9SAndrew BrownVTune support in Wasmtime is provided through the JIT profiling APIs from the 1299b00cd9SAndrew Brown[`ittapi`] library. This library provides code generators (or the runtimes that 1399b00cd9SAndrew Brownuse them) a way to report JIT activities. The APIs are implemented in a static 1499b00cd9SAndrew Brownlibrary (see [`ittapi`] source) which Wasmtime links to when VTune support is 1599b00cd9SAndrew Brownspecified through the `vtune` Cargo feature flag; this feature is not enabled by 1699b00cd9SAndrew Browndefault. When the VTune collector is run, the `ittapi` library collects 1799b00cd9SAndrew BrownWasmtime's reported JIT activities. This connection to `ittapi` is provided by 1899b00cd9SAndrew Brownthe [`ittapi-rs`] crate. 19dff789c7SJohnnie Birch 2099b00cd9SAndrew BrownFor more information on VTune and the analysis tools it provides see its 2199b00cd9SAndrew Brown[documentation]. 22dff789c7SJohnnie Birch 23*2fafa358SAndrew Brown[main]: https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html 24*2fafa358SAndrew Brown[download]: https://www.intel.com/content/www/us/en/docs/vtune-profiler/installation-guide/current/overview.html 25*2fafa358SAndrew Brown[documentation]: https://www.intel.com/content/www/us/en/docs/vtune-profiler/user-guide/current/overview.html 2699b00cd9SAndrew Brown[`ittapi`]: https://github.com/intel/ittapi 2799b00cd9SAndrew Brown[`ittapi-rs`]: https://crates.io/crates/ittapi-rs 2899b00cd9SAndrew Brown 2999b00cd9SAndrew Brown### Turn on VTune support 3099b00cd9SAndrew Brown 31c183e93bSAndrew BrownFor JIT profiling with VTune, Wasmtime currently builds with the `vtune` feature 32c183e93bSAndrew Brownenabled by default. This ensures the compiled binary understands how to inform 33c183e93bSAndrew Brownthe `ittapi` library of JIT events. But it must still be enabled at 34c183e93bSAndrew Brownruntime--enable runtime support based on how you use Wasmtime: 3599b00cd9SAndrew Brown 3699b00cd9SAndrew Brown* **Rust API** - call the [`Config::profiler`] method with 37dff789c7SJohnnie Birch `ProfilingStrategy::VTune` to enable profiling of your wasm modules. 38dff789c7SJohnnie Birch 3999b00cd9SAndrew Brown* **C API** - call the `wasmtime_config_profiler_set` API with a 40dff789c7SJohnnie Birch `WASMTIME_PROFILING_STRATEGY_VTUNE` value. 41dff789c7SJohnnie Birch 426f4f30c8SBenjamin Bouvier* **Command Line** - pass the `--profile=vtune` flag on the command line. 43dff789c7SJohnnie Birch 44dff789c7SJohnnie Birch 4599b00cd9SAndrew Brown### Profiling Wasmtime itself 46dff789c7SJohnnie Birch 4799b00cd9SAndrew BrownNote that VTune is capable of profiling a single process or all system 4899b00cd9SAndrew Brownprocesses. Like `perf`, VTune is capable of profiling the Wasmtime runtime 4999b00cd9SAndrew Brownitself without any added support. However, the [`ittapi`] APIs also provide an 5099b00cd9SAndrew Browninterface for marking the start and stop of code regions for easy isolation in 5199b00cd9SAndrew Brownthe VTune Profiler. Support for these APIs is expected to be added in the 5299b00cd9SAndrew Brownfuture. 5399b00cd9SAndrew Brown 5499b00cd9SAndrew Brown 5599b00cd9SAndrew Brown### Example: Getting Started 5699b00cd9SAndrew Brown 5799b00cd9SAndrew BrownWith VTune [properly installed][download], if you are using the CLI execute: 58dff789c7SJohnnie Birch 5992cfda1bSVictor Adossi```console 6092cfda1bSVictor Adossicargo build 6192cfda1bSVictor Adossivtune -run-pass-thru=--no-altstack -collect hotspots target/debug/wasmtime --profile=vtune foo.wasm 62dff789c7SJohnnie Birch``` 63dff789c7SJohnnie Birch 6424bc4d60SAndrew BrownThis command tells the VTune collector (`vtune`) to collect hot spot 656f4f30c8SBenjamin Bouvierprofiling data as Wasmtime is executing `foo.wasm`. The `--profile=vtune` flag enables 6699b00cd9SAndrew BrownVTune support in Wasmtime so that the collector is also alerted to JIT events 6799b00cd9SAndrew Brownthat take place during runtime. The first time this is run, the result of the 68523bc959Svuittont60command is a results directory `r000hs/` which contains profiling data for 6999b00cd9SAndrew BrownWasmtime and the execution of `foo.wasm`. This data can then be read and 7099b00cd9SAndrew Browndisplayed via the command line or via the VTune GUI by importing the result. 71dff789c7SJohnnie Birch 72dff789c7SJohnnie Birch 7399b00cd9SAndrew Brown### Example: CLI Collection 7499b00cd9SAndrew Brown 7599b00cd9SAndrew BrownUsing a familiar algorithm, we'll start with the following Rust code: 76dff789c7SJohnnie Birch 77dff789c7SJohnnie Birch```rust 78dff789c7SJohnnie Birchfn main() { 79dff789c7SJohnnie Birch let n = 45; 80dff789c7SJohnnie Birch println!("fib({}) = {}", n, fib(n)); 81dff789c7SJohnnie Birch} 82dff789c7SJohnnie Birch 83dff789c7SJohnnie Birchfn fib(n: u32) -> u32 { 84dff789c7SJohnnie Birch if n <= 2 { 85dff789c7SJohnnie Birch 1 86dff789c7SJohnnie Birch } else { 87dff789c7SJohnnie Birch fib(n - 1) + fib(n - 2) 88dff789c7SJohnnie Birch } 89dff789c7SJohnnie Birch} 90dff789c7SJohnnie Birch``` 91dff789c7SJohnnie Birch 9299b00cd9SAndrew BrownWe compile the example to Wasm: 9399b00cd9SAndrew Brown 9492cfda1bSVictor Adossi```console 9592cfda1bSVictor Adossirustc --target wasm32-wasip1 fib.rs -C opt-level=z -C lto=yes 9699b00cd9SAndrew Brown``` 9799b00cd9SAndrew Brown 9899b00cd9SAndrew BrownThen we execute the Wasmtime runtime (built with the `vtune` feature and 996f4f30c8SBenjamin Bouvierexecuted with the `--profile=vtune` flag to enable reporting) inside the VTune CLI 10024bc4d60SAndrew Brownapplication, `vtune`, which must already be installed and available on the 10199b00cd9SAndrew Brownpath. To collect hot spot profiling information, we execute: 102dff789c7SJohnnie Birch 10392cfda1bSVictor Adossi```console 10405095c18SAlex Crichton$ rustc --target wasm32-wasip1 fib.rs -C opt-level=z -C lto=yes 1056f4f30c8SBenjamin Bouvier$ vtune -run-pass-thru=--no-altstack -v -collect hotspots target/debug/wasmtime --profile=vtune fib.wasm 106dff789c7SJohnnie Birchfib(45) = 1134903170 107dff789c7SJohnnie Birchamplxe: Collection stopped. 108dff789c7SJohnnie Birchamplxe: Using result path /home/jlb6740/wasmtime/r000hs 109dff789c7SJohnnie Birchamplxe: Executing actions 7 % Clearing the database 110dff789c7SJohnnie Birchamplxe: The database has been cleared, elapsed time is 0.239 seconds. 111dff789c7SJohnnie Birchamplxe: Executing actions 14 % Updating precomputed scalar metrics 112dff789c7SJohnnie Birchamplxe: Raw data has been loaded to the database, elapsed time is 0.792 seconds. 113dff789c7SJohnnie Birchamplxe: Executing actions 19 % Processing profile metrics and debug information 114dff789c7SJohnnie Birch... 115dff789c7SJohnnie BirchTop Hotspots 116dff789c7SJohnnie BirchFunction Module CPU Time 117dff789c7SJohnnie Birch-------------------------------------------------------------------------------------------- -------------- -------- 118dff789c7SJohnnie Birchh2bacf53cb3845acf [Dynamic code] 3.480s 119dff789c7SJohnnie Birch__memmove_avx_unaligned_erms libc.so.6 0.222s 120dff789c7SJohnnie Birchcranelift_codegen::ir::instructions::InstructionData::opcode::hee6f5b6a72fc684e wasmtime 0.122s 121dff789c7SJohnnie Birchcore::ptr::slice_from_raw_parts::hc5cb6f1b39a0e7a1 wasmtime 0.066s 122dff789c7SJohnnie Birch_$LT$usize$u20$as$u20$core..slice..SliceIndex$LT$$u5b$T$u5d$$GT$$GT$::get::h70c7f142eeeee8bd wasmtime 0.066s 123dff789c7SJohnnie Birch``` 12499b00cd9SAndrew Brown 12599b00cd9SAndrew Brown 12699b00cd9SAndrew Brown### Example: Importing Results into GUI 12799b00cd9SAndrew Brown 12824bc4d60SAndrew BrownResults directories created by the `vtune` CLI can be imported in the VTune GUI 12924bc4d60SAndrew Brownby clicking "Open > Result". Below is a visualization of the collected data as 13024bc4d60SAndrew Brownseen in VTune's GUI: 131dff789c7SJohnnie Birch 132dff789c7SJohnnie Birch 133dff789c7SJohnnie Birch 13499b00cd9SAndrew Brown 13599b00cd9SAndrew Brown### Example: GUI Collection 13699b00cd9SAndrew Brown 13724bc4d60SAndrew BrownVTune can collect data in multiple ways (see `vtune` CLI discussion above); 13899b00cd9SAndrew Brownanother way is to use the VTune GUI directly. A standard work flow might look 13999b00cd9SAndrew Brownlike: 14099b00cd9SAndrew Brown 14199b00cd9SAndrew Brown- Open VTune Profiler 14299b00cd9SAndrew Brown- "Configure Analysis" with 14399b00cd9SAndrew Brown - "Application" set to `/path/to/wasmtime` (e.g., `target/debug/wasmtime`) 1446f4f30c8SBenjamin Bouvier - "Application parameters" set to `--profile=vtune /path/to/module.wasm` 14599b00cd9SAndrew Brown - "Working directory" set as appropriate 14699b00cd9SAndrew Brown - Enable "Hardware Event-Based Sampling," which may require some system 14799b00cd9SAndrew Brown configuration, e.g. `sysctl -w kernel.perf_event_paranoid=0` 14899b00cd9SAndrew Brown- Start the analysis 149