16b4e84f2SAndrew Brown# Using `VTune`
2dff789c7SJohnnie Birch
3*2fafa358SAndrew Brown[VTune][main] is a popular performance profiling tool that targets both 32-bit
499b00cd9SAndrew Brownand 64-bit x86 architectures. The tool collects profiling data during runtime
599b00cd9SAndrew Brownand then, either through the command line or GUI, provides a variety of options
699b00cd9SAndrew Brownfor viewing and analyzing that data. VTune Profiler is available in both
7523bc959Svuittont60commercial and free options. The free, downloadable version is available
899b00cd9SAndrew Brown[here][download] and is backed by a community forum for support. This version is
96b4e84f2SAndrew Brownappropriate for detailed analysis of your Wasm program.
10dff789c7SJohnnie Birch
1199b00cd9SAndrew BrownVTune support in Wasmtime is provided through the JIT profiling APIs from the
1299b00cd9SAndrew Brown[`ittapi`] library. This library provides code generators (or the runtimes that
1399b00cd9SAndrew Brownuse them) a way to report JIT activities. The APIs are implemented in a static
1499b00cd9SAndrew Brownlibrary (see [`ittapi`] source) which Wasmtime links to when VTune support is
1599b00cd9SAndrew Brownspecified through the `vtune` Cargo feature flag; this feature is not enabled by
1699b00cd9SAndrew Browndefault. When the VTune collector is run, the `ittapi` library collects
1799b00cd9SAndrew BrownWasmtime's reported JIT activities. This connection to `ittapi` is provided by
1899b00cd9SAndrew Brownthe [`ittapi-rs`] crate.
19dff789c7SJohnnie Birch
2099b00cd9SAndrew BrownFor more information on VTune and the analysis tools it provides see its
2199b00cd9SAndrew Brown[documentation].
22dff789c7SJohnnie Birch
23*2fafa358SAndrew Brown[main]: https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html
24*2fafa358SAndrew Brown[download]: https://www.intel.com/content/www/us/en/docs/vtune-profiler/installation-guide/current/overview.html
25*2fafa358SAndrew Brown[documentation]: https://www.intel.com/content/www/us/en/docs/vtune-profiler/user-guide/current/overview.html
2699b00cd9SAndrew Brown[`ittapi`]: https://github.com/intel/ittapi
2799b00cd9SAndrew Brown[`ittapi-rs`]: https://crates.io/crates/ittapi-rs
2899b00cd9SAndrew Brown
2999b00cd9SAndrew Brown### Turn on VTune support
3099b00cd9SAndrew Brown
31c183e93bSAndrew BrownFor JIT profiling with VTune, Wasmtime currently builds with the `vtune` feature
32c183e93bSAndrew Brownenabled by default. This ensures the compiled binary understands how to inform
33c183e93bSAndrew Brownthe `ittapi` library of JIT events. But it must still be enabled at
34c183e93bSAndrew Brownruntime--enable runtime support based on how you use Wasmtime:
3599b00cd9SAndrew Brown
3699b00cd9SAndrew Brown* **Rust API** - call the [`Config::profiler`] method with
37dff789c7SJohnnie Birch  `ProfilingStrategy::VTune` to enable profiling of your wasm modules.
38dff789c7SJohnnie Birch
3999b00cd9SAndrew Brown* **C API** - call the `wasmtime_config_profiler_set` API with a
40dff789c7SJohnnie Birch  `WASMTIME_PROFILING_STRATEGY_VTUNE` value.
41dff789c7SJohnnie Birch
426f4f30c8SBenjamin Bouvier* **Command Line** - pass the `--profile=vtune` flag on the command line.
43dff789c7SJohnnie Birch
44dff789c7SJohnnie Birch
4599b00cd9SAndrew Brown### Profiling Wasmtime itself
46dff789c7SJohnnie Birch
4799b00cd9SAndrew BrownNote that VTune is capable of profiling a single process or all system
4899b00cd9SAndrew Brownprocesses. Like `perf`, VTune is capable of profiling the Wasmtime runtime
4999b00cd9SAndrew Brownitself without any added support. However, the [`ittapi`] APIs also provide an
5099b00cd9SAndrew Browninterface for marking the start and stop of code regions for easy isolation in
5199b00cd9SAndrew Brownthe VTune Profiler. Support for these APIs is expected to be added in the
5299b00cd9SAndrew Brownfuture.
5399b00cd9SAndrew Brown
5499b00cd9SAndrew Brown
5599b00cd9SAndrew Brown### Example: Getting Started
5699b00cd9SAndrew Brown
5799b00cd9SAndrew BrownWith VTune [properly installed][download], if you are using the CLI execute:
58dff789c7SJohnnie Birch
5992cfda1bSVictor Adossi```console
6092cfda1bSVictor Adossicargo build
6192cfda1bSVictor Adossivtune -run-pass-thru=--no-altstack -collect hotspots target/debug/wasmtime --profile=vtune foo.wasm
62dff789c7SJohnnie Birch```
63dff789c7SJohnnie Birch
6424bc4d60SAndrew BrownThis command tells the VTune collector (`vtune`) to collect hot spot
656f4f30c8SBenjamin Bouvierprofiling data as Wasmtime is executing `foo.wasm`. The `--profile=vtune` flag enables
6699b00cd9SAndrew BrownVTune support in Wasmtime so that the collector is also alerted to JIT events
6799b00cd9SAndrew Brownthat take place during runtime. The first time this is run, the result of the
68523bc959Svuittont60command is a results directory `r000hs/` which contains profiling data for
6999b00cd9SAndrew BrownWasmtime and the execution of `foo.wasm`. This data can then be read and
7099b00cd9SAndrew Browndisplayed via the command line or via the VTune GUI by importing the result.
71dff789c7SJohnnie Birch
72dff789c7SJohnnie Birch
7399b00cd9SAndrew Brown### Example: CLI Collection
7499b00cd9SAndrew Brown
7599b00cd9SAndrew BrownUsing a familiar algorithm, we'll start with the following Rust code:
76dff789c7SJohnnie Birch
77dff789c7SJohnnie Birch```rust
78dff789c7SJohnnie Birchfn main() {
79dff789c7SJohnnie Birch    let n = 45;
80dff789c7SJohnnie Birch    println!("fib({}) = {}", n, fib(n));
81dff789c7SJohnnie Birch}
82dff789c7SJohnnie Birch
83dff789c7SJohnnie Birchfn fib(n: u32) -> u32 {
84dff789c7SJohnnie Birch    if n <= 2 {
85dff789c7SJohnnie Birch        1
86dff789c7SJohnnie Birch    } else {
87dff789c7SJohnnie Birch        fib(n - 1) + fib(n - 2)
88dff789c7SJohnnie Birch    }
89dff789c7SJohnnie Birch}
90dff789c7SJohnnie Birch```
91dff789c7SJohnnie Birch
9299b00cd9SAndrew BrownWe compile the example to Wasm:
9399b00cd9SAndrew Brown
9492cfda1bSVictor Adossi```console
9592cfda1bSVictor Adossirustc --target wasm32-wasip1 fib.rs -C opt-level=z -C lto=yes
9699b00cd9SAndrew Brown```
9799b00cd9SAndrew Brown
9899b00cd9SAndrew BrownThen we execute the Wasmtime runtime (built with the `vtune` feature and
996f4f30c8SBenjamin Bouvierexecuted with the `--profile=vtune` flag to enable reporting) inside the VTune CLI
10024bc4d60SAndrew Brownapplication, `vtune`, which must already be installed and available on the
10199b00cd9SAndrew Brownpath. To collect hot spot profiling information, we execute:
102dff789c7SJohnnie Birch
10392cfda1bSVictor Adossi```console
10405095c18SAlex Crichton$ rustc --target wasm32-wasip1 fib.rs -C opt-level=z -C lto=yes
1056f4f30c8SBenjamin Bouvier$ vtune -run-pass-thru=--no-altstack -v -collect hotspots target/debug/wasmtime --profile=vtune fib.wasm
106dff789c7SJohnnie Birchfib(45) = 1134903170
107dff789c7SJohnnie Birchamplxe: Collection stopped.
108dff789c7SJohnnie Birchamplxe: Using result path /home/jlb6740/wasmtime/r000hs
109dff789c7SJohnnie Birchamplxe: Executing actions  7 % Clearing the database
110dff789c7SJohnnie Birchamplxe: The database has been cleared, elapsed time is 0.239 seconds.
111dff789c7SJohnnie Birchamplxe: Executing actions 14 % Updating precomputed scalar metrics
112dff789c7SJohnnie Birchamplxe: Raw data has been loaded to the database, elapsed time is 0.792 seconds.
113dff789c7SJohnnie Birchamplxe: Executing actions 19 % Processing profile metrics and debug information
114dff789c7SJohnnie Birch...
115dff789c7SJohnnie BirchTop Hotspots
116dff789c7SJohnnie BirchFunction                                                                                      Module          CPU Time
117dff789c7SJohnnie Birch--------------------------------------------------------------------------------------------  --------------  --------
118dff789c7SJohnnie Birchh2bacf53cb3845acf                                                                             [Dynamic code]    3.480s
119dff789c7SJohnnie Birch__memmove_avx_unaligned_erms                                                                  libc.so.6         0.222s
120dff789c7SJohnnie Birchcranelift_codegen::ir::instructions::InstructionData::opcode::hee6f5b6a72fc684e               wasmtime          0.122s
121dff789c7SJohnnie Birchcore::ptr::slice_from_raw_parts::hc5cb6f1b39a0e7a1                                            wasmtime          0.066s
122dff789c7SJohnnie Birch_$LT$usize$u20$as$u20$core..slice..SliceIndex$LT$$u5b$T$u5d$$GT$$GT$::get::h70c7f142eeeee8bd  wasmtime          0.066s
123dff789c7SJohnnie Birch```
12499b00cd9SAndrew Brown
12599b00cd9SAndrew Brown
12699b00cd9SAndrew Brown### Example: Importing Results into GUI
12799b00cd9SAndrew Brown
12824bc4d60SAndrew BrownResults directories created by the `vtune` CLI can be imported in the VTune GUI
12924bc4d60SAndrew Brownby clicking "Open > Result". Below is a visualization of the collected data as
13024bc4d60SAndrew Brownseen in VTune's GUI:
131dff789c7SJohnnie Birch
132dff789c7SJohnnie Birch![vtune report output](assets/vtune-gui-fib.png)
133dff789c7SJohnnie Birch
13499b00cd9SAndrew Brown
13599b00cd9SAndrew Brown### Example: GUI Collection
13699b00cd9SAndrew Brown
13724bc4d60SAndrew BrownVTune can collect data in multiple ways (see `vtune` CLI discussion above);
13899b00cd9SAndrew Brownanother way is to use the VTune GUI directly. A standard work flow might look
13999b00cd9SAndrew Brownlike:
14099b00cd9SAndrew Brown
14199b00cd9SAndrew Brown- Open VTune Profiler
14299b00cd9SAndrew Brown- "Configure Analysis" with
14399b00cd9SAndrew Brown  - "Application" set to `/path/to/wasmtime` (e.g., `target/debug/wasmtime`)
1446f4f30c8SBenjamin Bouvier  - "Application parameters" set to `--profile=vtune /path/to/module.wasm`
14599b00cd9SAndrew Brown  - "Working directory" set as appropriate
14699b00cd9SAndrew Brown  - Enable "Hardware Event-Based Sampling," which may require some system
14799b00cd9SAndrew Brown    configuration, e.g. `sysctl -w kernel.perf_event_paranoid=0`
14899b00cd9SAndrew Brown- Start the analysis
149