![]() This example below samples for RTM aborts. It is also possible to sample for both at the same time, but it is recommended to not specify a event that is not needed. When el-starts is high el-aborts is the abort even, when tx-starts is high tx-aborts is the abort event. perf stat -T reports whether HLE or RTM are used (el-starts or tx-starts). ![]() HLE and RTM use different sampling events. Abort sampling does not affect the transaction commit rate, because the transactions have already aborted when sampled (it however still adds some overhead) When the number of aborted cycles reported by stat -T is high the location of aborts should be profiled using sampling. Profiling Abort Causes with perf record Sampling The computations for the various ratios will be still done. When a RTM enabled kernel is used, but only the user program should be measured it is possible to specify the events used by -T manually using -e to perf stat, with an additional :u qualifier to only count them for ring 3. perf stat -T counts both kernel and user transactions. The overhead of perf stat -T counting is normally low, it should not affect the run time of the program significantly. In general it is preferable if transactions are not too short. In addition -T reports the number of transactions separated for HLE (el) and RTM (tx) and their average length. For programs with very different phases this can be useful to use with -T to get separate measurements for different phases. Newer version of perf also have a -I option to enable interval sampling. If the startup phase of the application is very expensive it is preferable to use -a or -p in parallel to only measure when the program is past the start-up phase. At startup there are typically various transient abort causes (for example faulting in the working set) that will disappear later. These numbers should be only trusted for relatively long running processes. The goal of TSX tuning is normally to make that number as small as possible, that is to make the commit rate of transactions as large as possible. T also reports the aborted cycles, that is cycles spent in doomed transactions that did not commit. ![]() When the number is low the program may not spend much time in locks or the locks are not enabled for TSX lock elision. The -T option reports the number of transactional cycles. Alternatively it's also possible to attach to specific pids with -p. With -a the complete machine will be measured. Using -a may require being root or setting /proc/sys/kernel/perf_paranoid to -1 first. Or if the program is long running in a steady state run in parallel from another terminal perf stat -T -a sleep 1 The first step after the program is running with TSX to use perf stat -T to measure the basic transactional success. Measuring Basic Transactional Success with perf stat -T
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |