linux 性能调优工具perf + 火焰图常用命令

本文是从本人笔记直接搬运过来，将就着看

 
  perf性能分析： 
 

 
  生成火焰图（执行1-4步骤）： 
 

 
  1、perf record -e cpu-clock -g -p pid （perf record -F 99 -g -p pid 
   99HZ采样）

 
  -g 选项是告诉perf record额外记录函数的调用关系 
 

 
  -e cpu-clock 指perf record监控的指标为cpu周期 
 

 
  -p 指定需要record的进程pid 
 

 
      perf report -i perf.data 
 

 
  -i 指定要查看的文件 
 

 
  2、perf script -i perf.data &> perf.unfold 
 

 
  用perf script工具对perf.data进行解析 
 

 
  3、./stackcollapse-perf.pl perf.unfold &> perf.folded  
 

 
  将perf.unfold中的符号进行折叠 
 

 
  4、./flamegraph.pl perf.folded > perf.svg 
 

 
  最后生成svg图 
 

 
  火焰图项目地址：git clone  
  https://github.com/brendangregg/FlameGraph.git

 
  1、统计事件，stat：statistics 
 

 
  # CPU counter statistics for the specified command: 
 

 
  perf stat  
  command

 
  # Detailed CPU counter statistics (includes extras) for the specified command:

 
  perf stat -d command

 
  # CPU counter statistics for the specified PID, until Ctrl-C:

 
  perf stat -p PID

 
  # CPU counter statistics for the entire system, for 5 seconds:

 
  perf stat -a sleep 5

 
  # Various basic CPU statistics, system wide, for 10 seconds:

 
  perf stat -e cycles,instructions,cache-references,cache-misses,bus-cycles -a sleep 10

 
  2、剖析 Profiling 
 

 
  # Sample on-CPU functions for the specified command, at 99 Hertz: 
 

 
  perf record -F 99 command

 
  # Sample on-CPU functions for the specified PID, at 99 Hertz, until Ctrl-C:

 
  perf record -F 99 -p PID

 
  # Sample on-CPU functions for the specified PID, at 99 Hertz, for 10 seconds:

 
  perf record -F 99 -p PID sleep 10

 
  # Sample CPU stack traces (via frame pointers) for the specified PID, at 99 Hertz, for 10 seconds:

 
  perf record -F 99 -p PID -g -- sleep 10

 
  常用参数 
 

 
  -e：Select the PMU event. 
 

 
  -a：System-wide collection from all CPUs. 
 

 
  -p：Record events on existing process ID (comma separated list). 
 

 
  -A：Append to the output file to do incremental profiling. 
 

 
   -f：Overwrite existing data file. 
 

 
  -o：Output file name. 
 

 
  -g：Do call-graph (stack chain/backtrace) recording. 
 

 
  -C：Collect samples only on the list of CPUs provided. 
 

 
  3、Static Tracing 
 

 
  # Trace new processes, until Ctrl-C: 
 

 
  perf record -e sched:sched_process_exec -a

 
  # Trace all context-switches, until Ctrl-C:

 
  perf record -e context-switches -a

 
  # Trace context-switches via sched tracepoint, until Ctrl-C:

 
  perf record -e sched:sched_switch -a

 
  # Trace all context-switches with stack traces, until Ctrl-C:

 
  perf record -e context-switches -ag

 
  # Trace all context-switches with stack traces, for 10 seconds:

 
  perf record -e context-switches -ag -- sleep 10

 
  4、Dynamic Tracing 
 

 
  # Add a tracepoint for the kernel tcp_sendmsg() function entry ("--add" is optional): 
 

 
  perf probe --add tcp_sendmsg

 
  # Remove the tcp_sendmsg() tracepoint (or use "--del"):

 
  perf probe -d tcp_sendmsg

 
  # Add a tracepoint for the kernel tcp_sendmsg() function return:

 
  perf probe 'tcp_sendmsg%return'

 
  # Show available variables for the kernel tcp_sendmsg() function (needs debuginfo):

 
  perf probe -V tcp_sendmsg

 
  # Show available variables for the kernel tcp_sendmsg() function, plus external vars (needs debuginfo):

 
  perf probe -V tcp_sendmsg --externs

 
  5、Mixed 
 

 
  # Sample stacks at 99 Hertz, and, context switches: 
 

 
  perf record -F99 -e cpu-clock -e cs -a -g

 
  # Sample stacks to 2 levels deep, and, context switch stacks to 5 levels (needs 4.8):

 
  perf record -F99 -e cpu-clock/max-stack=2/ -e cs/max-stack=5/ -a -g

 
  6、Reporting 
 

 
  # Show perf.data in an ncurses browser (TUI) if possible: 
 

 
  perf report

 
  # Show perf.data with a column for sample count:

 
  perf report -n

 
  # Show perf.data as a text report, with data coalesced and percentages:

 
  perf report --stdio

 
  # Report, with stacks in folded format: one line per stack (needs 4.4):

 
  perf report --stdio -n -g folded

 
  # List all events from perf.data:

 
  perf script

 
  # List all perf.data events, with data header (newer kernels; was previously default):

 
  perf script --header

linux 性能调优工具perf + 火焰图 常用命令