- Notifications
You must be signed in to change notification settings - Fork995
Performance Profiling in OpenBMC
OpenBMC has a lot of advanced features that need to fit in a somewhat small footprint (i.e. the AST2400 or AST2500). The CPU has been the most limiting issue and can result in long BMC boot times, missed mapper discoveries, and/or the inability to respond to the host within allotted times.
A lot of fixes for the above issues have been the obvious one's so far:
- Remove un-used services, especially python based one's
- Optimize time using kernel spin locks, especially in the FSI driver
- The continuous collection of OCC data over FSI is intense
- Minimize DBUS traffic where possible
- Cache mapper-provided names
- Grab as much info as possible in single DBUS calls (i.e. mapper subtree)
There are lot of tools to profile systemd based Linux distributions. Here's some pointers to some that have been used so far.
This tool will generate a graph of all services started by systemd based on time stamp and show how long they took to start. An example can be foundhere
To run it, just copy the systemd-analyzd application onto your OpenBMC and run the following:
systemd-analyze plot > /tmp/systemd-analzye.svg
You can load the output .svg file directly into most web browsers using the "File:////systemd-analzye.svg syntax"
To build the tool, you can simply add it to your package (as done in thislink) or do a "bitbake packagegroup-core-tools-profile" and copy it out of there.
This tool shows similar data to systemd-analyze except it's based on the actual applications being run in the services and it shows CPU utilization. It does require you recompile your kernel with some profiling flags enabled as seen in thiscommit.
It supports being started via kernel command line but I didn't have any luck with that in OpenBMC. The tool can also just be run standalone (which did work).
The tool has a lot of parameters, but a good working example for OpenBMC is:
systemd-bootchart -n 1200 -f 3 -C
This will take 3 sample per second, with a total sample size of 1200. You need to be careful on samples per second, the default is very large and ends up locking up the CPU on the AST2500 (making the data collected very irrelevant).
There are some scripts to help set this up in the openbmc-toolsrepo.
This can be (and has been) very valuable for finding bottlenecks in the kernel, especially drivers custom written for OpenBMC.
Useful perf operations are:
- perf record -a -g sleep 10 && perf report --call-graph
- perf sched record && perf sched latency
- perf timechart record && perf timechart
pyflame is atool from Uber that generatesflamegraphs of Python processes. Patchesporting it to ARM anddealing with prelinked binaries have been sent upstream, as has a series tointegrate pyflame into meta-openembedded. In the mean time @amboar has a treeintegrating the whole lot into the debug tools tarball.