The information contained in this article has been verified as up-to-date on the date of the original publication of the article. HashiCorp endeavors to keep this information up-to-date and correct, but it makes no representations or warranties of any kind, express or implied, about the ongoing completeness, accuracy, reliability, or suitability of the information provided.
All information contained in this article is for general information purposes only. Any reliance you place on such information as it applies to your use of your HashiCorp product is therefore strictly at your own risk.
Introduction
The pprof contains valuable CPU, memory, goroutine, and traces profiling data for the Consul agent, which is essential for debugging performance issues like high CPU and/or memory consumption, as well as analyzing Consul agent's performance. When you execute the command consul debug
, it capture the profile, heap, and goroutine statistics in an archive file. The pprof is also capable of generating both text and graphical reports.
The inclusion of pprof data has proven to be highly beneficial for delving into performance issues at the agent-specific level. It is strongly recommended to provide this data in your support ticket whenever you face performance-related problems.
Instructions
Below are the steps to read and visualize the data using the go tool pprof
.
Generating the files
- Begin by collecting these profiles from the
consul debug
command.- This documentation explains how to execute this command.
- After extracting the debug archive file, you will have a data directory with the following structure
.
├── 2023-07-11T02-54-32Z
│ ├── goroutine.prof
│ └── heap.prof
├── agent.json
├── consul.log
├── host.json
├── index.json
├── members.json
├── metrics.json
├── profile.prof
└── trace.out - To analyze the profiling data, focus on the
profile.prof
, which contains CPU information.- The
heap.prof
provides in-use memory allocations - The
goroutine.prof
contains stack traces of all current goroutines.
- The
Visualize pprof in web interface
-
Run the
go tool pprof
command with option-http=[host]:[port]
$ go tool pprof -http=localhost:8080 profile.prof
Serving web UI on http://localhost:8080 - The above example will serve a web UI on http://localhost:8080 and pop up with your local web browser.
Graph View
The web UI will show a graph of which Go functions and processes consume the most CPU resources. Here are some quick tips to help you interpret the graph based on its node color and font size:
Node Colors:
- Red nodes: Functions or code areas with a large overall impact
- Green nodes: Functions or code areas with a large negative overall impact
- Grey nodes: Functions or code areas with a minimal impact
Node Font Size:
- Larger font size: Indicates functions or code areas with higher absolute time or memory usage
- Smaller font size: Indicates functions or code areas with lower absolute time or memory usage
For more in-depth information, you can visit the pprof documentation on GitHub.
Flame Graphs
You can switch to different views from the "VIEW" button, such as the Flame graph view.
Flame graphs are visual representations of CPU profiler output, showing how functions are called and their CPU time.
- The boxes represent functions, and the width indicates the frequency that the function appeared in the stack traces.
- Wider boxes mean the function was more active in the program.
In the below example, the pbpeering._PeeringService_TrustBundleListByService_Handler
was identified as the component responsible for consuming approximately 43% of CPU resources at the time the samples were captured from the consul debug
command.
This finding provides an explanation for the high CPU resource utilization observed in the Consul agent.
Analyze pprof from the command prompt
-
If you have a machine that only comes with a terminal, you can read the profile data from the CLI by omitting the -http option, which will directly take you to the command prompt. This allows you to interactively work with the profile data directly in the terminal without opening a web interface.
$ go tool pprof profile.prof
File: consul
Type: cpu
Time: Jul 11, 2023 at 12:52pm (AEST)
Duration: 122.12s, Total samples = 1701.01s (1392.86%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) -
There is a list of commands available to search data, such as
top
,tree
, etc. For example, if you typetop5
in the command prompt, the output will be the top Five CPU consumers. To sort by the fourth and fifth columns, use the-cum
(for cumulative) flag.
(pprof) top5 -cum
Showing nodes accounting for 3.78s, 0.22% of 1701.01s total
Dropped 2032 nodes (cum <= 8.51s)
Showing top 5 nodes out of 363
flat flat% sum% cum cum%
0.74s 0.044% 0.044% 824.53s 48.47% google.golang.org/grpc.(*Server).serveStreams.func1.2
0.69s 0.041% 0.084% 822.63s 48.36% google.golang.org/grpc.(*Server).handleStream
1.57s 0.092% 0.18% 820.01s 48.21% google.golang.org/grpc.(*Server).processUnaryRPC
0.40s 0.024% 0.2% 731.51s 43.00% github.com/hashicorp/consul/proto/pbpeering._PeeringService_TrustBundleListByService_Handler
0.38s 0.022% 0.22% 714.75s 42.02% github.com/grpc-ecosystem/go-grpc-middleware/recovery.UnaryServerInterceptor.func1 - Additionally, you have the option to generate visualization graphs in PNG, SVG or GIF format, similar to what you have previously seen in the web browser. This allows you to capture and save the visual representations of the profiling data for further analysis or sharing.
(pprof) png
Generating report in profile001.png - You can use the same command to generate both text and graphical reports from heap and goroutine profiles.
-
In the following example, it tells us that
github.com/hashicorp/go-immutable-radix.(*Txn).insert
is consuming 8.6MB of memory
go tool pprof heap.prof
File: consul
Type: inuse_space
Time: Jul 11, 2023 at 12:52pm (AEST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top5
Showing nodes accounting for 34.34MB, 33.05% of 103.89MB total
Dropped 141 nodes (cum <= 0.52MB)
Showing top 5 nodes out of 256
flat flat% sum% cum cum%
8.50MB 8.18% 8.18% 16.50MB 15.88% github.com/hashicorp/go-immutable-radix.(*Txn).insert
7MB 6.74% 14.92% 7MB 6.74% github.com/hashicorp/go-immutable-radix.(*Txn).writeNode
6.82MB 6.57% 21.49% 6.82MB 6.57% bufio.NewWriterSize
6.51MB 6.27% 27.75% 6.51MB 6.27% github.com/DataDog/datadog-go/statsd.newStatsdBuffer
5.51MB 5.30% 33.05% 5.51MB 5.30% runtime.allocm
-
In the following example, it tells us that
-
Similarly, the goroutine profile reports the stack traces of all current goroutines which is useful to see how many goroutines are currently running, and what they are doing.
go tool pprof goroutine.prof
...
(pprof) top3 -cum
Showing nodes accounting for 2300, 98.21% of 2342 total
Dropped 366 nodes (cum <= 11)
Showing top 3 nodes out of 87
flat flat% sum% cum cum%
2300 98.21% 98.21% 2300 98.21% runtime.gopark
0 0% 98.21% 2122 90.61% runtime.selectgo
0 0% 98.21% 928 39.62% github.com/hashicorp/consul-net-rpc/net/rpc.(*Server).ServeRequestIn the below example, we can see that the rpc serveRequest has created 928 goroutines. Among these goroutines, 109 were found to be associated with the rpc readRequest, while a substantial number of 818 were linked to the execution of the function GetNetRPCInterceptor.
References :