KEMBAR78
fgtrace package - github.com/felixge/fgtrace - Go Packages

fgtrace

package module
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 21, 2022 License: MIT Imports: 11 Imported by: 4

README

fgtrace - The Full Go Tracer

ci test status documentation

fgtrace is an experimental profiler/tracer that is capturing wallclock timelines for each goroutine. It's very similar to the Chrome profiler.

⚠️ fgtrace may cause noticeable stop-the-world pauses in your applications. It is intended for dev and testing environments for now.

Quick Start

To capture an fgtrace of your program, simply add the one-liner shown below. This will cause the creation of a fgtrace.json file in the current working directory that you can view by opening it in the Perfetto UI.

package main

import "github.com/felixge/fgtrace"

func main() {
	defer fgtrace.Config{Dst: fgtrace.File("fgtrace.json")}.Trace().Stop()

	// <code to trace>
}

Alternatively you can configure fgtrace as a http.Handler and request traces on-demand by hitting http://localhost:1234/debug/fgtrace?seconds=30&hz=100.

package main

import (
	"net/http"
	"github.com/felixge/fgtrace"
)

func main() {
	http.DefaultServeMux.Handle("/debug/fgtrace", fgtrace.Config{})
	http.ListenAndServe(":1234", nil)
}

For more advanced use cases, have a look at the API Documentation.

Comparison with Similar Tools

Below is a simple program that spends its time sleeping, requesting a website, capturing the response body and then hashing it a few times.

for i := 0; i < 10; i++ {
	time.Sleep(10 * time.Millisecond)
}

res, err := http.Get("https://github.com/")
if err != nil {
	panic(err)
}
defer res.Body.Close()

var buf bytes.Buffer
if _, err := io.Copy(&buf, res.Body); err != nil {
	panic(err)
}

for i := 0; i < 1000; i++ {
	sha1.Sum(buf.Bytes())
}

Now let's have a look at how fgtrace and other tools allow you to understand the performance of such a program.

fgtrace

Looking at our main goroutine (G1), we can easily recognize the operations of the program, their order, and how long they are taking (~100ms time.Sleep, ~65ms http.Get, ~30ms io.Copying the response and ~300ms calling sha1.Sum to hash it).

However, it's important to note that this data is captured by sampling goroutine stack traces rather than actual tracing. Therefore fgtrace does not know that there were ten time.Sleep() function calls lasting 10ms each. Instead it just merges its samples into one big time.Sleep() call that appears to take 100ms.

Another detail are the virtual goroutine state indicators on top, e.g. sleep, select, sync.Cond.Wait and running/runnable. These are not part of the real stack traces and meant to help understanding On-CPU activity (running/runnable) vs Off-CPU states. You can disable them via configuration.

To break down the latency of our main goroutine, we can also look at other goroutines used by the program. E.g. below is a closer look on how the http.Get operation is broken down into resolving the IP address, connecting to it, and performing a TLS handshake.

So as you can see, fgtrace offers an intuitive, yet powerful way to understand the operation of Go programs. However, since it always captures the activity of all goroutines and has no information about how they communicate with each other, it may create overwhelming amounts of data in some cases.

fgprof

You can think of fgprof as a more simplified version of fgtrace. Instead of capturing a timeline for each goroutine, it aggregates the same data into a single profile as shown in the flame graph below.

This means that the x-axis represents duration rather than time, so function calls are ordered alphabetically rather than chronologically. E.g. notice how time.Sleep is shown after sha1.Sum in the graph above even so it's the first operation completed by our program.

Additionally the data of all goroutines ends up in the same graph which can be difficult to read without having a good understanding of the underlaying code and number of goroutines that are involved.

Despite these disadvantages, fgprof may still be useful in certain situations where the detail provided by the timeline may be overwhelming and a simpler view of the average program behavior is desirable. Additionally fgprof under Go 1.19 has less negative impact on the performance of the profiled program than fgtrace.

runtime/trace

The runtime/trace package is a true execution tracer that is capable of capturing even more detailed information than fgtrace. However, it's mostly designed to understand the decisions made by the Go scheduler. So the default timeline is focused on how goroutines are scheduled onto the CPU (processors). This means only the sha1.Sum operation stands out in green, and full stack traces can only be seen by clicking on the individual scheduler activities.

The goroutine analysis view offers a more useful breakdown. Here we can see that our goroutine is spending 271ms in Execution on CPU, but it's not clear from this view alone that this is the sha1.Sum operation. Our networking activity (http.Get and io.Copy) gets grouped into Sync block rather than Network wait because the networking is done through channels via other goroutines. And our time.Sleep activity is shown as a grey component of the bar diagram, but not explicitly listed in the table. So while a lot of information is available here, it's difficult to interpret for casual users.

Last but not least it's possible to click on the goroutine id in the view above in order to see a timeline for the individual goroutine, as well as the other goroutines it is communicating with. However, the view is also CPU-centric, so remains difficult to understand the sleep and networking operations of our program.

That being said, some of the limitations of runtime/trace could probably be resolved with changes to the UI or converting the traces into a format that Perfetto UI can understand which might be a fun project for another time.

How it Works

The current implementation of fgtrace is incredibly hacky. It calls runtime.Stack() on a regular frequency (default 100 Hz) to capture textual stack traces of all goroutines and parses them using the gostackparse package. Each call to runtime.Stack() is a blocking stop-the-world operation, so it scales very poorly to programs using ten thousand or more goroutines.

After the data is captured, it is converted into the Trace Event Format which is one of the data formats understood by Perfetto UI.

The Future

fgtrace is mostly a "Do Things that Don't Scale" kind of project. If enough people like it, it will motivate me and perhaps others to invest into putting it on a solid technical foundation.

The Go team has previously declined the idea of adding wallclock profiling capabilities similar to fgprof (which is similar to fgtrace) to the Go project and is more likely to invest in runtime/trace going forward.

That being said, I still think fgtrace can help by:

  1. Showing the usefulness of stack-trace/wallclock focused timeline views in addition to the CPU-centric views used by runtime/trace to perhaps implement the future developement of the runtime tracer.
  2. Starting a conversation (link to GH issue will follow ...) to offer more powerful goroutine profiling APIs to allow user-space tooling like this to thrive without having to hack around the existing APIs while reducing their overhead.

License

fgtrace is licensed under the MIT License.

Documentation

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

func File

func File(name string) io.WriteCloser

File is a helper for Config.Dst that returns an io.WriteCloser that creates and writes to the file with the given name.

func Writer

func Writer(w io.Writer) io.WriteCloser

Writer is a helper for for Config.Dst that returns an io.WriteCloser that writes to w and does nothing when Close() is called.

Types

type Config

type Config struct {
	// Hz determines how often the stack traces of all goroutines are captured
	// per second. WithDefaults() sets it to 99 Hz if it is 0.
	Hz int
	// IncludeSelf controls if the trace contains its own internal goroutines.
	// It's disabled by default because they are usually not of interest.
	IncludeSelf bool
	// StateFrames allows adding the state of goroutines as a virtual frame when
	// their stack traces are captured. WithDefaults() sets it to StateFramesRoot
	// if it is "".
	StateFrames StateFrames
	// Dst is the destination for traces created by calling Trace().
	// WithDefaults() sets it to File("fgtrace.json") if it is nil. Also see
	// Writer().
	Dst io.WriteCloser
	// HTTPDuration is the default duration for traces served via ServeHTTP().
	// WithDefaults() sets it to 30s if it is 0. It is ignored by Trace().
	HTTPDuration time.Duration
}

Config configures the capturing of traces as well as serving them via http. The zero value is a valid configuration.

func (Config) ServeHTTP

func (c Config) ServeHTTP(w http.ResponseWriter, r *http.Request)

ServeHTTP applies WithDefaults to c and serves a trace. The query parameters "hz" and "seconds" can be used to overwrite the defaults.

Example
// Serve traces via /debug/fgtrace endpoint
http.DefaultServeMux.Handle("/debug/fgtrace", Config{})
http.ListenAndServe(":1234", nil)

func (Config) Trace

func (c Config) Trace() *Trace

Trace applies WithDefaults to c and starts capturing a trace at c.Hz to c.Dst. Callers are responsible for calling Trace.Stop() to finish the trace.

Example
// Write trace to the default fgtrace.json file
defer Config{}.Trace().Stop()

// Write high-resolution trace
defer Config{Hz: 10000}.Trace().Stop()

// Write trace to a custom file
defer Config{Dst: File("path/to/myfile.json")}.Trace().Stop()

// Write trace to a buffer
var buf bytes.Buffer
defer Config{Dst: Writer(&buf)}.Trace().Stop()

func (Config) WithDefaults

func (c Config) WithDefaults() Config

WithDefaults returns a copy of c with default values applied as described in the type documentation. This is done automatically by Trace() and ServeHTTP(), but can be useful to log the effective configuration.

type StateFrames

type StateFrames string

StateFrames describes if and where virtual goroutine state frames are added.

const (
	// StateFramesRoot causes virtual goroutine state frames to be added at the
	// root of stack traces (e.g. above main).
	StateFramesRoot StateFrames = "root"
	// StateFramesLeaf causes virtual goroutine state frames to be added at the
	// leaf of stack traces.
	StateFramesLeaf StateFrames = "leaf"
	// StateFramesNo casuses no virtual goroutine state frames to be added to
	// stack traces.
	StateFramesNo StateFrames = "no"
)

type Trace

type Trace struct {
	// contains filtered or unexported fields
}

Trace represents a trace that is being captured.

func (*Trace) Stop

func (t *Trace) Stop() error

Stop stops the trace, calls Close() on the configured dst and returns nil on success. Calling Stop() more than once returns the previous error or an error indicating that the tracer has already been stopped.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL