-
Notifications
You must be signed in to change notification settings - Fork 937
nativemem - Native memory profiler #1064
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Native memory allocations profiling and memory leak reporting, via GOT patching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
Makefile
Outdated
build-test-java-libs: $(addsuffix /.build-test,$(TESTS)) | ||
%/.build-test: | ||
@mkdir -p $(TEST_LIB_DIR) | ||
@if [ -f test/test/$*/Makefile ]; then \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd leave everything in one root Makefile for now to keep things simple. IMO, recursive make is an overkill for running just one extra command. We can get back to this idea when the structure becomes more branched.
- macOS x64/arm64: [async-profiler-3.0-macos.zip](https://github.com/async-profiler/async-profiler/releases/download/v3.0/async-profiler-3.0-macos.zip) | ||
- Converters between profile formats: [converter.jar](https://github.com/async-profiler/async-profiler/releases/download/v3.0/converter.jar) | ||
(JFR to Flame Graph, JFR to pprof, collapsed stacks to Flame Graph) | ||
- Converters between profile formats: [converter.jar](https://github.com/async-profiler/async-profiler/releases/download/v3.0/converter.jar) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two spaces intentionally meant a line break here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to a list.
--nativemem Generate native memory allocation profile | ||
--leak Only include memory leaks in nativemem |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move them after --live
, since --live
relates to --alloc
as --leak
to --nativemem
.
Btw, will a single option that can be applied to either alloc
or nativemem
be better? (just an idea, I'm not proposing this).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could have alloc=native
, alloc=jvm
, alloc=both
options perhaps?
docs/ConverterUsage.md
Outdated
jfrconv --cpu foo.jfr | ||
# which is equivalent to: | ||
# jfrconv --cpu foo.jfr -o foo.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example is wrong. Sorry I haven't noticed that before.
It should be jfrconv --cpu -o flamegraph foo.jfr foo.html
src/profiler.cpp
Outdated
if (VMStructs::_get_stack_trace != NULL) { | ||
// Object allocation in HotSpot happens at known places where it is safe to call JVM TI, | ||
// but not directly, since the thread is in_vm rather than in_native | ||
num_frames += getJavaTraceInternal(jvmti_frames + num_frames, frames + num_frames, _max_stack_depth); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getJavaTraceInternal
must not be used for nativemem
, since it can allocate itself.
More generally, getJavaTraceInternal
must not be used ever. I have recently removed it completely.
src/mallocTracer.cpp
Outdated
return false; | ||
} | ||
|
||
CodeCache* lib = Profiler::instance()->findLibraryByName("libasyncProfiler"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not necessarily called like this. A library can have an arbitrary name, especially when extracted as a temporary file. Use findLibraryByAddress
instead.
src/mallocTracer.cpp
Outdated
|
||
lib->mark( | ||
[](const char* s) -> bool { | ||
return strncmp(s, "_ZL11malloc_hook", 16) == 0 || strncmp(s, "_ZL11calloc_hook", 16) == 0 || strncmp(s, "_ZL12realloc_hook", 16) == 0 || strncmp(s, " _ZL9free_hook", 13) == 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Break expression into several lines for better readability.
return false; | ||
} | ||
|
||
MutexLocker ml(_patch_lock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory, this may conflict with Hooks::patchLibraries()
which has its own lock.
It makes sense to unify these two functions (also, to reduce copy-paste).
Can be done in a follow-up PR.
src/mallocTracer.cpp
Outdated
event._address = (uintptr_t)address; | ||
event._size = 0; | ||
|
||
Profiler::instance()->recordSample(NULL, 0, MALLOC_SAMPLE, &event); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
free
calls are not sampled according to _interval
.
It may be too expensive to record stack traces for each free
. I therefore propose to record free
events without a stack trace.
src/mallocTracer.cpp
Outdated
} | ||
|
||
Error MallocTracer::check(Arguments& args) { | ||
#ifndef __linux__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try to avoid ifdef
s in shared code when possible.
OS::isLinux()
can help here (should be optimized by the compiler).
System.loadLibrary("doesmalloc"); | ||
} | ||
|
||
public static native long nativeMalloc(int size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To reduce code duplication, malloc wrappers can be moved to a separate Java class.
src/profiler.cpp
Outdated
char mark; | ||
if (current_method_name != NULL && (mark = NativeFunc::mark(current_method_name)) != 0) { | ||
if (mark == MARK_VM_RUNTIME && event_type >= ALLOC_SAMPLE) { | ||
if ((mark == MARK_VM_RUNTIME || mark == MARK_ASYNC_PROFILER) && event_type >= MALLOC_SAMPLE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For malloc events, it will be better to leave the first native frame to make it clear what function exactly was called (malloc/calloc/realloc).
long startTicks = args.from != 0 ? toTicks(args.from) : Long.MIN_VALUE; | ||
long endTicks = args.to != 0 ? toTicks(args.to) : Long.MAX_VALUE; | ||
boolean allowZeroes = !args.nativemem; | ||
IEventReader reader = eventFilter != null ? eventFilter : jfr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to revise this approach - it will not work for multi-chunk JFRs.
An address map should persist between chunks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IEventReader
instance persists between collectEvents
calls, as it is created in the JfrConverter
ctor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went with a different design around the EventAggregator
class.
Addressed comments that have reactions or replies. |
Makefile
Outdated
|
||
build-test-java-libs: | ||
@mkdir -p $(TEST_LIB_DIR) | ||
$(CC) -shared -fPIC $(INCLUDES) -I$(realpath src) -o $(TEST_LIB_DIR)/libdoesmalloc.$(SOEXT) test/test/nativemem/doesmalloc.c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why realpath
? Other targets are compiled fine without relative paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was useful in the multi-makefile version, removing.
Makefile
Outdated
JAVA_TARGET=8 | ||
JAVAC_OPTIONS=--release $(JAVA_TARGET) -Xlint:-options | ||
|
||
TEST_LIB_DIR=$(abspath build/test/lib) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't it work with relative path?
Makefile
Outdated
PROFILER_VERSION := $(PROFILER_VERSION)-$(COMMIT_TAG) | ||
endif | ||
|
||
SOEXT ?= so |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SOEXT
is already defined appropriately depending on ifeq ($(OS),Darwin)
Makefile
Outdated
echo "Running tests against $(LIB_PROFILER)" | ||
$(JAVA) $(TEST_FLAGS) -ea -cp "build/test.jar:build/jar/*:build/lib/*" one.profiler.test.Runner $(TESTS) | ||
|
||
LD_LIBRARY_PATH="$(TEST_LIB_DIR)" $(JAVA) $(TEST_FLAGS) -ea -cp "build/test.jar:build/jar/*:build/lib/*" one.profiler.test.Runner $(TESTS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LD_LIBRARY_PATH
is Linux-specific. Use -Djava.library.path
instead.
test/test/nativemem/doesmalloc.c
Outdated
} | ||
|
||
JNIEXPORT void JNICALL Java_test_nativemem_Native_free(JNIEnv* env, jclass clazz, jlong addr) { | ||
free((void*)addr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this compile on a 32-bit system? I guess an intermediate conversion to intptr_t
is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a more realistic flamegraph?
I used to show native memory leaks on this example (strictly speaking, it's not fully a leak, since finalizers are supposed to free native resources, but in practice, finalizers can be very unpredictable).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed with a "ByteBuffer leak while reading a stream" scenario.
docs/ProfilingModes.md
Outdated
|
||
## Native memory leaks | ||
|
||
It is possible to profile `malloc` and `mmap` calls in Java context before, not always helpful. A large amount of allocations does not yet mean a leak, in case all the allocated memory is released on time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something wrong with the first sentence, can you rephrase it?
docs/ProfilingModes.md
Outdated
``` | ||
asprof start --nativemem[=N] <YourApp> | ||
asprof stop -f <app.jfr> <YourApp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With JFR output, filename must be given in start
command.
test/test/nativemem/Native.java
Outdated
|
||
public class Native { | ||
static { | ||
System.loadLibrary("doesmalloc"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unobvious name. Maybe jnimalloc
?
} | ||
|
||
#ifdef __linux__ | ||
extern "C" WEAK DLLEXPORT void* malloc(size_t size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does nativemem profiler work with non-Java apps if async-profiler is LD_PRELOAD'ed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, tested manually.
Very nice, thanks for contributing this |
im_pthread_exit, | ||
im_pthread_setspecific, | ||
im_poll, | ||
im_malloc, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case it is helpful, here are some of the functions we had to hook: https://github.com/DataDog/ddprof/blob/bccb05d89ec9e8473cd7f4f26d4e4948a70839c8/src/lib/symbol_overrides.cc#L852-L899
Though I realize you might not need to have a full coverage for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the link. We could implement these in another iteration or prioritize based on customer requests.
} | ||
} | ||
|
||
void MallocTracer::recordFree(void* address) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I correct in saying we do not have a sampling decision here (as we do not want to miss matching frees) ?
Here we proposed to record only free events for tracked addresses. However that meant having a lock free data structure containing tracked addresses.
The ideal solution might be to have more control over the allocator being used, to embed such a logic within the allocated object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we do not have a sampling decision for free
and wanted to avoid having to keep a list of allocation addresses.
void* result = dlopen(filename, flags); | ||
if (result != NULL) { | ||
instance()->updateSymbols(false); | ||
MallocTracer::installHooks(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While relying on a similar strategy, I recall we were not able to catch all loaded libraries. We eventually decided to put a timer that would trigger and recheck loaded libraries every X seconds (in addition to the dlopen hook). If this is of interest, I can find more details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be interesting to understand when this can happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the commit thanks to @nsavoire. A private symbol can be used to dlopen __libc_dlopen_mode
.
DataDog/ddprof@d0bb6c6
Exciting to see memory profiling as part of async profiler! 👏 I mentioned a few of the gotchas we encountered while implementing something similar in case this can be of interest. |
docs/ProfilingModes.md
Outdated
|
||
``` | ||
asprof start -e nativemem -f app.jfr <YourApp> | ||
# asprof start --nativemem=N -f app.jfr <YourApp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--nativemem N
(no =)
public abstract class JfrConverter extends Classifier { | ||
protected final JfrReader jfr; | ||
protected final Arguments args; | ||
private final IEventAggregator leakAggregator; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like a leaky abstraction (pun intended): we pretend it to be some abstract aggregator, although we know it can be only MallocLeakAggregator.
leakAggregator.coarsen(args.grain); | ||
} | ||
|
||
convertChunk(leakAggregator); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An illustration to the above comment: we have a special case for leakAggregator, but trying to deal with it as with some generic aggregator.
Can we design it in a way that we either have a common sequence for any aggregator or don't pretend that leak processor is an aggregator at all and just make leak processor explicitly a different thing?
For example, to consolidate EventAggregator (currently created for each new chunk) and LeakAggegator (a single instance for the whole process) we may probably replace creation of a new EventAggregator with clearing/resetting the same instance of EventAggregator. (Just as an idea; I'm open for other options too).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing the leak and special-cased invocations.
} | ||
|
||
public void collect(Event e, long samples, long value) { | ||
protected void collect(Event e, long samples, long value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a public API that is currently used in other projects.
import java.util.ArrayList; | ||
import java.util.Collections; | ||
|
||
public abstract class SortedEventAggregator extends EventAggregator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redundant abstractions, IMO.
There are currently only two aggregators. If we go this way, we can just make MallocLeakAggregator a subclass of EventAggregator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Folding SortedEventAggregator into MallocLeakAggregator.
src/mallocTracer.cpp
Outdated
|
||
__attribute__((constructor)) static void getOrigAddresses() { | ||
// Set malloc and calloc which may be called from libc during getOrigAddresses. | ||
_orig_malloc = OS::safeAlloc; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So is it really necessary? I remember you were saying allocation in dlsym only happens on error path, and returning NULL does not break anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a specific libc version yes. Removing to see test results.
#define __ASSERT_OR_CHECK_OP(isAssert, val1, op, val2) \ | ||
{ \ | ||
_Pragma("GCC diagnostic push"); \ | ||
_Pragma("GCC diagnostic ignored \"-Waddress\""); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is getting more and more intricate.
Would not it be better to have a special version of macros for strings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could split it at some point.
this.jfr = jfr; | ||
this.args = args; | ||
|
||
IEventAggregator agg = new EventAggregator(args.threads, args.total, args.lock ? 1e9 / jfr.ticksPerSec : 1.0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ticksPerSec
is only available after reading chunk header. In theory, it may be different for different chunks.
Actually, this value is required only during post-processing (in forEach), so it probably makes sense to move it out of constructor.
args.live ? LiveObject.class : | ||
args.alloc ? AllocationSample.class : | ||
args.lock ? ContendedLock.class : ExecutionSample.class; | ||
protected IEventAggregator collectEvents() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, collectEvents
created and owned an aggregator. Now it always uses a singleton created in a constructor. There is no sense in returning an aggregator then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, convertChunk
does not need an acceptor
argument.
protected IEventAggregator collectEvents() throws IOException { | ||
Class<? extends Event> eventClass = args.live ? LiveObject.class | ||
: args.alloc ? AllocationSample.class | ||
: args.lock ? ContendedLock.class : args.nativemem ? MallocEvent.class : ExecutionSample.class; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, previous formatting was more readable.
|
||
// millis can be an absolute timestamp or an offset from the beginning/end of the recording | ||
// millis can be an absolute timestamp or an offset from the beginning/end of | ||
// the recording |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like an automatic reformatting. In general, prefer to avoid reformatting code unrelated to the PR.
} | ||
|
||
public void resetChunk() { | ||
this.fraction = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fraction
is relevant only for coarsen
method - I'd reset it in the beginning of coarsen
instead.
|
||
package one.jfr.event; | ||
|
||
public interface IEventAcceptor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand why you added this, but imo this is unnecessary level of abstraction.
Fortunately, as per my other comment, the only use of IEventAcceptor
(and therefore IEventAcceptor
itself) can be safely removed.
src/mallocTracer.cpp
Outdated
|
||
#ifdef __linux__ | ||
extern "C" WEAK DLLEXPORT void* malloc(size_t size) { | ||
if (unlikely(!_orig_malloc)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we check MallocTracer::initialized()
first? (to make less comparisons on the fast path)
src/profiler.h
Outdated
# ifndef likely | ||
# define likely(x) (__builtin_expect(!!(x), 1)) | ||
# endif | ||
|
||
# ifndef unlikely | ||
# define unlikely(x) (__builtin_expect(!!(x), 0)) | ||
# endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
profiler.h
is probably not the right place for these macros, they are only used in mallocTracer.cpp
, you can put them there. Or, to make them available for a wider use, arch.h
will fit better.
} | ||
} | ||
|
||
return new ArrayList<>(addresses.values()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The list of "filtered" events is not actually used, is it? Therefore, creating ArrayList
seems redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is used in multi-chunk jfrs, filtered events will be sorted together with new events, before filtering again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But why? Timestamps of malloc events from different chunks should not overlap. Do they?
eventAggregator.finishChunk(); | ||
|
||
if (args.grain > 0) { | ||
agg.coarsen(args.grain); | ||
eventAggregator.coarsen(args.grain); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
finishChunk
and coarsen
can be moved out of collectEvents
.
convertChunk(eventAggregator); | ||
eventAggregator.resetChunk(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the design perspective, it looks strange that convertChunk/resetChunk
are called again after finish
.
I understand they are needed only for MallocLeakAggregator
to process the final filtered list, but I think this should be somehow hidden in finish
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving them to their own method, to make the intent clearer.
Signed-off-by: Andrei Pangin <1749416+apangin@users.noreply.github.com>
I've pushed a fix for recording timestamps correctly. Other than that, PR looks good. There is still one question about filtering events after each chunk, but it's not a blocker right now. Merged. Thanks a lot! |
@linking12 Yes, please try a nightly build. |
Native memory allocations profiling and memory leak reporting, via GOT patching.
There is a new argument in the profiler
nativemem
, andjfrconv
gains native memory tracking and memory leak reporting.Usage
New argument
--nativemem [N]
/nativemem[=N]
is used to enable theMallocTracer
engine.It can profile
malloc
,calloc
,realloc
andfree
invocations that are dynamically linked to the target.When
N
is specified, only one allocation everyN
bytes is sampled.Profiling with
nativemem
A flame graph can be generated directly as the profiler output option, which includes all sampled malloc calls. Alternatively, use
jfrconv
to process a jfr file and chart only the memory leaks:Memory leak analysis via
jfrconv
Generated flame graphs unit is in bytes if
total
is specified, otherwise in number of invocations.asprof
If the profiler is attached via
asprof
,nativemem
profiling can be started and stopped multiple times for the same process as necessary.How has this been tested?
malloc
functions with and without multi-threading.Sample outputs
All allocations
From a test application.
Memory leaks only
Profile of the same application, processed via
jfrconv
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.