KEMBAR78
Support posix_memalign and aligned_alloc in the native memory tracker by krk · Pull Request #1130 · async-profiler/async-profiler · GitHub
Skip to content

Conversation

@krk
Copy link
Contributor

@krk krk commented Feb 7, 2025

In addition to malloc, calloc, realloc and free; now posix_memalign and aligned_alloc are supported for tracking native memory allocations.

Related issues

#1121

How has this been tested?

New nativemem test added with posix_memalign and aligned_alloc usage.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

}

extern "C" void* aligned_alloc_hook(size_t alignment, size_t size) {
void* ret = _orig_aligned_alloc(alignment, size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If my understanding is correct, musl implementation of aligned_alloc calls malloc under the hood; in turn, posix_memalign calls aligned_alloc. Can you verify these are not double counted?

Copy link
Contributor Author

@krk krk Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. posix_memalign triggers posix_memalign_hook as well as aligned_alloc_hook.
  2. aligned_alloc only triggers aligned_alloc_hook.

Tested on Alpine 3.21:

profiler.Malloc {
  startTime = 17:26:33.908 (2025-02-07)
  address = 0x7F3358D43660
  size = 28.6 MB
  eventThread = "main" (osThreadId = 355)
  stackTrace = [
    libasyncProfiler.so.Profiler::getNativeTrace() line: 0
    libasyncProfiler.so.Profiler::recordSample() line: 0
    libasyncProfiler.so.MallocTracer::recordMalloc() line: 0
    libasyncProfiler.so.aligned_alloc_hook() line: 0
    ld-musl-x86_64.so.1.posix_memalign() line: 0
    main.main() line: 0
    ./lib/ld-musl-x86_64.so.1() line: 0
  ]
}

profiler.Malloc {
  startTime = 17:26:33.908 (2025-02-07)
  address = 0x7F3358D43660
  size = 28.6 MB
  eventThread = "main" (osThreadId = 355)
  stackTrace = [
    libasyncProfiler.so.Profiler::getNativeTrace() line: 0
    libasyncProfiler.so.Profiler::recordSample() line: 0
    libasyncProfiler.so.MallocTracer::recordMalloc() line: 0
    libasyncProfiler.so.posix_memalign_hook() line: 0
    main.main() line: 0
    ./lib/ld-musl-x86_64.so.1() line: 0
  ]
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thanks. This needs to be fixed then?

}

JNIEXPORT jlong JNICALL Java_test_nativemem_Native_alignedAlloc(JNIEnv* env, jclass clazz, jlong alignment, jlong size) {
return (jlong)aligned_alloc((size_t)alignment, (size_t)size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interim intptr_t cast needed for 32-bit?

cc->patchImport(im_realloc, (void*)realloc_hook);
cc->patchImport(im_free, (void*)free_hook);
cc->patchImport(im_posix_memalign, (void*)posix_memalign_hook);
cc->patchImport(im_aligned_alloc, (void*)aligned_alloc_hook);
Copy link
Member

@apangin apangin Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to add new hooks to MallocTracer::initialize()

Comment on lines 43 to 44
static thread_local bool in_musl_posix_memalign = false;
static thread_local bool in_musl_aligned_alloc = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thread_local may allocate (using malloc!), so it's problematic to use thread_local in malloc hooks.
Why not just skip these hooks entirely on musl? Will malloc_hook catch these functions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In musl, the next-gen aligned_alloc implementation no longer relies on malloc, unlike the current-gen version, which uses malloc in certain cases.

As of 2012, posix_memalign has consistently called aligned_alloc instead of using malloc, so hooking posix_memalign in musl is unnecessary.

However, since aligned_alloc no longer necessarily calls malloc, it still requires hooking to ensure proper behavior.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, new aligned_alloc does call malloc, am I wrong?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, yes. I haven't seen this in jfr profiles of aligned_alloc calls. Need to check this again.

MARK_ASYNC_PROFILER);

// _CS_GNU_LIBC_VERSION is not defined on musl
is_musl = confstr(_CS_GNU_LIBC_VERSION, NULL, 0) == 0 && errno != 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid code duplication, can we move this logic to os_linux.cpp and expose OS::isMusl()?

// If in_musl_aligned_alloc, do not recordMalloc in malloc_hook.
// Do not access thread_local variables in non-musl.
static bool is_musl = false;
static thread_local bool in_musl_posix_memalign = false;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We hit several issues with deadlocks and thread locals.
Our solution was to control the creation and deletion of the thread local variables with get/set_specific.
That implies hooking thread creation to ensure you can create these states. Though I think you already have such hooks.
We also added guards to avoid re-entrance (this avoids double counting mentioned bellow).
I'm not sure if this is preferable to your approach, though feel free to compare.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, we are looking into a simpler implementation that does not avoid double-accounting, yet still has correct leak detection.

@apangin apangin merged commit 3beae04 into async-profiler:master Feb 13, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants