KEMBAR78
Build and run tests on Alpine by fandreuz · Pull Request #1226 · async-profiler/async-profiler · GitHub
Skip to content

Conversation

fandreuz
Copy link
Contributor

@fandreuz fandreuz commented Apr 11, 2025

Description

In this PR I add a new Alpine build/test target.

Related issues

Fixes #1136

Motivation and context

Find musl-specific issues early.

How has this been tested?

GHA


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@fandreuz
Copy link
Contributor Author

fandreuz commented Apr 11, 2025

Async-Profiler complains about missing debug symbols: Install JVM debug symbols to improve profile accuracy

Can we base the new target on zulu-openjdk-alpine? It seems that people are already using AP on that image: zulu-openjdk/zulu-openjdk#76

@apangin
Copy link
Member

apangin commented Apr 11, 2025

Why not Corretto then? It also has Alpine images

@fandreuz
Copy link
Contributor Author

Why not Corretto then? It also has Alpine images

Missed that, thanks

@fandreuz
Copy link
Contributor Author

fandreuz commented Apr 11, 2025

/async-profiler # grep test.nativemem.Native.malloc output.txt
test/nativemem/CallsMallocCalloc.main;test/nativemem/Native.malloc;Java_test_nativemem_Native_malloc;malloc_hook 9999965

/async-profiler # grep test.nativemem.Native.calloc output.txt
test/nativemem/CallsMallocCalloc.main;test/nativemem/Native.calloc;calloc;malloc_hook 10000735

Java_test_nativemem_Native_calloc is missing in the traces for calloc events. It seems that on Alpine StackWalker::walkFp does not properly jump from calloc to the previous stack frame:

(gdb) bt
#0  StackWalker::walkFP (ucontext=0x0, callchain=0x7f57fa22aeb0, max_depth=128, java_ctx=0x7f57fa22b360) at src/stackWalker.cpp:103
#1  0x00007f57fa0218c5 in Profiler::getNativeTrace (this=0x7f57f9fc4330, ucontext=0x0, frames=0x7f57914a1160, event_type=MALLOC_SAMPLE, tid=607, java_ctx=0x7f57fa22b360) at src/profiler.cpp:341
#2  0x00007f57fa022d3a in Profiler::recordSample (this=0x7f57f9fc4330, ucontext=0x0, counter=2000147, event_type=MALLOC_SAMPLE, event=0x7f57fa22b3a0) at src/profiler.cpp:655
#3  0x00007f57fa01c4ea in MallocTracer::recordMalloc (address=0x7f57904ea170, size=2000147) at src/mallocTracer.cpp:160
#4  0x00007f57fa01bfa0 in malloc_hook (size=2000147) at src/mallocTracer.cpp:52
#5  0x00007f57fb582ee0 in calloc (m=<optimized out>, n=2000147) at src/malloc/calloc.c:40
#6  0x00007f57910b520a in Java_test_nativemem_Native_calloc () from /async-profiler/build/test/lib/libjnimalloc.so
#7  0x00007f57e01cda10 in ?? ()
#8  0x00007f57906d30c0 in ?? ()
#9  0x0000000000000000 in ?? ()

(gdb) print pc
$27 = (const void *) 0x7f57fb582ee0 <calloc+32>
(gdb) next
(gdb) print pc
$28 = (const void *) 0x7f57e01cda10

The new pc points to the instruction just before Java_test_nativemem_Native_calloc in the stack trace.

@apangin
Copy link
Member

apangin commented Apr 11, 2025

Right, it seems that Alpine libc in this image does not preserve frame pointers, but even worse, it does not include DWARF unwinding info either. I see the following options:

  • Switch to cstack=dwarf and install debuginfo package that includes dwarf unwind info.
  • Find another Alpine image where libc is compiled with frame pointers and/or dwarf unwind info.
  • Modify test to look for different frames.

@fandreuz
Copy link
Contributor Author

Right, it seems that Alpine libc in this image does not preserve frame pointers, but even worse, it does not include DWARF unwinding info either. I see the following options:

  • Switch to cstack=dwarf and install debuginfo package that includes dwarf unwind info.
  • Find another Alpine image where libc is compiled with frame pointers and/or dwarf unwind info.
  • Modify test to look for different frames.

I noticed that the following change:

diff --git a/src/mallocTracer.cpp b/src/mallocTracer.cpp
index a7d0724..9082754 100644
--- a/src/mallocTracer.cpp
+++ b/src/mallocTracer.cpp
@@ -56,7 +56,7 @@ extern "C" void* malloc_hook(size_t size) {
 
 extern "C" void* calloc_hook(size_t num, size_t size) {
     void* ret = _orig_calloc(num, size);
-    if (MallocTracer::running() && ret && num && size) {
+    if (!OS::isMusl() && MallocTracer::running() && ret && num && size) {
         MallocTracer::recordMalloc(ret, num * size);
     }
     return ret;
@@ -140,11 +140,11 @@ void MallocTracer::patchLibraries() {
         cc->patchImport(im_realloc, (void*)realloc_hook);
         cc->patchImport(im_free, (void*)free_hook);
         cc->patchImport(im_aligned_alloc, (void*)aligned_alloc_hook);
+        cc->patchImport(im_calloc, (void*)calloc_hook);
 
         if (!OS::isMusl()) {
             // On musl, calloc() calls malloc() internally, and posix_memalign() calls aligned_alloc().
             // Skip the following hooks to prevent double-accounting.
-            cc->patchImport(im_calloc, (void*)calloc_hook);
             cc->patchImport(im_posix_memalign, (void*)posix_memalign_hook);
         }
     }

fixes the problem:

/async-profiler # grep test.nativemem.Native.calloc x.txt
test/nativemem/CallsMallocCalloc.main;test/nativemem/Native.calloc;Java_test_nativemem_Native_calloc;calloc;malloc_hook 10000735

but I can't understand why.

@apangin
Copy link
Member

apangin commented Apr 12, 2025

This is because calloc_hook frame (which is a part of async-profiler and is compiled with frame pointers) saves frame pointer of the caller - Java_test_nativemem_Native_calloc. Without calloc_hook, calloc is called directly, and there is no frame link between Java_test_nativemem_Native_calloc and calloc.

@fandreuz
Copy link
Contributor Author

This is because calloc_hook frame (which is a part of async-profiler and is compiled with frame pointers) saves frame pointer of the caller - Java_test_nativemem_Native_calloc. Without calloc_hook, calloc is called directly, and there is no frame link between Java_test_nativemem_Native_calloc and calloc.

I see, thanks. Could this also be a solution? We could have two different versions of calloc_hook, a dummy musl-specific one, and one for the other cases.

@apangin
Copy link
Member

apangin commented Apr 14, 2025

Sounds interesting. This may be a simple and effective way to solve the problem.

@krk
Copy link
Contributor

krk commented Apr 24, 2025

please ping here when prerequisite PRs are merged.

@fandreuz
Copy link
Contributor Author

We're only missing #1254 here

make COMMIT_TAG=${GITHUB_SHA:0:7} FAT_BINARY=true release coverage -j
;;
alpine*)
make COMMIT_TAG=$HASH release coverage -j
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is $HASH for this case and ${GITHUB_SHA:0:7} for other?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

git merge troubles, fixed in 4618514

java-version: 11
java-distribution: corretto
image: "public.ecr.aws/async-profiler/asprof-builder-arm:latest"
image: public.ecr.aws/async-profiler/asprof-builder-arm:latest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Alpine, we need to test binaries that were built on Debian.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name: ubuntu-latest
image: public.ecr.aws/async-profiler/asprof-builder-x86:latest
- runson:
display: alpine
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will artifacts from this build be published in a release? They should not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought Alpine should have its own artifacts, fixed in 91e17e2

make COMMIT_TAG=${GITHUB_SHA:0:7} FAT_BINARY=true release coverage -j
make COMMIT_TAG=$HASH FAT_BINARY=true release coverage -j
;;
alpine*)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we no longer build on alpine, this case is not needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and same for setup-java a bit above, thanks: 2cd2700

@apangin apangin merged commit d2c85c1 into async-profiler:master Apr 28, 2025
15 checks passed
krk pushed a commit to krk/async-profiler that referenced this pull request May 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Run tests on Alpine Linux in CI

3 participants