KEMBAR78
On JDK8, alloc profiling may cause heap corrupt and crash · Issue #694 · async-profiler/async-profiler · GitHub
Skip to content

On JDK8, alloc profiling may cause heap corrupt and crash #694

@yanglong1010

Description

@yanglong1010

hi, Andrei

One of our applications has been running stably for more than 3 years. Recently, we used async-profiler to continuously trace the cpu and alloc of the application, and found that it would crash every once in a while (it may be 1 hour after startup, or it may be 1 day later, the time is not very sure). If turn off alloc and only turn on cpu tracing, every thing goes well, never crash again.

java version "1.8.0_202"
Java(TM) SE Runtime Environment (build 1.8.0_202-b08)
Java HotSpot(TM) 64-Bit Server VM (build 25.202-b08, mixed mode)

OracleJDK hs_err_pid15474.log
OpenJDK hs_err_pid4529.log

OpenJDK hs_err_pid4529.log
RDI=0x00000000877bbf60 is pointing into object: 0x00000000877bbf58
java.util.concurrent.ConcurrentHashMap$Node

  • klass: 'java/util/concurrent/ConcurrentHashMap$Node'
#0  0x00007f8c550e81d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007f8c550e98c8 in __GI_abort () at abort.c:90
#2  0x00007f8c549fe109 in os::abort (dump_core=<optimized out>) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/os/linux/vm/os_linux.cpp:1565
#3  0x00007f8c54b97186 in VMError::report_and_die (this=this@entry=0x7f8c52f22920) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/utilities/vmError.cpp:1107
#4  0x00007f8c54a07bbf in JVM_handle_linux_signal (sig=11, info=0x7f8c52f22bb0, ucVoid=0x7f8c52f22a80, abort_if_unrecognized=<optimized out>) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541
#5  0x00007f8c549fb5a8 in signalHandler (sig=11, info=0x7f8c52f22bb0, uc=0x7f8c52f22a80) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/os/linux/vm/os_linux.cpp:4538
#6  <signal handler called>
#7  0x00007f8c547a1fe1 in oopDesc::size_given_klass (this=0x877bbf60, klass=0x474ae0140) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/oops/oop.inline.hpp:405
#8  0x00007f8c54796a16 in do_oop_work<unsigned int> (gc_barrier=true, root_scan=false, p=0x9593df14, this=0x7f8b380938f0) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/gc_implementation/parNew/parOopClosures.inline.hpp:118
#9  do_oop_nv (p=0x9593df14, this=0x7f8b380938f0) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/gc_implementation/parNew/parOopClosures.inline.hpp:139
#10 InstanceKlass::oop_oop_iterate_nv (this=0x100001610, obj=0x9593df08, closure=0x7f8b380938f0) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/oops/instanceKlass.cpp:2352
#11 0x00007f8c54a1e75e in oop_iterate (blk=0x7f8b380938f0, this=<optimized out>) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/oops/oop.inline.hpp:734
#12 ParScanThreadState::trim_queues (this=0x7f8b380937f0, max_size=20) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/gc_implementation/parNew/parNewGeneration.cpp:175
#13 0x00007f8c54a215e7 in do_oop_work<unsigned int> (root_scan=true, gc_barrier=true, p=0x89890404, this=<optimized out>) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/gc_implementation/parNew/parOopClosures.inline.hpp:125
#14 ParRootScanWithBarrierTwoGensClosure::do_oop (this=0x7f8b380939a0, p=0x89890404) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/gc_implementation/parNew/parNewGeneration.cpp:506
#15 0x00007f8c549da0e9 in do_oop_work<unsigned int> (this=<optimized out>, this=<optimized out>, p=0x89890404) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/memory/genOopClosures.hpp:165
#16 do_oop_nv (p=0x89890404, this=<optimized out>) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/memory/genOopClosures.hpp:176
#17 ObjArrayKlass::oop_oop_iterate_nv_m (this=<optimized out>, obj=0x8988a508, closure=0x7f8c52f23970, mr=...) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/oops/objArrayKlass.cpp:557
#18 0x00007f8c545f827b in oop_iterate (mr=..., blk=0x7f8c52f23970, this=0x8988a508) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/oops/oop.inline.hpp:734
#19 FreeListSpace_DCTOC::walk_mem_region_with_cl_par (this=0x7f8b280c1190, mr=..., bottom=0x8988a508, top=0x89890800, cl=0x7f8c52f23970) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp:751
#20 0x00007f8c545f919d in FreeListSpace_DCTOC::walk_mem_region_with_cl (this=<optimized out>, mr=..., bottom=<optimized out>, top=<optimized out>, cl=<optimized out>) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp:751
#21 0x00007f8c54ac2798 in Filtering_DCTOC::walk_mem_region (this=<optimized out>, mr=..., bottom=<optimized out>, top=<optimized out>) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/memory/space.cpp:226
#22 0x00007f8c54ac21dc in DirtyCardToOopClosure::do_MemRegion (this=0x7f8b280c1190, mr=...) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/memory/space.cpp:171
#23 0x00007f8c5457da06 in ClearNoncleanCardWrapper::do_MemRegion (this=this@entry=0x7f8c52f23b10, mr=...) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/memory/cardTableRS.cpp:212
#24 0x00007f8c54a18fa7 in CardTableModRefBS::process_stride (this=this@entry=0x7f8c4c0202d0, sp=sp@entry=0x7f8c4c022c90, used=..., stride=<optimized out>, n_strides=n_strides@entry=4, cl=cl@entry=0x7f8b380939a0, ct=ct@entry=0x7f8c4c020290, lowest_non_clean=0x7f8c3c14b560,
    lowest_non_clean_base_chunk_index=547815073504, lowest_non_clean_chunk_size=10697) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/gc_implementation/parNew/parCardTableModRefBS.cpp:159
#25 0x00007f8c54a1952b in CardTableModRefBS::non_clean_card_iterate_parallel_work (this=0x7f8c4c0202d0, sp=sp@entry=0x7f8c4c022c90, mr=..., cl=cl@entry=0x7f8b380939a0, ct=ct@entry=0x7f8c4c020290, n_threads=<optimized out>)
    at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/gc_implementation/parNew/parCardTableModRefBS.cpp:74
#26 0x00007f8c5457d597 in CardTableModRefBS::non_clean_card_iterate_possibly_parallel (this=<optimized out>, sp=sp@entry=0x7f8c4c022c90, mr=..., cl=cl@entry=0x7f8b380939a0, ct=ct@entry=0x7f8c4c020290) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/memory/cardTableModRefBS.cpp:485
#27 0x00007f8c5457d809 in CardTableRS::younger_refs_in_space_iterate (this=0x7f8c4c020290, sp=0x7f8c4c022c90, cl=0x7f8b380939a0) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/memory/cardTableRS.cpp:311
#28 0x00007f8c54634b5e in ConcurrentMarkSweepGeneration::younger_refs_iterate (this=<optimized out>, cl=0x7f8b380939a0) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp:3199
#29 0x00007f8c5473e928 in GenCollectedHeap::gen_process_roots (this=this@entry=0x7f8c4c019c00, level=<optimized out>, younger_gens_as_roots=younger_gens_as_roots@entry=true, activate_scope=activate_scope@entry=false, so=so@entry=GenCollectedHeap::SO_ScavengeCodeCache,
    only_strong_roots=only_strong_roots@entry=false, not_older_gens=not_older_gens@entry=0x7f8b38093948, older_gens=older_gens@entry=0x7f8b380939a0, cld_closure=cld_closure@entry=0x7f8c52f23df0) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/memory/genCollectedHeap.cpp:740
#30 0x00007f8c54a1b88d in ParNewGenTask::work (this=0x7f8c5012a6b0, worker_id=<optimized out>) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/gc_implementation/parNew/parNewGeneration.cpp:629
#31 0x00007f8c54bad1ea in GangWorker::loop (this=0x7f8c4c01e800) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/utilities/workgroup.cpp:329
#32 0x00007f8c549fd222 in java_start (thread=0x7f8c4c01e800) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/os/linux/vm/os_linux.cpp:840
#33 0x00007f8c55896dc5 in start_thread (arg=0x7f8c52f24700) at pthread_create.c:308
#34 0x00007f8c551aa76d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

(gdb) x/16wx 0x9593df08 (an oop)
0x9593df08: 0x00000007 0x00000000 0x200002c2 0x877bbf60
0x9593df18: 0x00000000 0x00000000 0x0000000d 0x00000000
0x9593df28: 0x2000ac29 0x00000000 0xe2a6fb68 0x00000184
0x9593df38: 0x0000000d 0x00000000 0x200002c2 0x935a3288

(gdb) x/16wx (0x200002c2l << 3) java.lang.String
0x100001610: 0x55037430 0x00007f8c 0x00000018 0x00000030 (address of char[])
0x100001620: 0x55ae4100 0x00007f8c 0x00001260 0x00000001
0x100001630: 0x398020a0 0x00007f8c 0x00000ea8 0x00000001
0x100001640: 0x00001610 0x00000001 0x00000000 0x00000000

(gdb) x/64bc 0x00007f8c55ae4100
0x7f8c55ae4100: 16 '\020' 0 '\000' -1 '\377' -1 '\377' -17 '\357' 122 'z' 117 'u' 48 '0'
0x7f8c55ae4108: 106 'j' 97 'a' 118 'v' 97 'a' 47 '/' 108 'l' 97 'a' 110 'n'
0x7f8c55ae4110: 103 'g' 47 '/' 83 'S' 116 't' 114 'r' 105 'i' 110 'n' 103 'g'
0x7f8c55ae4118: 0 '\000' 0 '\000' 0 '\000' 0 '\000' 0 '\000' 0 '\000' 0 '\000' 0 '\000'
0x7f8c55ae4120: 16 '\020' 0 '\000' -1 '\377' -1 '\377' -61 '\303' -7 '\371' -39 '\331' 112 'p'
0x7f8c55ae4128: 106 'j' 97 'a' 118 'v' 97 'a' 47 '/' 108 'l' 97 'a' 110 'n'
0x7f8c55ae4130: 103 'g' 47 '/' 84 'T' 104 'h' 114 'r' 101 'e' 97 'a' 100 'd'
0x7f8c55ae4138: 0 '\000' 0 '\000' 0 '\000' 0 '\000' 0 '\000' 0 '\000' 0 '\000' 0 '\000'

(gdb) x/32wx 0x877bbf30 (0x877bbf60 should point to char[], but actually points into oop, which means the heap is corrupted)
0x877bbf30: 0x8d846ee0 0x8d846ee0 0x8bb317a0 0x8d326f68
0x877bbf40: 0x8d326f80 0x89c27740 0x89c27740 0x899ff620
0x877bbf50: 0x899ff620 0x00000000 0x88716033 0x00000000
0x877bbf60: 0x2000674c 0x4d15b72b 0x8e95c028 0x877bbf10
0x877bbf70: 0x00000000 0x00000000 0x00000005 0x00000000
0x877bbf80: 0x2012ec84 0x00000080 0x00000080 0x00000080
0x877bbf90: 0x00000080 0x93806ff0 0x00000000 0x93941db8
0x877bbfa0: 0xb3fe8b08 0x00000000 0x88716243 0x00000000

In order to further locate the reason, D-D-H and I add some debug code to OpenJDK. If in send_allocation_outside_tlab_event or send_allocation_in_new_tlab_event, and a safepoint occurs, triggering core dump.
Finally we found that JvmtiEnv::GetStackTrace may enter safepoint, the memory corresponding to the tlab address allocated by the current thread may be allocated to other threads after GC, resulting in Heap corruption.

https://github.com/openjdk/jdk8u.git
commit 04a31b454cd853fb88aafffd411dd113e3f4045f (tag: jdk8u202-b08)

pid31086.bt.log

Thread 264 (Thread 0x7f7d36d75700 (LWP 31772)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f7f631eef8b in os::PlatformEvent::park (this=this@entry=0x7f7d40014c00) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/os/linux/vm/os_linux.cpp:5980
#2  0x00007f7f631a8fa8 in ParkCommon (timo=0, ev=<optimized out>) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/runtime/mutex.cpp:424
#3  Monitor::ILock (this=0x7f7f5c009530, Self=0x7f7d401a9800) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/runtime/mutex.cpp:491
#4  0x00007f7f631a98f6 in lock_without_safepoint_check (Self=0x7f7d401a9800, this=0x7f7f5c009530) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/runtime/mutex.cpp:959
#5  Monitor::lock_without_safepoint_check (this=0x7f7f5c009530) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/runtime/mutex.cpp:965
#6  0x00007f7f63283e46 in SafepointSynchronize::block (thread=0x7f7d401a9800) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/runtime/safepoint.cpp:708
#7  0x00007f7f631a97db in transition_and_fence (to=_thread_in_vm, from=_thread_blocked, thread=0x7f7d401a9800) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/runtime/interfaceSupport.hpp:184
#8  trans_and_fence (to=_thread_in_vm, from=_thread_blocked, this=<synthetic pointer>) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/runtime/interfaceSupport.hpp:232
#9  ~ThreadBlockInVM (this=<synthetic pointer>, __in_chrg=<optimized out>) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/runtime/interfaceSupport.hpp:314
#10 Monitor::lock (this=this@entry=0x7f7f5c008c30, Self=0x7f7d401a9800) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/runtime/mutex.cpp:940
#11 0x00007f7f631a98ab in Monitor::lock (this=this@entry=0x7f7f5c008c30) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/runtime/mutex.cpp:949
#12 0x00007f7f62f87585 in MutexLocker (mutex=0x7f7f5c008c30, this=<synthetic pointer>) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/runtime/mutexLocker.hpp:185
#13 InstanceKlass::get_jmethod_id (ik_h=..., method_h=...) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/oops/instanceKlass.cpp:1778
#14 0x00007f7f63075b9c in jmethod_id (this=0x7f7eb4ebfa88) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/oops/method.hpp:776
#15 JvmtiEnvBase::get_stack_trace (this=<optimized out>, java_thread=java_thread@entry=0x7f7d401a9800, start_depth=<optimized out>, max_count=max_count@entry=2048, frame_buffer=frame_buffer@entry=0x7f7ef404e930, count_ptr=count_ptr@entry=0x7f7d36d72730, this=<optimized out>) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/prims/jvmtiEnvBase.cpp:874
#16 0x00007f7f6306be14 in JvmtiEnv::GetStackTrace (this=0x0, java_thread=0x7f7d401a9800, start_depth=0, max_frame_count=2048, frame_buffer=0x7f7ef404e930, count_ptr=0x7f7d36d72730) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/prims/jvmtiEnv.cpp:1303
#17 0x00007f7f1a796c5b in Profiler::recordSample(void*, unsigned long long, int, Event*) () from /tmp/libasyncProfiler7868072734751652305.so
#18 0x00007f7f1a797b4b in AllocTracer::recordAllocation(void*, int, unsigned long, unsigned long, unsigned long) () from /tmp/libasyncProfiler7868072734751652305.so
#19 <signal handler called>
#20 CollectedHeap::allocate_from_tlab_slow (klass=..., klass@entry=..., thread=thread@entry=0x7f7d401a9800, size=size@entry=4) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/gc_interface/collectedHeap.cpp:300
#21 0x00007f7f62f83a9e in allocate_from_tlab (size=4, thread=0x7f7d401a9800, klass=...) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp:236
#22 common_mem_allocate_noinit (__the_thread__=0x7f7d401a9800, size=4, klass=...) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp:131
#23 common_mem_allocate_init (__the_thread__=0x7f7d401a9800, size=4, klass=...) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp:223
#24 obj_allocate (__the_thread__=0x7f7d401a9800, size=4, klass=...) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp:251
#25 InstanceKlass::allocate_instance (this=this@entry=0x100033a60, __the_thread__=__the_thread__@entry=0x7f7d401a9800) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/oops/instanceKlass.cpp:1150
#26 0x00007f7f63280575 in OptoRuntime::new_instance_C (klass=0x100033a60, thread=0x7f7d401a9800) at /home/yibo.yl/openjdk/jdk8u/hotspot/src/share/vm/opto/runtime.cpp:245

A possible fix is to use getJavaTraceAsync.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions