-
Notifications
You must be signed in to change notification settings - Fork 937
#1174: Detect JVM in non-Java application & attach it to the async-profiler #1192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#1174: Detect JVM in non-Java application & attach it to the async-profiler #1192
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your PR. The idea makes sense, but the solution to handle JVM start needs revising. I think it will be simpler to move forward with two-step approach: let's split this in two PRs, with the first one solving the problem of attaching to the JVM when it is already running by the time profiler has started. What do you think?
src/codeCache.cpp
Outdated
| break; | ||
| case 'J': | ||
| if (strcmp(name, "JVM_StartThread") == 0) { | ||
| saveImport(im_JVM_StartThread, entry); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The choice of JVM_StartThread function to intercept is not clear to me. This is a native implementation of Java Thread.start method. It called for every new Java thread, while we are interested in something that is preferably called only at JVM startup. On the other hand, JVM_StartThread calls pthread_start under the hood, and we already have a pthread_start hook.
In general, library hooks is a subtle tool, which should be used only in exceptional cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some experimentation the pthread hooks don't seem to be a good option currently
However I do agree that using the JVM_StartThread is sub-optimal .
pthread_create
All threads that go through the pthread_create step were observed & found that only one thread has the JVM in a good enough state to attach to it (JVM is visible via JNI method) but using that thread to initiate the operation would result in a deadlock condition on the JNI GetEnv, please refer to the following stack dump for reference
#0 0x00007f0732f61377 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f073228e8fb in PlatformMonitor::wait(unsigned long) () from /usr/lib/jvm/java-21-amazon-corretto/lib/server/libjvm.so
#2 0x00007f073222abc5 in Monitor::wait(unsigned long) () from /usr/lib/jvm/java-21-amazon-corretto/lib/server/libjvm.so
#3 0x00007f07325ab642 in VMThread::wait_until_executed(VM_Operation*) () from /usr/lib/jvm/java-21-amazon-corretto/lib/server/libjvm.so
#4 0x00007f07325ac003 in VMThread::execute(VM_Operation*) () from /usr/lib/jvm/java-21-amazon-corretto/lib/server/libjvm.so
#5 0x00007f07320858eb in JvmtiEnvBase::enable_virtual_threads_notify_jvmti() () from /usr/lib/jvm/java-21-amazon-corretto/lib/server/libjvm.so
#6 0x00007f073208e42b in JvmtiExport::get_jvmti_interface(JavaVM_*, void**, int) () from /usr/lib/jvm/java-21-amazon-corretto/lib/server/libjvm.so
#7 0x00007f07306ba1e5 in VM::init(JavaVM_*, bool) () from build/lib/libasyncProfiler.so
#8 0x00007f07306ba28f in VM::VMManualLoad(bool) () from build/lib/libasyncProfiler.so
#9 0x00007f07306ba337 in pthread_create_hook(unsigned long*, pthread_attr_t const*, void* (*)(void*), void*) () from build/lib/libasyncProfiler.so
#10 0x00007f0732282c4f in os::create_thread(Thread*, os::ThreadType, unsigned long) () from /usr/lib/jvm/java-21-amazon-corretto/lib/server/libjvm.so
[#11](https://amzn-aws.slack.com/archives/C03K52T0BV2) 0x00007f0731f7bca9 in JVM_StartThread () from /usr/lib/jvm/java-21-amazon-corretto/lib/server/libjvm.so
** pthread_start**
Using this method will result segmentation fault when trying to initialize the JVM with the async profiler.
=> A deeper investigation will be needed on why this happens & how to prevent it.
pthread_exit
Using this method is too late in many case as it could happen at the end of the life cycle of the application being profiled.
src/hooks.cpp
Outdated
| MallocTracer::installHooks(); | ||
|
|
||
| // only install Java hooks if needed | ||
| if (!VM::loaded() && strstr(filename, OS::isLinux() ? "libjava.so" : "libjava.dylib" ) != NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not very reliable to look for any occurrence of a string. It should be better to check if a string ends with /libjava.so
src/vmEntry.cpp
Outdated
|
|
||
| void VM::JVM_StartThread_hook(JNIEnv *env, jobject thread) { | ||
| if (!VM::loaded()) { | ||
| VM::VMManualLoad(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If load does not succeed for any reason, we will keep calling it again and again, causing extra overhead for every thread start.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is by intention as the JVM internally start multiple threads.
Not all threads will be started when the JVM is fully initialized and fully available via the JNI APIs.
However I agree that we should add a counter for example to prevent trying too many times.
src/vmEntry.cpp
Outdated
| _JNI_GetCreatedJavaVMs_func = (jint (*)(JavaVM **, jsize, jsize *))dlsym(libHandle, "JNI_GetCreatedJavaVMs"); | ||
| _original_JVM_StartThread = (void (*)(JNIEnv*, jobject)) dlsym(libHandle, "JVM_StartThread"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not guaranteed to work with all JVMs, since libHandle here is a handle to libjava.so, whereas JNI_GetCreatedJavaVMs and JVM_StartThread are defined in libjvm.so.
src/vmEntry.cpp
Outdated
| CodeCacheArray* native_libs = Profiler::instance()->nativeLibs(); | ||
| int native_lib_count = native_libs->count(); | ||
|
|
||
| while (_patched_libs < native_lib_count) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regardless of it being already the third copy of a patching loop in the sources, the whole idea of going through all loaded libraries does not seem right to me, since JVM_StartThread is called from one library only - libjava.so.
src/vmEntry.cpp
Outdated
| Log::debug("No JVM is yet detected in manual load"); | ||
| return; | ||
| } | ||
| VM::init(jvm, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
attach=true is for cases when we are attaching to a running JVM.
But what about a case when JVM is started after profiler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, this code is supposed to handle the case when JVM is just being started. So can find an earlier hook where we are able to call VM::init with attach=false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When implementing the Thread solution which handles then Java after Async profiler case
It was observed that the that the JVM will already be in a running state when it's first detected via the Thread
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As suggested this PR currently only handles the JVM starting first case
So in this case the attache will always be true as the VM is running
src/profiler.cpp
Outdated
| // Try loading JVM if not already loaded | ||
| if (!VM::loaded()) { | ||
| VM::VMManualLoad(true); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not the best place for the call. This function is specifically for setting up signal handlers.
asprof_init() looks a better choice.
|
@apangin thanks for the review a simple solution would be starting a separate pthread that simply does a relatively long sleep (1-5seconds) & then attempt to load the JVM into async profiler, this thread could be started only if a dlopen is detected on the libjava.so & the VM isn't loaded library I.E. shouldn't have any effect on native Java applications & in cases where the VM starts first |
|
@apangin PR has been updated |
src/asprof.cpp
Outdated
|
|
||
| DLLEXPORT void asprof_init() { | ||
| // Try loading JVM if not already loaded | ||
| if (!VM::loaded()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized that my previous suggestion about asprof_init is also imperfect, because asprof_init is normally called only once, but we want to check if JVM has been loaded in the beginning of every profiling session. Sorry for the confusion.
So the right location will be Profiler::start.
src/asprof.cpp
Outdated
| DLLEXPORT void asprof_init() { | ||
| // Try loading JVM if not already loaded | ||
| if (!VM::loaded()) { | ||
| VM::VMManualLoad(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, name choice is not great (double "VM"; what is "manual load"?)
Maybe VM::detect or VM::tryAttach?
src/vmEntry.cpp
Outdated
| JavaVM* jvm; | ||
| jsize nVMs; | ||
|
|
||
| jint (*handle)(JavaVM **, jsize, jsize *) = (jint (*)(JavaVM **, jsize, jsize *))dlsym(RTLD_DEFAULT, "JNI_GetCreatedJavaVMs"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use typedef to avoid repeating cryptic declaration.
src/vmEntry.cpp
Outdated
| JavaVM* jvm; | ||
| jsize nVMs; | ||
|
|
||
| jint (*handle)(JavaVM **, jsize, jsize *) = (jint (*)(JavaVM **, jsize, jsize *))dlsym(RTLD_DEFAULT, "JNI_GetCreatedJavaVMs"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why RTLD_DEFAULT works. What if libjvm is loaded with dlopen(..., RTLD_LOCAL)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some checking here, it's due to the compile command I'm using which will use the RTLD_GLOBAL
You're right this needs to change to account for the fact it's possible that JVM will be loaded in a RTLD_LOCAL manner
src/vmEntry.cpp
Outdated
| } | ||
| } | ||
|
|
||
| void VM::VMManualLoad() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this function will be called on every profiling session of a native app, it should use something lightweight to detect if libjvm is loaded. Check what is the most efficient: dlopen("libjvm.so", RTLD_NOLOAD), dlsym, dl_iterate_phdr or something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Problems that could happen:
libjvm.sois initially loaded withRTLD_LOCALoptionvoid* libHandle = dlopen("/usr/lib/jvm/java-21-amazon-corretto/lib/server/libjvm.so", RTLD_NOW | RTLD_LOCAL);dlsymwon’t detect theJNI_GetCreatedJavaVMsfunction as thelibjvm.soisn’t visible in the async-profiler scope => check first pointdl_iterate_phdrwill allow us to detect iflibjvm.sowas indeed loaded but won’t guarantee the loading options => check first point
Suggested Solution Steps:
- Limit the amount of
dlopen&dlsymoperation to at most to 2 operations of each.
The point above can be handled via simply adding a Flag that is initiallyTRUE.
If the Flag is turned on the async-profiler on the next profiling session will attempt to load theJNI_GetCreatedJavaVMsfunction.
This however will require first loading thelibjvm.sofirst due the first problem mentioned above.
If the load operation is successful the function reference can be stored & used in future profiling session starts to try & detect the JVM to load it into async-profiler.
If the operation is a failure the Flag is turned off (LIB or function wasn’t found via the respective dlopen & dlsym to prevent this loading from happening again in future profiling sessions.
- If a
dlopenwas detected on thelibjvmbetween profiling sessions the flag can be turned back on for a reattempt of Step 1
Known issues & fixes if possible:
- If
libjvmis detected mid profiling session it won’t attach until a new profiling session starts, so the JVM symbols won’t be visible for the initial profiling session
Suggested fix is to find a method that could be hooked into to detect that the JVM is ready for use & attach it mid profiling session to the async-profiler
=> TODO: this should be studied in depth & handled in a different PR libjvmis decide but a JVM is never started.
To prevent constant calls to try to load the JVM (JNI_GetCreatedJavaVMs was detected successfully) over many profiling sessions, a simple stopgap solution of a counter could be introduced to limit the attempts to detect & load the JVM to maximum amountX-> I would suggest 5 attempts here at maximum
@apangin what do you think?
|
|
||
| public class NonenativeTests { | ||
|
|
||
| @Test(sh = "%testbin/java_first_none_native_app %s.html", output = true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is "none native"? You mean "non-native"? There are Java applications and Native (or Non-Java) applications, but "non-native" sounds unusual.
|
@apangin code has been updated Make file updated & test refactored to use dlopen & dlsym to create the VM multiple more tests are added to handle JVM starting between session cases Known issues: JVM won't attach for an already run session Solution: This will be automatically be solved once we have the JVM hook. solution online to detect a JVM start & attach it the profiler so it won't be handled here Please feel free to check the updates made here |
src/profiler.cpp
Outdated
| return Error("Profiler already started"); | ||
| } | ||
|
|
||
| // Try loading JVM if not already loaded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment is misleading: tryAttach does not load a JVM, it rather checks if the JVM is already loaded.
src/vmEntry.cpp
Outdated
| jsize nVMs; | ||
|
|
||
| if (_getJvm == NULL) { | ||
| void* libHandle = dlopen(OS::isLinux() ? "libjvm.so" : "libjvm.dylib", RTLD_LAZY | RTLD_NOLOAD); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Local variables follow snake_case naming convention.
src/vmEntry.cpp
Outdated
|
|
||
| if (_getJvm == NULL) { | ||
| void* libHandle = dlopen(OS::isLinux() ? "libjvm.so" : "libjvm.dylib", RTLD_LAZY | RTLD_NOLOAD); | ||
| _getJvm = (GetJvm)dlsym(libHandle, "JNI_GetCreatedJavaVMs"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be called only if the above dlopen succeeds.
src/vmEntry.cpp
Outdated
| return; | ||
| } | ||
|
|
||
| // Try actually loading the JVM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, the comment is misleading.
src/vmEntry.h
Outdated
| static jvmtiError JNICALL RedefineClassesHook(jvmtiEnv* jvmti, jint class_count, const jvmtiClassDefinition* class_definitions); | ||
| static jvmtiError JNICALL RetransformClassesHook(jvmtiEnv* jvmti, jint class_count, const jclass* classes); | ||
|
|
||
| /** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Javadoc-style comments do not make sense for C++ code.
| } | ||
|
|
||
| void CommonNonJava::loadProfiler() { | ||
| #ifdef __linux__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, this won't compile with some older gcc. There should not be whilespace before #.
Btw, inline #ifdefs in a method implementation are discouraged anyway.
| exit(1); | ||
| } | ||
| char libPath[4096]; | ||
| #ifdef __linux__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#ifdef inside a function again.
Since you use this for choosing between .so/.dylib in multiple places, it may be better to declare a platform-specific constant or a macro at the top of the file.
| jmethodID constructor = env->GetMethodID(customClass, "<init>", "()V"); | ||
| jobject customObject = env->NewObject(customClass, constructor); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why creating an instance? These lines won't be needed if you declare the method static.
| fprintf(stderr, "Can't find cpuHeavyTask"); | ||
| exit(1); | ||
| } | ||
| env->CallVoidMethod(customObject, cpuHeavyTask); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exception must be checked after JNI-to-Java upcall.
You may run program with -Xcheck:jni to find potential issues with JNI code.
test/test/nonjava/common_non_java.h
Outdated
|
|
||
| typedef jint (*CreateJvm)(JavaVM **, void **, void *); | ||
|
|
||
| class CommonNonJava { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In what sense it is "common"?
Having 3 source files for a couple of tests seems an overkill. Can it be just one .cpp file?
|
Thanks @apangin for the feedback |
test/test/nonjava/non_java_app.cpp
Outdated
| std::cerr << "JAVA_HOME is not set" << std::endl; | ||
| exit(1); | ||
| } | ||
| char lib_path[4096]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use PATH_MAX limit when constructing paths.
test/test/nonjava/non_java_app.cpp
Outdated
| } | ||
|
|
||
| void loadJvmLib() { | ||
| char* java_home = getenv("JAVA_HOME"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may break test, since JAVA_HOME is not always set.
test/test/nonjava/non_java_app.cpp
Outdated
| exit(1); | ||
| } | ||
| char lib_path[4096]; | ||
| snprintf(lib_path, sizeof(lib_path), "%s/%s", java_home, jvm_lib_path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will not work with JDK 8, where there is additional jre directory on the path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct.
I will make do the changes to prepare to make it work with JDK 8 in the future
However as if now Async-profiler doesn't compile using JDK-8, so as if now JDK-8 can't be used to run the tests
Some changes will be needed on the Async-profiler as a whole to support compiling & running the tests in JDK 8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's true that async-profiler requires JDK 11+ to compile. This is because it uses JDK 11 APIs.
However, this does not mean tests cannot run with JDK 8. async-profiler compiled with JDK 11 can perfectly run on JDK 8.
test/test/nonjava/non_java_app.cpp
Outdated
| options[1].optionString = const_cast<char*>("-Xcheck:jni"); | ||
|
|
||
| // Configure JVM | ||
| vm_args.version = JNI_VERSION_10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 10?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note in JNI_OnLoad currently is using JNI_VERSION_1_6
This is because JNI_VERSION depends on the JDK version
In my case JNI_VERSION_10 was used as it the highest common version between JDK11 -> JDK21
I will make the value more dynamic depending the JDK version
But note this will require redefining some constants to avoid compile errors
test/test/nonjava/non_java_app.cpp
Outdated
| if (_env->ExceptionCheck()) { | ||
| jthrowable exception = _env->ExceptionOccurred(); | ||
| _env->ExceptionDescribe(); | ||
| _env->ExceptionClear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've listed all possible JNI exception functions :)
ExceptionDescribe is enough. It does everything: prints exception message with a stack trace and clears current exception.
src/vmEntry.cpp
Outdated
| return; | ||
| } | ||
| VM::init(jvm, true); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New line character at EOF is missing
Makefile
Outdated
| @mkdir -p $(TEST_BIN_DIR) | ||
| gcc -o $(TEST_BIN_DIR)/malloc_plt_dyn test/test/nativemem/malloc_plt_dyn.c | ||
| gcc -o $(TEST_BIN_DIR)/native_api -Isrc test/test/c/native_api.c -ldl | ||
| $(CXX) -o $(TEST_BIN_DIR)/non_java_app $(CPP_TEST_INCLUDES) $(INCLUDES) $(LIBS) test/test/nonjava/non_java_app.cpp -ldl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is $(LIBS) needed here?
src/vmEntry.cpp
Outdated
|
|
||
| jint result = _getJvm(&jvm, 1, &nVMs); | ||
| if (result != JNI_OK || nVMs != 1) { | ||
| Log::debug("No JVM is yet detected in manual load"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"manual load" is unclear.
Anyways, I'd remove debug printout altogether.
|
@apangin |
src/vmEntry.cpp
Outdated
|
|
||
| if (attach_thread) { | ||
| JNIEnv* env = nullptr; | ||
| if (_vm->AttachCurrentThreadAsDaemon((void**)&env, NULL) != JNI_OK) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Attaching thread to a JVM has side effects: e.g. it becomes visible in Java thread dumps, JDK Flight Recorder starts to report this thread, etc. If this is needed only to call VM::init, make sure the thread is detached as soon as initialization completes. But note that a thread may be already owned by the JVM - in this case, it must not be detached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| private static final String JAVA_VERSION; | ||
|
|
||
| static { | ||
| String[] javaVersionParts = System.getProperty("java.version").split("\\."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not fully correct way to get the JDK version.
Runner class already gets the JVM version here.
However, it's likely that you don't need this at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will need it as to load the libjvm.so lib dynamically
The path is different between JDK8 & other versions on linux
=> it includes the arch as amd64 or i386 (x86) As well as aarch64 or arm (arm)
test/test/nonjava/non_java_app.cpp
Outdated
| #ifdef __linux__ | ||
| const char profiler_lib_path[] = "build/lib/libasyncProfiler.so"; | ||
| const char jvm_lib_path[] = "lib/server/libjvm.so"; | ||
| const char jvm8_lib_path[] = "lib/amd64/server/libjvm.so"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not good to have the architecture hard-coded.
test/test/nonjava/non_java_app.cpp
Outdated
| #ifndef JNI_VERSION_9 | ||
| #define JNI_VERSION_9 0x00090000 | ||
| #endif | ||
|
|
||
| #ifndef JNI_VERSION_10 | ||
| #define JNI_VERSION_10 0x000a0000 | ||
| #endif | ||
|
|
||
| #ifndef JNI_VERSION_19 | ||
| #define JNI_VERSION_19 0x00130000 | ||
| #endif | ||
|
|
||
| #ifndef JNI_VERSION_20 | ||
| #define JNI_VERSION_20 0x00140000 | ||
| #endif | ||
|
|
||
| #ifndef JNI_VERSION_21 | ||
| #define JNI_VERSION_21 0x00150000 | ||
| #endif | ||
|
|
||
| #ifndef JNI_VERSION_24 | ||
| #define JNI_VERSION_24 0x00180000 | ||
| #endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is that? Can't you just always use JNI_VERSION_1_6?
test/test/nonjava/non_java_app.cpp
Outdated
| //options[1].optionString = const_cast<char*>("-Xcheck:jni"); | ||
|
|
||
| // Configure JVM | ||
| vm_args.version = getJniVersion();; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be a constant. Also, double ; at the end.
|
I realized that in all the tests async-profiler starts in the very same thread that creates a JVM. It will not this way in real Rust applications. Can you create a tests that starts async-profiler in a different thread not attached to the JVM? |
|
@apangin the code has been updated to handle JVM starting in another thread & a test was added to check for this flow Please feel free to check it when you can Please note that creating a solid reproducer for the Bug will be hard as it depends on multiple factors that are hard to control & it will be different from one machine to another As it's hard to exactly control the start of the profiler to be after the VM is created but not fully initialized So I dependent on a local reproduction that works reliably on my machine (for the JDK bug) & made sure that the work around is working as expected |
src/vmEntry.cpp
Outdated
| return NULL; | ||
| } | ||
|
|
||
| void VM::initWrapper(JavaVM* vm) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Wrapper" is a too generic word. It's not clear who wraps what.
It would be better to restructure the algorithm not to have wrappers at all. E.g., code that looks for JVM threads can be extracted in a separate function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing that would effect the performance & readability of the normal init method
As we need to use vm->GetEnv((void**)&env, JNI_VERSION_1_6);
Using the GetEnv to get JVM TI isn't safe in this specific case
So moving this extra read & other thread logic outside the logic of original init is more readable IMO than adding a flag to only do this flow if the flag is enabled (Not having this wrapping will most likely result in more nesting in the init method)
I however agree that the initWrapper name is somewhat vague
src/vmEntry.cpp
Outdated
|
|
||
| ThreadList* list = OS::listThreads(); | ||
| while (list->hasNext()) { | ||
| if (!OS::threadName(list->next(), thread_name, 4096)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sizeof(thread_name) instead of a constant.
Btw, 4096 looks too much when the longest thread name we care about is 20 characters.
src/vmEntry.cpp
Outdated
| } | ||
|
|
||
| // For MacOs Thread name starts with "Java: " | ||
| if (!OS::isLinux() && thread_name[0] == 'J' && strncmp(thread_name, "Java: ", 6) == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking first character explicitly is a redundant micro-optimization that worsens readability.
src/vmEntry.cpp
Outdated
| } | ||
|
|
||
| // Check that VM is in a good state to attach the thread | ||
| if (service_thread && vm_thread && vm->AttachCurrentThreadAsDaemon((void**)&env, NULL) == JNI_OK) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you attach a thread that is not under async-profiler control, you need to detach it afterwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will cause async-profiler to loose access to the JVM
Which in worst case could lead to crash & in best case would cause us to not sample Java stack frames correctly
|
@apangin Applied the latest changes, please feel free to check it when possible |
…to the async-profiler
…gic, and handle async profiler first case in a separate issue
… JVM start between profiling session cases
…a single file with better error reporting
… for future use & code cleanup
…vm for better monitoring
…t thread than the async-profiler
4c2663d to
754d0a3
Compare
| thread_name_offset = 6; | ||
| } | ||
|
|
||
| if (strcmp(thread_name + thread_name_offset, "VM Thread") == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May cause out-of-bounds access, when thread_name_offset == 6, but thread_name does not start with "Java: "
src/vmEntry.cpp
Outdated
| jsize nVMs; | ||
|
|
||
| if (_getJvm == NULL) { | ||
| void* lib_handle = dlopen(OS::isLinux() ? "libjvm.so" : "libjvm.dylib", RTLD_LAZY | RTLD_NOLOAD); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible handle leak: need dlclose(lib_handle) after use.
| * SPDX-License-Identifier: Apache-2.0 | ||
| */ | ||
|
|
||
| public class JavaClass { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be in test.nonjava package.
| public class NonjavaTests { | ||
|
|
||
| // jvm is loaded before the profiling session is started | ||
| @Test(sh = "%testbin/non_java_app 1 %s.html", output = true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why output = true if stdout is never read?
test/test/nonjava/non_java_app.cpp
Outdated
| pthread_t thread; | ||
| pthread_create(&thread, NULL, jvmThreadWrapper, NULL); | ||
|
|
||
| // busy wait for JVM to start on the thread |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not busy wait
test/test/nonjava/non_java_app.cpp
Outdated
| // libjvm wasn't found under standard path, this could happen in JDK 8 where the path is formated like: | ||
| // ${TEST_JAVA_HOME}/lib/${ARCH}/server/libjvm.(so|dylib) | ||
| snprintf(java_lib_home, sizeof(java_lib_home), "%s/lib", java_home); | ||
| dir = opendir(java_lib_home); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
closedir is missing
|
I fixed above issues together with minor typos and aligned the code with the project style. |



Description
Dynamically detecting & linking JVM instances in none Java applications to the async-profiler.
Related issues
#1174
Motivation and context
When async profiler is used on none Java applications (C/CPP/Rust) that contain a JVM instance (Created via JNI for example), the async-profiler won't sample Java stack correctly as the JVM instance isn't linked to the profiler.
By dynamically detecting & linking the JVM instance, it would provide better sampling for any made calls to the Java layer.
Please refer to the following before & after screen shots
How has this been tested?
Two tests have been added to handle the two major use cases where the order of the JVM & async-profiler loading happens differently, in both tests logs are checked for certain calls made into the JVM layer.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.