KEMBAR78
Add `Unsafe` array access sanitizer by Marcono1234 · Pull Request #932 · CodeIntelligenceTesting/jazzer · GitHub
Skip to content

Conversation

@Marcono1234
Copy link
Contributor

@Marcono1234 Marcono1234 commented Jul 3, 2025

Adds a sanitizer which looks for invalid array access performed by sun.misc.Unsafe. Unlike native memory access performed by Unsafe, array access can be sanitized in a stateless way just based on the arguments passed to the Unsafe method. And multiple Java libraries have used Unsafe for arrays in the past to improve performance, see related security advisories GHSA-8wh2-6qhj-h7j9 and GHSA-973x-65j7-xcf4.

Note that fuzzing likely cannot find all such invalid accesses because some of it occurs for numeric overflow respectively multiple MB of processed data, which the fuzzer might be unable to generate.

See related #891

I used #915 as reference for how to implement a sanitizer. However, I am not very familiar with Bazel and this project setup here, and also have / had issues with building it locally on Windows. Therefore I have marked this PR as Draft for now.

Any feedback is appreciated!

@Marcono1234
Copy link
Contributor Author

I think this PR is now ready for an initial review; any feedback is appreciated! Please also let me know what you think about the points I mentioned above in my review comments.

Also, is there a way to let Bazel install the Maven artifacts (especially the Jazzer JUnit artifact and its dependencies) as SNAPSHOT to the local Maven repository (similar to what mvn install does)? I hope I have integrated the sanitizer correctly but I wasn't able to verify this yet by running Jazzer for a separate project.

@Marcono1234 Marcono1234 marked this pull request as ready for review July 5, 2025 13:43
@fmeum
Copy link
Contributor

fmeum commented Jul 5, 2025

bazel run //deploy:deploy_local should deploy into your local maven repo with the version 0.0.0-dev.

@Marcono1234
Copy link
Contributor Author

Marcono1234 commented Jul 5, 2025

Thanks! That only works for Linux though because deploy/deploy_local.sh is a Bash script, right? Unfortunately I am on Windows. And even if I try to run the commands there individually, it then fails at a later point, apparently because the jazzer.publish-publisher script it generates is again a Bash script.

I will see if I can maybe get it working with WSL 2 or a Docker container. But if that then only produces the Linux and not the Windows native libraries for Jazzer (not completely sure how the native build for Jazzer works) it will also be a bit cumbersome to then use these artifacts.

(Side note: I had to update locally some of the bazel_dep entries in MODULE.bazel especially rules_kotlin to be able to build the project on Windows 11. But I don't think this is related to the deploy_local issues I am facing now.)

Edit: Was able to build it now in a Docker container, but had to copy the native libraries which I had previously built on Windows into the JAR. The sanitizer seems to work.

@fmeum
Copy link
Contributor

fmeum commented Jul 5, 2025

If you follow the steps at https://bazel.build/install/windows#install-compilers to install MSYS2 and Visual Studio, sh_binarys should run just fine on Windows (via bazel run) and the native libraries are built for Windows. You can build your own release jars that way.

The publishing script for java_export is unfortunately hardcoded to a Bash script without a launcher even on Windows, so it won't work. I will see whether I can fix that in rules_jvm_external.

@Marcono1234
Copy link
Contributor Author

Thanks for the hints!

The publishing script for java_export is unfortunately hardcoded to a Bash script without a launcher even on Windows, so it won't work. I will see whether I can fix that in rules_jvm_external.

I worked around it now by building within a Docker container, but maybe other users would benefit from that as well, so thanks for your work on this.

Copy link
Contributor

@oetr oetr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this awesome contribution!
I am not familiar with the unsafe, but this PR looks good already!

@Marcono1234
Copy link
Contributor Author

Side note: The current test failure for the GitHub workflow seems to be unrelated to my changes:

 [5,521 / 5,709] 123 / 336 tests; Testing //sanitizers/src/test/java/com/example:OsCommandInjectionProcessBuilder; 300s local ... (4 actions running)

TIMEOUT: //sanitizers/src/test/java/com/example:OsCommandInjectionProcessBuilder (Summary)
      C:/users/runneradmin/_bazel_runneradmin/2gbstkcc/execroot/_main/bazel-out/x64_windows-opt/testlogs/sanitizers/src/test/java/com/example/OsCommandInjectionProcessBuilder/test.log

@oetr
Copy link
Contributor

oetr commented Aug 4, 2025

Side note: The current test failure for the GitHub workflow seems to be unrelated to my changes:

 [5,521 / 5,709] 123 / 336 tests; Testing //sanitizers/src/test/java/com/example:OsCommandInjectionProcessBuilder; 300s local ... (4 actions running)

TIMEOUT: //sanitizers/src/test/java/com/example:OsCommandInjectionProcessBuilder (Summary)
      C:/users/runneradmin/_bazel_runneradmin/2gbstkcc/execroot/_main/bazel-out/x64_windows-opt/testlogs/sanitizers/src/test/java/com/example/OsCommandInjectionProcessBuilder/test.log

It turns out, using String input = data.consumeRemainingAsAsciiString(); is very inefficient for fuzzing, because that function uses clamping that is in general very bad for guided comparison.
Changing the test to use the mutation framework resolves the timeout problem---here the mutator saves the clamped value and the fuzzing works way better (see the changes here: 45405d2, and the test results here: https://github.com/CodeIntelligenceTesting/jazzer/actions/runs/16728441276/job/47350237276?pr=940)

What's interesting though, is this:

  • The original test was already taking about 2 minutes to run (without the new unsafe sanitizer).
  • After enabling the unsafe sanitizer, the run time on Windows increase to > 5m.
  • Using the mutation framework, the run time diff on Windows is: 4.5s -> 5.7s

I am still investigating what adds the >3 minutes overhead on Windows.

Can you already add 45405d2 to your PR? It's already good improvement to use 6 seconds instead of 2 minutes!

Copy link
Contributor

@oetr oetr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, thanks! Only the timeout test needs to be fixed like shown in 45405d2

I don't know why some comments appear twice in different places 🙈

@Marcono1234
Copy link
Contributor Author

Thanks! I have included 45405d2 now. Should I also squash the commits or will you do that on merge?

Also, maybe the OsCommandInjectionProcessBuilder test is in general a bit risky if it invokes arbitrary OS commands. Or are there any security / safety measures in place which prevent the actual execution of the command? (I might have overlooked them)

I don't know why some comments appear twice in different places 🙈

Do you mean the review comments here on the GitHub UI? I think that happens when you submit review comments on an existing comment thread. Then GitHub shows the comments on that thread but additionally (without any context...) for the review event. Hopefully GitHub improves this some day.

@oetr
Copy link
Contributor

oetr commented Aug 5, 2025

Should I also squash the commits or will you do that on merge?

I am planning to rebase and merge. Can you group meaningful commits together?

Also, maybe the OsCommandInjectionProcessBuilder test is in general a bit risky if it invokes arbitrary OS commands. Or are there any security / safety measures in place which prevent the actual execution of the command? (I might have overlooked them)

It is totally unsafe! I usually run only specific tests locally, and the rest on Github.
An idea to make it safer would be to add custom replace hook that does noops.

@Marcono1234
Copy link
Contributor Author

Can you group meaningful commits together?

Is it ok if I just squash all my commits into a single one and keep your commit (with slightly different message)? I don't think for my commits it is worth keeping any of the intermediate state as separate commit.

@oetr
Copy link
Contributor

oetr commented Aug 6, 2025

Is it ok if I just squash all my commits into a single one and keep your commit (with slightly different message)? I don't think for my commits it is worth keeping any of the intermediate state as separate commit.

Yes, thank you!

@Marcono1234 Marcono1234 force-pushed the unsafe-array-sanitizer branch from 45405d2 to 28c431d Compare August 6, 2025 12:32
@Marcono1234
Copy link
Contributor Author

Marcono1234 commented Aug 6, 2025

Have squashed the commits now.

Regarding your comment above:

After enabling the unsafe sanitizer, the run time on Windows increase to > 5m.

Jazzer uses Unsafe internally as well, but I hope this new sanitizer does not interfere with it, especially because it should only cover the Unsafe method overloads with Object parameter while Jazzer uses mostly those without Object parameter. And the method hooks of the sanitizer are applied at the call site, and not by transforming the code of Unsafe (if I understand it correctly).

Marcono1234 and others added 2 commits August 6, 2025 14:45
Previous usage of `FuzzedDataProvider#consumeRemainingAsAsciiString` caused a timeout.
@Marcono1234 Marcono1234 force-pushed the unsafe-array-sanitizer branch from 28c431d to 6b8d574 Compare August 6, 2025 12:46
@oetr oetr enabled auto-merge (rebase) August 6, 2025 12:56
@oetr oetr merged commit f6fb442 into CodeIntelligenceTesting:main Aug 6, 2025
5 checks passed
@Marcono1234 Marcono1234 deleted the unsafe-array-sanitizer branch August 6, 2025 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants