KEMBAR78
add a native C demangler by arielb1 · Pull Request #75 · rust-lang/rustc-demangle · GitHub
Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ jobs:
run: rustup update ${{ matrix.rust }} && rustup default ${{ matrix.rust }}
- run: cargo build --all
- run: cargo test --all
- run: cd crates/native-c && cargo test --all
- run: cargo build --features std

fuzz_targets:
Expand Down
2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ rustc-dep-of-std = ['core', 'compiler_builtins']
std = []

[profile.release]
lto = true
#lto = true

[package.metadata.docs.rs]
features = ["std"]
Expand Down
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,15 @@ You'll then find `target/release/librustc_demangle.a` and
platform). These objects implement the interface specified in
`crates/capi/include/rustc_demangle.h`.

If your build system does not support Rust, there is also a mostly-identical
C version in the `crates/native-c` which you can use via copy-paste or as
a git submodule. Read `crates/native-c/README.md` for more details. It is
likely to be less supported than the Rust version, so it is better to use
the Rust version if your build system supports it.

Both the Rust and C versions don't require memory allocation or any other
operating-system support.

# License

This project is licensed under either of
Expand Down
15 changes: 15 additions & 0 deletions crates/native-c/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
[package]
name = "rustc-demangle-native-c"
version = "0.1.0"
authors = ["automatically generated"]
description = """
Native C version of the rustc_demangle crate
"""
license = "MIT/Apache-2.0"
repository = "https://github.com/rust-lang/rustc-demangle"

[lib]
name = "rustc_demangle_native_c"

[build-dependencies]
cc = "1"
7 changes: 7 additions & 0 deletions crates/native-c/README
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would probably also make sense to update the top-level README noting this exists and indicating that expected usage is probably copy/paste or git submodule?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added it

Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
A portable native C demangler, which should mostly have byte-for-byte identical outputs to the Rust one, including in error cases.

This code is intended to be safe to run on untrusted inputs and has been fuzzed, but only it's author has tried to find security issues in it so a security review is probably wise before using it as a serious security barrier.

The only difference is that since it's hard to include up-to-date unicode tables in portable C code, strings in constants (do you know that feature exists?) have all non-ASCII characters escaped (as `\u{ABCD}`) rather than having only non-printable characters escaped. Unicode in identifiers is still translated as-is, allowing non-printable characters just like rustc. If you care, the code intentionally includes `unicode_isprint` and `unicode_isgraphemextend` that can be replaced with actual Unicode tables.

This has a Cargo.toml to make it easy to test, but people whose build systems can use Rust are expected to use the `rustc-demangle-capi` crate which uses the Rust `rustc-demangle` implementation instead. Since the crate is intended only for users with weird build systems, there is no build system provided.
8 changes: 8 additions & 0 deletions crates/native-c/build.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
fn main() {
cc::Build::new()
.file("src/demangle.c")
.include("include")
.compile("demangle_native_c");
println!("cargo::rerun-if-changed=src/demangle.c");
println!("cargo::rerun-if-changed=include/demangle.h");
}
80 changes: 80 additions & 0 deletions crates/native-c/include/demangle.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
#ifndef _H_DEMANGLE_V0_H
#define _H_DEMANGLE_V0_H

#ifdef __cplusplus
extern "C" {
#endif

#include <stddef.h>

#if defined(__GNUC__) || defined(__clang__)
#define DEMANGLE_NODISCARD __attribute__((warn_unused_result))
#else
#define DEMANGLE_NODISCARD
#endif

typedef enum {
OverflowOk,
OverflowOverflow
} overflow_status;

enum demangle_style {
DemangleStyleUnknown = 0,
DemangleStyleLegacy,
DemangleStyleV0,
};

// Not using a union here to make the struct easier to copy-paste if needed.
struct demangle {
enum demangle_style style;
// points to the "mangled" part of the name,
// not including `ZN` or `R` prefixes.
const char *mangled;
size_t mangled_len;
// In DemangleStyleLegacy, is the number of path elements
size_t elements;
// while it's called "original", it will not contain `.llvm.9D1C9369@@16` suffixes
// that are to be ignored.
const char *original;
size_t original_len;
// Contains the part after the mangled name that is to be outputted,
// which can be `.exit.i.i` suffixes LLVM sometimes adds.
const char *suffix;
size_t suffix_len;
};

// if the length of the output buffer is less than `output_len-OVERFLOW_MARGIN`,
// the demangler will return `OverflowOverflow` even if there is no overflow.
#define OVERFLOW_MARGIN 4

/// Demangle a C string that refers to a Rust symbol and put the demangle intermediate result in `res`.
/// Beware that `res` contains references into `s`. If `s` is modified (or free'd) before calling
/// `rust_demangle_display_demangle` behavior is undefined.
///
/// Use `rust_demangle_display_demangle` to convert it to an actual string.
void rust_demangle_demangle(const char *s, struct demangle *res);

/// Write the string in a `struct demangle` into a buffer.
///
/// Return `OverflowOk` if the output buffer was sufficiently big, `OverflowOverflow` if it wasn't.
/// This function is `O(n)` in the length of the input + *output* [$], but the demangled output of demangling a symbol can
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, output here refers specifically to the produced demangling, not the output buffer's size, right? We don't do anything with unused bytes in the output buffer.

(Just to check my understanding, no changes needed here I think).

Copy link
Contributor Author

@arielb1 arielb1 Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, "output-sensitive algorithm"

/// be exponentially[$$] large, therefore it is recommended to have a sane bound (`rust-demangle`
/// uses 1,000,000 bytes) on `len`.
///
/// `alternate`, if true, uses the less verbose alternate formatting (Rust `{:#}`) is used, which does not show
/// symbol hashes and types of constant ints.
///
/// [$] It's `O(n * MAX_DEPTH)`, but `MAX_DEPTH` is a constant 300 and therefore it's `O(n)`
/// [$$] Technically, bounded by `O(n^MAX_DEPTH)`, but this is practically exponential.
DEMANGLE_NODISCARD overflow_status rust_demangle_display_demangle(struct demangle const *res, char *out, size_t len, bool alternate);

/// Returns true if `res` refers to a known valid Rust demangling style, false if it's an unknown style.
bool rust_demangle_is_known(struct demangle *res);

#undef DEMANGLE_NODISCARD

#ifdef __cplusplus
}
#endif

#endif
Loading
Loading