KEMBAR78
Fix invalid slice access in String::retain by SkiFire13 · Pull Request #82554 · rust-lang/rust · GitHub
Skip to content

Conversation

@SkiFire13
Copy link
Contributor

As noted in #78499, the previous fix was technically still unsound because it accessed elements of a slice outside its bounds (even though they were still inside the same allocation). This PR addresses that concern by switching to a dropguard approach.

@rust-highfive
Copy link
Contributor

r? @m-ou-se

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 26, 2021
@eddyb
Copy link
Member

eddyb commented Feb 26, 2021

cc @RalfJung

@RalfJung
Copy link
Member

somewhat suspicious - it's calling get_unchecked on a &str obtained via auto-deref which is zero-length now. So we more or less expect this needs to go through a raw pointer directly or something similar.

Yeah, that seems potentially problematic. Until we have a better idea of what the aliasing model will be long-term, it's better to be conservative and always remain in-bounds even for dynamically sized references.

I haven't carefully checked the code here, but a set-len-on-drop guard is a common approach elsewhere in the stdlib. Is this function covered by a liballoc test case? If yes, any idea why Miri is not complaining about this? If it is possible to cause a Miri error with the current implementation of that function, it'd be good to add that as a testcase.

@SkiFire13
Copy link
Contributor Author

The function was covered by this test, but it contains a catch_unwind (also added by #78499) that makes it not runnable in miri.

#[test]
fn test_retain() {
let mut s = String::from("α_β_γ");
s.retain(|_| true);
assert_eq!(s, "α_β_γ");
s.retain(|c| c != '_');
assert_eq!(s, "αβγ");
s.retain(|c| c != 'β');
assert_eq!(s, "αγ");
s.retain(|c| c == 'α');
assert_eq!(s, "α");
s.retain(|_| false);
assert_eq!(s, "");
let mut s = String::from("0è0");
let _ = panic::catch_unwind(panic::AssertUnwindSafe(|| {
let mut count = 0;
s.retain(|_| {
count += 1;
match count {
1 => false,
2 => true,
_ => panic!(),
}
});
}));
assert!(std::str::from_utf8(s.as_bytes()).is_ok());
}

However even when run without the catch_unwind part it doesn't trigger miri. Heck, even the following program, which seems obviously wrong, doesn't trigger miri:

fn main() {
    unsafe {
        let mut v = vec![1, 2, 3];
        v.set_len(0);
        assert_eq!(*v.get_unchecked(1), 2);
        
        let a = [1, 2, 3];
        let s = &a[0..0];
        assert_eq!(s.len(), 0);
        assert_eq!(*s.get_unchecked(1), 2);
    }
}

I guess miri doesn't correctly track slices bounds but only the allocation ones?

@RalfJung
Copy link
Member

RalfJung commented Feb 27, 2021

but it contains a catch_unwind (also added by #78499) that makes it not runnable in miri.

Miri has supported unwinding for more than a year at this point. ;)

I guess miri doesn't correctly track slices bounds but only the allocation ones?

Interesting... I first thought this was just a missing -Zmiri-track-raw-pointers, but that's not it. The following does error though (with -Zmiri-track-raw-pointers):

fn main() {
    unsafe {
        let a = [1, 2, 3];
        let s = &a[0..1];
        assert_eq!(s.len(), 1);
        assert_eq!(*s.get_unchecked(1), 2);
    }
}

Something strange is going on with zero-sized references here.

@SkiFire13
Copy link
Contributor Author

Miri has supported unwinding for more than a year at this point. ;)

Oops, I copypasted the test without importing the panic module and got an error, then blindly assumed it was because Miri didn't support it. I feel dumb...

bors added a commit to rust-lang/miri that referenced this pull request Feb 27, 2021
fix reborrowing of tagged ZST references

`@SkiFire13` [pointed out](rust-lang/rust#82554 (comment)) that Miri fails to detect illegal use of empty slices. This PR fixes that. In so doing, it uncovers a flaw of Stacked Borrows: it is incompatible with how the formatting machinery uses `extern type`, so for now we skip reborrowing when we cannot determine the exact size of the pointee.
@RalfJung
Copy link
Member

RalfJung commented Mar 1, 2021

Heck, even the following program, which seems obviously wrong, doesn't trigger miri:

This is fixed now in latest Miri, but you need to pass -Zmiri-track-raw-pointers to get an error.

@m-ou-se
Copy link
Member

m-ou-se commented Mar 3, 2021

I'm a bit worried that the resulting code here is worse with a drop guard. Did you check?

It might be better to keep the set_len(0) and use std::slice::from_raw_parts and std::str::from_utf8_unchecked to create the slice.

Alternatively, you could keep use Vec::spare_capacity_mut right after set_len(0) together with slice_assume_init_mut() to get the full slice while temporarily having the Vec's len at zero. (You'd have to change the ptr::copy call though.)

@m-ou-se m-ou-se added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 3, 2021
@DJMcNab
Copy link
Contributor

DJMcNab commented Mar 3, 2021

I do already have most of an impl of the raw_parts and from_utf8_unchecked version in https://github.com/DJMcNab/retain_more/blob/c87ddd74b8b0c6660781dc9b5bb4b8f8a3187cc5/src/string.rs#L141-L223

I think the safety comments are correct, although the actual reasons that I think alloc::String::retain is broken are wrong. Obviously that impl can be simplified to not expose the additional power that it provides.

@SkiFire13
Copy link
Contributor Author

I'm a bit worried that the resulting code here is worse with a drop guard. Did you check?

The difference looks pretty minimal, it adds just a couple of methods and some instructions outside the main loop. https://godbolt.org/z/4rqM9P

@SkiFire13
Copy link
Contributor Author

I forgot to update the labels
@rustbot label -S-waiting-on-author +S-waiting-on-review

Also, I correct my previous statement, it doesn't add code to the function, it just moves the return label.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 10, 2021
@m-ou-se
Copy link
Member

m-ou-se commented Mar 21, 2021

@bors r+

@bors
Copy link
Collaborator

bors commented Mar 21, 2021

📌 Commit c89e643 has been approved by m-ou-se

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 21, 2021
@m-ou-se m-ou-se added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Mar 21, 2021
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Mar 21, 2021
…ness, r=m-ou-se

Fix invalid slice access in String::retain

As noted in rust-lang#78499, the previous fix was technically still unsound because it accessed elements of a slice outside its bounds (even though they were still inside the same allocation). This PR addresses that concern by switching to a dropguard approach.
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 22, 2021
Rollup of 9 pull requests

Successful merges:

 - rust-lang#80193 (stabilize `feature(osstring_ascii)`)
 - rust-lang#80771 (Make NonNull::as_ref (and friends) return refs with unbound lifetimes)
 - rust-lang#81607 (Implement TrustedLen and TrustedRandomAccess for Range<integer>, array::IntoIter, VecDequeue's iterators)
 - rust-lang#82554 (Fix invalid slice access in String::retain)
 - rust-lang#82686 (Move `std::sys::unix::platform` to `std::sys::unix::ext`)
 - rust-lang#82771 (slice: Stabilize IterMut::as_slice.)
 - rust-lang#83329 (Cleanup LLVM debuginfo module docs)
 - rust-lang#83336 (Fix ICE with `use clippy::a::b;`)
 - rust-lang#83350 (Download a more recent LLVM version if `src/version` is modified)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit da143d3 into rust-lang:master Mar 22, 2021
@rustbot rustbot added this to the 1.53.0 milestone Mar 22, 2021
@SkiFire13 SkiFire13 deleted the fix-string-retain-unsoundness branch March 22, 2021 09:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants