- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 33.2k
gh-129349: Accept bytes in bytes.fromhex()/bytearray.fromhex() #129844
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need more tests where invalid non-hexadecimal bytes are given (e.g., b"à", '\uD834\uDD1E'.encode() or bytes with NULs) and where non-bytes/str/bytearray objects are also passed.
cc @vstinner
        
          
                Misc/NEWS.d/next/Core_and_Builtins/2025-02-08-09-55-33.gh-issue-129349.PkcG-l.rst
              
                Outdated
          
            Show resolved
            Hide resolved
        
      | Although the use case of  | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry I reviewed the PR before thinking more about this. Actually, we have binascii.unhexlify that already supports bytes inputs. And the docs say:
Similar functionality (accepting only text string arguments, but more liberal towards whitespace) is also accessible using the bytes.fromhex() class method.
So I think the text-only feature for bytes.fromhex was actually desired.
I also tried to search on GitHub whether there are use cases for this, but I couldn't find any that actually used a non-string input. So I think this modification is likely not ideal. We might want to really restrict the input type for built-in types and perhaps be less restrictive with binascii.
Thus, sorry for having reviewed the PR (which seemed I endorsed the change), but I don't think we need to make that change. For your use-case, I think you can definitely use binascii.unhexlify instead (for instance binascii.unhexlify(b'012abc') would return b'\x01\x2a\xbc as you tested).
Note that binascii actually supports any object implementing the buffer protocol, so it's likely the one that needs to be used in such cases (assuming that whitespaces are cleaned). See https://github.com/python/cpython/blob/main/Python/pystrhex.c.
TL;DR: I'm rather -0.5 for the inclusion of this feature but if others disagree and find it good to align it with binascii.unhexlify, then I won't be against.
| Note: The PR you wrote is of good quality for a first contribution. It's just that the feature can be quite niche, especially if it's added on a built-in class. | 
| 
 
 It seems like a logical extension that  It's not my first credited contribution, just the first under my own GitHub account maybe. | 
| 
 Sorry I got confused with another account (but the fact that the PR is good remains). 
 Yes, but OTOH, it's a built-in and except for your use case, I couldn't find other use cases where  | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
        
          
                Misc/NEWS.d/next/Core_and_Builtins/2025-02-08-09-55-33.gh-issue-129349.PkcG-l.rst
              
                Outdated
          
            Show resolved
            Hide resolved
        
      There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add also a test for memoryview or array.array?
        
          
                Misc/NEWS.d/next/Core_and_Builtins/2025-02-08-09-55-33.gh-issue-129349.PkcG-l.rst
              
                Outdated
          
            Show resolved
            Hide resolved
        
      ad5cc10    to
    1a18438      
    Compare
  
    | 
 Ok, added this and support for the  | 
| (Contrubuted by Sergey B Kirpichev in :gh:`87790`.) | ||
|  | ||
| * The :func:`bytes.fromhex` and :func:`bytearray.fromhex` methods now accept | ||
| ASCII :class:`bytes` and :term:`bytes-like objects <bytes-like object>`. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: are bytes also bytes-like objects? if so, you can just link the term. Or more generally, isn't it objects that support the buffer protocol?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but it seems less accessible to users to just link bytes-like object because that dives into a description of the buffer protocol which is lower level than the audience I was writing for in builtins docs.
Maybe the fault is with :term:\bytes-like object`` because it could just list types that duck-type like bytes.
So I hedged and did both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could sa "that support the buffer protocol such as memoryviews" and add a link to whatever example of a type that supports the buffer protocol you used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to say "bytes and bytes-like", it's more explicit.
91566c9    to
    ba05050      
    Compare
  
    be4b9a4    to
    1e035db      
    Compare
  
    | Oh no, there is now a conflict on  | 
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
1e035db    to
    31b9ab0      
    Compare
  
    | Merged, thank you @lordmauve! | 
…ython#129844) Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com> Co-authored-by: Victor Stinner <vstinner@python.org>
Change
bytes.fromhex()andbytearray.fromhex()to accept abytesobject interpreted as ASCII.This matches the behaviour of
inte.gFixes #129349