KEMBAR78
gh-48241: Clarify URL needs to be encoded when provided to urlopen and Request by mblahay · Pull Request #103855 · python/cpython · GitHub
Skip to content

Conversation

@mblahay
Copy link
Contributor

@mblahay mblahay commented Apr 25, 2023

Adding note about url needing to be encoded when provided to the urlopen function as well as the Request class.

This is a documentation change

@ghost
Copy link

ghost commented Apr 25, 2023

All commit authors signed the Contributor License Agreement.
CLA signed

@mblahay
Copy link
Contributor Author

mblahay commented Apr 25, 2023

@ambv This is ready

Copy link
Contributor

@ambv ambv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we can drop the comma between "encoded" and "URL"?

@arhadthedev arhadthedev added docs Documentation in the Doc dir awaiting review labels Apr 26, 2023
mblahay and others added 4 commits April 26, 2023 09:39
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
@ambv ambv added the needs backport to 3.11 only security fixes label Apr 26, 2023
@ambv ambv merged commit 44010d0 into python:main Apr 26, 2023
@miss-islington
Copy link
Contributor

Thanks @mblahay for the PR, and @ambv for merging it 🌮🎉.. I'm working now to backport this PR to: 3.11.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Apr 26, 2023
…pen and Request (pythonGH-103855)

(cherry picked from commit 44010d0)

Co-authored-by: Michael Blahay <mblahay@users.noreply.github.com>
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
@bedevere-bot
Copy link

GH-103891 is a backport of this pull request to the 3.11 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.11 only security fixes label Apr 26, 2023
@ambv ambv changed the title gh-48241: Adding note about url needing to be encoded when provided to the urlopen and Request gh-48241: Clarify URL needs to be encoded when provided to urlopen and Request Apr 26, 2023
erlend-aasland pushed a commit that referenced this pull request May 9, 2023
…open and Request (GH-103855) (#103891)

(cherry picked from commit 44010d0)

Co-authored-by: Michael Blahay <mblahay@users.noreply.github.com>
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
@vadimkantorov
Copy link

vadimkantorov commented Jan 23, 2024

A bit hard to understand what means Open url, which can be either a string containing a valid, properly encoded UR and the notions valid, properly encoded. If it accepts a string, there probably need to be some RFC reference or examples of what's properly encoded and how to encode sth properly.

My problem: is urllib.request.urlopen('https://google.com/?q=…') fails with encoding error as does urllib.request.urlopen(urllib.parse.quote('https://google.com/?q=…',safe=':/')) fails with not found error as quote also corrupts the question mark: https://google.com/%3Fq%3D%E2%80%A6'

I've tried

urlparsed = urllib.parse.urlparse(url)
urlparsed_unicode_sanitized_query = urllib.parse.ParseResult(urlparsed.scheme, urlparsed.netloc, urlparsed.path, urlparsed.params, urllib.parse.quote(urlparsed.query), urlparsed.fragment)
urlopen_url = urllib.parse.urlunparse(urlparsed_unicode_sanitized_query)

but this also fails if the query string is already url-encoded :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs Documentation in the Doc dir

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants