KEMBAR78
email.message.EmailMessage accepts invalid header field names without error, which raise an error when parsed · Issue #127794 · python/cpython · GitHub
Skip to content

email.message.EmailMessage accepts invalid header field names without error, which raise an error when parsed #127794

@TauPan

Description

@TauPan

Bug report

Bug description:

email.message.EmailMessage accepts invalid header field names without error, which raise an error when parsed, regardless of policy and causes corrupt emails.

Case in point (with python 3.13.1 installed via pyenv, occurs in 3.11
and earlier as well):

delgado@tuxedo-e101776:~> python3.13
Python 3.13.1 (main, Dec 10 2024, 15:13:47) [GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import email.message
>>> message = email.message.EmailMessage()
>>> message.add_header('From', 'me@example.example')
None
>>> message.add_header('To', 'me@example.example')
None
>>> message.add_header('Subject', 'Example Subject')
None
>>> message.add_header('Invalid Header', 'Contains a space, which is illegal')
None
>>> message.add_header('X-Valid Header', 'Custom header as recommended')
None
>>> message.set_content('Hello, this is an example!')
None
>>> message.defects
[]
>>> message._headers
[('From', 'me@example.example'),
 ('To', 'me@example.example'),
 ('Subject', 'Example Subject'),
 ('Invalid Header', 'Contains a space, which is illegal'),
 ('X-Valid Header', 'Custom header as recommended'),
 ('Content-Type', 'text/plain; charset="utf-8"'),
 ('Content-Transfer-Encoding', '7bit'),
 ('MIME-Version', '1.0')]
>>> message.as_string()
('From: me@example.example\n'
 'To: me@example.example\n'
 'Subject: Example Subject\n'
 'Invalid Header: Contains a space, which is illegal\n'
 'X-Valid Header: Custom header as recommended\n'
 'Content-Type: text/plain; charset="utf-8"\n'
 'Content-Transfer-Encoding: 7bit\n'
 'MIME-Version: 1.0\n'
 '\n'
 'Hello, this is an example!\n')
>>> message.policy
EmailPolicy()
>>> msg_string = message.as_string()
>>> msg_string
('From: me@example.example\n'
 'To: me@example.example\n'
 'Subject: Example Subject\n'
 'Invalid Header: Contains a space, which is illegal\n'
 'X-Valid Header: Custom header as recommended\n'
 'Content-Type: text/plain; charset="utf-8"\n'
 'Content-Transfer-Encoding: 7bit\n'
 'MIME-Version: 1.0\n'
 '\n'
 'Hello, this is an example!\n')
>>> import email.parser
>>> parsed_message = email.parser.Parser().parsestr(msg_string)
>>> parsed_message._headers
[('From', 'me@example.example'),
 ('To', 'me@example.example'),
 ('Subject', 'Example Subject')]
>>> parsed_message.as_string()
('From: me@example.example\n'
 'To: me@example.example\n'
 'Subject: Example Subject\n'
 '\n'
 'Invalid Header: Contains a space, which is illegal\n'
 'X-Valid Header: Custom header as recommended\n'
 'Content-Type: text/plain; charset="utf-8"\n'
 'Content-Transfer-Encoding: 7bit\n'
 'MIME-Version: 1.0\n'
 '\n'
 'Hello, this is an example!\n')
>>> parsed_message.policy
Compat32()
>>> parsed_message.defects
[MissingHeaderBodySeparatorDefect()]
>>> import email.policy
>>> parsed_message_strict = email.parser.Parser(policy=email.policy.strict).parsestr(msg_string)
Traceback (most recent call last):
  File "<python-input-19>", line 1, in <module>
    parsed_message_strict = email.parser.Parser(policy=email.policy.strict).parsestr(msg_string)
  File "/home/delgado/git/pyenv/versions/3.13.1/lib/python3.13/email/parser.py", line 64, in parsestr
    return self.parse(StringIO(text), headersonly=headersonly)
           ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/delgado/git/pyenv/versions/3.13.1/lib/python3.13/email/parser.py", line 53, in parse
    feedparser.feed(data)
    ~~~~~~~~~~~~~~~^^^^^^
  File "/home/delgado/git/pyenv/versions/3.13.1/lib/python3.13/email/feedparser.py", line 176, in feed
    self._call_parse()
    ~~~~~~~~~~~~~~~~^^
  File "/home/delgado/git/pyenv/versions/3.13.1/lib/python3.13/email/feedparser.py", line 180, in _call_parse
    self._parse()
    ~~~~~~~~~~~^^
  File "/home/delgado/git/pyenv/versions/3.13.1/lib/python3.13/email/feedparser.py", line 234, in _parsegen
    self.policy.handle_defect(self._cur, defect)
    ~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
  File "/home/delgado/git/pyenv/versions/3.13.1/lib/python3.13/email/_policybase.py", line 193, in handle_defect
    raise defect
email.errors.MissingHeaderBodySeparatorDefect
>>> parsed_message_nonstrict = email.parser.Parser(policy=email.policy.default).parsestr(msg_string)
>>> parsed_message_nonstrict.as_string()
('From: me@example.example\n'
 'To: me@example.example\n'
 'Subject: Example Subject\n'
 '\n'
 'Invalid Header: Contains a space, which is illegal\n'
 'X-Valid Header: Custom header as recommended\n'
 'Content-Type: text/plain; charset="utf-8"\n'
 'Content-Transfer-Encoding: 7bit\n'
 'MIME-Version: 1.0\n'
 '\n'
 'Hello, this is an example!\n')
>>> parsed_message_nonstrict.defects
[MissingHeaderBodySeparatorDefect()]

The illegal header field name is accepted by EmailMessage without a defect, but when the resulting message is parsed, regardless of policy, it looks to me like header parsing stops at that point and the line with the defect header is viewed as first line of the body, which leads to the MissingHeaderBodySeparatorDefect.

It's interesting that email.headers contains the following:

# Field name regexp, including trailing colon, but not separating whitespace,
# according to RFC 2822.  Character range is from tilde to exclamation mark.
# For use with .match()
fcre = re.compile(r'[\041-\176]+:$')

which is the correct regex according to the rfc, including the final colon, which apparently isn't used anywhere in the code.

A MUA (such as claws or mutt) will display the resulting email with the remaining headers as part of the body, breaking any mime multipart rendering.

CPython versions tested on:

3.11, 3.13

Operating systems tested on:

Linux

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibStandard Library Python modules in the Lib/ directorytopic-emailtype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions