KEMBAR78
gh-103092: Isolate `_decimal` by CharlieZhao95 · Pull Request #103381 · python/cpython · GitHub
Skip to content

Conversation

@CharlieZhao95
Copy link
Contributor

@CharlieZhao95 CharlieZhao95 commented Apr 8, 2023

This PR is based on @erlend-aasland 's great work. Currently, most of the test cases can be passed, but still some work needs to be done. I make this PR as a draft so anyone can remind me of missing work or improve this code.

TODO

  • Test references leak and performance
  • Solve compilation warnings
  • Convert cond_map to heap type (not recorded in global-to-fix, but it seems like we should handle it)
  • Adapt multi-phase initialization
  • Remove redundant code and improve some dirty code

@erlend-aasland
Copy link
Contributor

  • Convert cond_map to heap type (not recorded in global-to-fix, but it seems like we should handle it)

I'm not sure I follow; cond_map is not a type.

Remove redundant code and improve some dirty code

Can you please spell this out? We cannot guess what you mean by this :)

@CharlieZhao95
Copy link
Contributor Author

CharlieZhao95 commented Apr 9, 2023

  • Convert cond_map to heap type (not recorded in global-to-fix, but it seems like we should handle it)

I'm not sure I follow; cond_map is not a type.

Background

When I executed test_decimal.py after moving all variables in global-to-fix.csv into decimal_state, it raised an error:

Traceback (most recent call last):
  File "F:\CPython\github\cpython\Lib\test\test_decimal.py", line 5900, in <module>
    test_main(arith=True, verbose=True)
  File "F:\CPython\github\cpython\Lib\test\test_decimal.py", line 5840, in test_main
    init(C)
  File "F:\CPython\github\cpython\Lib\test\test_decimal.py", line 110, in init
    DefaultTestContext = m.Context(
                         ^^^^^^^^^^
KeyError: 'invalid signal dict'

I found that the place where the error occurs is PyDict_GetItemWithError(val, cm->ex) in dict_as_flags. This seems to be caused by the static signal_map whose members are assigned to types in module state.

So I guess signal_map should be moved to decimal_state (you can view my latest commit, maybe it's not the best solution). Likewise, cond_map[i].ex will be assigned to signal_map[i].ex, so it probably needs to be moved as well.

I'm not sure if these works are necessary, please correct me if I'm wrong :)

Remove redundant code and improve some dirty code

Can you please spell this out? We cannot guess what you mean by this :)

Of course, there is currently some unclean code. For example, signal_map_init or some code with FIXME comments, they work but are ugly. I will list the code that need to be modified after I have finished thinking. :)

@erlend-aasland
Copy link
Contributor

erlend-aasland commented Apr 9, 2023

Remove redundant code and improve some dirty code

Can you please spell this out? We cannot guess what you mean by this :)

Of course, there is currently some unclean code. For example, signal_map_init or some code with FIXME comments, they work but are ugly. I will list the code that need to be modified after I have finished thinking. :)

Is this code added by this PR, or code that is already there? For the latter, please do not include such changes in this PR. Instead, create an issue (unless there already is one), explain what you intend to do, and create a separate PR. This PR should focus on isolating _decimal, and nothing else.

@CharlieZhao95
Copy link
Contributor Author

If this code added by this PR, or code that is already there?

Sorry for not being clear. I mean the code added by this PR. I like to leave "FIXME" in my code to remind myself of potential work :)

@CharlieZhao95
Copy link
Contributor Author

I tested the reference counts using erlend's script and they look well.

$./python measure.py
before=90491, after=93999
before=94000, after=94000
before=94000, after=94000
before=94000, after=94000
before=94000, after=94000

Also, I compared the execution time of the test_decimal between the current branch and python3.12, and there is no significant change in them. More detailed benchmark results can be viewed in Python Speed ​​Center after the merge.

@CharlieZhao95
Copy link
Contributor Author

Of course, there is currently some unclean code. For example, signal_map_init or some code with FIXME comments, they work but are ugly. I will list the code that need to be modified after I have finished thinking. :)

The code that I can think of that may need to be improved is as follows:

  1. If we call the mpd_setminalloc function multiple times, this function is called in decimal_exec, it will raise a warning. I temporarily use a static variable to ensure that it is not called multiple times. This doesn't seem pretty, any suggestions for improvement?
  2. Should we move signal_map into decimal_state(to fix the failing test cases, please see comment), is there a better solution?

@CharlieZhao95 CharlieZhao95 marked this pull request as ready for review May 26, 2023 08:54
@CharlieZhao95 CharlieZhao95 requested a review from rhettinger as a code owner May 26, 2023 08:54
@rhettinger rhettinger removed their request for review May 26, 2023 16:16
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
@kumaraditya303 kumaraditya303 added the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Jun 1, 2023
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @kumaraditya303 for commit 58f0049 🤖

If you want to schedule another build, you need to add the 🔨 test-with-refleak-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Jun 1, 2023
@erlend-aasland
Copy link
Contributor

I'm sorry, but I don't have the bandwidth to review this in the near future.

int n, mem;

assert(PyDecContext_Check(self));
decimal_state *state = get_module_state_by_def(Py_TYPE(self));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guard with Py_DEBUG

@erlend-aasland
Copy link
Contributor

IMO, we should break this up into multiple PRs, like we did with _io; it will be easier to bisect if (when) bugs appear. See also PEP-687 for suggestions on how to break this up.

@CharlieZhao95
Copy link
Contributor Author

IMO, we should break this up into multiple PRs, like we did with _io; it will be easier to bisect if (when) bugs appear. See also PEP-687 for suggestions on how to break this up.

Thanks. This PR made a lot of changes at the same time, which might make it difficult to review. But the _deciaml module is simpler than the _io module, and I think 4~5 PRs are enough.

I plan to break the PR into the following parts:

  1. Establish a global module state and port some simple static types
  2. Move other global static variables to the global module state
  3. Convert the global module state to true module state

Potential work: Add Argument Clinic (It might be worth opening a new issue)

And I noticed that @kumaraditya303 has already done some review work, what do you think?

@kumaraditya303
Copy link
Contributor

SGTM, ping me when you are done,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants