-
Notifications
You must be signed in to change notification settings - Fork 25.7k
refactor ps benchmark #60784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor ps benchmark #60784
Conversation
[ghstack-poisoned]
💊 CI failures summary and remediationsAs of commit c52cc12 (more details on the Dr. CI page and at hud.pytorch.org/pr/60784):
🕵️ 2 new failures recognized by patternsThe following CI failures do not appear to be due to upstream breakages:
|
| Job | Step | Action |
|---|---|---|
| Chown workspace | 🔁 rerun |
This comment was automatically generated by Dr. CI (expand for details).
Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group.
Codecov Report
@@ Coverage Diff @@
## gh/gcramer23/16/base #60784 +/- ##
========================================================
- Coverage 76.22% 76.22% -0.01%
========================================================
Files 2061 2061
Lines 205068 205068
========================================================
- Hits 156316 156307 -9
- Misses 48752 48761 +9 |
This pr refactors the ps benchmark for modular trainers. [ghstack-poisoned]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Three general comments
- let's just modularize components that we expect users to override (i.e., overriding those would have impact specific to PS training efficiency).
- let's try to improve the file structure a bit to avoid tiny and fragmented files.
- The rest of the PyTorch is usually very careful when introducing new arguments to APIs, because more arguments usually is more confusing. Let's try to apply the same spirit here as well.
benchmarks/distributed/rpc/parameter_server/criterion_functions/cel.py
Outdated
Show resolved
Hide resolved
benchmarks/distributed/rpc/parameter_server/preprocess_data_functions/preprocess_dummy_data.py
Show resolved
Hide resolved
benchmarks/distributed/rpc/parameter_server/optimizer_functions/sgd_optimizer.py
Outdated
Show resolved
Hide resolved
benchmarks/distributed/rpc/parameter_server/trainers/DdpTrainer.py
Outdated
Show resolved
Hide resolved
benchmarks/distributed/rpc/parameter_server/trainers/DdpTrainer.py
Outdated
Show resolved
Hide resolved
benchmarks/distributed/rpc/parameter_server/trainers/DdpTrainer.py
Outdated
Show resolved
Hide resolved
This pr refactors the ps benchmark for modular trainers. [ghstack-poisoned]
This pr refactors the ps benchmark for modular trainers. [ghstack-poisoned]
This pr refactors the ps benchmark for modular trainers. [ghstack-poisoned]
This pr refactors the ps benchmark for modular trainers. [ghstack-poisoned]
This pr refactors the ps benchmark for modular trainers. [ghstack-poisoned]
This pr refactors the ps benchmark for modular trainers. [ghstack-poisoned]
benchmarks/distributed/rpc/parameter_server/trainer/criterions.py
Outdated
Show resolved
Hide resolved
This pr refactors the ps benchmark for modular trainers. [ghstack-poisoned]
This pr refactors the ps benchmark for modular trainers. [ghstack-poisoned]
|
@gcramer23 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
|
@gcramer23 merged this pull request in 304c02e. |
Stack from ghstack:
This pr refactors the ps benchmark for modular trainers.
Differential Revision: D29697291