[MPS] Add searchsorted op #112829

qqaatw · 2023-11-03T06:39:34Z

Stack from ghstack (oldest at bottom):

The metal kernels implemented are closely following Bucketization.cu.

Benchmark:

[----------------------------- searchsorted ----------------------------]
                                                         |  cpu   |  mps 
1 threads: --------------------------------------------------------------
      Batch size: 8; In features: 64; Sorter: True       |    44  |   530
      Batch size: 8; In features: 64; Sorter: False      |    31  |    12
      Batch size: 8; In features: 256; Sorter: True      |   131  |   520
      Batch size: 8; In features: 256; Sorter: False     |   107  |    12
      Batch size: 8; In features: 1024; Sorter: True     |   499  |   590
      Batch size: 8; In features: 1024; Sorter: False    |   398  |    12
      Batch size: 16; In features: 64; Sorter: True      |    71  |   540
      Batch size: 16; In features: 64; Sorter: False     |    57  |    12
      Batch size: 16; In features: 256; Sorter: True     |   242  |   610
      Batch size: 16; In features: 256; Sorter: False    |   200  |    12
      Batch size: 16; In features: 1024; Sorter: True    |   999  |   720
      Batch size: 16; In features: 1024; Sorter: False   |   842  |    12
      Batch size: 32; In features: 64; Sorter: True      |   124  |   509
      Batch size: 32; In features: 64; Sorter: False     |   103  |    12
      Batch size: 32; In features: 256; Sorter: True     |   477  |   650
      Batch size: 32; In features: 256; Sorter: False    |   407  |    12
      Batch size: 32; In features: 1024; Sorter: True    |  1940  |   833
      Batch size: 32; In features: 1024; Sorter: False   |  1710  |    12
      Batch size: 64; In features: 64; Sorter: True      |   231  |   590
      Batch size: 64; In features: 64; Sorter: False     |   194  |    12
      Batch size: 64; In features: 256; Sorter: True     |   937  |   710
      Batch size: 64; In features: 256; Sorter: False    |   800  |    13
      Batch size: 64; In features: 1024; Sorter: True    |  3980  |  1290
      Batch size: 64; In features: 1024; Sorter: False   |  3330  |    12
      Batch size: 128; In features: 64; Sorter: True     |   448  |   650
      Batch size: 128; In features: 64; Sorter: False    |   390  |    13
      Batch size: 128; In features: 256; Sorter: True    |  1830  |   850
      Batch size: 128; In features: 256; Sorter: False   |  1590  |    12
      Batch size: 128; In features: 1024; Sorter: True   |  7790  |  2850
      Batch size: 128; In features: 1024; Sorter: False  |  6670  |    13

[ghstack-poisoned]

pytorch-bot · 2023-11-03T06:39:38Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/112829

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 96a4912 with merge base 3a284da ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

qqaatw · 2023-11-06T19:45:20Z

Hi @albanD @malfet, can you take a look at this stack please?

malfet

Overal LGTM, but please add description (that says that it implements operator as a metal kernel following closely Bucketization.cu)
Also, would be good to add some sort of perf numbers (to show that it's faster than CPU or large enough tensors)

aten/src/ATen/native/mps/operations/Bucketization.mm

The metal kernels implemented are closely following `Bucketization.cu`. ``` [----------------------------- searchsorted ----------------------------] | cpu | mps 1 threads: -------------------------------------------------------------- Batch size: 8; In features: 64; Sorter: True | 44 | 530 Batch size: 8; In features: 64; Sorter: False | 31 | 12 Batch size: 8; In features: 256; Sorter: True | 131 | 520 Batch size: 8; In features: 256; Sorter: False | 107 | 12 Batch size: 8; In features: 1024; Sorter: True | 499 | 590 Batch size: 8; In features: 1024; Sorter: False | 398 | 12 Batch size: 16; In features: 64; Sorter: True | 71 | 540 Batch size: 16; In features: 64; Sorter: False | 57 | 12 Batch size: 16; In features: 256; Sorter: True | 242 | 610 Batch size: 16; In features: 256; Sorter: False | 200 | 12 Batch size: 16; In features: 1024; Sorter: True | 999 | 720 Batch size: 16; In features: 1024; Sorter: False | 842 | 12 Batch size: 32; In features: 64; Sorter: True | 124 | 509 Batch size: 32; In features: 64; Sorter: False | 103 | 12 Batch size: 32; In features: 256; Sorter: True | 477 | 650 Batch size: 32; In features: 256; Sorter: False | 407 | 12 Batch size: 32; In features: 1024; Sorter: True | 1940 | 833 Batch size: 32; In features: 1024; Sorter: False | 1710 | 12 Batch size: 64; In features: 64; Sorter: True | 231 | 590 Batch size: 64; In features: 64; Sorter: False | 194 | 12 Batch size: 64; In features: 256; Sorter: True | 937 | 710 Batch size: 64; In features: 256; Sorter: False | 800 | 13 Batch size: 64; In features: 1024; Sorter: True | 3980 | 1290 Batch size: 64; In features: 1024; Sorter: False | 3330 | 12 Batch size: 128; In features: 64; Sorter: True | 448 | 650 Batch size: 128; In features: 64; Sorter: False | 390 | 13 Batch size: 128; In features: 256; Sorter: True | 1830 | 850 Batch size: 128; In features: 256; Sorter: False | 1590 | 12 Batch size: 128; In features: 1024; Sorter: True | 7790 | 2850 Batch size: 128; In features: 1024; Sorter: False | 6670 | 13 ``` [ghstack-poisoned]

Pull Request resolved: #112830 Approved by: https://github.com/kulinseth, https://github.com/malfet ghstack dependencies: #112829

The metal kernels implemented are closely following `Bucketization.cu`. Benchmark: ``` [----------------------------- searchsorted ----------------------------] | cpu | mps 1 threads: -------------------------------------------------------------- Batch size: 8; In features: 64; Sorter: True | 44 | 530 Batch size: 8; In features: 64; Sorter: False | 31 | 12 Batch size: 8; In features: 256; Sorter: True | 131 | 520 Batch size: 8; In features: 256; Sorter: False | 107 | 12 Batch size: 8; In features: 1024; Sorter: True | 499 | 590 Batch size: 8; In features: 1024; Sorter: False | 398 | 12 Batch size: 16; In features: 64; Sorter: True | 71 | 540 Batch size: 16; In features: 64; Sorter: False | 57 | 12 Batch size: 16; In features: 256; Sorter: True | 242 | 610 Batch size: 16; In features: 256; Sorter: False | 200 | 12 Batch size: 16; In features: 1024; Sorter: True | 999 | 720 Batch size: 16; In features: 1024; Sorter: False | 842 | 12 Batch size: 32; In features: 64; Sorter: True | 124 | 509 Batch size: 32; In features: 64; Sorter: False | 103 | 12 Batch size: 32; In features: 256; Sorter: True | 477 | 650 Batch size: 32; In features: 256; Sorter: False | 407 | 12 Batch size: 32; In features: 1024; Sorter: True | 1940 | 833 Batch size: 32; In features: 1024; Sorter: False | 1710 | 12 Batch size: 64; In features: 64; Sorter: True | 231 | 590 Batch size: 64; In features: 64; Sorter: False | 194 | 12 Batch size: 64; In features: 256; Sorter: True | 937 | 710 Batch size: 64; In features: 256; Sorter: False | 800 | 13 Batch size: 64; In features: 1024; Sorter: True | 3980 | 1290 Batch size: 64; In features: 1024; Sorter: False | 3330 | 12 Batch size: 128; In features: 64; Sorter: True | 448 | 650 Batch size: 128; In features: 64; Sorter: False | 390 | 13 Batch size: 128; In features: 256; Sorter: True | 1830 | 850 Batch size: 128; In features: 256; Sorter: False | 1590 | 12 Batch size: 128; In features: 1024; Sorter: True | 7790 | 2850 Batch size: 128; In features: 1024; Sorter: False | 6670 | 13 ``` Pull Request resolved: pytorch#112829 Approved by: https://github.com/malfet

Pull Request resolved: pytorch#112830 Approved by: https://github.com/kulinseth, https://github.com/malfet ghstack dependencies: pytorch#112829

[MPS] Add searchsorted op

415b999

[ghstack-poisoned]

qqaatw requested review from kulinseth, mruberry and ngimel as code owners November 3, 2023 06:39

pytorch-bot bot added ciflow/mps Run MPS tests (subset of trunk) release notes: mps Release notes category labels Nov 3, 2023

qqaatw mentioned this pull request Nov 3, 2023

[MPS] Add bucketize op #112830

Closed

pytorchbot added the open source label Nov 3, 2023

qqaatw requested review from albanD and malfet November 3, 2023 15:03

malfet approved these changes Nov 6, 2023

View reviewed changes

pytorchmergebot added the Merged label Nov 7, 2023

pytorchmergebot closed this in c4bb773 Nov 7, 2023

pytorchmergebot pushed a commit that referenced this pull request Nov 7, 2023

[MPS] Add bucketize op (#112830)

740137d

Pull Request resolved: #112830 Approved by: https://github.com/kulinseth, https://github.com/malfet ghstack dependencies: #112829

facebook-github-bot deleted the gh/qqaatw/26/head branch November 11, 2023 15:24

Skylion007 pushed a commit to Skylion007/pytorch that referenced this pull request Nov 14, 2023

[MPS] Add bucketize op (pytorch#112830)

7ea9687

Pull Request resolved: pytorch#112830 Approved by: https://github.com/kulinseth, https://github.com/malfet ghstack dependencies: pytorch#112829

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[MPS] Add searchsorted op #112829

[MPS] Add searchsorted op #112829

Uh oh!

qqaatw commented Nov 3, 2023 •

edited

Loading

Uh oh!

pytorch-bot bot commented Nov 3, 2023 •

edited

Loading

Uh oh!

qqaatw commented Nov 6, 2023

Uh oh!

malfet left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[MPS] Add searchsorted op #112829

[MPS] Add searchsorted op #112829

Uh oh!

Conversation

qqaatw commented Nov 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Nov 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/112829

✅ No Failures

Uh oh!

qqaatw commented Nov 6, 2023

Uh oh!

malfet left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

qqaatw commented Nov 3, 2023 •

edited

Loading

pytorch-bot bot commented Nov 3, 2023 •

edited

Loading