[MPS] Add scatter_reduce.two #141948

malfet · 2024-12-03T06:51:58Z

Which has been request 20+ times on #77764 is just a flavor of out-of-box scatter-reduce, so all this op does is redispatches existing implementation.
Unsupported dtype/reduction type combinations:

min/max for int64
min/max for int32 on MacOS-14 or older

Following swift code demonstrates problem with scatterAlongAxis MPS call

import Metal
import MetalPerformanceShadersGraph


func scatterMPS(device: MTLDevice, 
                inp_buf: MTLBuffer, upd_buf: MTLBuffer,
                idx_buf: MTLBuffer, out_buf: MTLBuffer,
                inp_elem: Int, upd_elem: Int) {
  let graph = MPSGraph()
  let inputPlaceholder = graph.placeholder(shape: [inp_elem as NSNumber], dataType: .int64, name: nil)
  let updatesPlaceholder = graph.placeholder(shape: [upd_elem as NSNumber], dataType: .int64, name: nil)
  let indicesPlaceholder = graph.placeholder(shape: [upd_elem as NSNumber], dataType: .int64, name: nil)
  let outNode = graph.scatterAlongAxis(0, data: inputPlaceholder, updates: updatesPlaceholder, indices: indicesPlaceholder, mode: .min, name: nil)
  let mpsInputBuffer = MPSGraphTensorData(inp_buf, shape: [inp_elem as NSNumber], dataType: .int64)
  let mpsUpdatesBuffer = MPSGraphTensorData(upd_buf, shape: [upd_elem as NSNumber], dataType: .int64)
  let mpsIndicesBuffer = MPSGraphTensorData(idx_buf, shape: [upd_elem as NSNumber], dataType: .int64)
  let mpsOutputBuffer = MPSGraphTensorData(out_buf, shape: [inp_elem as NSNumber], dataType: .int64)
  guard let queue = device.makeCommandQueue() else { fatalError("Can't make queue") }
  graph.run(with: queue, feeds: [inputPlaceholder: mpsInputBuffer, 
                               updatesPlaceholder: mpsUpdatesBuffer,
                               indicesPlaceholder: mpsIndicesBuffer ],
            targetOperations: nil, resultsDictionary: [outNode: mpsOutputBuffer])
}

func makeBufferWithValues(device: MTLDevice, values: [Int64]) -> MTLBuffer {
  guard let buf = device.makeBuffer(length: values.count * MemoryLayout<Int64>.size, options: [.storageModeShared]) else { fatalError("Can't alloc") }
  let buf_data = buf.contents().assumingMemoryBound(to: Int64.self)
  for i in 0..<values.count {
    buf_data[i] = values[i]
  }
  return buf
}

guard let device = MTLCopyAllDevices().first else { fatalError("Not Metal device found") }
print("Using device \(device.name)")

let inp_elem = 4
let upd_elem = 4
let inp_buf = makeBufferWithValues(device: device, values: [1, 2, 3, 4])
let upd_buf = makeBufferWithValues(device: device, values: [Int64.max - 1, Int64.max - 2 , Int64.max >> 16 , 11])
let idx_buf = makeBufferWithValues(device: device, values: [0, 1, 2, 3])
guard let out_buf = device.makeBuffer(length:inp_elem * MemoryLayout<Int64>.size, options: [.storageModeShared]) else { fatalError("Can't alloc") }

scatterMPS(device: device, 
           inp_buf: inp_buf, upd_buf: upd_buf,
           idx_buf: idx_buf, out_buf: out_buf,
           inp_elem: inp_elem, upd_elem: upd_elem)

let obuf_data = out_buf.contents().assumingMemoryBound(to: Int64.self)
for i in 0..<inp_elem {
    print("out_buf[\(i)] = \(obuf_data[i])")
}

that prints 4294967294, 4294967293, 4294967295, 4 instead of expected 1, 2, 3, 4
Where torch.tensor([[1, 9223372036854775806], [2, 9223372036854775805], [3, 140737488355327], [4, 11]], dtype=torch.int64, device='mps').max(1) yields an expected results

pytorch-bot · 2024-12-03T06:52:02Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/141948

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 1ff4994 with merge base 78543e6 ():

NEW FAILURE - The following job has failed:

pull / cuda12.4-py3.10-gcc9-sm75 / test (pr_time_benchmarks, 1, 1, linux.g4dn.metal.nvidia.gpu) (gh)
##[error]Process completed with exit code 1.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2024-12-03T06:57:23Z

Attention! native_functions.yaml was changed

If you are adding a new function or defaulted argument to native_functions.yaml, you cannot use it from pre-existing Python frontend code until our FC window passes (two weeks). Split your PR into two PRs, one which adds the new C++ functionality, and one that makes use of it from Python, and land them two weeks apart. See https://github.com/pytorch/pytorch/wiki/PyTorch's-Python-Frontend-Backward-and-Forward-Compatibility-Policy#forwards-compatibility-fc for more info.

Caused by:

aten/src/ATen/native/native_functions.yaml

Which is just a flavor of out-of-box scatter-reduce

malfet · 2024-12-04T04:55:01Z

@pytorchbot merge -f "MPS tests + Lint are green"

pytorchmergebot · 2024-12-04T04:56:28Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Which has been request 20+ times on pytorch#77764 is just a flavor of out-of-box scatter-reduce, so all this op does is redispatches existing implementation. Unsupported dtype/reduction type combinations: - min/max for int64 - min/max for int32 on MacOS-14 or older Following swift code demonstrates problem with scatterAlongAxis MPS call ```swift import Metal import MetalPerformanceShadersGraph func scatterMPS(device: MTLDevice, inp_buf: MTLBuffer, upd_buf: MTLBuffer, idx_buf: MTLBuffer, out_buf: MTLBuffer, inp_elem: Int, upd_elem: Int) { let graph = MPSGraph() let inputPlaceholder = graph.placeholder(shape: [inp_elem as NSNumber], dataType: .int64, name: nil) let updatesPlaceholder = graph.placeholder(shape: [upd_elem as NSNumber], dataType: .int64, name: nil) let indicesPlaceholder = graph.placeholder(shape: [upd_elem as NSNumber], dataType: .int64, name: nil) let outNode = graph.scatterAlongAxis(0, data: inputPlaceholder, updates: updatesPlaceholder, indices: indicesPlaceholder, mode: .min, name: nil) let mpsInputBuffer = MPSGraphTensorData(inp_buf, shape: [inp_elem as NSNumber], dataType: .int64) let mpsUpdatesBuffer = MPSGraphTensorData(upd_buf, shape: [upd_elem as NSNumber], dataType: .int64) let mpsIndicesBuffer = MPSGraphTensorData(idx_buf, shape: [upd_elem as NSNumber], dataType: .int64) let mpsOutputBuffer = MPSGraphTensorData(out_buf, shape: [inp_elem as NSNumber], dataType: .int64) guard let queue = device.makeCommandQueue() else { fatalError("Can't make queue") } graph.run(with: queue, feeds: [inputPlaceholder: mpsInputBuffer, updatesPlaceholder: mpsUpdatesBuffer, indicesPlaceholder: mpsIndicesBuffer ], targetOperations: nil, resultsDictionary: [outNode: mpsOutputBuffer]) } func makeBufferWithValues(device: MTLDevice, values: [Int64]) -> MTLBuffer { guard let buf = device.makeBuffer(length: values.count * MemoryLayout<Int64>.size, options: [.storageModeShared]) else { fatalError("Can't alloc") } let buf_data = buf.contents().assumingMemoryBound(to: Int64.self) for i in 0..<values.count { buf_data[i] = values[i] } return buf } guard let device = MTLCopyAllDevices().first else { fatalError("Not Metal device found") } print("Using device \(device.name)") let inp_elem = 4 let upd_elem = 4 let inp_buf = makeBufferWithValues(device: device, values: [1, 2, 3, 4]) let upd_buf = makeBufferWithValues(device: device, values: [Int64.max - 1, Int64.max - 2 , Int64.max >> 16 , 11]) let idx_buf = makeBufferWithValues(device: device, values: [0, 1, 2, 3]) guard let out_buf = device.makeBuffer(length:inp_elem * MemoryLayout<Int64>.size, options: [.storageModeShared]) else { fatalError("Can't alloc") } scatterMPS(device: device, inp_buf: inp_buf, upd_buf: upd_buf, idx_buf: idx_buf, out_buf: out_buf, inp_elem: inp_elem, upd_elem: upd_elem) let obuf_data = out_buf.contents().assumingMemoryBound(to: Int64.self) for i in 0..<inp_elem { print("out_buf[\(i)] = \(obuf_data[i])") } ``` that prints `4294967294, 4294967293, 4294967295, 4` instead of expected `1, 2, 3, 4` Where `torch.tensor([[1, 9223372036854775806], [2, 9223372036854775805], [3, 140737488355327], [4, 11]], dtype=torch.int64, device='mps').max(1)` yields an expected results Pull Request resolved: pytorch#141948 Approved by: https://github.com/manuelcandales

atalman · 2025-01-21T20:31:54Z

Results running final rc for 2.6 on MacOS 15.1.1:

python test_mps.py -v -k scatter_reduce
Fail to import hypothesis in common_utils, tests are not derandomized
test_output_grad_match_scatter_reduce_amax_cpu_float16 (__main__.TestConsistencyCPU) ... ok
test_output_grad_match_scatter_reduce_amax_cpu_float32 (__main__.TestConsistencyCPU) ... ok
test_output_grad_match_scatter_reduce_amin_cpu_float16 (__main__.TestConsistencyCPU) ... ok
test_output_grad_match_scatter_reduce_amin_cpu_float32 (__main__.TestConsistencyCPU) ... ok
test_output_grad_match_scatter_reduce_mean_cpu_float16 (__main__.TestConsistencyCPU) ... ok
test_output_grad_match_scatter_reduce_mean_cpu_float32 (__main__.TestConsistencyCPU) ... ok
test_output_grad_match_scatter_reduce_prod_cpu_float16 (__main__.TestConsistencyCPU) ... ok
test_output_grad_match_scatter_reduce_prod_cpu_float32 (__main__.TestConsistencyCPU) ... ok
test_output_grad_match_scatter_reduce_sum_cpu_float16 (__main__.TestConsistencyCPU) ... ok
test_output_grad_match_scatter_reduce_sum_cpu_float32 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amax_cpu_bfloat16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amax_cpu_bool (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amax_cpu_float16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amax_cpu_float32 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amax_cpu_int16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amax_cpu_int32 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amax_cpu_int64 (__main__.TestConsistencyCPU) ... expected failure
test_output_match_scatter_reduce_amax_cpu_int8 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amax_cpu_uint8 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amin_cpu_bfloat16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amin_cpu_bool (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amin_cpu_float16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amin_cpu_float32 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amin_cpu_int16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amin_cpu_int32 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amin_cpu_int64 (__main__.TestConsistencyCPU) ... expected failure
test_output_match_scatter_reduce_amin_cpu_int8 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amin_cpu_uint8 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_mean_cpu_bfloat16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_mean_cpu_float16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_mean_cpu_float32 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_mean_cpu_int16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_mean_cpu_int32 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_mean_cpu_int64 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_mean_cpu_int8 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_mean_cpu_uint8 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_prod_cpu_bfloat16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_prod_cpu_bool (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_prod_cpu_float16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_prod_cpu_float32 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_prod_cpu_int16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_prod_cpu_int32 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_prod_cpu_int64 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_prod_cpu_int8 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_prod_cpu_uint8 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_sum_cpu_bfloat16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_sum_cpu_bool (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_sum_cpu_float16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_sum_cpu_float32 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_sum_cpu_int16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_sum_cpu_int32 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_sum_cpu_int64 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_sum_cpu_int8 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_sum_cpu_uint8 (__main__.TestConsistencyCPU) ... ok
test_scatter_reduce (__main__.TestMPS) ... /Users/atalman/Downloads/release26/pytorch/test/test_mps.py:7655: UserWarning: The reduce argument of torch.scatter with Tensor src is deprecated and will be removed in a future PyTorch release. Use torch.scatter_reduce instead for more reduction options. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/TensorAdvancedIndexing.cpp:234.)
  scatter_result = torch.scatter(x, dim=dim, index=idx, src=src, reduce=reduce_str)
ok

----------------------------------------------------------------------
Ran 55 tests in 5.660s

OK (expected failures=2)

Final rc running on MacOs 14.4:

python test_mps.py -v -k scatter_reduce
Fail to import hypothesis in common_utils, tests are not derandomized
test_output_grad_match_scatter_reduce_amax_cpu_float16 (__main__.TestConsistencyCPU) ... ok
test_output_grad_match_scatter_reduce_amax_cpu_float32 (__main__.TestConsistencyCPU) ... ok
test_output_grad_match_scatter_reduce_amin_cpu_float16 (__main__.TestConsistencyCPU) ... ok
test_output_grad_match_scatter_reduce_amin_cpu_float32 (__main__.TestConsistencyCPU) ... ok
test_output_grad_match_scatter_reduce_mean_cpu_float16 (__main__.TestConsistencyCPU) ... ok
test_output_grad_match_scatter_reduce_mean_cpu_float32 (__main__.TestConsistencyCPU) ... ok
test_output_grad_match_scatter_reduce_prod_cpu_float16 (__main__.TestConsistencyCPU) ... ok
test_output_grad_match_scatter_reduce_prod_cpu_float32 (__main__.TestConsistencyCPU) ... ok
test_output_grad_match_scatter_reduce_sum_cpu_float16 (__main__.TestConsistencyCPU) ... ok
test_output_grad_match_scatter_reduce_sum_cpu_float32 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amax_cpu_bfloat16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amax_cpu_bool (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amax_cpu_float16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amax_cpu_float32 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amax_cpu_int16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amax_cpu_int32 (__main__.TestConsistencyCPU) ... expected failure
test_output_match_scatter_reduce_amax_cpu_int64 (__main__.TestConsistencyCPU) ... expected failure
test_output_match_scatter_reduce_amax_cpu_int8 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amax_cpu_uint8 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amin_cpu_bfloat16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amin_cpu_bool (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amin_cpu_float16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amin_cpu_float32 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amin_cpu_int16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amin_cpu_int32 (__main__.TestConsistencyCPU) ... expected failure
test_output_match_scatter_reduce_amin_cpu_int64 (__main__.TestConsistencyCPU) ... expected failure
test_output_match_scatter_reduce_amin_cpu_int8 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_amin_cpu_uint8 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_mean_cpu_bfloat16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_mean_cpu_float16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_mean_cpu_float32 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_mean_cpu_int16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_mean_cpu_int32 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_mean_cpu_int64 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_mean_cpu_int8 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_mean_cpu_uint8 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_prod_cpu_bfloat16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_prod_cpu_bool (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_prod_cpu_float16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_prod_cpu_float32 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_prod_cpu_int16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_prod_cpu_int32 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_prod_cpu_int64 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_prod_cpu_int8 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_prod_cpu_uint8 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_sum_cpu_bfloat16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_sum_cpu_bool (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_sum_cpu_float16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_sum_cpu_float32 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_sum_cpu_int16 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_sum_cpu_int32 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_sum_cpu_int64 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_sum_cpu_int8 (__main__.TestConsistencyCPU) ... ok
test_output_match_scatter_reduce_sum_cpu_uint8 (__main__.TestConsistencyCPU) ... ok
test_scatter_reduce (__main__.TestMPS) ... /Users/ec2-user/test/pytorch/test/test_mps.py:7655: UserWarning: The reduce argument of torch.scatter with Tensor src is deprecated and will be removed in a future PyTorch release. Use torch.scatter_reduce instead for more reduction options. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/TensorAdvancedIndexing.cpp:234.)
  scatter_result = torch.scatter(x, dim=dim, index=idx, src=src, reduce=reduce_str)
ok

----------------------------------------------------------------------
Ran 55 tests in 10.426s

OK (expected failures=4)

malfet requested a review from kulinseth as a code owner December 3, 2024 06:51

pytorch-bot bot added ciflow/mps Run MPS tests (subset of trunk) release notes: mps Release notes category labels Dec 3, 2024

malfet requested a review from Skylion007 December 3, 2024 06:52

malfet added 3 commits December 3, 2024 09:47

[MPS] Add scatter_reduce.two

5778522

Which is just a flavor of out-of-box scatter-reduce

Restrict it this way

f5dca8d

Implicitly error out on MacOS14 for int32

c63b817

malfet force-pushed the malfet/mps-add-scatter-reduce-two branch from b8c9b2d to c63b817 Compare December 3, 2024 18:07

malfet requested review from albanD and manuelcandales December 3, 2024 18:26

malfet added this to the 2.6.0 milestone Dec 3, 2024

Fix lint

1ff4994

manuelcandales approved these changes Dec 3, 2024

View reviewed changes

pytorchmergebot added merging Merged labels Dec 4, 2024

pytorchmergebot closed this in d648133 Dec 4, 2024

pytorchmergebot removed the merging label Dec 4, 2024

malfet deleted the malfet/mps-add-scatter-reduce-two branch December 12, 2024 22:30

atalman mentioned this pull request Jan 13, 2025

Release 2.6.0 validations checklist and cherry-picks #144503

Closed

73 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[MPS] Add scatter_reduce.two #141948

[MPS] Add scatter_reduce.two #141948

Uh oh!

malfet commented Dec 3, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Dec 3, 2024 •

edited

Loading

Uh oh!

github-actions bot commented Dec 3, 2024

Uh oh!

malfet commented Dec 4, 2024

Uh oh!

pytorchmergebot commented Dec 4, 2024

Uh oh!

atalman commented Jan 21, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[MPS] Add scatter_reduce.two #141948

[MPS] Add scatter_reduce.two #141948

Uh oh!

Conversation

malfet commented Dec 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Dec 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/141948

❌ 1 New Failure

Uh oh!

github-actions bot commented Dec 3, 2024

Attention! native_functions.yaml was changed

Uh oh!

malfet commented Dec 4, 2024

Uh oh!

pytorchmergebot commented Dec 4, 2024

Merge started

Uh oh!

atalman commented Jan 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

malfet commented Dec 3, 2024 •

edited

Loading

pytorch-bot bot commented Dec 3, 2024 •

edited

Loading

atalman commented Jan 21, 2025 •

edited

Loading