-
Notifications
You must be signed in to change notification settings - Fork 25.7k
[MPS] Fix memory leak #142052
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
[MPS] Fix memory leak #142052
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/142052
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 9403bd2 with merge base 61dc5e9 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This was referenced Dec 4, 2024
manuelcandales
approved these changes
Dec 4, 2024
pytorchmergebot
pushed a commit
that referenced
this pull request
Dec 4, 2024
By releasing retained `id<MTLFunction>` and `id<MTLComputePipelineState>`
Please note, that `id<MTLLibrary>` associated with class are currently leaked, which is by design, all dynamic shader allocations shoudl use `DynamicMetalShaderLibrary`
Test plan: `leaks --atExit -- ./bin/mps_test_metal_library`
Before:
```
STACK OF 1 INSTANCE OF 'ROOT LEAK: <_MTLFunctionInternal>':
18 dyld 0x197a94274 start + 2840
17 mps_test_metal_library 0x1002cb420 main + 68
16 mps_test_metal_library 0x1002fa388 testing::UnitTest::Run() + 124
15 mps_test_metal_library 0x1002fa40c bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) + 80
14 mps_test_metal_library 0x1002fac50 testing::internal::UnitTestImpl::RunAllTests() + 1588
13 mps_test_metal_library 0x1002e9934 testing::TestSuite::Run() + 1032
12 mps_test_metal_library 0x1002e8688 testing::TestInfo::Run() + 960
11 mps_test_metal_library 0x1002e715c testing::Test::Run() + 812
10 mps_test_metal_library 0x1002e7200 void testing::internal::HandleExceptionsInMethodIfSupported<testing::TestSuite, void>(testing::TestSuite*, void (testing::TestSuite::*)(), char const*) + 80
9 mps_test_metal_library 0x1002c5518 MPSTestMetalLibrary_ArangeShader_Test::TestBody() + 420
8 libtorch_cpu.dylib 0x10fdd3804 at::native::mps::MetalShaderLibrary::getKernelFunction(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) + 56
7 libtorch_cpu.dylib 0x10fdd3394 at::native::mps::MetalShaderLibrary::getLibraryPipelineState(id<MTLLibrary>, std::__1::basic_string<char, id<MTLLibrary>::char_traits<char>, id<MTLLibrary>::allocator<char>> const&) + 268
6 com.apple.Metal 0x1a2be43b4 -[_MTLLibrary newFunctionWithName:] + 28
5 com.apple.Metal 0x1a2be4498 -[_MTLLibrary newFunctionWithNameInternal:] + 148
4 com.apple.Metal 0x1a2be4580 MTLLibraryContainer::functionWithName(NSString*, id<MTLDevice>) + 68
3 com.apple.Metal 0x1a2be4724 MTLLibraryDataWithArchive::newFunction(NSString*, id<MTLDevice>) + 368
2 libobjc.A.dylib 0x197a49ddc _objc_rootAllocWithZone + 48
1 libsystem_malloc.dylib 0x197c3baf8 _calloc + 88
0 libsystem_malloc.dylib 0x197c4e9bc _malloc_zone_calloc_instrumented_or_legacy + 128
====
2 (592 bytes) ROOT LEAK: <_MTLFunctionInternal 0x1325e5550> [448]
1 (144 bytes) _functionQueue --> <dispatch_queue_t (serial) 0x13254c340> [144] "function queue" (from Metal)
```
After:
```
Process: mps_test_metal_library [30687]
Path: /Users/USER/*/mps_test_metal_library
Load Address: 0x100f74000
Identifier: mps_test_metal_library
Version: 0
Code Type: ARM64
Platform: macOS
Parent Process: leaks [30686]
Date/Time: 2024-12-04 07:57:01.020 -0800
Launch Time: 2024-12-04 07:56:59.030 -0800
OS Version: macOS 15.1.1 (24B2091)
Report Version: 7
Analysis Tool: /usr/bin/leaks
Physical footprint: 177.2M
Physical footprint (peak): 236.5M
Idle exit: untracked
----
leaks Report Version: 4.0, multi-line stacks
Process 30687: 40691 nodes malloced for 5575 KB
Process 30687: 0 leaks for 0 total leaked bytes.
```
Pull Request resolved: #142053
Approved by: https://github.com/manuelcandales
ghstack dependencies: #142052
pobin6
pushed a commit
to pobin6/pytorch
that referenced
this pull request
Dec 5, 2024
`NSProcessInfo` was allocated inside autorelease pool, but was not added to the pool
Test plan: `leaks --atExit -- ./bin/mps_test_print`
Before it reported the leaks as follows
```
leaks Report Version: 4.0, multi-line stacks
Process 30066: 39595 nodes malloced for 5034 KB
Process 30066: 7 leaks for 448 total leaked bytes.
STACK OF 1 INSTANCE OF 'ROOT LEAK: <NSProcessInfo>':
29 dyld 0x197a94274 start + 2840
28 mps_test_print 0x10224440c main + 68
27 mps_test_print 0x1022733e4 testing::UnitTest::Run() + 124
26 mps_test_print 0x102273468 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) + 80
25 mps_test_print 0x102273cac testing::internal::UnitTestImpl::RunAllTests() + 1588
24 mps_test_print 0x102262990 testing::TestSuite::Run() + 1032
23 mps_test_print 0x1022616e4 testing::TestInfo::Run() + 960
22 mps_test_print 0x1022601b8 testing::Test::Run() + 812
21 mps_test_print 0x10226025c void testing::internal::HandleExceptionsInMethodIfSupported<testing::TestSuite, void>(testing::TestSuite*, void (testing::TestSuite::*)(), char const*) + 80
20 mps_test_print 0x102240f88 MPSPrintTest_PrintFloatMatrix_Test::TestBody() + 88
19 mps_test_print 0x1022414f4 torch::randn(c10::ArrayRef<long long>, c10::TensorOptions) + 72
18 libtorch_cpu.dylib 0x10de1cb34 at::_ops::randn::call(c10::ArrayRef<c10::SymInt>, std::__1::optional<c10::ScalarType>, std::__1::optional<c10::Layout>, std::__1::optional<c10::Device>, std::__1::optional<bool>) + 280
17 libtorch_cpu.dylib 0x10de1cf1c at::_ops::randn::redispatch(c10::DispatchKeySet, c10::ArrayRef<c10::SymInt>, std::__1::optional<c10::ScalarType>, std::__1::optional<c10::Layout>, std::__1::optional<c10::Device>, std::__1::optional<bool>) + 152
16 libtorch_cpu.dylib 0x10d9b1078 at::native::randn(c10::ArrayRef<long long>, std::__1::optional<c10::ScalarType>, std::__1::optional<c10::Layout>, std::__1::optional<c10::Device>, std::__1::optional<bool>) + 60
15 libtorch_cpu.dylib 0x10d9b1220 at::native::randn(c10::ArrayRef<long long>, std::__1::optional<at::Generator>, std::__1::optional<c10::ScalarType>, std::__1::optional<c10::Layout>, std::__1::optional<c10::Device>, std::__1::optional<bool>) + 256
14 libtorch_cpu.dylib 0x10e0151f8 at::_ops::normal_::call(at::Tensor&, double, double, std::__1::optional<at::Generator>) + 476
13 libtorch_cpu.dylib 0x10f08ceac c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor& (at::Tensor&, double, double, std::__1::optional<at::Generator>), &at::(anonymous namespace)::(anonymous namespace)::wrapper_MPS__normal_(at::Tensor&, double, double, std::__1::optional<at::Generator>)>, at::Tensor&, c10::guts::typelist::typelist<at::Tensor&, double, double, std::__1::optional<at::Generator>>>, at::Tensor& (at::Tensor&, double, double, std::__1::optional<at::Generator>)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor&, double, double, std::__1::optional<at::Generator>) + 84
12 libtorch_cpu.dylib 0x10f037674 at::(anonymous namespace)::(anonymous namespace)::wrapper_MPS__normal_(at::Tensor&, double, double, std::__1::optional<at::Generator>) + 72
11 libtorch_cpu.dylib 0x111d8bde8 at::native::normal_mps_(at::Tensor&, double, double, std::__1::optional<at::Generator>) + 132
10 libtorch_cpu.dylib 0x111d8c334 at::native::mps::normal_mps_impl(at::Tensor&, double, double, std::__1::optional<at::Tensor> const&, std::__1::optional<at::Tensor> const&, std::__1::optional<at::Generator>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>) + 884
9 libtorch_cpu.dylib 0x111d8b8d8 at::Tensor& at::native::mps::random_mps_impl<double>(at::Tensor&, double, double, std::__1::optional<at::Tensor> const&, std::__1::optional<at::Tensor> const&, MPSGraphRandomDistribution, std::__1::optional<at::Generator>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, MPSGraphTensor* (at::native::mps::RandomCachedGraph*, MPSGraphTensor*) block_pointer) + 2508
8 libtorch_cpu.dylib 0x111d453bc at::native::mps::Placeholder::Placeholder(MPSGraphTensor*, at::Tensor const&, NSArray<NSNumber*>*, bool, MPSDataType, bool) + 5120
7 libtorch_cpu.dylib 0x111d2dbc8 at::mps::MPSDevice::isMacOS13Plus(at::mps::MacOSVersion) const + 404
6 libtorch_cpu.dylib 0x111d2ddf0 at::mps::MPSDevice::isMacOS13Plus(at::mps::MacOSVersion) const::$_0::operator()(int, int) const + 48
5 libobjc.A.dylib 0x197a7b3f4 objc_alloc_init + 80
4 com.apple.Foundation 0x19995fbe4 +[NSProcessInfo alloc] + 112
3 com.apple.Foundation 0x19995faec +[NSProcessInfo allocWithZone:] + 120
2 libobjc.A.dylib 0x197a49ddc _objc_rootAllocWithZone + 48
1 libsystem_malloc.dylib 0x197c3baf8 _calloc + 88
0 libsystem_malloc.dylib 0x197c4e9bc _malloc_zone_calloc_instrumented_or_legacy + 128
====
1 (64 bytes) ROOT LEAK: <NSProcessInfo 0x102ce4de0> [64]
```
After test run finishes with no leaks reported
```
Process 29875 is not debuggable. Due to security restrictions, leaks can only show or save contents of readonly memory of restricted processes.
Process: mps_test_print [29875]
Path: /Users/USER/*/mps_test_print
Load Address: 0x10223c000
Identifier: mps_test_print
Version: 0
Code Type: ARM64
Platform: macOS
Parent Process: leaks [29874]
Date/Time: 2024-12-04 07:43:15.287 -0800
Launch Time: 2024-12-04 07:43:14.400 -0800
OS Version: macOS 15.1.1 (24B2091)
Report Version: 7
Analysis Tool: /usr/bin/leaks
Physical footprint: 172.0M
Physical footprint (peak): 234.1M
Idle exit: untracked
----
leaks Report Version: 4.0, multi-line stacks
Process 29875: 39508 nodes malloced for 5021 KB
Process 29875: 0 leaks for 0 total leaked bytes.
```
Pull Request resolved: pytorch#142052
Approved by: https://github.com/manuelcandales
pobin6
pushed a commit
to pobin6/pytorch
that referenced
this pull request
Dec 5, 2024
By releasing retained `id<MTLFunction>` and `id<MTLComputePipelineState>`
Please note, that `id<MTLLibrary>` associated with class are currently leaked, which is by design, all dynamic shader allocations shoudl use `DynamicMetalShaderLibrary`
Test plan: `leaks --atExit -- ./bin/mps_test_metal_library`
Before:
```
STACK OF 1 INSTANCE OF 'ROOT LEAK: <_MTLFunctionInternal>':
18 dyld 0x197a94274 start + 2840
17 mps_test_metal_library 0x1002cb420 main + 68
16 mps_test_metal_library 0x1002fa388 testing::UnitTest::Run() + 124
15 mps_test_metal_library 0x1002fa40c bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) + 80
14 mps_test_metal_library 0x1002fac50 testing::internal::UnitTestImpl::RunAllTests() + 1588
13 mps_test_metal_library 0x1002e9934 testing::TestSuite::Run() + 1032
12 mps_test_metal_library 0x1002e8688 testing::TestInfo::Run() + 960
11 mps_test_metal_library 0x1002e715c testing::Test::Run() + 812
10 mps_test_metal_library 0x1002e7200 void testing::internal::HandleExceptionsInMethodIfSupported<testing::TestSuite, void>(testing::TestSuite*, void (testing::TestSuite::*)(), char const*) + 80
9 mps_test_metal_library 0x1002c5518 MPSTestMetalLibrary_ArangeShader_Test::TestBody() + 420
8 libtorch_cpu.dylib 0x10fdd3804 at::native::mps::MetalShaderLibrary::getKernelFunction(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) + 56
7 libtorch_cpu.dylib 0x10fdd3394 at::native::mps::MetalShaderLibrary::getLibraryPipelineState(id<MTLLibrary>, std::__1::basic_string<char, id<MTLLibrary>::char_traits<char>, id<MTLLibrary>::allocator<char>> const&) + 268
6 com.apple.Metal 0x1a2be43b4 -[_MTLLibrary newFunctionWithName:] + 28
5 com.apple.Metal 0x1a2be4498 -[_MTLLibrary newFunctionWithNameInternal:] + 148
4 com.apple.Metal 0x1a2be4580 MTLLibraryContainer::functionWithName(NSString*, id<MTLDevice>) + 68
3 com.apple.Metal 0x1a2be4724 MTLLibraryDataWithArchive::newFunction(NSString*, id<MTLDevice>) + 368
2 libobjc.A.dylib 0x197a49ddc _objc_rootAllocWithZone + 48
1 libsystem_malloc.dylib 0x197c3baf8 _calloc + 88
0 libsystem_malloc.dylib 0x197c4e9bc _malloc_zone_calloc_instrumented_or_legacy + 128
====
2 (592 bytes) ROOT LEAK: <_MTLFunctionInternal 0x1325e5550> [448]
1 (144 bytes) _functionQueue --> <dispatch_queue_t (serial) 0x13254c340> [144] "function queue" (from Metal)
```
After:
```
Process: mps_test_metal_library [30687]
Path: /Users/USER/*/mps_test_metal_library
Load Address: 0x100f74000
Identifier: mps_test_metal_library
Version: 0
Code Type: ARM64
Platform: macOS
Parent Process: leaks [30686]
Date/Time: 2024-12-04 07:57:01.020 -0800
Launch Time: 2024-12-04 07:56:59.030 -0800
OS Version: macOS 15.1.1 (24B2091)
Report Version: 7
Analysis Tool: /usr/bin/leaks
Physical footprint: 177.2M
Physical footprint (peak): 236.5M
Idle exit: untracked
----
leaks Report Version: 4.0, multi-line stacks
Process 30687: 40691 nodes malloced for 5575 KB
Process 30687: 0 leaks for 0 total leaked bytes.
```
Pull Request resolved: pytorch#142053
Approved by: https://github.com/manuelcandales
ghstack dependencies: pytorch#142052
AmdSampsa
pushed a commit
to AmdSampsa/pytorch
that referenced
this pull request
Dec 9, 2024
`NSProcessInfo` was allocated inside autorelease pool, but was not added to the pool
Test plan: `leaks --atExit -- ./bin/mps_test_print`
Before it reported the leaks as follows
```
leaks Report Version: 4.0, multi-line stacks
Process 30066: 39595 nodes malloced for 5034 KB
Process 30066: 7 leaks for 448 total leaked bytes.
STACK OF 1 INSTANCE OF 'ROOT LEAK: <NSProcessInfo>':
29 dyld 0x197a94274 start + 2840
28 mps_test_print 0x10224440c main + 68
27 mps_test_print 0x1022733e4 testing::UnitTest::Run() + 124
26 mps_test_print 0x102273468 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) + 80
25 mps_test_print 0x102273cac testing::internal::UnitTestImpl::RunAllTests() + 1588
24 mps_test_print 0x102262990 testing::TestSuite::Run() + 1032
23 mps_test_print 0x1022616e4 testing::TestInfo::Run() + 960
22 mps_test_print 0x1022601b8 testing::Test::Run() + 812
21 mps_test_print 0x10226025c void testing::internal::HandleExceptionsInMethodIfSupported<testing::TestSuite, void>(testing::TestSuite*, void (testing::TestSuite::*)(), char const*) + 80
20 mps_test_print 0x102240f88 MPSPrintTest_PrintFloatMatrix_Test::TestBody() + 88
19 mps_test_print 0x1022414f4 torch::randn(c10::ArrayRef<long long>, c10::TensorOptions) + 72
18 libtorch_cpu.dylib 0x10de1cb34 at::_ops::randn::call(c10::ArrayRef<c10::SymInt>, std::__1::optional<c10::ScalarType>, std::__1::optional<c10::Layout>, std::__1::optional<c10::Device>, std::__1::optional<bool>) + 280
17 libtorch_cpu.dylib 0x10de1cf1c at::_ops::randn::redispatch(c10::DispatchKeySet, c10::ArrayRef<c10::SymInt>, std::__1::optional<c10::ScalarType>, std::__1::optional<c10::Layout>, std::__1::optional<c10::Device>, std::__1::optional<bool>) + 152
16 libtorch_cpu.dylib 0x10d9b1078 at::native::randn(c10::ArrayRef<long long>, std::__1::optional<c10::ScalarType>, std::__1::optional<c10::Layout>, std::__1::optional<c10::Device>, std::__1::optional<bool>) + 60
15 libtorch_cpu.dylib 0x10d9b1220 at::native::randn(c10::ArrayRef<long long>, std::__1::optional<at::Generator>, std::__1::optional<c10::ScalarType>, std::__1::optional<c10::Layout>, std::__1::optional<c10::Device>, std::__1::optional<bool>) + 256
14 libtorch_cpu.dylib 0x10e0151f8 at::_ops::normal_::call(at::Tensor&, double, double, std::__1::optional<at::Generator>) + 476
13 libtorch_cpu.dylib 0x10f08ceac c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor& (at::Tensor&, double, double, std::__1::optional<at::Generator>), &at::(anonymous namespace)::(anonymous namespace)::wrapper_MPS__normal_(at::Tensor&, double, double, std::__1::optional<at::Generator>)>, at::Tensor&, c10::guts::typelist::typelist<at::Tensor&, double, double, std::__1::optional<at::Generator>>>, at::Tensor& (at::Tensor&, double, double, std::__1::optional<at::Generator>)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor&, double, double, std::__1::optional<at::Generator>) + 84
12 libtorch_cpu.dylib 0x10f037674 at::(anonymous namespace)::(anonymous namespace)::wrapper_MPS__normal_(at::Tensor&, double, double, std::__1::optional<at::Generator>) + 72
11 libtorch_cpu.dylib 0x111d8bde8 at::native::normal_mps_(at::Tensor&, double, double, std::__1::optional<at::Generator>) + 132
10 libtorch_cpu.dylib 0x111d8c334 at::native::mps::normal_mps_impl(at::Tensor&, double, double, std::__1::optional<at::Tensor> const&, std::__1::optional<at::Tensor> const&, std::__1::optional<at::Generator>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>) + 884
9 libtorch_cpu.dylib 0x111d8b8d8 at::Tensor& at::native::mps::random_mps_impl<double>(at::Tensor&, double, double, std::__1::optional<at::Tensor> const&, std::__1::optional<at::Tensor> const&, MPSGraphRandomDistribution, std::__1::optional<at::Generator>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, MPSGraphTensor* (at::native::mps::RandomCachedGraph*, MPSGraphTensor*) block_pointer) + 2508
8 libtorch_cpu.dylib 0x111d453bc at::native::mps::Placeholder::Placeholder(MPSGraphTensor*, at::Tensor const&, NSArray<NSNumber*>*, bool, MPSDataType, bool) + 5120
7 libtorch_cpu.dylib 0x111d2dbc8 at::mps::MPSDevice::isMacOS13Plus(at::mps::MacOSVersion) const + 404
6 libtorch_cpu.dylib 0x111d2ddf0 at::mps::MPSDevice::isMacOS13Plus(at::mps::MacOSVersion) const::$_0::operator()(int, int) const + 48
5 libobjc.A.dylib 0x197a7b3f4 objc_alloc_init + 80
4 com.apple.Foundation 0x19995fbe4 +[NSProcessInfo alloc] + 112
3 com.apple.Foundation 0x19995faec +[NSProcessInfo allocWithZone:] + 120
2 libobjc.A.dylib 0x197a49ddc _objc_rootAllocWithZone + 48
1 libsystem_malloc.dylib 0x197c3baf8 _calloc + 88
0 libsystem_malloc.dylib 0x197c4e9bc _malloc_zone_calloc_instrumented_or_legacy + 128
====
1 (64 bytes) ROOT LEAK: <NSProcessInfo 0x102ce4de0> [64]
```
After test run finishes with no leaks reported
```
Process 29875 is not debuggable. Due to security restrictions, leaks can only show or save contents of readonly memory of restricted processes.
Process: mps_test_print [29875]
Path: /Users/USER/*/mps_test_print
Load Address: 0x10223c000
Identifier: mps_test_print
Version: 0
Code Type: ARM64
Platform: macOS
Parent Process: leaks [29874]
Date/Time: 2024-12-04 07:43:15.287 -0800
Launch Time: 2024-12-04 07:43:14.400 -0800
OS Version: macOS 15.1.1 (24B2091)
Report Version: 7
Analysis Tool: /usr/bin/leaks
Physical footprint: 172.0M
Physical footprint (peak): 234.1M
Idle exit: untracked
----
leaks Report Version: 4.0, multi-line stacks
Process 29875: 39508 nodes malloced for 5021 KB
Process 29875: 0 leaks for 0 total leaked bytes.
```
Pull Request resolved: pytorch#142052
Approved by: https://github.com/manuelcandales
AmdSampsa
pushed a commit
to AmdSampsa/pytorch
that referenced
this pull request
Dec 9, 2024
By releasing retained `id<MTLFunction>` and `id<MTLComputePipelineState>`
Please note, that `id<MTLLibrary>` associated with class are currently leaked, which is by design, all dynamic shader allocations shoudl use `DynamicMetalShaderLibrary`
Test plan: `leaks --atExit -- ./bin/mps_test_metal_library`
Before:
```
STACK OF 1 INSTANCE OF 'ROOT LEAK: <_MTLFunctionInternal>':
18 dyld 0x197a94274 start + 2840
17 mps_test_metal_library 0x1002cb420 main + 68
16 mps_test_metal_library 0x1002fa388 testing::UnitTest::Run() + 124
15 mps_test_metal_library 0x1002fa40c bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) + 80
14 mps_test_metal_library 0x1002fac50 testing::internal::UnitTestImpl::RunAllTests() + 1588
13 mps_test_metal_library 0x1002e9934 testing::TestSuite::Run() + 1032
12 mps_test_metal_library 0x1002e8688 testing::TestInfo::Run() + 960
11 mps_test_metal_library 0x1002e715c testing::Test::Run() + 812
10 mps_test_metal_library 0x1002e7200 void testing::internal::HandleExceptionsInMethodIfSupported<testing::TestSuite, void>(testing::TestSuite*, void (testing::TestSuite::*)(), char const*) + 80
9 mps_test_metal_library 0x1002c5518 MPSTestMetalLibrary_ArangeShader_Test::TestBody() + 420
8 libtorch_cpu.dylib 0x10fdd3804 at::native::mps::MetalShaderLibrary::getKernelFunction(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) + 56
7 libtorch_cpu.dylib 0x10fdd3394 at::native::mps::MetalShaderLibrary::getLibraryPipelineState(id<MTLLibrary>, std::__1::basic_string<char, id<MTLLibrary>::char_traits<char>, id<MTLLibrary>::allocator<char>> const&) + 268
6 com.apple.Metal 0x1a2be43b4 -[_MTLLibrary newFunctionWithName:] + 28
5 com.apple.Metal 0x1a2be4498 -[_MTLLibrary newFunctionWithNameInternal:] + 148
4 com.apple.Metal 0x1a2be4580 MTLLibraryContainer::functionWithName(NSString*, id<MTLDevice>) + 68
3 com.apple.Metal 0x1a2be4724 MTLLibraryDataWithArchive::newFunction(NSString*, id<MTLDevice>) + 368
2 libobjc.A.dylib 0x197a49ddc _objc_rootAllocWithZone + 48
1 libsystem_malloc.dylib 0x197c3baf8 _calloc + 88
0 libsystem_malloc.dylib 0x197c4e9bc _malloc_zone_calloc_instrumented_or_legacy + 128
====
2 (592 bytes) ROOT LEAK: <_MTLFunctionInternal 0x1325e5550> [448]
1 (144 bytes) _functionQueue --> <dispatch_queue_t (serial) 0x13254c340> [144] "function queue" (from Metal)
```
After:
```
Process: mps_test_metal_library [30687]
Path: /Users/USER/*/mps_test_metal_library
Load Address: 0x100f74000
Identifier: mps_test_metal_library
Version: 0
Code Type: ARM64
Platform: macOS
Parent Process: leaks [30686]
Date/Time: 2024-12-04 07:57:01.020 -0800
Launch Time: 2024-12-04 07:56:59.030 -0800
OS Version: macOS 15.1.1 (24B2091)
Report Version: 7
Analysis Tool: /usr/bin/leaks
Physical footprint: 177.2M
Physical footprint (peak): 236.5M
Idle exit: untracked
----
leaks Report Version: 4.0, multi-line stacks
Process 30687: 40691 nodes malloced for 5575 KB
Process 30687: 0 leaks for 0 total leaked bytes.
```
Pull Request resolved: pytorch#142053
Approved by: https://github.com/manuelcandales
ghstack dependencies: pytorch#142052
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):
NSProcessInfowas allocated inside autorelease pool, but was not added to the poolTest plan:
leaks --atExit -- ./bin/mps_test_printBefore it reported the leaks as follows
After test run finishes with no leaks reported