KEMBAR78
[release/9.0.1xx] [src/runtime] Don't allow native code to resolve a weak reference for an object in the finalizer queue. by rolfbjarne · Pull Request #23086 · dotnet/macios · GitHub
Skip to content

Conversation

@rolfbjarne
Copy link
Member

The following happens:

  1. An instance of a custom NSObject subclass is created, with both a native instance
    and the corresponding managed wrapper.

  2. Native code creates a weak reference to the native instance.

  3. No other managed code references the managed instance, and there's no non-weak
    native reference to the native instance, so the GC schedules the managed instance
    for finalization.

  4. Native code fetches the native instance from the weak reference.

  5. Native code thinks it can do whatever it wants with the native instance, because
    it got a perfectly valid (and retained) native instance. Unfortunately, there's no way
    to stop the managed instance from being finalized (which happens on a background
    thread anyway, so there's no thread-safe way to do it), which means havoc ensues.

The most frequent manifestation of this problem has been like this:

ObjCRuntime.RuntimeException: Failed to marshal the Objective-C object 0x13d566800 (type: Microsoft_Maui_Platform_MauiTextField). Could not find an existing managed instance for this object, nor was it possible to create a new managed instance (because the type 'Microsoft.Maui.Platform.MauiTextField' does not have a constructor that takes one NativeHandle argument).
   at ObjCRuntime.Runtime.MissingCtor(IntPtr , IntPtr , Type , MissingCtorResolution , IntPtr , RuntimeMethodHandle )
   at ObjCRuntime.Runtime.ConstructNSObject[NSObject](IntPtr , Type , MissingCtorResolution , IntPtr , RuntimeMethodHandle )
   at ObjCRuntime.Runtime.ConstructNSObject[NSObject](IntPtr , Type , MissingCtorResolution )
   at ObjCRuntime.Runtime.ConstructNSObject(IntPtr , IntPtr , MissingCtorResolution )
   at ObjCRuntime.Runtime.GetNSObject(IntPtr , MissingCtorResolution , Boolean )
   at ObjCRuntime.Runtime.GetNSObject(IntPtr )
   at ObjCRuntime.Runtime.InvokeConformsToProtocol(IntPtr , IntPtr )
   at ObjCRuntime.Runtime.invoke_conforms_to_protocol(IntPtr obj, IntPtr protocol, IntPtr* exception_gchandle)
   Exception_EndOfInnerExceptionStack

but other problems can also occur.

This is a rather complicated issue to fix, because:

  • There's no way to be notified when native code creates a weak reference to a
    native instance.
  • There's not even a way to know if anybody has a weak reference to a native
    instance.
  • I also looked into manually clearing the weak references for a native
    instance, but that's not possible either.

However, it is possible to basically say "nope!" when native code tries to
fetch a native instance from a weak reference:

We override the '[NSObject retainWeakReference]' method for all custom
NSObject subclasses, and return FALSE if the corresponding managed object has
been scheduled for finalization (otherwise call the base class implementation).

The '[NSObject retainWeakReference]' method is public, but it's not
documented how it's supposed to behave. Fortunately, the corresponding source
code
is public, so we can figure out the semantics ourselves: return
TRUE if the object can be / was retained, FALSE otherwise.

Might also fix the following issues:

Backport of #23072.

rolfbjarne and others added 3 commits June 19, 2025 14:36
… an object in the finalizer queue.

The following happens:

1. An instance of a custom NSObject subclass is created, with both a native instance
   and the corresponding managed wrapper.

2. Native code creates a weak reference to the native instance.

3. No other managed code references the managed instance, and there's no non-weak
   native reference to the native instance, so the GC schedules the managed instance
   for finalization.

4. Native code fetches the native instance from the weak reference.

5. Native code thinks it can do whatever it wants with the native instance, because
   it got a perfectly valid (and retained) native instance. Unfortunately, there's no way
   to stop the managed instance from being finalized (which happens on a background
   thread anyway, so there's no thread-safe way to do it), which means havoc ensues.

The most frequent manifestation of this problem has been like this:

    ObjCRuntime.RuntimeException: Failed to marshal the Objective-C object 0x13d566800 (type: Microsoft_Maui_Platform_MauiTextField). Could not find an existing managed instance for this object, nor was it possible to create a new managed instance (because the type 'Microsoft.Maui.Platform.MauiTextField' does not have a constructor that takes one NativeHandle argument).
       at ObjCRuntime.Runtime.MissingCtor(IntPtr , IntPtr , Type , MissingCtorResolution , IntPtr , RuntimeMethodHandle )
       at ObjCRuntime.Runtime.ConstructNSObject[NSObject](IntPtr , Type , MissingCtorResolution , IntPtr , RuntimeMethodHandle )
       at ObjCRuntime.Runtime.ConstructNSObject[NSObject](IntPtr , Type , MissingCtorResolution )
       at ObjCRuntime.Runtime.ConstructNSObject(IntPtr , IntPtr , MissingCtorResolution )
       at ObjCRuntime.Runtime.GetNSObject(IntPtr , MissingCtorResolution , Boolean )
       at ObjCRuntime.Runtime.GetNSObject(IntPtr )
       at ObjCRuntime.Runtime.InvokeConformsToProtocol(IntPtr , IntPtr )
       at ObjCRuntime.Runtime.invoke_conforms_to_protocol(IntPtr obj, IntPtr protocol, IntPtr* exception_gchandle)
       Exception_EndOfInnerExceptionStack

but other problems can also occur.

This is a rather complicated issue to fix, because:

* There's no way to be notified when native code creates a weak reference to a
  native instance.
* There's not even a way to know if anybody has a weak reference to a native
  instance.
* I also looked into manually clearing the weak references for a native
  instance, but that's not possible either.

However, it is possible to basically say "nope!" when native code tries to
fetch a native instance from a weak reference:

We override the '[NSObject retainWeakReference]' method for all custom
NSObject subclasses, and return FALSE if the corresponding managed object has
been scheduled for finalization (otherwise call the base class implementation).

The ['[NSObject retainWeakReference]'][1] method is public, but it's not
documented how it's supposed to behave. Fortunately, the corresponding [source
code][2] is public, so we can figure out the semantics ourselves: return
`TRUE` if the object can be / was retained, `FALSE` otherwise.

* Fixes #21648.
* Fixes dotnet/maui#21485.
* Fixes #19076.

Might also fix the following issues:

* #22867
* #19579
* #4207
* https://bugzilla.xamarin.com/show_bug.cgi?id=34242 (https://web.archive.org/web/20170630214134/https://bugzilla.xamarin.com/show_bug.cgi?id=34242)

[1]: https://developer.apple.com/documentation/foundation/nsproxy/retainweakreference
[2]: https://github.com/apple-oss-distributions/objc4/blob/f126469408dc82bd3f327217ae678fd0e6e3b37c/runtime/NSObject.mm#L539-L541
@vs-mobiletools-engineering-service2
Copy link
Collaborator

✅ [CI Build #b4e4349] Build passed (Build packages) ✅

Pipeline on Agent
Hash: b4e43497e0aa6a71fccc59d769d89a0adc300c15 [PR build]

@vs-mobiletools-engineering-service2
Copy link
Collaborator

✅ [PR Build #b4e4349] Build passed (Detect API changes) ✅

Pipeline on Agent
Hash: b4e43497e0aa6a71fccc59d769d89a0adc300c15 [PR build]

@vs-mobiletools-engineering-service2
Copy link
Collaborator

✅ API diff for current PR / commit

.NET ( No breaking changes )

✅ API diff vs stable

.NET ( No breaking changes )

ℹ️ Generator diff

Generator Diff: vsdrops (html) vsdrops (raw diff) gist (raw diff) - Please review changes)

Pipeline on Agent
Hash: b4e43497e0aa6a71fccc59d769d89a0adc300c15 [PR build]

@vs-mobiletools-engineering-service2
Copy link
Collaborator

✅ [CI Build #b4e4349] Build passed (Build macOS tests) ✅

Pipeline on Agent
Hash: b4e43497e0aa6a71fccc59d769d89a0adc300c15 [PR build]

@vs-mobiletools-engineering-service2
Copy link
Collaborator

💻 [CI Build #b4e4349] Tests on macOS X64 - Mac Sonoma (14) passed 💻

All tests on macOS X64 - Mac Sonoma (14) passed.

Pipeline on Agent
Hash: b4e43497e0aa6a71fccc59d769d89a0adc300c15 [PR build]

@vs-mobiletools-engineering-service2
Copy link
Collaborator

💻 [CI Build #b4e4349] Tests on macOS M1 - Mac Monterey (12) passed 💻

All tests on macOS M1 - Mac Monterey (12) passed.

Pipeline on Agent
Hash: b4e43497e0aa6a71fccc59d769d89a0adc300c15 [PR build]

@vs-mobiletools-engineering-service2
Copy link
Collaborator

💻 [CI Build #b4e4349] Tests on macOS arm64 - Mac Sequoia (15) passed 💻

All tests on macOS arm64 - Mac Sequoia (15) passed.

Pipeline on Agent
Hash: b4e43497e0aa6a71fccc59d769d89a0adc300c15 [PR build]

@vs-mobiletools-engineering-service2
Copy link
Collaborator

💻 [CI Build #b4e4349] Tests on macOS M1 - Mac Ventura (13) passed 💻

All tests on macOS M1 - Mac Ventura (13) passed.

Pipeline on Agent
Hash: b4e43497e0aa6a71fccc59d769d89a0adc300c15 [PR build]

@vs-mobiletools-engineering-service2
Copy link
Collaborator

🚀 [CI Build #b4e4349] Test results 🚀

Test results

✅ All tests passed on VSTS: test results.

🎉 All 115 tests passed 🎉

Tests counts

✅ cecil: All 1 tests passed. Html Report (VSDrops) Download
✅ dotnettests (iOS): All 1 tests passed. Html Report (VSDrops) Download
✅ dotnettests (MacCatalyst): All 1 tests passed. Html Report (VSDrops) Download
✅ dotnettests (macOS): All 1 tests passed. Html Report (VSDrops) Download
✅ dotnettests (Multiple platforms): All 1 tests passed. Html Report (VSDrops) Download
✅ dotnettests (tvOS): All 1 tests passed. Html Report (VSDrops) Download
✅ framework: All 2 tests passed. Html Report (VSDrops) Download
✅ fsharp: All 4 tests passed. Html Report (VSDrops) Download
✅ generator: All 5 tests passed. Html Report (VSDrops) Download
✅ interdependent-binding-projects: All 4 tests passed. Html Report (VSDrops) Download
✅ introspection: All 4 tests passed. Html Report (VSDrops) Download
✅ linker: All 44 tests passed. Html Report (VSDrops) Download
✅ monotouch (iOS): All 8 tests passed. Html Report (VSDrops) Download
✅ monotouch (MacCatalyst): All 11 tests passed. Html Report (VSDrops) Download
✅ monotouch (macOS): All 9 tests passed. Html Report (VSDrops) Download
✅ monotouch (tvOS): All 8 tests passed. Html Report (VSDrops) Download
✅ msbuild: All 2 tests passed. Html Report (VSDrops) Download
✅ windows: All 3 tests passed. Html Report (VSDrops) Download
✅ xcframework: All 4 tests passed. Html Report (VSDrops) Download
✅ xtro: All 1 tests passed. Html Report (VSDrops) Download

Pipeline on Agent
Hash: b4e43497e0aa6a71fccc59d769d89a0adc300c15 [PR build]

@rolfbjarne rolfbjarne merged commit 204dbeb into release/9.0.1xx Jun 19, 2025
44 checks passed
@rolfbjarne rolfbjarne deleted the dev/rolf/backport-pr-23072-release/9.0.1xx-2025-06-19 branch June 19, 2025 14:48
@shnaz
Copy link

shnaz commented Jun 30, 2025

@rolfbjarne Hi Rolf, thanks for looking into this. Do you know when this fix will be released? :)

@rolfbjarne
Copy link
Member Author

@rolfbjarne Hi Rolf, thanks for looking into this. Do you know when this fix will be released? :)

This fix will be included in our next service release, which is scheduled to come out soon (unfortunately I can't be more specific).

@bnSonic
Copy link

bnSonic commented Jul 8, 2025

may I ask a possibly stupid question? :-S

If the label says "9.0.1xx" does this mean that it will be in a Maui-Release with version-number "9.0.100" or greater?
So currently 9.0.81 is available - so this fix will be in the next one which has the number 9.0.100(+)?

I'm still trying to understand / learn about the version-numbers of MAUI, .Net, SDK, … :D

And: Is there any workaround available until then? I see this crash happen rather frequently with our app that I just upgraded to .net9 and now I have a problem because I cannot ship it in this state ... what can I do?

@fadisafarpan
Copy link

I have the same question- how can we check if this was included in the latest release from July 8th?

@rolfbjarne
Copy link
Member Author

If the label says "9.0.1xx" does this mean that it will be in a Maui-Release with version-number "9.0.100" or greater?

No, the bug is in the iOS workload, not MAUI, so the fix will be in the iOS workload.

I have the same question- how can we check if this was included in the latest release from July 8th?

It's included.

It's mentioned in the release notes: https://github.com/dotnet/macios/releases/tag/dotnet-9.0.1xx-xcode16.4-9207

@bnSonic
Copy link

bnSonic commented Jul 23, 2025

Pardon me for an urgent question:

  • A Tester is reporting crashes with an iOS Release Build
  • These crashes happen on an iPhone with iOS 15.8.4 (yes, rather old iOS Version, I know)

Edit:
Happens on iOS 16 and iOS 17 too

Edit 2:
It was a CustomRenderer for a ListView
This renderer added a

Control.AddObserver(
            "contentOffset",
            Foundation.NSKeyValueObservingOptions.New,
            HandleAction
        );

and even if this one was disposed in the "Disposed" override, something went wrong at some point …
I was lucky that I was able to remove this CustomRenderer completely as I didn't need this one for this app anymore … so I didn't looked deeper into it …

… maybe it clicks and you say: "oh, wait, maybe there's a thing we need to fix" … or maybe you say: "sorry, but without a repro I cannot do anything" :D Or maybe even: "oh wait, no, such observer doesn't work anymore with maui - you need(!) to make it into a handler now" …

Question is:

Are there known problems with .net9 / maui on iOS Versions older than iOS 18 (or maybe older than iOS 17)?
Should this Service Release fix problems with older iOS-Versions too?
Or is this is different Problem? Currently I cannot see that this is a problem in my code as the Tester was not able to reproduce this on iOS 18 yet - it looks like the problem is a bit deeper down in the Framework?!
It looks similar to what we had on iOS18 devices before this ServiceRelease too.

It doesn't happen alle the time but rather often.
For example: In the App you tap somewhere to open a ModalPage with a ListView to select something, selecting is closing the dialog and refreshing the View. Now you tap again to open the same dialog and select a different thing… you do this 5 or 6 Times … sometimes 3 times, sometimes 20 times or so … then it crashes … This "smells" like GarbageCollection …

And it happens almost everywhere in the App so it's not some specific function / code in my own CodeBehind or Classes

Any idea? Is the Crashlog helpful?

and no, I'm pretty sure I'm not able to create a small sample project :( as I don't know what causes this in the first place

I have this fix installed:

dotnet workload --info
 Workload version: 9.0.302.0
 Konfiguriert für die Verwendung workload sets beim Installieren neuer Manifeste.

 [maui]
   Installationsquelle: SDK 9.0.300
   Manifestversion:    9.0.51/9.0.100
   Manifestpfad:       /usr/local/share/dotnet/sdk-manifests/9.0.100/microsoft.net.sdk.maui/9.0.51/WorkloadManifest.json
   Installationstyp:        FileBased

Crash log TestFlight

I was able to get a crash-log via TestFlight that looks like this (the crashed thread)
I'm not very good in reading these - but I seems that after a "dealloc" and "removeFrom…" somewhere the code tries to access something or to call something

Incident Identifier: 2E6AF1F0-A7B8-45BD-AE94-A21356F9FC74
Hardware Model:      iPhone8,1
Process:             MyCompany.MyApp.iOS [318]
Path:                /private/var/containers/Bundle/Application/4A7EA2E3-E83E-4153-8DFA-73B844AC9A5A/MyCompany.MyApp.iOS.app/MyCompany.MyApp.iOS
Identifier:          de.MyCompany.MyApp
Version:             25.1.0 (25.1.0.2)
AppStoreTools:       16F7
AppVariant:          1:iPhone8,1:15
Beta:                YES
Code Type:           ARM-64 (Native)
Role:                Foreground
Parent Process:      launchd [1]
Coalition:           de.MyCompany.MyApp [449]

Date/Time:           2025-07-23 15:16:25.7523 +0200
Launch Time:         2025-07-23 15:15:58.2957 +0200
OS Version:          iPhone OS 15.8.4 (19H390)
Release Type:        User
Baseband Version:    9.61.00
Report Version:      104

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Subtype: KERN_INVALID_ADDRESS at 0x0000000df0619c80
Exception Codes: 0x0000000000000001, 0x0000000df0619c80
VM Region Info: 0xdf0619c80 is not in any region.  Bytes after previous region: 48593214593  Bytes before following region: 7778231168
      REGION TYPE                 START - END      [ VSIZE] PRT/MAX SHRMOD  REGION DETAIL
      MALLOC_NANO              280000000-2a0000000 [512.0M] rw-/rwx SM=COW  
--->  GAP OF 0xd20000000 BYTES
      commpage (reserved)      fc0000000-1000000000 [  1.0G] ---/--- SM=NUL  ...(unallocated)
Exception Note:  EXC_CORPSE_NOTIFY
Triggered by Thread:  0


Thread 0 name:
Thread 0 name:
Thread 0 Crashed:
0   libobjc.A.dylib               	0x00000001996ee18c class_getMethodImplementation + 32 (objc-class.mm:676)
1   Foundation                    	0x0000000183589fd0 _NSKVONotifyingOriginalClassForIsa + 28 (NSKeyValueObserverNotifying.m:937)
2   Foundation                    	0x000000018358ba68 _NSKeyValueObservationInfoGetObservances + 284 (NSKeyValueObservationInfo.m:815)
3   Foundation                    	0x0000000183576534 -[NSObject(NSKeyValueObservingPrivate) _changeValueForKeys:count:maybeOldValuesDict:maybeNewValuesDict:usingBlock:] + 244 (NSKeyValueObserving.m:2599)
4   Foundation                    	0x000000018356c9c4 -[NSObject(NSKeyValueObservingPrivate) _changeValueForKey:key:key:usingBlock:] + 68 (NSKeyValueObserving.m:2652)
5   Foundation                    	0x000000018356ab9c _NSSetPointValueAndNotify + 300 (NSKeyValueObserverNotifying.m:107)
6   UIKitCore                     	0x000000018441ee40 -[UIScrollView _adjustContentOffsetIfNecessary] + 100 (UIScrollView.m:13524)
7   UIKitCore                     	0x0000000184366aa8 -[UIScrollView _stopScrollingNotify:pin:tramplingDragFlags:] + 432 (UIScrollView.m:11728)
8   UIKitCore                     	0x0000000184613210 -[UITableView _stopScrollingNotify:pin:] + 48 (UITableView.m:12635)
9   UIKitCore                     	0x00000001846214c4 -[UIScrollView removeFromSuperview] + 40 (UIScrollView.m:2925)
10  UIKitCore                     	0x000000018435738c -[UIView dealloc] + 432 (UIView.m:4665)
11  MyCompany.MyApp.iOS             	0x0000000104fb279c 0x1007d0000 + 75376540
12  MyCompany.MyApp.iOS             	0x00000001051a4d70 0x1007d0000 + 77417840
13  MyCompany.MyApp.iOS             	0x0000000104fb017c 0x1007d0000 + 75366780
14  MyCompany.MyApp.iOS             	0x0000000101578f58 0x1007d0000 + 14323544
15  MyCompany.MyApp.iOS             	0x0000000104ed195c 0x1007d0000 + 74455388
16  MyCompany.MyApp.iOS             	0x0000000104ed5c24 0x1007d0000 + 74472484
17  MyCompany.MyApp.iOS             	0x00000001051f05f8 0x1007d0000 + 77727224
18  Foundation                    	0x0000000183587660 __NSThreadPerformPerform + 164 (NSThread.m:1058)
19  CoreFoundation                	0x0000000181eec448 __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 24 (CFRunLoop.c:1972)
20  CoreFoundation                	0x0000000181efc578 __CFRunLoopDoSource0 + 204 (CFRunLoop.c:2016)
21  CoreFoundation                	0x0000000181e3e734 __CFRunLoopDoSources0 + 256 (CFRunLoop.c:2053)
22  CoreFoundation                	0x0000000181e43e08 __CFRunLoopRun + 768 (CFRunLoop.c:2951)
23  CoreFoundation                	0x0000000181e57174 CFRunLoopRunSpecific + 572 (CFRunLoop.c:3268)
24  GraphicsServices              	0x00000001a2976988 GSEventRunModal + 160 (GSEvent.c:2200)
25  UIKitCore                     	0x0000000184659a88 -[UIApplication _run] + 1080 (UIApplication.m:3511)
26  UIKitCore                     	0x00000001843f2f78 UIApplicationMain + 336 (UIApplication.m:5064)
27  MyCompany.MyApp.iOS             	0x0000000104fad120 0x1007d0000 + 75354400
28  MyCompany.MyApp.iOS             	0x0000000101569710 0x1007d0000 + 14259984
29  MyCompany.MyApp.iOS             	0x0000000104e0340c 0x1007d0000 + 73610252
30  MyCompany.MyApp.iOS             	0x0000000102184b30 0x1007d0000 + 26954544
31  MyCompany.MyApp.iOS             	0x0000000105132da4 0x1007d0000 + 76950948
32  MyCompany.MyApp.iOS             	0x00000001050de3b0 0x1007d0000 + 76604336
33  MyCompany.MyApp.iOS             	0x00000001050e40e4 0x1007d0000 + 76628196
34  MyCompany.MyApp.iOS             	0x0000000105139540 0x1007d0000 + 76977472
35  MyCompany.MyApp.iOS             	0x0000000104fb7d00 0x1007d0000 + 75398400
36  MyCompany.MyApp.iOS             	0x00000001051a426c 0x1007d0000 + 77415020
37  dyld                          	0x000000010eb184d0 start + 444 (dyldMain.cpp:879)

Viewed in Xcode

In Xcode I can see a bit more for the addresses of my App:
Bildschirmfoto 2025-07-23 um 15 45 08

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants