Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Massive Problems with random crashes (SIGSEGV and SIGABRT) #21479

Closed
pijnappel opened this issue Mar 27, 2024 · 27 comments
Closed

Massive Problems with random crashes (SIGSEGV and SIGABRT) #21479

pijnappel opened this issue Mar 27, 2024 · 27 comments
Labels
area-core-platform Integration with platforms platform/android 🤖 s/needs-attention Issue has more information and needs another look t/bug Something isn't working

Comments

@pijnappel
Copy link

Description

We are still experiencing random crashes of our app. I've been trying for month to debug and locate them. Its increasingly frustrating because there is no possibility to locate the origin of these errors.

03-27 09:19:59.028 4846 4860 F libc : Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x2c0030002c0039 in tid 4860 (SGen worker), pid 4846 (ctro.mobile.xrf)

Most of the time I see 'SGen worker' in the message, sometimes it is '.NET TP Worker'. I am always running in release mode when this happens, was not able to reproduce this in DEBUG.

  • Could this be a memory issue? I'm logging memory consumption and it is always around 30%
  • Could this be a threading issue? How would I debug this?

Steps to Reproduce

The app is very complex and it is not possible to reproduce these crashes in a minimal repro app. I am running an automation service in a foreground service which performs measurements and calculations using custom hardware components (serial communication). A client (in this case the same android device) triggers a measurement process. The process runs in a separate task, creates result data and stores it to the device memory (in memory json serialization, writing bytes to the file), then adds it to an index (Lucene.Net). On the UI side status events are observed and progress and result information is shown.

Most of the time the process runs flawlessly. Then suddenly the app crashes (All global exception handling is ignored). I tried locating the line of code where the crash happens (with traces), but I could only locate that it is most likely happening in the processing, not in UI.

Link to public reproduction project repository

No response

Version with bug

8.0.10 SR3

Is this a regression from previous behavior?

Not sure, did not test other versions

Last version that worked well

Unknown/Other

Affected platforms

Android

Affected platform versions

Android 9

Did you find any workaround?

I switched several processes (saving files, indexing) from running in separate threads to running synchronously. I thought this resolved the problem. But now it happened again.

Relevant log output

--------- beginning of crash
03-27 09:19:59.028  4846  4860 F libc    : Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x2c0030002c0039 in tid 4860 (SGen worker), pid 4846 (ctro.mobile.xrf)
03-27 09:19:59.041  3040 12463 I NuPlayerDecoder: [audio] saw output EOS
03-27 09:19:59.290  3040 12462 D AudioTrack: stop() called with 71424 frames delivered
03-27 09:19:59.421 12477 12477 I crash_dump64: obtaining output fd from tombstoned, type: kDebuggerdTombstone
03-27 09:19:59.423  3066  3066 I /system/bin/tombstoned: received crash request for pid 4860
03-27 09:19:59.425 12477 12477 I crash_dump64: performing dump of process 4846 (target tid = 4860)
03-27 09:19:59.450 12477 12477 F DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
03-27 09:19:59.451 12477 12477 F DEBUG   : Build fingerprint: 'Android/aek/aek:9/2.3.4-ga-rc2/root01172037:userdebug/dev-keys'
03-27 09:19:59.451 12477 12477 F DEBUG   : Revision: '0'
03-27 09:19:59.451 12477 12477 F DEBUG   : ABI: 'arm64'
03-27 09:19:59.451 12477 12477 F DEBUG   : pid: 4846, tid: 4860, name: SGen worker  >>> com.myApp.mobile.xrf <<<
03-27 09:19:59.451 12477 12477 F DEBUG   : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x2c0030002c0039
03-27 09:19:59.451 12477 12477 F DEBUG   :     x0  002c0030002c0039  x1  0000f5b54c4c9348  x2  0000f5b57b16cfd8  x3  0000f5b5a733b010
03-27 09:19:59.451 12477 12477 F DEBUG   :     x4  0000f5b5a733b010  x5  0000f5b59eeb77cd  x6  0100000000000000  x7  0000000000000000
03-27 09:19:59.451 12477 12477 F DEBUG   :     x8  0000000000000001  x9  0000f5b54c4c8000  x10 0000f5b58dcf4278  x11 0000f5b57b0b9880
03-27 09:19:59.451 12477 12477 F DEBUG   :     x12 0000f5b54c4c8470  x13 0000000000000000  x14 0000000040180003  x15 0000000000000001
03-27 09:19:59.451 12477 12477 F DEBUG   :     x16 0000f5b58dcd7338  x17 0000f5b58dbd72a8  x18 0000fffff2198dca  x19 0000000000000003
03-27 09:19:59.451 12477 12477 F DEBUG   :     x20 0000000000000000  x21 0000f5b54c4c9348  x22 0000f5b57b16cfd8  x23 0000000000000018
03-27 09:19:59.451 12477 12477 F DEBUG   :     x24 0000f5b58dcf21d0  x25 ffffffffffffffff  x26 0000f5b58dcf2180  x27 0000f5b58dcf1000
03-27 09:19:59.451 12477 12477 F DEBUG   :     x28 0000f5b58dcf1000  x29 0000f5b57b6cb1c0
03-27 09:19:59.451 12477 12477 F DEBUG   :     sp  0000f5b57b6cb1c0  lr  0000f5b58dcbc6c0  pc  0000f5b58dcbc6e8
03-27 09:19:59.462 12477 12477 F DEBUG   : 
03-27 09:19:59.462 12477 12477 F DEBUG   : backtrace:
03-27 09:19:59.462 12477 12477 F DEBUG   :     #00 pc 00000000002ec6e8  /data/app/com.myApp.mobile.xrf-jAm7z2gKIusMvQJmGcs8wg==/lib/arm64/libmonosgen-2.0.so
03-27 09:19:59.462 12477 12477 F DEBUG   :     #01 pc 00000000002dca54  /data/app/com.myApp.mobile.xrf-jAm7z2gKIusMvQJmGcs8wg==/lib/arm64/libmonosgen-2.0.so
03-27 09:19:59.462 12477 12477 F DEBUG   :     #02 pc 00000000002a70d0  /data/app/com.myApp.mobile.xrf-jAm7z2gKIusMvQJmGcs8wg==/lib/arm64/libmonosgen-2.0.so
03-27 09:19:59.462 12477 12477 F DEBUG   :     #03 pc 00000000002bf9b4  /data/app/com.myApp.mobile.xrf-jAm7z2gKIusMvQJmGcs8wg==/lib/arm64/libmonosgen-2.0.so
03-27 09:19:59.462 12477 12477 F DEBUG   :     #04 pc 00000000002d2984  /data/app/com.myApp.mobile.xrf-jAm7z2gKIusMvQJmGcs8wg==/lib/arm64/libmonosgen-2.0.so
03-27 09:19:59.462 12477 12477 F DEBUG   :     #05 pc 00000000002c85fc  /data/app/com.myApp.mobile.xrf-jAm7z2gKIusMvQJmGcs8wg==/lib/arm64/libmonosgen-2.0.so
03-27 09:19:59.462 12477 12477 F DEBUG   :     #06 pc 00000000002fafc4  /data/app/com.myApp.mobile.xrf-jAm7z2gKIusMvQJmGcs8wg==/lib/arm64/libmonosgen-2.0.so
03-27 09:19:59.462 12477 12477 F DEBUG   :     #07 pc 0000000000083114  /system/lib64/libc.so (__pthread_start(void*)+36)
03-27 09:19:59.462 12477 12477 F DEBUG   :     #08 pc 00000000000233bc  /system/lib64/libc.so (__start_thread+68)
03-27 09:19:59.504 12477 12477 I crash_dump64: type=1400 audit(0.0:855): avc: denied { read } for name="com.google.android.datatransport.events-shm" dev="mmcblk0p13" ino=1180139 scontext=u:r:crash_dump:s0:c68,c256,c512,c768 tcontext=u:object_r:app_data_file:s0:c68,c256,c512,c768 tclass=file permissive=1
03-27 09:19:59.532 12477 12477 I crash_dump64: type=1400 audit(0.0:856): avc: denied { open } for path="/data/data/com.myApp.mobile.xrf/databases/com.google.android.datatransport.events-shm" dev="mmcblk0p13" ino=1180139 scontext=u:r:crash_dump:s0:c68,c256,c512,c768 tcontext=u:object_r:app_data_file:s0:c68,c256,c512,c768 tclass=file permissive=1
03-27 09:19:59.532 12477 12477 I crash_dump64: type=1400 audit(0.0:857): avc: denied { getattr } for path="/data/data/com.myApp.mobile.xrf/databases/com.google.android.datatransport.events-shm" dev="mmcblk0p13" ino=1180139 scontext=u:r:crash_dump:s0:c68,c256,c512,c768 tcontext=u:object_r:app_data_file:s0:c68,c256,c512,c768 tclass=file permissive=1
03-27 09:19:59.532 12477 12477 I crash_dump64: type=1400 audit(0.0:858): avc: denied { map } for path="/data/data/com.myApp.mobile.xrf/databases/com.google.android.datatransport.events-shm" dev="mmcblk0p13" ino=1180139 scontext=u:r:crash_dump:s0:c68,c256,c512,c768 tcontext=u:object_r:app_data_file:s0:c68,c256,c512,c768 tclass=file permissive=1
03-27 09:19:59.675  3040 12458 D NuPlayerDriver: notifyListener_l(0xe9e3f000), (2, 0, 0, -1), loop setting(0, 0)
03-27 09:19:59.676  3040 12458 D NuPlayerDriver: notifyListener_l(0xe9e3f000), (211, 0, 0, 20), loop setting(0, 0)
03-27 09:19:59.808  4846  4880 I ctro.mobile.xr: Explicit concurrent copying GC freed 434(86KB) AllocSpace objects, 0(0B) LOS objects, 50% free, 6MB/13MB, paused 225us total 158.581ms
@pijnappel pijnappel added the t/bug Something isn't working label Mar 27, 2024
@PureWeen
Copy link
Member

Can you attach a binlog?
https://learn.microsoft.com/en-us/xamarin/android/deploy-test/debugging/android-debug-log?tabs=windows

A repro would also be helpful

@PureWeen PureWeen added platform/android 🤖 s/needs-info Issue needs more info from the author labels Mar 27, 2024
@pijnappel
Copy link
Author

test7-1.log

This is the full logcat. Since then I activated additional Mono logging. I'll try to reproduce it, but it will take several hours to make it happen again.

I am unable to provide a repro. I had no success reproducing the error in a repro app.

@dotnet-policy-service dotnet-policy-service bot added s/needs-attention Issue has more information and needs another look and removed s/needs-info Issue needs more info from the author labels Mar 28, 2024
@jfversluis
Copy link
Member

Have you looked through the logcat yourself? Do you see anything that makes sense? While we can glance over it we might miss something since we don't know anything about your app and know what might be relevant or not

@pijnappel
Copy link
Author

I went through the log in detail and am not able to find any reason for the error to be caused by my code.

I am aware that I am causing it indirectly. From the crash report I can see that the crash happens in the Mono Garbage Collector process (SGen worker). How would I be able to cause such a crash from a MAUI app? From my understanding this should not be possible. Please correct me if I'm wrong.

I even added a lot of logging and commeted sections of code to find a possible location where this happens. But for now it seems absolutely random.

@ac-lap
Copy link

ac-lap commented Apr 16, 2024

@jfversluis I am also facing similar crash, on play store this is contributing to 2% crash rate of my app.

I tried reading through the entire logcat file, but nothing I am able to figure out. Plz assist.

03-09 06:04:44.827: D/SSense(897): onTouchRateChanged: rate:9.0
03-09 06:04:44.861: I/os.AppName(32521): Explicit concurrent copying GC freed 5(31KB) AllocSpace objects, 0(0B) LOS objects, 66% free, 16MB/48MB, paused 57us total 69.944ms
03-09 06:04:45.096: F/crashpad(19532): dlopen: dlopen failed: library "libandroidicu.so" not found: needed by /system/lib64/libharfbuzz_ng.so in namespace (default)
03-09 06:04:45.096: F/crashpad(19532): --------- beginning of crash
03-09 06:04:45.097: F/libc(32521): Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0xf890ed7006b203fc in tid 32748 (SGen worker), pid 32521 (os.AppName)
03-09 06:04:45.152: I/os.AppName(32521): Explicit concurrent copying GC freed 64(58KB) AllocSpace objects, 0(0B) LOS objects, 66% free, 16MB/48MB, paused 132us total 118.972ms
03-09 06:04:45.275: I/crash_dump64(19549): obtaining output fd from tombstoned, type: kDebuggerdTombstone
03-09 06:04:45.283: I/tombstoned(458): received crash request for pid 32748
03-09 06:04:45.285: I/crash_dump64(19549): performing dump of process 32521 (target tid = 32748)
03-09 06:04:45.304: F/DEBUG(19549): *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
03-09 06:04:45.304: F/DEBUG(19549): Native Crash TIME: 17379577
03-09 06:04:45.304: F/DEBUG(19549): *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
03-09 06:04:45.304: F/DEBUG(19549): Build fingerprint: 'motorola/java_retail/java:11/RTAS31.68-66-3/66-3:user/release-keys'
03-09 06:04:45.304: F/DEBUG(19549): Revision: '0'
03-09 06:04:45.304: F/DEBUG(19549): ABI: 'arm64'
03-09 06:04:45.305: F/DEBUG(19549): Timestamp: 2024-03-09 06:04:45-0800
03-09 06:04:45.305: F/DEBUG(19549): pid: 32521, tid: 32748, name: SGen worker  >>> com.optimiliastudios.AppName <<<
03-09 06:04:45.305: F/DEBUG(19549): uid: 10264
03-09 06:04:45.305: F/DEBUG(19549): signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0xf890ed7006b203fc
03-09 06:04:45.305: F/DEBUG(19549):     x0  f890ed7006b203fc  x1  0000006ce97f5030  x2  0000006d4e284e88  x3  0000006d54662ec4
03-09 06:04:45.305: F/DEBUG(19549):     x4  000000704f131010  x5  00000070515b0288  x6  0000000000000001  x7  0000000000000000
03-09 06:04:45.305: F/DEBUG(19549):     x8  0000000000000001  x9  0000006ce97f4000  x10 0000006d546ac048  x11 0000006d4e284e88
03-09 06:04:45.305: F/DEBUG(19549):     x12 0000000000000000  x13 0000000000000000  x14 00000000402b0010  x15 0000000000000001
03-09 06:04:45.305: F/DEBUG(19549):     x16 0000006d5468e3e8  x17 0000007051b0a180  x18 0000006d4c4e0000  x19 0000000000000028
03-09 06:04:45.305: F/DEBUG(19549):     x20 1fffffffffffffff  x21 0000006dae53c258  x22 0000006d4e284e88  x23 0000006ce97f5030
03-09 06:04:45.305: F/DEBUG(19549):     x24 0000000000000018  x25 0000006d546a9fa0  x26 ffffffffffffffff  x27 0000006d546a9f50
03-09 06:04:45.305: F/DEBUG(19549):     x28 0000000000000000  x29 0000006d54343a70
03-09 06:04:45.305: F/DEBUG(19549):     lr  0000006d546744a8  sp  0000006d54343a70  pc  0000006d546744d0  pst 0000000060000000
03-09 06:04:45.313: F/DEBUG(19549): backtrace:
03-09 06:04:45.314: F/DEBUG(19549):       #00 pc 00000000002ec4d0  /data/app/~~3-r7G91C5QClodx0Zg-2pw==/com.optimiliastudios.AppName-jQD6-7njwVI3JkOGdD2E2g==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x18d8000) (BuildId: b4eabab966ddf1c49d3c8b923cca5b14598b39f3)
03-09 06:04:45.314: F/DEBUG(19549):       #01 pc 00000000002db4c4  /data/app/~~3-r7G91C5QClodx0Zg-2pw==/com.optimiliastudios.AppName-jQD6-7njwVI3JkOGdD2E2g==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x18d8000) (BuildId: b4eabab966ddf1c49d3c8b923cca5b14598b39f3)
03-09 06:04:45.314: F/DEBUG(19549):       #02 pc 00000000002d276c  /data/app/~~3-r7G91C5QClodx0Zg-2pw==/com.optimiliastudios.AppName-jQD6-7njwVI3JkOGdD2E2g==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x18d8000) (BuildId: b4eabab966ddf1c49d3c8b923cca5b14598b39f3)
03-09 06:04:45.314: F/DEBUG(19549):       #03 pc 00000000002c83e4  /data/app/~~3-r7G91C5QClodx0Zg-2pw==/com.optimiliastudios.AppName-jQD6-7njwVI3JkOGdD2E2g==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x18d8000) (BuildId: b4eabab966ddf1c49d3c8b923cca5b14598b39f3)
03-09 06:04:45.314: F/DEBUG(19549):       #04 pc 00000000002fadac  /data/app/~~3-r7G91C5QClodx0Zg-2pw==/com.optimiliastudios.AppName-jQD6-7njwVI3JkOGdD2E2g==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x18d8000) (BuildId: b4eabab966ddf1c49d3c8b923cca5b14598b39f3)
03-09 06:04:45.314: F/DEBUG(19549):       #05 pc 00000000000af97c  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+64) (BuildId: ddedea5a1d1071f97c5321841b6be985)
03-09 06:04:45.314: F/DEBUG(19549):       #06 pc 00000000000500d0  /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+64) (BuildId: ddedea5a1d1071f97c5321841b6be985)
03-09 06:04:45.458: I/os.AppName(32521): Explicit concurrent copying GC freed 14(79KB) AllocSpace objects, 0(0B) LOS objects, 66% free, 16MB/48MB, paused 112us total 90.546ms
03-09 06:04:45.499: W/System(897): A resource failed to call release. 
03-09 06:04:45.502: I/chatty(897): uid=1000(system) FinalizerDaemon identical 16 lines
03-09 06:04:45.502: W/System(897): A resource failed to call release. 
03-09 06:04:45.669: I/os.AppName(32521): Explicit concurrent copying GC freed 14(45KB) AllocSpace objects, 0(0B) LOS objects, 66% free, 16MB/48MB, paused 149us total 74.048ms
03-09 06:04:45.802: I/os.AppName(32521): Explicit concurrent copying GC freed 30(28KB) AllocSpace objects, 0(0B) LOS objects, 66% free, 16MB/48MB, paused 101us total 70.885ms
03-09 06:04:45.933: I/os.AppName(32521): Explicit concurrent copying GC freed 29(47KB) AllocSpace objects, 0(0B) LOS objects, 66% free, 16MB/48MB, paused 96us total 71.138ms
03-09 06:04:46.047: I/os.AppName(32521): Explicit concurrent copying GC freed 9(46KB) AllocSpace objects, 0(0B) LOS objects, 66% free, 16MB/48MB, paused 72us total 71.358ms
03-09 06:04:46.169: I/os.AppName(32521): Explicit concurrent copying GC freed 17(28KB) AllocSpace objects, 0(0B) LOS objects, 66% free, 16MB/48MB, paused 95us total 71.317ms
03-09 06:04:46.284: I/os.AppName(32521): Explicit concurrent copying GC freed 27(47KB) AllocSpace objects, 0(0B) LOS objects, 66% free, 16MB/48MB, paused 97us total 73.855ms
03-09 06:04:46.434: I/os.AppName(32521): Explicit concurrent copying GC freed 25(47KB) AllocSpace objects, 0(0B) LOS objects, 66% free, 16MB/48MB, paused 87us total 73.892ms
03-09 06:04:46.641: I/os.AppName(32521): Explicit concurrent copying GC freed 67(58KB) AllocSpace objects, 0(0B) LOS objects, 66% free, 16MB/48MB, paused 69us total 70.731ms
03-09 06:04:46.681: E/tombstoned(458): Tombstone written to: /data/tombstones/tombstone_00
03-09 06:04:46.683: I/DropBoxManagerService(897): add tag=data_app_native_crash isTagEnabled=true flags=0x2
03-09 06:04:46.686: I/BootReceiver(897): Copying /data/tombstones/tombstone_00 to DropBox (SYSTEM_TOMBSTONE)
03-09 06:04:46.687: I/DropBoxManagerService(897): add tag=SYSTEM_TOMBSTONE isTagEnabled=true flags=0x2
03-09 06:04:46.688: W/BroadcastQueue(897): Background execution not allowed: receiving Intent { act=android.intent.action.DROPBOX_ENTRY_ADDED flg=0x10 (has extras) } to com.google.android.gms/.chimera.GmsIntentOperationService$PersistentTrustedReceiver
03-09 06:04:46.689: W/BroadcastQueue(897): Background execution not allowed: receiving Intent { act=android.intent.action.DROPBOX_ENTRY_ADDED flg=0x10 (has extras) } to com.google.android.gms/.stats.service.DropBoxEntryAddedReceiver
03-09 06:04:46.693: D/DropBoxManagerService(897): get detail log for important dropbox tag SYSTEM_TOMBSTONE
03-09 06:04:46.697: D/DropBoxManagerService(897): append detail log for tag:SYSTEM_TOMBSTONE
03-09 06:04:46.719: I/DEBUG(32521): Crash thread undumpable
03-09 06:04:46.781: E/SELinux(320): avc:  denied  { find } for interface=android.hardware.memtrack::IMemtrack sid=u:r:gmscore_app:s0:c512,c768 pid=30834 scontext=u:r:gmscore_app:s0:c512,c768 tcontext=u:object_r:hal_memtrack_hwservice:s0 tclass=hwservice_manager permissive=0
03-09 06:04:46.781: E/memtrack(30834): Couldn't load memtrack module
03-09 06:04:46.838: W/BroadcastQueue(897): Background execution not allowed: receiving Intent { act=android.intent.action.DROPBOX_ENTRY_ADDED flg=0x10 (has extras) } to com.google.android.gms/.chimera.GmsIntentOperationService$PersistentTrustedReceiver
03-09 06:04:46.838: W/BroadcastQueue(897): Background execution not allowed: receiving Intent { act=android.intent.action.DROPBOX_ENTRY_ADDED flg=0x10 (has extras) } to com.google.android.gms/.stats.service.DropBoxEntryAddedReceiver
03-09 06:04:46.992: W/InputDispatcher(897): channel '8ff4c63 com.optimiliastudios.AppName/crc641e510553ababc772.MainActivity (server)' ~ Consumer closed input channel or an error occurred.  events=0x9
03-09 06:04:46.992: E/InputDispatcher(897): channel '8ff4c63 com.optimiliastudios.AppName/crc641e510553ababc772.MainActivity (server)' ~ Channel is unrecoverably broken and will be disposed!
03-09 06:04:46.993: W/Robo(1101): Platform to host thread finished.

@ultimategrandson
Copy link

I'm having the same problem, updated from 8.0.7 to 8.0.21 and started getting SIGSEGVs.

I downgraded back to 8.0.7 and that seems to have stopped the crashes.

When I was on 8.0.21, I started using Sentry and that made the crashes even worse. Removing Sentry made it less severe, but it wasn't until I downgraded the the issue went away completely.

@pijnappel
Copy link
Author

I found the more I/O happens, the more likely it is to crash. I was able to reduce the amount of crashes by limiting the data written and read from storage. It is better, but it still happens. And it is no permanent solution. In my case it happens in earlier versions, like the 8.0.7, too.

@PureWeen PureWeen added the area-core-platform Integration with platforms label May 31, 2024
@taublast
Copy link
Contributor

taublast commented Jul 9, 2024

@ac-lap @pijnappel In your apps could this be related to accessing already disposed objects?

@RobTF
Copy link

RobTF commented Aug 27, 2024

Hi,

We are getting this too - it kills our app whilst it's backgrounding;

08-25 12:28:14.820 10396  4884  4982 F libc    : Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0 in tid 4982 (.NET TP Worker), pid 4884 (smo.app.Locator)
08-25 12:28:19.418 10396  5062  5062 F DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
08-25 12:28:19.418 10396  5062  5062 F DEBUG   : Build fingerprint: 'google/raven/raven:14/AP2A.240805.005.F1/12043167:user/release-keys'
08-25 12:28:19.418 10396  5062  5062 F DEBUG   : Revision: 'MP1.0'
08-25 12:28:19.418 10396  5062  5062 F DEBUG   : ABI: 'arm64'
08-25 12:28:19.418 10396  5062  5062 F DEBUG   : Timestamp: 2024-08-25 12:28:15.017864039+0100
08-25 12:28:19.418 10396  5062  5062 F DEBUG   : Process uptime: 17s
08-25 12:28:19.418 10396  5062  5062 F DEBUG   : Cmdline: com.vismo.app.Locator
08-25 12:28:19.418 10396  5062  5062 F DEBUG   : pid: 4884, tid: 4982, name: .NET TP Worker  >>> com.vismo.app.Locator <<<
08-25 12:28:19.418 10396  5062  5062 F DEBUG   : uid: 10396
08-25 12:28:19.418 10396  5062  5062 F DEBUG   : tagged_addr_ctrl: 0000000000000001 (PR_TAGGED_ADDR_ENABLE)
08-25 12:28:19.418 10396  5062  5062 F DEBUG   : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0000000000000000
08-25 12:28:19.418 10396  5062  5062 F DEBUG   : Cause: null pointer dereference
08-25 12:28:19.418 10396  5062  5062 F DEBUG   :     x0  000000005c000000  x1  0000000000000000  x2  b4000073c24ae110  x3  00000072add04150
08-25 12:28:19.418 10396  5062  5062 F DEBUG   :     x4  000000720064d478  x5  b4000073524bc520  x6  00000080040b3163  x7  6e6f6d00726f7463
08-25 12:28:19.418 10396  5062  5062 F DEBUG   :     x8  00000072ae286000  x9  000000755bf55220  x10 0000000000000244  x11 0000000000000250
08-25 12:28:19.418 10396  5062  5062 F DEBUG   :     x12 0000000000000000  x13 0000000000000002  x14 0000000000000008  x15 0000000000000007
08-25 12:28:19.418 10396  5062  5062 F DEBUG   :     x16 0000000000000001  x17 000000754b976210  x18 000000717fe40000  x19 000000006f55b918
08-25 12:28:19.418 10396  5062  5062 F DEBUG   :     x20 0000000000000000  x21 0000000000000000  x22 b4000074a24230d0  x23 0000007200650ac0
08-25 12:28:19.418 10396  5062  5062 F DEBUG   :     x24 0000007200650ac0  x25 000000729e4850f0  x26 0000000000004a0a  x27 0000000000000000
08-25 12:28:19.418 10396  5062  5062 F DEBUG   :     x28 0000007200650ac0  x29 000000720064d400
08-25 12:28:19.418 10396  5062  5062 F DEBUG   :     lr  00000072add041d4  sp  000000720064d3c0  pc  00000072add0439c  pst 0000000060001000
08-25 12:28:19.418 10396  5062  5062 F DEBUG   : 2 total frames
08-25 12:28:19.418 10396  5062  5062 F DEBUG   : backtrace:
08-25 12:28:19.418 10396  5062  5062 F DEBUG   :       #00 pc 000000000069339c  /apex/com.android.art/lib64/libart.so (art::JNI<false>::IsInstanceOf(_JNIEnv*, _jobject*, _jclass*)+588) (BuildId: 2452917c4ff69cbb6e75e5512260946b)
08-25 12:28:19.418 10396  5062  5062 F DEBUG   :       #01 pc 000000000000c0dc  <anonymous:755bd25000>

No ideas how to fight this one as there is no stack trace into our .NET code, and it sounds too low level. I have the full Android bug report file I can supply if it would be helpful.

This is a release build installed from Google Play.

@pijnappel
Copy link
Author

This really needs to be addressed! We are still struggling with theses crashes. No chance to reproduce or trace the problem and no support.

Can someone please give instructions on debugging or isolating the problem? Is it possible to get information about these errors from the garbage collector?

@RobTF
Copy link

RobTF commented Aug 27, 2024

I think this sort of thing risks flying under the radar as for a lot of apps this issue will result in a crash which manifests as the app disappearing in front of the user, after which the user will roll their eyes before simply restarting the app and continuing to use it. You might risk a few poor reviews but generally the app will function.

However, the moment you get into backgrounding things get worse as the user isn't there to work around the problem for you. Your app is killed and just never comes back, the user might not know for days until they see some other side effect such as something not updating on a web portal.

A good analogy would be that you're trying to write a Windows service but due to the platform/toolchain/APIs/whatever your program could simply terminate at any moment without warning.

We're going to work around this by employing countermeasures such as popping up an "emergency" push notification asking the user to tap to re-open the app if things stop functioning but it is not at all ideal.

@kcrg
Copy link

kcrg commented Aug 27, 2024

I also see this in the latest MAUI version 8.0.81 in Sentry dashboard

@MitchBomcanhao
Copy link

MitchBomcanhao commented Aug 27, 2024

in the latest MAUI version 8.0.81

which version is that?

@kcrg
Copy link

kcrg commented Aug 27, 2024

which version is that?

Ooops, I mean 8.0.80 😵‍💫

@cactusjack66
Copy link

Hi,

we are experiencing this as well on our MAUI app using the newest MAUI version.
This is very annyoing and we do not know how to fix it.

Most of the time it is something like

pid: 1111, tid: 1111, name: appName >>> com.AppName<<<
08-27 10:55:21.316  1111 1111 F DEBUG   : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0
08-27 10:55:21.316  1111 1111 F DEBUG   : Cause: null pointer dereference

where the name can also be .NET TP Worker with a different fault address.
The crashes happen completely random once a few hours and are not reproducable at all which makes it nearly unpossible finding the reason for them.

What causes the Null Pointer Deference? Where is it caused? We do not know.... and there is no hint or clue for finding a solution. It is like fishing in muddy waters, very frustrating.

We catch every single exception that is happening in our code, so we are save from that side, but when a native exception occurs we are completely helpless. The app just closes without any useful hint.

We also tried to use ndk-stack for more information, but no chance.

@RobTF
Copy link

RobTF commented Aug 27, 2024

I don't think its an issue with our code - I think it's an internal issue with the runtime. At best we can possibly try to avoid it if some pattern is more likely to cause it, but I don't think it will end up being a bug in our respective code we can "fix" as such.

That's sort of the problem as developers can't fix it themselves and what tends to happen is the MS developers working at the level with the problem (e.g. the runtime) seem to lack examples of apps of moderate to high complexity which would cause this problem to trigger so they ask the community for "base-case" simple examples which are extremely hard to create as often simple "hello world" apps don't have the issue and/or there is a temporal component (e.g. the crash happens once a day/week/whatever).

If MS had more apps using MAUI they would likely naturally have those examples to hand and be seeing these issues in the field - this is probably how APIs such as the Windows API itself became so battle hardened in comparison to these "Web 2.0 nu-APIs" where there is a much fatter layer of chaos and undefined behaviour which appears to linger a lot longer.

@pijnappel
Copy link
Author

We found a workaround. The chrashes vanish if we deactivate concurrent garbage collection in the project settings.

<AndroidEnableSGenConcurrent>False</AndroidEnableSGenConcurrent>

I am not sure about the side effects and impacts on performance. When running calculation processes, we see random pauses of the execution for 300ms or more. I assume this happens when the GC is triggered. I haven't found a way to pause the GC for the duration of our calculations.

@gwise-vision
Copy link

Android : AOT Off, Startup Tracing Off, Garbage Collection Off, Enable Trimming Off
Doesn't this cause a crash?

@gwise-vision
Copy link

We found a workaround. The chrashes vanish if we deactivate concurrent garbage collection in the project settings.

<AndroidEnableSGenConcurrent>False</AndroidEnableSGenConcurrent>

I am not sure about the side effects and impacts on performance. When running calculation processes, we see random pauses of the execution for 300ms or more. I assume this happens when the GC is triggered. I haven't found a way to pause the GC for the duration of our calculations.

Android : AOT Off, Startup Tracing Off, Garbage Collection Off, Enable Trimming Off
Doesn't this cause a crash?

@Picao84
Copy link

Picao84 commented Oct 7, 2024

Suffering from this as well. Initially I thought it could be my fault after looking at my logs, but I have since added additional logging and think it must be something with the framework. In my case it happens at app startup and like others I have trouble reproducing it! It began happening after I updated from MAUI controls 8.0.6 to 8.0.80 and the .net Android framework to the latest version.

@Picao84
Copy link

Picao84 commented Oct 8, 2024

Just to update that it seems that disabling the concurrent garbage collector together with enabling AOT fixed my issue as well.

@divil5000
Copy link

This is definitely not a Maui issue, it’s a runtime issue. Thanks for tagging my own issue above which appears to be the same as this.

For us the issue started when we “upgraded” from the older xamarin framework to .net.

@RobbiewOnline
Copy link

I have a segfault too but I've not explicitly enabled AndroidEnableSGenConcurrent so I don't think it's related since the default is false.

Mine appears to be network / connectivity related (see #25446) and I only mention it in case it's relevant here for others.

@Picao84
Copy link

Picao84 commented Oct 23, 2024

In .net android the default is True. In Xamarin.Android the default was false.

@divil5000
Copy link

In MAUI the default is True. It was In Xamarin the default was false.

I think you mean in .net-android the default is true? MAUI has nothing to do with it.

@Picao84
Copy link

Picao84 commented Oct 23, 2024

Yes, I meant .net android.

@jaosnz-rep jaosnz-rep added s/triaged Issue has been reviewed and removed s/triaged Issue has been reviewed labels Oct 24, 2024
@jfversluis
Copy link
Member

Looks like another discussion is happening here which is probably the more appropriate place seeing that this is not something that .NET MAUI influences directly. Closing this one here.

@jfversluis jfversluis closed this as not planned Won't fix, can't repro, duplicate, stale Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-core-platform Integration with platforms platform/android 🤖 s/needs-attention Issue has more information and needs another look t/bug Something isn't working
Projects
None yet
Development

No branches or pull requests