Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Signal 11 crash still present after SIMCTL_CHILD_NSZombieEnabled=YES + latest Firebase version #3207

Open
2 tasks done
shamilovtim opened this issue Feb 5, 2022 · 71 comments

Comments

@shamilovtim
Copy link

shamilovtim commented Feb 5, 2022

Description

A clear and concise description of what the bug is.

  • I have tested this issue on the latest Detox release and it still reproduces

Reproduction

We still see Signal 11 crashes in our project even after adding both SIMCTL_CHILD_NSZombieEnabled=YES detox test -c ios.sim.release as well as updating firebase + firebase perf to the latest versions and ensuring that GoogleUtilities are at 7.7+.

The error is commonly reproduced on CircleCI and is incredibly difficult / practically impossible to reproduce locally.

In order to help debug I've provided a list of all of the dependencies in our project. We use Expo 43 (no expo updates), React Native 0.65.2 and Codepush (probably unrelated, the error happens in the middle of a test, not at the beginning). I've found that the Stripe iOS SDK (via tipsi-stripe) is a source of method swizzling. Also, while Flipper is a source of method swizzling, we are building a release project so Flipper should be excluded.

Let me know if there's anything else I can provide to help debug and resolve this issue.

The stacktrace is:

DetoxRuntimeError: The pending request #99 ("invoke") has been rejected due to the following error:

The app has crashed, see the details below:

Signal 11 was raised
(
	0   Detox                               0x00000001127a56a5 +[NSThread(DetoxUtils) dtx_demangledCallStackSymbols] + 37
	1   Detox                               0x00000001127a8230 __DTXHandleCrash + 464
	2   Detox                               0x00000001127a8971 __DTXHandleSignal + 59
	3   libsystem_platform.dylib            0x0000000118ef1d7d _sigtramp + 29
	4   ???                                 0x00006000162b0180 0x0 + 105553488183680
	5   DetoxSync                           0x0000000154151b37 -[_DTXTimerTrampoline fire:] + 188
	6   DetoxSync                           0x0000000154139b70 _DTXCFTimerTrampoline + 74
	7   CoreFoundation                      0x00000001156416b6 __CFRUNLOOP_IS_CALLING_OUT_TO_A_TIMER_CALLBACK_FUNCTION__ + 20
	8   CoreFoundation                      0x00000001156411b2 __CFRunLoopDoTimer + 923
	9   CoreFoundation                      0x0000000115640771 __CFRunLoopDoTimers + 265
	10  CoreFoundation                      0x000000011563adb0 __CFRunLoopRun + 2010
	11  CoreFoundation                      0x000000011563a0f3 CFRunLoopRunSpecific + 567
	12  Shipt                               0x000000010ff87c74 +[RCTCxxBridge runRunLoop] + 281
	13  DetoxSync                           0x000000015414173a swz_runRunLoopThread + 291
	14  Foundation                          0x0000000113539550 __NSThread__start__ + 1025
	15  libsystem_pthread.dylib             0x0000000118e8a8fc _pthread_start + 224
	16  libsystem_pthread.dylib             0x0000000118e86443 thread_start + 15
)
    at _callee4$ (/Users/distiller/nebula/e2e/helpers/helpers.ts:22:42)
    at tryCatch (/Users/distiller/nebula/node_modules/regenerator-runtime/runtime.js:63:40)

List of dependencies present in the project:

	"dependencies": {
		"@babel/preset-typescript": "7.16.7",
		"@bugsnag/react-native": "7.15.1",
		"@formatjs/intl-datetimeformat": "^2.8.3",
		"@formatjs/intl-numberformat": "^5.7.5",
		"@oracle/react-native-pushiomanager": "https://github.com/shamilovtim/pushiomanager-react-native#shipt-next",
		"@react-native-community/async-storage": "^1.6.2",
		"@react-native-community/geolocation": "^2.0.2",
		"@react-native-community/hooks": "2.8.1",
		"@react-native-community/push-notification-ios": "^1.10.1",
		"@react-native-firebase/app": "14.2.4",
		"@react-native-firebase/perf": "14.2.4",
		"@react-navigation/bottom-tabs": "^6.0.9",
		"@react-navigation/native": "^6.0.6",
		"@react-navigation/routers": "^6.1.0",
		"@react-navigation/stack": "^6.0.11",
		"@segment/analytics-react-native": "^1.5.0",
		"@segment/analytics-react-native-firebase": "^1.5.0",
		"accounting-js": "^1.1.1",
		"concurrently": "6.5.1",
		"core-js": "3.20.1",
		"date-fns": "2.7.0",
		"date-fns-tz": "1.0.12",
		"expo": "43.0.4",
		"expo-apple-authentication": "^4.1.0",
		"expo-facebook": "12.0.3",
		"expo-google-sign-in": "10.0.3",
		"expo-linear-gradient": "10.0.3",
		"expo-secure-store": "^11.1.0",
		"expo-tracking-transparency": "2.0.3",
		"graphql": "^15.7.0",
		"graphql-request": "3.7.0",
		"imagemin": "^8.0.1",
		"immer": "9.0.12",
		"inquirer": "^8.2.0",
		"intl": "^1.2.5",
		"invariant": "2.2.4",
		"jsdom": "19.0.0",
		"lodash": "^4.17.21",
		"lottie-ios": "^3.2.3",
		"lottie-react-native": "4.1.3",
		"mailosaur": "^8.1.0",
		"medallia-digital": "https://repository.medallia.com/digital-npm/medallia-digital-rn/medallia-digital-rn-3.8.1.tgz",
		"multi-progress": "4.0.0",
		"nano-memoize": "1.2.1",
		"node-emoji": "^1.11.0",
		"normalizr": "^3.6.1",
		"prop-types": "15.8.0",
		"query-string": "7.0.1",
		"react": "17.0.2",
		"react-native": "0.65.2",
		"react-native-action-sheet": "2.2.0",
		"react-native-android-open-settings": "1.3.0",
		"react-native-branch": "5.3.0",
		"react-native-camera": "4.2.1",
		"react-native-code-push": "^7.0.4",
		"react-native-device-info": "8.4.8",
		"react-native-easing-gradient": "^1.0.0",
		"react-native-fs": "^2.18.0",
		"react-native-geolocation-service": "5.2.0",
		"react-native-gesture-handler": "1.10.3",
		"react-native-get-random-values": "1.7.2",
		"react-native-haptic-feedback": "1.13.0",
		"react-native-html-parser": "^0.1.0",
		"react-native-in-app-review": "^3.2.3",
		"react-native-keyboard-aware-scroll-view": "0.9.5",
		"react-native-mmkv": "1.6.3",
		"react-native-modal": "13.0.0",
		"react-native-progress": "5.0.0",
		"react-native-push-notification": "8.1.1",
		"react-native-reanimated": "1.13.1",
		"react-native-redash": "^15.6.0",
		"react-native-render-html": "4.2.4",
		"react-native-safe-area-context": "3.3.2",
		"react-native-screens": "3.10.1",
		"react-native-share": "7.3.2",
		"react-native-svg": "^12.1.1",
		"react-native-v8": "0.65.1-patch.1",
		"react-native-webview": "11.15.0",
		"react-query": "3.34.6",
		"react-redux": "^7.2.6",
		"reactotron-redux": "^3.1.3",
		"redux": "4.1.2",
		"redux-persist": "^6.0.0",
		"redux-saga": "^1.1.3",
		"reselect": "4.1.5",
		"shelljs": "^0.8.5",
		"sift-react-native": "^0.1.3",
		"tipsi-stripe": "adkenyon/tipsi-stripe#5cab5f303bedc5ffd3fb543b14dc229f5259b0a0",
		"uri-js": "4.4.1",
		"uuid": "^8.3.2",
		"v8-android-jit-nointl": "9.93.0"
	}

Expected behavior

Expect no signal 11

Screenshots / Video

Not necessary right now, will add later if needed.

Environment

  • Detox: 19.4.2
  • React Native: 0.65.2
  • Node: 14.17.3
  • Device: iOS Emulator
  • Xcode: 13.2.1
  • iOS: latest
  • macOS: 11.6
  • Test-runner (select one): jest-circus

Logs

Device and verbose Detox logs

  • I have run my tests using the --loglevel trace argument and am providing the verbose log below:
Detox logs The stacktrace is:
DetoxRuntimeError: The pending request #99 ("invoke") has been rejected due to the following error:

The app has crashed, see the details below:

Signal 11 was raised
(
	0   Detox                               0x00000001127a56a5 +[NSThread(DetoxUtils) dtx_demangledCallStackSymbols] + 37
	1   Detox                               0x00000001127a8230 __DTXHandleCrash + 464
	2   Detox                               0x00000001127a8971 __DTXHandleSignal + 59
	3   libsystem_platform.dylib            0x0000000118ef1d7d _sigtramp + 29
	4   ???                                 0x00006000162b0180 0x0 + 105553488183680
	5   DetoxSync                           0x0000000154151b37 -[_DTXTimerTrampoline fire:] + 188
	6   DetoxSync                           0x0000000154139b70 _DTXCFTimerTrampoline + 74
	7   CoreFoundation                      0x00000001156416b6 __CFRUNLOOP_IS_CALLING_OUT_TO_A_TIMER_CALLBACK_FUNCTION__ + 20
	8   CoreFoundation                      0x00000001156411b2 __CFRunLoopDoTimer + 923
	9   CoreFoundation                      0x0000000115640771 __CFRunLoopDoTimers + 265
	10  CoreFoundation                      0x000000011563adb0 __CFRunLoopRun + 2010
	11  CoreFoundation                      0x000000011563a0f3 CFRunLoopRunSpecific + 567
	12  Shipt                               0x000000010ff87c74 +[RCTCxxBridge runRunLoop] + 281
	13  DetoxSync                           0x000000015414173a swz_runRunLoopThread + 291
	14  Foundation                          0x0000000113539550 __NSThread__start__ + 1025
	15  libsystem_pthread.dylib             0x0000000118e8a8fc _pthread_start + 224
	16  libsystem_pthread.dylib             0x0000000118e86443 thread_start + 15
)
    at _callee4$ (/Users/distiller/nebula/e2e/helpers/helpers.ts:22:42)
    at tryCatch (/Users/distiller/nebula/node_modules/regenerator-runtime/runtime.js:63:40)
@asafkorem
Copy link
Contributor

Hi @shamilovtim, thanks for the report!

Try replacing NSZombieEnabled=YES detox test -c ios.sim.release with SIMCTL_CHILD_NSZombieEnabled=YES detox test -c ios.sim.release

Also, note that the flag that we pass as a fix for the Firebase-related Signal 11 deallocation errors is GULGeneratedClassDisposeDisabled=YES, which replaced the NSZombieEnabled flag that was passed from Detox before it, (see #3167, GoogleUtilities/#66).

@shamilovtim
Copy link
Author

shamilovtim commented Feb 5, 2022

Hey @asafkorem! Just confirmed that we do pass SIMCTL_CHILD_NSZombieEnabled=YES correctly, I had just typed it wrong on the Github ticket. I updated the original ticket to reflect that. Screen Shot 2022-02-05 at 5 12 05 PM

I know that you're passing the more specific flag inside of Detox itself but I was hoping that the Zombie flag might deal with any other method swizzling related crashes. Unfortunately the crashes were still present so the more general Zombie flag did not work. Passing no flag doesn't work either.

@shamilovtim shamilovtim changed the title Signal 11 crash still present after NSZombieEnabled=YES + latest Firebase version Signal 11 crash still present after SIMCTL_CHILD_NSZombieEnabled=YES + latest Firebase version Feb 5, 2022
@asafkorem
Copy link
Contributor

asafkorem commented Feb 6, 2022

@shamilovtim ah okay, I see..
Was it working with the zombie-enable workaround before upgrading Detox to the latest version?

@asafkorem asafkorem self-assigned this Feb 6, 2022
@shamilovtim
Copy link
Author

shamilovtim commented Feb 7, 2022

  • On December 17 I added SIMCTL_CHILD_NSZombieEnabled=YES to yarn detox test -c ci.ios.release in our project.
  • On December 29 I introduced Detox 19.4.1 to the project, which is supposed to include this fix inside detox instead of me declaring it as an env var and removed SIMCTL_CHILD_NSZombieEnabled=YES from detox test -c. In between this time we still saw signal 11 in the project.
  • After December 29, we continued to see Signal 11.
  • On January 25, we upgraded to Detox 19.4.2 and after that continued to receive signal 11 in the project.

No matter what version of Detox or whether or not we apply SIMCTL_CHILD_NSZombieEnabled=YES Signal 11 continues to live in our project. We used to see it way more often in the past, so it has gone down in frequency. I assume that's because the Firebase fix did remove some instances of it but the other ones are still occurring.

@asafkorem
Copy link
Contributor

asafkorem commented Feb 8, 2022

@shamilovtim this sounds like the current Signal 11 errors are more Expo related (and aren't coming from Firebase), based on some previous reports and my gut feeling...

Also, can you tell me if you have some other dependency that uses swizzling in iOS?

That would be hard to investigate based on the logs only, so we must have some project I can work with that reproduces this issue.
I can try and invest some time soon in testing an Expo + Detox project with the Expo version you have, but if you have something ready that reproduces the crash on the latest Detox version and you can send it me this would be very helpful.

@shamilovtim
Copy link
Author

@asafkorem In my opinion the suspects are:

Expo (we don't use expo updates), Stripe (v14 stripe-ios via tipsi stripe), Bugsnag, Codepush, Segment Analytics

Stripe iOS (v14) for sure has method swizzling. I was able to find it by scanning the source code of the Pod.

Flipper uses method swizzling but will not be active during Release builds. (However, I do wonder if Flipper would crash detox debug)?

Excuse my ignorance but is there any way of making Detox play nice with any and all method swizzling rather than the Firebase workaround created in the Google repo?

@asafkorem
Copy link
Contributor

asafkorem commented Feb 13, 2022

@shamilovtim the problem in Firebase/Performance wasn't in Detox end but in their code (it pretty much assumed that no other framework does dynamic ISA-Swizzling for iOS classes), so it's pretty hard to assume that there is a general solution to the problem from Detox' code.

However, once we'll finish with Detox-iOS transition to XCUITest framework, we may be able to avoid doing ISA swizzling for iOS classes so hopefully it will resolve such issues.

@asafkorem
Copy link
Contributor

@asafkorem In my opinion the suspects are:

Expo (we don't use expo updates), Stripe (v14 stripe-ios via tipsi stripe), Bugsnag, Codepush, Segment Analytics

Stripe iOS (v14) for sure has method swizzling. I was able to find it by scanning the source code of the Pod.

Flipper uses method swizzling but will not be active during Release builds. (However, I do wonder if Flipper would crash detox debug)?

Excuse my ignorance but is there any way of making Detox play nice with any and all method swizzling rather than the Firebase workaround created in the Google repo?

These definitely sound suspicious 😅 It will be very helpful for us if you'll run DetoxTemplate (or any example RN project) with Detox tests and any of your dependencies that might cause this error, and try to reproduce the bug. I know it might take a lot of time, but it is also hard for me to reproduce this issue with other libraries or frameworks I'm not familiar with.

@ball-hayden
Copy link
Contributor

ball-hayden commented Mar 9, 2022

We're seeing this, I can maybe narrow down a bit as we intersect with some of those dependencies.

We're using bits of Expo and CodePush.
(We don't use Stripe, Bugsnag, or Segment Analytics)

We also use reanimated (which I see in your dependency list) and our tests do seem to fail with Signal 11s around about places where there are animations - would that match up with what you're seeing @shamilovtim?

@shamilovtim shamilovtim mentioned this issue Mar 28, 2022
1 task
@shamilovtim
Copy link
Author

We can't really nail down the cause. What version reanimated are you on?

@ball-hayden
Copy link
Contributor

ball-hayden commented Mar 28, 2022

We're running react-native-reanimated: ^2.3.1

@shamilovtim
Copy link
Author

We're on 1.13.1. Makes me skeptical it's reanimated. Could it just be Expo in some way?

@ball-hayden
Copy link
Contributor

ball-hayden commented Mar 29, 2022

Possibly, although I can't help but feel it might be animation-timing related given the places we see this are always slightly after an animation has completed.

The places we are seeing this aren't places where we're interacting with Expo directly.

I've tried creating a project based on the Detox Repro Template that has Expo and reanimated in it but I haven't managed to get it to crash yet.

Something I have noticed is that the amount of time it takes Detox to "react" (i.e. continue to the next command) to an animation finishing seems to vary somewhat - see the video below at 19 seconds vs the other repetitions from my attempted reproduction:

test.mov

I also wondered if it might be a race between an animation finishing and a request completing, although again haven't managed to coax the reproduction into crashing.

@shamilovtim
Copy link
Author

Possibly, although I can't help but feel it might be animation-timing related given the places we see this are always slightly after an animation has completed.

The places we are seeing this aren't places where we're interacting with Expo directly.

I've tried creating a project based on the Detox Repro Template that has Expo and reanimated in it but I haven't managed to get it to crash yet.

Something I have noticed is that the amount of time it takes Detox to "react" (i.e. continue to the next command) to an animation finishing seems to vary somewhat - see the video below at 19 seconds vs the other repetitions from my attempted reproduction:

test.mov

I also wondered if it might be a race between an animation finishing and a request completing, although again haven't managed to coax the reproduction into crashing.

Are you using the template in CI? Do you have your code up somewhere so i can pull it or add to it?

We only get these crashes in CI fyi

@ball-hayden
Copy link
Contributor

We only get these crashes in CI

^ that is a point. I don't think I've ever seen it locally.

Fighting some other fires, but I'll get the template pushed somewhere when I can.

@shamilovtim
Copy link
Author

shamilovtim commented Mar 29, 2022

@asafkorem I wonder if we can add reanimated, gesture handler, expo and react navigation to the official Detox e2e tests? By adding all of the common dependencies it might help to catch some of this crashing. What CI provider do you guys use for the project?

@noomorph
Copy link
Collaborator

@shamilovtim

What CI provider do you guys use for the project?

Buildkite + Macstadium

@ball-hayden
Copy link
Contributor

@shamilovtim sorry for the delay.

https://github.com/PlayerData/DetoxReanimatedRepro

Just trying to set up GitHub actions to see if that might result in a reproduction.

@ball-hayden
Copy link
Contributor

@shamilovtim can I ask, do you run multiple simulators at once in CI (as in, do you use the --max-workers flag)?

@ball-hayden
Copy link
Contributor

ball-hayden commented Apr 11, 2022

Running PlayerData/DetoxReanimatedRepro#4 locally, Detox is consistently not waiting for animations to finish for some workers (between 1 and 3).

It doesn't, however, Signal 11

@ball-hayden
Copy link
Contributor

To brain dump a bit more - the stack trace takes us to here (specifically, the block):

https://github.com/wix/DetoxSync/blob/f5cd8cbde60311bb8e41df7c773850cd73c8f84b/DetoxSync/DetoxSync/Utils/_DTXTimerTrampoline.m#L120

Could fire be running for a timer that's been disposed, leading to mode or timer being invalid by the time we execute the block?

Sorry to ping @LeoNatan, but I'm waaay out of my depth here - would you have any ideas around this space?

@shamilovtim
Copy link
Author

@shamilovtim can I ask, do you run multiple simulators at once in CI (as in, do you use the --max-workers flag)?

No we don't. CircleCI doesn't allow hardware acceleration so we are limited performance wise

@shamilovtim
Copy link
Author

To brain dump a bit more - the stack trace takes us to here (specifically, the block):

https://github.com/wix/DetoxSync/blob/f5cd8cbde60311bb8e41df7c773850cd73c8f84b/DetoxSync/DetoxSync/Utils/_DTXTimerTrampoline.m#L120

Could fire be running for a timer that's been disposed, leading to mode or timer being invalid by the time we execute the block?

Sorry to ping @LeoNatan, but I'm waaay out of my depth here - would you have any ideas around this space?

I wonder if this has something to do with software rendered, slow CI VMs? Maybe the VM is so slow that timer times out and gets GCCed or whatever like you said? Which is why we don't see this on Detox running on bare metal instances

@ball-hayden
Copy link
Contributor

Our CI doesn't use VMs but instead a pair of slightly ageing Mac Minis (2018) so the same theory holds.

@ball-hayden
Copy link
Contributor

Couple of patches I tried yesterday.

Firstly, this change which guards against the returned proxy being null.

diff --git a/node_modules/detox/DetoxSync/DetoxSync/DetoxSync/Spies/CADisplayLink+DTXSpy.m b/node_modules/detox/DetoxSync/DetoxSync/DetoxSync/Spies/CADisplayLink+DTXSpy.m
index 023eabc..c296644 100644
--- a/node_modules/detox/DetoxSync/DetoxSync/DetoxSync/Spies/CADisplayLink+DTXSpy.m
+++ b/node_modules/detox/DetoxSync/DetoxSync/DetoxSync/Spies/CADisplayLink+DTXSpy.m
@@ -102,13 +102,16 @@ extern atomic_cfrunloop __RNRunLoop;
 	if(self.isPaused != paused)
 	{
 		id<DTXTimerProxy> proxy = [DTXTimerSyncResource existingTimerProxyWithDisplayLink:self create:NO];
-		if(paused == YES)
-		{
-			[proxy untrack];
-		}
-		else
-		{
-			[self _detox_sync_trackIfNeeded];
+
+		if(proxy) {
+			if(paused == YES)
+			{
+				[proxy untrack];
+			}
+			else
+			{
+				[self _detox_sync_trackIfNeeded];
+			}
 		}
 	}

This didn't help, which kinda makes sense - there should always be a proxy at this point if I've understood properly.

The more interesting change is this one:

diff --git a/node_modules/detox/DetoxSync/DetoxSync/DetoxSync/Spies/NSTimer+DTXSpy.m b/node_modules/detox/DetoxSync/DetoxSync/DetoxSync/Spies/NSTimer+DTXSpy.m
index 8f5549c..9a82b9e 100644
--- a/node_modules/detox/DetoxSync/DetoxSync/DetoxSync/Spies/NSTimer+DTXSpy.m
+++ b/node_modules/detox/DetoxSync/DetoxSync/DetoxSync/Spies/NSTimer+DTXSpy.m
@@ -78,7 +78,7 @@ CFRunLoopTimerRef __detox_sync_CFRunLoopTimerCreateWithHandler(CFAllocatorRef al
 static void (*__orig_CFRunLoopAddTimer)(CFRunLoopRef rl, CFRunLoopTimerRef timer, CFRunLoopMode mode);
 void __detox_sync_CFRunLoopAddTimer(CFRunLoopRef rl, CFRunLoopTimerRef timer, CFRunLoopMode mode)
 {
-//	NSLog(@"🤦‍♂️ addTimer: %@", NS(timer));
+	NSLog(@"🤦‍♂️ addTimer: %@", NS(timer));

 	id<DTXTimerProxy> trampoline = [DTXTimerSyncResource existingTimerProxyWithTimer:NS(timer)];
 	trampoline.runLoop = rl;
@@ -91,7 +91,7 @@ void __detox_sync_CFRunLoopAddTimer(CFRunLoopRef rl, CFRunLoopTimerRef timer, CF
 static void (*__orig_CFRunLoopRemoveTimer)(CFRunLoopRef rl, CFRunLoopTimerRef timer, CFRunLoopMode mode);
 void __detox_sync_CFRunLoopRemoveTimer(CFRunLoopRef rl, CFRunLoopTimerRef timer, CFRunLoopMode mode)
 {
-//	NSLog(@"🤦‍♂️ removeTimer: %@", NS(timer));
+	NSLog(@"🤦‍♂️ removeTimer: %@", NS(timer));

 	id<DTXTimerProxy> trampoline = [DTXTimerSyncResource existingTimerProxyWithTimer:NS(timer)];
 	[trampoline untrack];
@@ -102,7 +102,7 @@ void __detox_sync_CFRunLoopRemoveTimer(CFRunLoopRef rl, CFRunLoopTimerRef timer,
 static void (*__orig_CFRunLoopTimerInvalidate)(CFRunLoopTimerRef timer);
 void __detox_sync_CFRunLoopTimerInvalidate(CFRunLoopTimerRef timer)
 {
-//	NSLog(@"🤦‍♂️ invalidate: %@", NS(timer));
+	NSLog(@"🤦‍♂️ invalidate: %@", NS(timer));

 	id<DTXTimerProxy> trampoline = [DTXTimerSyncResource existingTimerProxyWithTimer:NS(timer)];
 	[trampoline untrack];
@@ -113,7 +113,7 @@ void __detox_sync_CFRunLoopTimerInvalidate(CFRunLoopTimerRef timer)
 static void (*__orig___NSCFTimer_invalidate)(NSTimer* timer);
 void __detox_sync___NSCFTimer_invalidate(NSTimer* timer)
 {
-	//	NSLog(@"🤦‍♂️ invalidate: %@", timer);
+    NSLog(@"🤦‍♂️ invalidate: %@", timer);

 	id<DTXTimerProxy> trampoline = [DTXTimerSyncResource existingTimerProxyWithTimer:timer];
 	[trampoline untrack];
diff --git a/node_modules/detox/DetoxSync/DetoxSync/DetoxSync/Utils/_DTXTimerTrampoline.m b/node_modules/detox/DetoxSync/DetoxSync/DetoxSync/Utils/_DTXTimerTrampoline.m
index 5235fdc..b66d945 100644
--- a/node_modules/detox/DetoxSync/DetoxSync/DetoxSync/Utils/_DTXTimerTrampoline.m
+++ b/node_modules/detox/DetoxSync/DetoxSync/DetoxSync/Utils/_DTXTimerTrampoline.m
@@ -162,6 +162,8 @@ const void* __DTXTimerTrampolineKey = &__DTXTimerTrampolineKey;
 		return;
 	}

+    NSLog(@"🤦‍♂️ track: %@", _timer);
+	
 	_tracking = YES;
 	[DTXTimerSyncResource.sharedInstance trackTimerTrampoline:self];
 }
@@ -173,7 +175,7 @@ const void* __DTXTimerTrampolineKey = &__DTXTimerTrampolineKey;
 		return;
 	}

-	//	NSLog(@"🤦‍♂️ untrack: %@", _timer);
+    NSLog(@"🤦‍♂️ untrack: %@", _timer);

 	[DTXTimerSyncResource.sharedInstance untrackTimerTrampoline:self];
 	_tracking = NO;

All this change is doing is re-enabling some of the commented out logging (and adding a little bit extra).

With this change, I wasn't able to reproduce a segfault over around 10 runs (usually, I would have seen at least one if not two in that space).

Presumably, adding the logging is causing a reference to the timer to be retained for longer than it would be otherwise. Either that, or the extra slowness of logging makes the race no longer happen.

@shamilovtim
Copy link
Author

valid

@stale stale bot removed the 🏚 stale label Nov 22, 2022
@shamilovtim
Copy link
Author

I once repro'd this locally on my M1 Max . Wonder how many Detox runs it would take to see it again.

@matthewparavati

This comment was marked as off-topic.

@ball-hayden
Copy link
Contributor

@matthewparavati if you can reliably reproduce your issue, that sounds like a different problem.

I'd suggest creating a reproduction repo based on https://github.com/wix-incubator/DetoxTemplate and filing a separate issue.

@shamilovtim
Copy link
Author

Signal 6 is different than Signal 11. In most cases we have found that Signal 6 is a valid crash in your business logic and a problem in your app.

@matthewparavati
Copy link

Thanks for the responses @ball-hayden and @shamilovtim. Apparently our android e2e tests were a false positive pass. They were actually crashing, too, but for some reason said they finished as passing 😕 I'll look to create a new issue

@d4vidi
Copy link
Collaborator

d4vidi commented Jan 11, 2023

@shamilovtim In the next reproduction of this issue... Could you be kind enough to provide all of the up-to-date artifacts and info so we could take another round at investigating this?

Or better yet, let's take this up on Discord.

@stale
Copy link

stale bot commented Feb 11, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
If you believe the issue is still relevant, please test on the latest Detox and report back.

Thank you for your contributions!

For more information on bots in this repository, read this discussion.

@stale stale bot added the 🏚 stale label Feb 11, 2023
@ball-hayden
Copy link
Contributor

Still seeing this in the latest versions.
(Appreciate this isn't massively helpful, but reproducing it continues to be very difficult)

@d4vidi what artifacts and further information would you like - I can make sure our CI is archiving them so I can send them over?

@stale
Copy link

stale bot commented Feb 21, 2023

The issue has been closed for inactivity.

@stale stale bot closed this as completed Feb 21, 2023
@ball-hayden
Copy link
Contributor

Oh. Okay. Definitely still seeing this. Sorry.

@d4vidi
Copy link
Collaborator

d4vidi commented Feb 23, 2023

Still seeing this in the latest versions. (Appreciate this isn't massively helpful, but reproducing it continues to be very difficult)

@d4vidi what artifacts and further information would you like - I can make sure our CI is archiving them so I can send them over?

Screenshots, videos, logs, view-hierarchy - Essentially everything 😄 You can start here

@owens-ben
Copy link

+1 seeing this in latest versions. Will try and gather some logs if possible

@CharlesMangwa
Copy link

CharlesMangwa commented Jun 7, 2023

also getting the signal 11 error sporadicially with these 2 logs being the ones that show up the most:

1. most frequent:
Signal 11 was raised
(
	0   Detox                               0x0000000102727118 +[NSThread(DetoxUtils) dtx_demangledCallStackSymbols] + 36
	1   Detox                               0x0000000102729d28 __DTXHandleCrash + 440
	2   Detox                               0x000000010272a358 __DTXHandleSignal + 72
	3   libsystem_platform.dylib            0x00000001b0591760 _sigtramp + 52
	4   STAGING                             0x0000000100ae51f4 reanimated::Scheduler::triggerUI() + 92
	5   STAGING                             0x0000000100ae51f4 reanimated::Scheduler::triggerUI() + 92
	6   DetoxSync                           0x000000010d5a0454 ____detox_sync_dispatch_wrapper_block_invoke + 44
	7   libdispatch.dylib                   0x0000000180133fa4 _dispatch_call_block_and_release + 24
	8   libdispatch.dylib                   0x0000000180135768 _dispatch_client_callout + 16
	9   libdispatch.dylib                   0x0000000180145018 _dispatch_main_queue_drain + 1220
	10  libdispatch.dylib                   0x0000000180144b44 _dispatch_main_queue_callback_4CF + 40
	11  CoreFoundation                      0x0000000180372ca4 __CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__ + 12
	12  CoreFoundation                      0x000000018036d360 __CFRunLoopRun + 1956
	13  CoreFoundation                      0x000000018036c7a4 CFRunLoopRunSpecific + 584
	14  GraphicsServices                    0x0000000188ff7c98 GSEventRunModal + 160
	15  UIKitCore                           0x000000011021237c -[UIApplication _run] + 868
	16  DetoxSync                           0x000000010d5a68fc __detox_sync_UIApplication_run + 376
	17  UIKitCore                           0x0000000110216374 UIApplicationMain + 124
	18  STAGING                             0x00000001008dabdc main + 80
	19  dyld                                0x000000010235dfa0 start_sim + 20
	20  ???                                 0x00000001022ad08c 0x0 + 4331327628
	21  ???                                 0xce6a000000000000 0x0 + -3573043354365067264
)
2. once every 10 random crashes approximately:
Signal 11 was raised
(
	0   Detox                               0x00000001028bb118 +[NSThread(DetoxUtils) dtx_demangledCallStackSymbols] + 36
	1   Detox                               0x00000001028bdd28 __DTXHandleCrash + 440
	2   Detox                               0x00000001028be358 __DTXHandleSignal + 72
	3   libsystem_platform.dylib            0x00000001b0591760 _sigtramp + 52
	4   STAGING                             0x0000000100dc11f4 reanimated::Scheduler::triggerUI() + 92
	5   STAGING                             0x0000000100dc11f4 reanimated::Scheduler::triggerUI() + 92
	6   STAGING                             0x0000000100d9c49c invocation function for block in reanimated::createReanimatedModule(RCTBridge*, std::__1::shared_ptr) + 60
	7   STAGING                             0x0000000100daae54 -[REAAnimationsManager onViewCreate:after:] + 84
	8   STAGING                             0x0000000100dbb450 __51-[REAUIManager uiBlockWithLayoutUpdateForRootView:]_block_invoke.45 + 1824
	9   STAGING                             0x0000000100e57c50 __44-[RCTUIManager flushUIBlocksWithCompletion:]_block_invoke + 164
	10  STAGING                             0x0000000100e57d40 __44-[RCTUIManager flushUIBlocksWithCompletion:]_block_invoke.146 + 28
	11  DetoxSync                           0x0000000114c5c454 ____detox_sync_dispatch_wrapper_block_invoke + 44
	12  libdispatch.dylib                   0x0000000180133fa4 _dispatch_call_block_and_release + 24
	13  libdispatch.dylib                   0x0000000180135768 _dispatch_client_callout + 16
	14  libdispatch.dylib                   0x0000000180145018 _dispatch_main_queue_drain + 1220
	15  libdispatch.dylib                   0x0000000180144b44 _dispatch_main_queue_callback_4CF + 40
	16  CoreFoundation                      0x0000000180372ca4 __CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__ + 12
	17  CoreFoundation                      0x000000018036d360 __CFRunLoopRun + 1956
	18  CoreFoundation                      0x000000018036c7a4 CFRunLoopRunSpecific + 584
	19  GraphicsServices                    0x0000000188ff7c98 GSEventRunModal + 160
	20  UIKitCore                           0x0000000103fe637c -[UIApplication _run] + 868
	21  DetoxSync                           0x0000000114c628fc __detox_sync_UIApplication_run + 376
	22  UIKitCore                           0x0000000103fea374 UIApplicationMain + 124
	23  STAGING                             0x0000000100bb6bdc main + 80
	24  dyld                                0x00000001023edfa0 start_sim + 20
	25  ???                                 0x000000010249108c 0x0 + 4333310092
	26  ???                                 0x4a72000000000000 0x0 + 5364350106151682048
)

as you can see, in our case, the crash seems to be related to reanimated (firebase or expo aren't installed/used).

update: also seeing crashes on android (here again, reanimated seems to be involved):

bad_weak_ptr exception on android:
The pending request #65 ("invoke") has been rejected due to the following error:
The app has crashed, see the details below:

@Thread main(2):
com.facebook.jni.CppException: bad_weak_ptr
	at com.swmansion.reanimated.Scheduler.triggerUI(Native Method)
	at com.swmansion.reanimated.layoutReanimation.AnimationsManager.onViewCreate(AnimationsManager.java:135)
	at com.swmansion.reanimated.layoutReanimation.ReaLayoutAnimator.applyLayoutUpdate(ReanimatedNativeHierarchyManager.java:89)
	at com.facebook.react.uimanager.NativeViewHierarchyManager.updateLayout(NativeViewHierarchyManager.java:252)
	at com.facebook.react.uimanager.NativeViewHierarchyManager.updateLayout(NativeViewHierarchyManager.java:222)
	at com.facebook.react.uimanager.UIViewOperationQueue$UpdateLayoutOperation.execute(UIViewOperationQueue.java:169)
	at com.facebook.react.uimanager.UIViewOperationQueue$1.run(UIViewOperationQueue.java:915)
	at com.facebook.react.uimanager.UIViewOperationQueue.flushPendingBatches(UIViewOperationQueue.java:1026)
	at com.facebook.react.uimanager.UIViewOperationQueue.access$2600(UIViewOperationQueue.java:47)
	at com.facebook.react.uimanager.UIViewOperationQueue$DispatchUIFrameCallback.doFrameGuarded(UIViewOperationQueue.java:1086)
	at com.facebook.react.uimanager.GuardedFrameCallback.doFrame(GuardedFrameCallback.java:29)
	at com.facebook.react.modules.core.ReactChoreographer$ReactChoreographerDispatcher.doFrame(ReactChoreographer.java:175)
	at com.facebook.react.modules.core.ChoreographerCompat$FrameCallback$1.doFrame(ChoreographerCompat.java:85)
	at android.view.Choreographer$CallbackRecord.run(Choreographer.java:1035)
	at android.view.Choreographer.doCallbacks(Choreographer.java:845)
	at android.view.Choreographer.doFrame(Choreographer.java:775)
	at android.view.Choreographer$FrameDisplayEventReceiver.run(Choreographer.java:1022)
	at android.os.Handler.handleCallback(Handler.java:938)
	at android.os.Handler.dispatchMessage(Handler.java:99)
	at android.os.Looper.loopOnce(Looper.java:201)
	at android.os.Looper.loop(Looper.java:288)
	at android.app.ActivityThread.main(ActivityThread.java:7839)
	at java.lang.reflect.Method.invoke(Native Method)
	at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:548)
	at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1003)

and fwiw:

environment:
  • detox: 20.8.0
  • react native: 0.67.5
  • node: 16.20.0
  • device: ios emulator & android emulator
  • ios: 16.4
  • android: 12 (api 31)
  • xcode: 14.3.1
  • macos: 13.4
  • soc: apple silicon
  • test-runner: jest

@ball-hayden
Copy link
Contributor

the crash seems to be related to reanimated

Yeah - that's about as far we've narrowed it down to.
That's in line with the investigations I'd done a while back. I have a reasonable suspicion that it's something to do with the timer trampoline objects being garbage collected (or otherwise becoming invalid).

Our workaround is to stub out everything that's animated with a non-animated variant, which isn't great :-(

@LucidityDesign
Copy link

LucidityDesign commented Jul 3, 2023

For me this is related to react-native-camera. I will investigate further..

Edit: I mocked all files, that import suspicious modules and so far both react-native-tab-view and react-native-reanimated cause crashes. It seems like react-native-camera doesn't cause crashes

@noomorph
Copy link
Collaborator

noomorph commented Jul 4, 2023

@d4vidi @asafkorem we might need to set up a test suite with popular libraries someday. 😌

@ball-hayden
Copy link
Contributor

^ ultimately, the problems here come down to Detox's tracking of animations.

Any library that triggers animations (react-native-tab-view uses Animated from react-native) are likely to cause this issue.

If we can solve the underlying issue in the handling of animations through the trampolines, this problem will go away (and we probably don't need to have a test suite with popular libraries - a basic test of animation tracking functionality should be sufficient).

@noomorph
Copy link
Collaborator

noomorph commented Jul 4, 2023

@ball-hayden Detox has some basic self-test suite for animations: https://github.com/wix/Detox/blob/master/detox/test/e2e/12.animations.test.js

Seems like it is not sufficient. A good question would be "what's missing".

Here's the source code of the Animations screen:

https://github.com/wix/Detox/blob/master/detox/test/src/Screens/AnimationsScreen.js

@shamilovtim
Copy link
Author

Try animations that use JSI. I've experienced many JSI crashes due to reload, hard reload or hot reload. So reanimated.

@mklb
Copy link

mklb commented Oct 2, 2023

I had this error for over a year now. Finally solved it by upgrading react-native-reanimated to the latest version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests