Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subscription failed with chip.exceptions.ChipStackError: IM Error 0x00000580: General error: 0x80 (INVALID_ACTION) . #37171

Open
Saravana-kr22 opened this issue Jan 23, 2025 · 1 comment
Assignees
Labels

Comments

@Saravana-kr22
Copy link
Contributor

Saravana-kr22 commented Jan 23, 2025

Description
When executing the script TC_Darwin_pair.py as a part of stress testing, in the CASE session when sending wildcard subscription using 2 controller there is a exception being thrown which is chip.exceptions.ChipStackError: IM Error 0x00000580: General error: 0x80 (INVALID_ACTION) . This script will mimic on a high level how the pairing occurs in the Darwin ecosystem. Here there are two fabrics is created and wildcard subcription is performed using both the controllers. But this INVALID_ACTION error occured only on the 2nd controller.

GitHub hash of the SDK that were being used are : 36f81cd

Platforms
I Have performed the execution on Raspberrypi using on-network provisioning
Test Bed :
Controller Platform - python controller running on Raspberrypi 4B -8GB,
RPI as DUT - all-cluster-app running all-cluster-app running on Raspberrypi 4B -8GB
Provisioning method - On-network

Test scenario
Pair/Provision the DUT (All-cluster-app running on raspi) as per drawiin pairing flow using On-network.
Read device information after the device is provisioned
sleep for 2 secs
Unpair the DUT (i.e. removing the fabric)
Factory reset the DUT
repeat step1 - 6 for 200 iterations
Script reference: https://github.com/CHIP-Specifications/matter-qa/blob/main/src/matter_qa/scripts/reliability_scripts/TC_Darwin_Pair.py

Execution Details

In this Test bed for Iteration [ 138, 139, 140, 141] second controller is failed to perform wildcard subscription
Iterations scheduled to run: 200
Iterations Failed: 7
Iterations Passed: 193
Execution Logs (logs consists of 1 Pass iteration:142 and 4 Fail Iterations):

138_failed.zip
139_failed.zip
140_failed.zip
141_failed.zip
142_passed.zip

@Saravana-kr22 Saravana-kr22 changed the title Subcription failed with chip.exceptions.ChipStackError: IM Error 0x00000580: General error: 0x80 (INVALID_ACTION) . Subscription failed with chip.exceptions.ChipStackError: IM Error 0x00000580: General error: 0x80 (INVALID_ACTION) . Jan 23, 2025
@bzbarsky-apple
Copy link
Contributor

From 138_failed.zip from the DUT log, here's the relevant bit:

[1737585692.750899][29350:29350] CHIP:EM: >>> [E:40690r S:22244 M:49634145] (S) Msg RX from 2:000000000001B66A [99AB] to 0000000012344322 --- Type 0001:03 (IM:SubscribeRequest) (B:51)
...
[1737585718.960936][29350:29350] CHIP:EM: <<< [E:40690r S:22244 M:198522001 (Ack:49634217)] (S) Msg TX from 0000000012344322 to 2:000000000001B66A [99AB] [UDP:[fe80::3c73:93e0:a746:d2c5%wlan0]:43012] --- Type 0001:05 (IM:ReportData) (B:1184)
...
[1737585718.963309][29350:29350] CHIP:EM: >>> [E:40691r S:22244 M:49634218] (S) Msg RX from 2:000000000001B66A [99AB] to 0000000012344322 --- Type 0001:03 (IM:SubscribeRequest) (B:51)
...
[1737585718.964428][29350:29350] CHIP:IM: Deleting previous subscription from NodeId: 000000000001B66A, FabricIndex: 2
...
[1737585720.611132][29350:29350] CHIP:EM: >>> [E:40690r S:22244 M:49634221 (Ack:198522001)] (S) Msg RX from 2:000000000001B66A [99AB] to 0000000012344322 --- Type 0001:01 (IM:StatusResponse) (B:42)
...
[1737585720.611341][29350:29350] CHIP:IM: Msg type 1 not supported
...
[1737585720.613503][29350:29350] CHIP:EM: <<< [E:40690r S:22244 M:198522003 (Ack:49634221)] (S) Msg TX from 0000000012344322 to 2:000000000001B66A [99AB] [UDP:[fe80::3c73:93e0:a746:d2c5%wlan0]:43012] --- Type 0001:01 (IM:StatusResponse) (B:42)

So we're in the middle of a priming read for that first SubscribeRequest; the DUT sent a ReportData and are waiting for the StatusResponse. Then another SubscribeRequest from the same node comes in, with keepSubscriptions set to false, so that in-progress subscription is torn down. When the StatusResponse then comes in, the DUT is not expecting it anymore, so it responds INVALID_ACTION. Then the client tears down its side of the first subscription and reports the error.

But the fundamental problem is the client doing the new subscribe with keepSubscriptions set to false before it's done with the previous subscription.

OK, so why is the controller doing that second subscribe? From the controller side of the log there:

2025-01-22 22:40:38,364 - base_tc - INFO - matter_qa_base_test_class.py:_create_iteration_log_file:221 - Started Iteration 138
...
2025-01-22 22:40:41,172 - chip.native.DMG - INFO - __init__.py:_RedirectToPythonLogging:41 - OnResubscribeTimerCallback: ForceCASE = 1

Looks to me like there is state left over from previous iterations that then messes with this one. So basically: the test script is buggy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

3 participants