Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tizen] Modify dnssd resolve operation process #37003

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

hyunuktak
Copy link
Contributor

@hyunuktak hyunuktak commented Jan 9, 2025

Problem

In multi-thread environment, Tizen dnssd's resolve operation process may cause deadlock.

Changes

Modify the resolve operation process so that it can operate on one thread.

Testing

Tested locally with TestDnssd test on Tizen:

Samsung's SmartThings and similar apps and services utilize 'Matter' to establish a hub environment. In a situation where various Things are connected, when the Hub is restarted or rebooted, it executes the dnssd resolve process to rediscover the existing Things. During this process, we confirmed that timing issues caused conflicts between multi-threading and lock/unlock operations due to asynchronous execution of the dnssd resolve process for many Things. The sequence of this process could be as follows:
Connect two or more Matter Things.
Restart the Matter Hub.
Duplicate calls to the dnssd resolve process.
Discovery of timing issues with multi-thread lock/unlock for asynchronous resolve processes.
It needs to be verified whether this can be automated.

@hyunuktak hyunuktak requested a review from arkq as a code owner January 9, 2025 05:57
Copy link

Review changes with  SemanticDiff

@github-actions github-actions bot added platform tizen For Tizen platform labels Jan 9, 2025
Copy link

github-actions bot commented Jan 9, 2025

PR #37003: Size comparison from f25f635 to 12bcab0

Full report (70 builds for bl602, bl702, bl702l, cc13x4_26x4, cc32xx, cyw30739, efr32, esp32, linux, nrfconnect, nxp, psoc6, qpg, stm32, telink, tizen)
platform target config section f25f635 12bcab0 change % change
bl602 lighting-app bl602+mfd+littlefs+rpc FLASH 1354956 1354956 0 0.0
RAM 104152 104152 0 0.0
bl702 lighting-app bl702+eth FLASH 726256 726256 0 0.0
RAM 25361 25361 0 0.0
bl702+wifi FLASH 913126 913126 0 0.0
RAM 14101 14101 0 0.0
bl706+mfd+rpc+littlefs FLASH 1173960 1173960 0 0.0
RAM 23941 23941 0 0.0
bl702l lighting-app bl702l+mfd+littlefs FLASH 1083028 1083028 0 0.0
RAM 16612 16612 0 0.0
cc13x4_26x4 lighting-app LP_EM_CC1354P10_6 FLASH 840208 840208 0 0.0
RAM 123696 123696 0 0.0
lock-ftd LP_EM_CC1354P10_6 FLASH 825748 825748 0 0.0
RAM 125584 125584 0 0.0
pump-app LP_EM_CC1354P10_6 FLASH 772568 772568 0 0.0
RAM 114060 114060 0 0.0
pump-controller-app LP_EM_CC1354P10_6 FLASH 756748 756748 0 0.0
RAM 114260 114260 0 0.0
cc32xx air-purifier CC3235SF_LAUNCHXL FLASH 540049 540049 0 0.0
RAM 205800 205800 0 0.0
lock CC3235SF_LAUNCHXL FLASH 574217 574217 0 0.0
RAM 205944 205944 0 0.0
cyw30739 light CYW30739B2-P5-EVK-01 unknown 2040 2040 0 0.0
FLASH 681745 681745 0 0.0
RAM 78756 78756 0 0.0
CYW30739B2-P5-EVK-02 unknown 2040 2040 0 0.0
FLASH 701597 701597 0 0.0
RAM 81396 81396 0 0.0
CYW30739B2-P5-EVK-03 unknown 2040 2040 0 0.0
FLASH 701597 701597 0 0.0
RAM 81396 81396 0 0.0
CYW930739M2EVB-02 unknown 2040 2040 0 0.0
FLASH 658525 658525 0 0.0
RAM 73824 73824 0 0.0
light-switch CYW30739B2-P5-EVK-01 unknown 2040 2040 0 0.0
FLASH 618369 618369 0 0.0
RAM 71748 71748 0 0.0
CYW30739B2-P5-EVK-02 unknown 2040 2040 0 0.0
FLASH 637997 637997 0 0.0
RAM 74292 74292 0 0.0
CYW30739B2-P5-EVK-03 unknown 2040 2040 0 0.0
FLASH 637997 637997 0 0.0
RAM 74292 74292 0 0.0
lock CYW30739B2-P5-EVK-01 unknown 2040 2040 0 0.0
FLASH 637769 637769 0 0.0
RAM 74756 74756 0 0.0
CYW30739B2-P5-EVK-02 unknown 2040 2040 0 0.0
FLASH 657477 657477 0 0.0
RAM 77300 77300 0 0.0
CYW30739B2-P5-EVK-03 unknown 2040 2040 0 0.0
FLASH 657477 657477 0 0.0
RAM 77300 77300 0 0.0
thermostat CYW30739B2-P5-EVK-01 unknown 2040 2040 0 0.0
FLASH 614389 614389 0 0.0
RAM 68844 68844 0 0.0
CYW30739B2-P5-EVK-02 unknown 2040 2040 0 0.0
FLASH 634241 634241 0 0.0
RAM 71476 71476 0 0.0
CYW30739B2-P5-EVK-03 unknown 2040 2040 0 0.0
FLASH 634241 634241 0 0.0
RAM 71476 71476 0 0.0
efr32 lock-app BRD4187C FLASH 932676 932676 0 0.0
RAM 160228 160228 0 0.0
BRD4338a FLASH 747160 747152 -8 -0.0
RAM 233356 233356 0 0.0
window-app BRD4187C FLASH 1025592 1025592 0 0.0
RAM 128332 128332 0 0.0
esp32 all-clusters-app c3devkit DRAM 95352 95352 0 0.0
FLASH 1541956 1541956 0 0.0
IRAM 82552 82552 0 0.0
m5stack DRAM 116332 116332 0 0.0
FLASH 1548154 1548154 0 0.0
IRAM 117039 117039 0 0.0
linux air-purifier-app debug unknown 4752 4752 0 0.0
FLASH 2730137 2730137 0 0.0
RAM 133096 133096 0 0.0
all-clusters-app debug unknown 5560 5560 0 0.0
FLASH 6018726 6018726 0 0.0
RAM 523992 523992 0 0.0
all-clusters-minimal-app debug unknown 5456 5456 0 0.0
FLASH 5355204 5355204 0 0.0
RAM 243008 243008 0 0.0
bridge-app debug unknown 5472 5472 0 0.0
FLASH 4703618 4703618 0 0.0
RAM 221760 221760 0 0.0
chip-tool debug unknown 5992 5992 0 0.0
FLASH 12866788 12866788 0 0.0
RAM 582586 582586 0 0.0
chip-tool-ipv6only arm64 unknown 21400 21400 0 0.0
FLASH 10995728 10995728 0 0.0
RAM 633584 633584 0 0.0
fabric-admin debug unknown 5816 5816 0 0.0
FLASH 11272273 11272273 0 0.0
RAM 582930 582930 0 0.0
fabric-bridge-app debug unknown 4728 4728 0 0.0
FLASH 4528852 4528852 0 0.0
RAM 208880 208880 0 0.0
fabric-sync debug unknown 4968 4968 0 0.0
FLASH 5639429 5639429 0 0.0
RAM 475880 475880 0 0.0
lighting-app debug+rpc+ui unknown 6136 6136 0 0.0
FLASH 5639409 5639409 0 0.0
RAM 232008 232008 0 0.0
lock-app debug unknown 5408 5408 0 0.0
FLASH 4751986 4751986 0 0.0
RAM 208008 208008 0 0.0
ota-provider-app debug unknown 4768 4768 0 0.0
FLASH 4378612 4378612 0 0.0
RAM 201696 201696 0 0.0
ota-requestor-app debug unknown 4720 4720 0 0.0
FLASH 4517520 4517520 0 0.0
RAM 206280 206280 0 0.0
shell debug unknown 4248 4248 0 0.0
FLASH 3036685 3036685 0 0.0
RAM 160736 160736 0 0.0
thermostat-no-ble arm64 unknown 9584 9584 0 0.0
FLASH 4118968 4118968 0 0.0
RAM 246296 246296 0 0.0
tv-app debug unknown 5736 5736 0 0.0
FLASH 5988693 5988693 0 0.0
RAM 599312 599312 0 0.0
tv-casting-app debug unknown 5320 5320 0 0.0
FLASH 11092685 11092685 0 0.0
RAM 695496 695496 0 0.0
nrfconnect all-clusters-app nrf52840dk_nrf52840 FLASH 918100 918100 0 0.0
RAM 143332 143332 0 0.0
nrf7002dk_nrf5340_cpuapp FLASH 890592 890592 0 0.0
RAM 141519 141519 0 0.0
all-clusters-minimal-app nrf52840dk_nrf52840 FLASH 852164 852164 0 0.0
RAM 142244 142244 0 0.0
nxp contact k32w0+release FLASH 586048 586048 0 0.0
RAM 71112 71112 0 0.0
mcxw71+release FLASH 601576 601576 0 0.0
RAM 63328 63328 0 0.0
light k32w0+release FLASH 612700 612700 0 0.0
RAM 70504 70504 0 0.0
k32w1+release FLASH 687320 687320 0 0.0
RAM 48920 48920 0 0.0
lock mcxw71+release FLASH 763656 763656 0 0.0
RAM 70956 70956 0 0.0
psoc6 all-clusters cy8ckit_062s2_43012 FLASH 1647500 1647500 0 0.0
RAM 212128 212128 0 0.0
all-clusters-minimal cy8ckit_062s2_43012 FLASH 1555132 1555132 0 0.0
RAM 208944 208944 0 0.0
light cy8ckit_062s2_43012 FLASH 1470236 1470236 0 0.0
RAM 200912 200912 0 0.0
lock cy8ckit_062s2_43012 FLASH 1467956 1467956 0 0.0
RAM 225272 225272 0 0.0
qpg lighting-app qpg6105+debug FLASH 664328 664328 0 0.0
RAM 105456 105456 0 0.0
lock-app qpg6105+debug FLASH 622156 622156 0 0.0
RAM 99908 99908 0 0.0
stm32 light STM32WB5MM-DK FLASH 485080 485080 0 0.0
RAM 144912 144912 0 0.0
telink bridge-app tlsr9258a FLASH 683634 683634 0 0.0
RAM 91248 91248 0 0.0
contact-sensor-app tlsr9528a_retention FLASH 623874 623874 0 0.0
RAM 31488 31488 0 0.0
light-app-ota-compress-lzma-shell-factory-data tl3218x FLASH 772708 772708 0 0.0
RAM 49348 49348 0 0.0
light-app-ota-shell-factory-data tl7218x FLASH 777324 777324 0 0.0
RAM 99812 99812 0 0.0
light-switch-app-ota-compress-lzma-shell-factory-data tlsr9528a FLASH 711316 711316 0 0.0
RAM 73544 73544 0 0.0
lighting-app-ota-factory-data tlsr9118bdk40d FLASH 628320 628320 0 0.0
RAM 142180 142180 0 0.0
lighting-app-ota-rpc-factory-data-4mb tlsr9518adk80d FLASH 814334 814334 0 0.0
RAM 99724 99724 0 0.0
tizen all-clusters-app arm unknown 5160 5160 0 0.0
FLASH 1780980 1780952 -28 -0.0
RAM 93684 93684 0 0.0
chip-tool-ubsan arm unknown 10844 10844 0 0.0
FLASH 17999086 17999190 104 0.0
RAM 7855832 7855912 80 0.0

Copy link
Contributor

@andy31415 andy31415 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hyunuktak the Tested locally with TestDnssd test on Tizen. is too brief for manual testing. We are actively trying to get manual testing to be detailed both because it is not as obvious unless familiar with the platform and to make it somewhat longer to write to encourage writing automated tests if possible.

Please describe how you tested: what commands did you run, what environment and what was observed. Also explain why adding automated tests for this is not possible (probably because platform specific ... however I believe tizen also has some qemu setup ... is that usable?).

@hyunuktak
Copy link
Contributor Author

@hyunuktak the Tested locally with TestDnssd test on Tizen. is too brief for manual testing. We are actively trying to get manual testing to be detailed both because it is not as obvious unless familiar with the platform and to make it somewhat longer to write to encourage writing automated tests if possible.

Please describe how you tested: what commands did you run, what environment and what was observed. Also explain why adding automated tests for this is not possible (probably because platform specific ... however I believe tizen also has some qemu setup ... is that usable?).

Samsung's SmartThings and similar apps and services utilize 'Matter' to establish a hub environment. In a situation where various Things are connected, when the Hub is restarted or rebooted, it executes the dnssd resolve process to rediscover the existing Things. During this process, we confirmed that timing issues caused conflicts between multi-threading and lock/unlock operations due to asynchronous execution of the dnssd resolve process for many Things. The sequence of this process could be as follows:
Connect two or more Matter Things.
Restart the Matter Hub.
Duplicate calls to the dnssd resolve process.
Discovery of timing issues with multi-thread lock/unlock for asynchronous resolve processes.
It needs to be verified whether this can be automated.

@arkq
Copy link
Contributor

arkq commented Jan 13, 2025

@hyunuktak Due to this PR, there is a segfault in Tizen integration test, please look at it: https://github.com/project-chip/connectedhomeip/actions/runs/12738786647/job/35501600912?pr=37003#step:5:1996

chip::DeviceLayer::StackLock lock;
rCtx->Finalize(CHIP_NO_ERROR);
}
rCtx->Finalize(CHIP_NO_ERROR);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we get a description in the summary? You updated the testing section, however the description is still vague: In multi-thread environment, Tizen dnssd's resolve operation process may cause deadlock..

it seems the deadlock is probably because the stack lock already being aquired. Why is that? Why is the comment that was here before (saying this must be locked) not applicable? Should we have a assertChipStackLockedByTheCurrentThread in here instead since we assume already locked? There should be a comment why we know this is already called in the chip stack context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps the reason for calling stack lock was for safety in callback function. However, if the app and service (e.g., smartthings) already make a stack lock and call, the possibility of a deadlock due to a double stack lock call was found during the asynchronous process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants