NXP LPSPI: Convert to native zephyr driver, remove MCUX shim #85010

decsny · 2025-02-01T05:44:44Z

Convert last pieces of the LPSPI driver completely to a native Zephyr driver since mostly it already is.

Main commit message:

Remove the shim to the SDK driver, and only use direct register writes in the zephyr driver instead. This will save flash space.

Remove mcux branding from all code.

While doing this part of the rewrite, I discovered also some issues that I went ahead and fixed:
- The driver was hardcoded to always only use PCS 0 no matter what the user provided spi_config set slave to.
- Active high CS configuration was not respected.

Fixes an issue where SDK driver performed an althgorithm on 2,048 possible clock configurations before every transfer.
This was replaced with a binary search in the native version.
And make it so that this configuration does not need to happen every transfer, if the same spi_cfg is used, to improve performance.
Various optimizations also due to the fact that the use case is more clear than the SDK was written for, so we can be more straightforward and simpler.

Also, I have calculated the difference in the code size of the LPSPI driver in 3.7.0 (including HAL + shim driver) vs the version in this PR after all the recent changes, and have the following results:

Code Size Difference

Zephyr 3.7.0 LPSPI:
- 2.793 KB from HAL driver itself
- 1.488 KB from Zephyr shim driver
- 188 bytes from common Zephyr SPI code
= Total 4.465 KB (4,572 bytes)

Latest Changes + This PR (Hopefully Zephyr 4.1.0 LPSPI):
- 1.352 KB from CPU-based LPSPI specific driver (spi_nxp_lpspi.c)
- 1.230 KB from common LPSPI driver code (spi_nxp_lpspi_common.c)
- 190 bytes from common Zephyr SPI code
= Total 2.768 KB (2,834 bytes)

Tested on MCXN, pending testing on RT and MCXW

Depends on #84931
Depends on zephyrproject-rtos/hal_nxp#504
Depends on hardware testing for MCXW, and RT

zephyrbot · 2025-02-01T05:45:43Z

The following west manifest projects have changed revision in this Pull Request:

Name	Old Revision	New Revision	Diff
hal_nxp	zephyrproject-rtos/hal_nxp@`49ff7e3` (`master`)	zephyrproject-rtos/hal_nxp#504	zephyrproject-rtos/hal_nxp#504/files

⛔ DNM label due to: 1 project with PR revision

Note: This message is automatically posted and updated by the Manifest GitHub Action.

butok · 2025-02-03T08:05:23Z

Is it possible to add improvements to the original MCUX driver and avoid adding and maintaining new Zephyr specific drivers?

decsny · 2025-02-03T13:55:27Z

Is it possible to add improvements to the original MCUX driver and avoid adding and maintaining new Zephyr specific drivers?

It was internally requested for this driver to enhance the API for this SDK driver 5 years ago to be compatible with zephyr and rejected by SDK team due to having to rewrite the entire fsl_lpspi to achieve it basically. So why write a new fsl_lpspi specifically designed for the purpose of working with zephyr and then shim to it, when you can just write it properly in zephyr?

In my experience, the shim drivers cause the most maintainance overhead by like 10 fold, specifically the ones with heavy abstraction. Having zephyr native is pretty much always simpler to maintain, that's the main advantage. Not to mention removing coupling and dependency is always a positive thing in software engineering practice. This coupling can cause configuration management problems and "spaghetti" code due to needing to design an implementation around an existing API and having to pull from upstream projects.

Another difference between zephyr driver and shim driver, is when a user finds a problem, for the shim driver, we get bug reports. Because the HAL is much harder to deal with for them. For the zephyr native, they just make a PR.

I also think that as shown by the numbers in my original post, zephyr native always reduces code size by a mega amount, in this case we have more than halved throughout the last 2 months by this conversion. That's normal whenever this happens. The new zephyr driver is smaller than the Hal was alone. Partly maybe due to the fact that it can be tailored to the zephyr specific driver implementation rather than as general as possible like the HAL needs to achieve.

So it is generally desirable not to have shim drivers. The choice to use shim drivers was an early zephyr project choice when there was only a small amount of contributors and they needed to move fast to make the project viable with quick hardware support. Now at this point it's just not desirable to have these dependencies for those reasons I said above.

Raymond0225 · 2025-02-03T18:03:42Z

I spotted that a lot of modifications are from "CONFIG_SPI_MCUX_LPSPI" to "CONFIG_SPI_NXP_LPSPI".
why we need to rename it?

ehughes · 2025-02-03T18:11:13Z

@decsny FYI. I am tracking an obscure bug with the shim based driver for LPSPI on the MCXA156. Even though the LPSPI driver passes the test, I was able to get it to fail in interesting ways. I am tagging you as I saw you submitted the updated to the FRDM-MCXA156. I can try this driver.

Issue : Using the LPSPI driver on MCXA156 with the generic ""jedec,spi-nor" and a external flash memory only works with optimizations disabled.

Setup:

I have a FRDM-MCXA156 with a W25q64 SPI Flash memory wired to LPSPI1.

I am using the "SPI_FLASH" sample as a basis for the my test.

Note: As a controller, I also tested the same Flash chip wired to a FRDM-MCX947 (which also uses LPSPI) and a LPC55S69Xpresso (which has flexcomm). The spi_flash runs OK on both of them. I narrowed it down to some interaction w/ the MCXA156 / LPSPI

I created an overlay :

The driver works through boot and most of the test. It can read the the JEDEC flash ID, issue erase commands, etc. However, I noticed that writes would fail. After some debugging I observed this on my logic analyzer:

The specific transaction for the flash write had the chip select active >200mS, most of that idle. The transaction after that was a command to shut down the flash as the SPI transaction for the write returned a timeout from the driver.

I turned on logging and saw this message:

The flash write is successfull if all optimizations are disabled.

Looking through the LP SPI Flash driver, I found this:

If you follow the spi_transcieve_dt function:

I noticed when optimizations are on, This if statement never evaluates to true

With optimizations off, it will evaluate to true for the write transcations.

Even though this appears to be an issue with the SPI NOR driver for MCXA156. I could wire the same chip to other boards (MCXN947 and LPC55S69) and it worked with optimizations turned on.

This does appear to be timing related. If I turn on all the SPI Dbg Messages, the sample will work.

I noticed that MCXN driver isntance had a fifo size of 8 while MCXA had a size of 4. If I reduce this size, the MCXA156 version works

Values of 1,2,3 yield a successful test result with optimizations enabled. A value of 4 will result in the timeout error.

I am going to modify my west manifest to point to your new branch/driver and see if makes a difference.

decsny · 2025-02-03T18:19:02Z

I spotted that a lot of modifications are from "CONFIG_SPI_MCUX_LPSPI" to "CONFIG_SPI_NXP_LPSPI". why we need to rename it?

Anything in zephyr named mcux means it is a shim to the mcux ecosystem, and so this is no longer a shim we remove mcux .

decsny · 2025-02-03T18:21:21Z

@ehughes I also found issues with using LPSPI with the SPI nor driver, because I was trying to enable the QSPI flash on from mcxw71 board. So far the new driver doesn't have any difference from the shim version as far as trying to communicate to the flash, so I doubt if you'll find it better. But thanks for the information because I'm going to get back look into this too eventually. Can you create a GitHub issue with the information you have about it?

ehughes · 2025-02-03T18:23:29Z

@decsny Updated my manifest to point to nxp-upstream, commit 52f48c1

The spi_loopback test has a bus fault (as does the spi_flash sample) for the FRDM-MCXA156 board. I used Segger Ozone to trace:

ehughes · 2025-02-03T18:24:24Z

@ehughes I also found issues with using LPSPI with the SPI nor driver, because I was trying to enable the QSPI flash on from mcxw71 board, so far the new driver doesn't have any difference from the shim version as far as trying bti communicate to the flash, so I doubt if you'll find it better but thanks for the information because I'm going to get back look into this too eventually. Can you create a GitHub issue with the information you have about it?

Yes, I'll create a new issue with what I posted.

Raymond0225 · 2025-02-03T19:51:08Z

drivers/spi/spi_nxp_lpspi/spi_nxp_lpspi_common.c

+	uint8_t best_scaler = 0, scaler = 0;
+	uint32_t diff, min_diff = 0xFFFFFFFF;
+
+	while (scaler < 256 && min_diff != 0) {


scaler is uint8_t, it is definitely smaller than 256.

Raymond0225 · 2025-02-03T19:54:51Z

drivers/spi/spi_nxp_lpspi/spi_nxp_lpspi_common.c

+		uint32_t real_freq = 0;
+
+		/* maximum freq won't get better than what we got with previous prescaler */
+		if (clock_freq / (TWO_EXP(prescaler) * 2) < best_freq) {


How about to use a right shift operator? clock_freq >> prescaler

Raymond0225 · 2025-02-03T20:00:33Z

drivers/spi/spi_nxp_lpspi/spi_nxp_lpspi_common.c

+
+		while (div > 0) {
+			div = low + (high - low) / 2;
+			real_freq = (clock_freq / (TWO_EXP(prescaler) * (div + 2)));


(clock_freq >> prescaler) /(div + 2) is better?

Remove the shim to the SDK driver, and only use direct register writes in the zephyr driver instead. This will save flash space. Remove mcux branding from all code. While doing this part of the rewrite, I discovered also some issues that I went ahead and fixed: - The driver was hardcoded to always only use PCS 0 no matter what the user provided spi_config set slave to. - Active high CS configuration was not respected. Also fixes an issue where SDK driver performed an althgorithm on 2,048 possible clock configurations before every transfer. This was replaced with a binary search in the native version. And make it so that this configuration does not need to happen every transfer, if the same spi_cfg is used, to improve performance. Various optimizations also due to the fact that the use case is more clear than the SDK was written for, so we can be more straightforward and simpler. Signed-off-by: Declan Snyder <[email protected]>

ehughes · 2025-02-04T14:56:01Z

@ehughes I also found issues with using LPSPI with the SPI nor driver, because I was trying to enable the QSPI flash on from mcxw71 board. So far the new driver doesn't have any difference from the shim version as far as trying to communicate to the flash, so I doubt if you'll find it better. But thanks for the information because I'm going to get back look into this too eventually. Can you create a GitHub issue with the information you have about it?

@decsny FYI:

#85081

I did test w/ the W71 and was able to get the onboard SPI chip to work with te spi_flash sample. The onboard flash is It is similar to what I was testing with. (macronix vs winbond).

It did not have the timeout issue that was dependent on the compiler optimization setting or the tx/rx-fifo size. It appears to be the similar to the MCX N instances (which also worked).

However, I mention in that issue that default transfer-delay is problematic. It should probably default to a value that yeilds 50% SCLK duty cycle as it is a reasonable default for most SPI peripherals.

decsny · 2025-02-04T16:32:30Z

However, I mention in that issue that default transfer-delay is problematic. It should probably default to a value that yeilds 50% SCLK duty cycle as it is a reasonable default for most SPI peripherals.

I definitely agree with this, I was just talking to somebody about these properties on a separate issue (#69658 (comment)) and was thinking the same thing as you actually. Either to make a default that is not the minimum or to make it a required property. I am leaning towards the latter for the theoretical principle (these delays need to be known and decided by board designer). But in a practical sense maybe leaning toward the former, because for the upstream EVK enablement, maybe we don't know how the SPI will be used on many of the boards and not sure what a sensible default is anyways, so would be easier to just update the binding default in one place.

decsny · 2025-02-04T16:33:00Z

Rebased the PR since dependent one merged, will address more comments soon, still doing testing

ehughes · 2025-02-04T16:45:56Z

However, I mention in that issue that default transfer-delay is problematic. It should probably default to a value that yeilds 50% SCLK duty cycle as it is a reasonable default for most SPI peripherals.

I definitely agree with this, I was just talking to somebody about these properties on a separate issue (#69658 (comment)) and was thinking the same thing as you actually. Either to make a default that is not the minimum or to make it a required property. I am leaning towards the latter for the theoretical principle (these delays need to be known and decided by board designer). But in a practical sense maybe leaning toward the former, because for the upstream EVK enablement, maybe we don't know how the SPI will be used on many of the boards and not sure what a sensible default is anyways, so would be easier to just update the binding default in one place.

This is a feature particular to the NXP LPSPI IP. From a generic SPI /Driver / Interface point of view, I think the expectation from a Zephyr user will that a stream of contiguous bytes should have 50% as a API default. In my experience I have never need the the timing as showing in the picture (a gap between bytes that is less than the 1/2 the period of the SPI frequency). If there are special adjustments, always has been to make the gap between bytes wider for devices that need more time in between bytes.

I think you will generally always pass a loopback test regardless, but the signals are essentially synchronous in the test case. However, I think the real world implications here bad (i.e. it was broken for a very simple SPI flash device on the FRDM-W71)

From a board level timing / slack / setup/hold POV, analysis is always done assuming 50%. Then you apply allowed tolerances form specific devices. Many devices don't really care as long as setup/hold isn't violated, but the current approach is the least compatible with anything that could be in the wild.

Feel that is OK to be optional as someone who wants this feature would be looking at NXP specific docs (vs generic SPI API). It just needs to have a default that yields timings that are generic.

The bare metal MCUX driver example start use a 50% duty as a default.

decsny · 2025-02-04T17:06:01Z

However, I mention in that issue that default transfer-delay is problematic. It should probably default to a value that yeilds 50% SCLK duty cycle as it is a reasonable default for most SPI peripherals.

I definitely agree with this, I was just talking to somebody about these properties on a separate issue (#69658 (comment)) and was thinking the same thing as you actually. Either to make a default that is not the minimum or to make it a required property. I am leaning towards the latter for the theoretical principle (these delays need to be known and decided by board designer). But in a practical sense maybe leaning toward the former, because for the upstream EVK enablement, maybe we don't know how the SPI will be used on many of the boards and not sure what a sensible default is anyways, so would be easier to just update the binding default in one place.

This is a feature particular to the NXP LPSPI IP. From a generic SPI /Driver / Interface point of view, I think the expectation from a Zephyr user will that a stream of contiguous bytes should have 50% as a API default. In my experience I have never need the the timing as showing in the picture (a gap between bytes that is less than the 1/2 the period of the SPI frequency). If there are special adjustments, always has been to make the gap between bytes wider for devices that need more time in between bytes.

I think you will generally always pass a loopback test regardless, but the signals are essentially synchronous in the test case. However, I think the real world implications here bad (i.e. it was broken for a very simple SPI flash device on the FRDM-W71)

From a board level timing / slack / setup/hold POV, analysis is always done assuming 50%. Then you apply allowed tolerances form specific devices. Many devices don't really care as long as setup/hold isn't violated, but the current approach is the least compatible with anything that could be in the wild.

Feel that is OK to be optional as someone who wants this feature would be looking at NXP specific docs (vs generic SPI API). It just needs to have a default that yields timings that are generic.

The bare metal MCUX driver example start use a 50% duty as a default.

So it sounds like what you are suggesting is actually to programmatically decide the default instead of from DT. This I didn't consider and seems like a good idea to me. So what I could do as part of this PR is to remove the default value from DT, keep the property unrequired, and then if it is not present then to programmatically determine what is 50%, otherwise use DT value. Does this sound like a reasonable approach to you?

ehughes · 2025-02-04T17:40:13Z

However, I mention in that issue that default transfer-delay is problematic. It should probably default to a value that yeilds 50% SCLK duty cycle as it is a reasonable default for most SPI peripherals.

I definitely agree with this, I was just talking to somebody about these properties on a separate issue (#69658 (comment)) and was thinking the same thing as you actually. Either to make a default that is not the minimum or to make it a required property. I am leaning towards the latter for the theoretical principle (these delays need to be known and decided by board designer). But in a practical sense maybe leaning toward the former, because for the upstream EVK enablement, maybe we don't know how the SPI will be used on many of the boards and not sure what a sensible default is anyways, so would be easier to just update the binding default in one place.

This is a feature particular to the NXP LPSPI IP. From a generic SPI /Driver / Interface point of view, I think the expectation from a Zephyr user will that a stream of contiguous bytes should have 50% as a API default. In my experience I have never need the the timing as showing in the picture (a gap between bytes that is less than the 1/2 the period of the SPI frequency). If there are special adjustments, always has been to make the gap between bytes wider for devices that need more time in between bytes.
I think you will generally always pass a loopback test regardless, but the signals are essentially synchronous in the test case. However, I think the real world implications here bad (i.e. it was broken for a very simple SPI flash device on the FRDM-W71)
From a board level timing / slack / setup/hold POV, analysis is always done assuming 50%. Then you apply allowed tolerances form specific devices. Many devices don't really care as long as setup/hold isn't violated, but the current approach is the least compatible with anything that could be in the wild.
Feel that is OK to be optional as someone who wants this feature would be looking at NXP specific docs (vs generic SPI API). It just needs to have a default that yields timings that are generic.
The bare metal MCUX driver example start use a 50% duty as a default.

So it sounds like what you are suggesting is actually to programmatically decide the default instead of from DT. This I didn't consider and seems like a good idea to me. So what I could do as part of this PR is to remove the default value from DT, keep the property unrequired, and then if it is not present then to programmatically determine what is 50%, otherwise use DT value. Does this sound like a reasonable approach to you?

Yep, I think copying what the bare metal sample code does is a good idea as it has been test a lot and yields good values. If the there is a value in the DT / Overlay, then use that, else use the computation.

decsny · 2025-02-18T15:43:00Z

This PR was kind of poorly made, because there ended up being way too much in one commit, I'm gonna put DNM and spend the time to refactor the commit history

Raymond0225 · 2025-02-18T16:40:31Z

I think this file should also be updated:
\zephyr\tests\drivers\spi\spi_loopback\boards\mimxrt700_evk_mimxrt798s_cm33_cpu0.conf

ehughes · 2025-02-19T12:01:06Z

@decsny Let me know when it is good to retest this.

decsny · 2025-02-19T13:52:20Z

@decsny Let me know when it is good to retest this.

Okay, it's delayed for a while until after release , trying to stabilize things as they are for LPSPI first

decsny added the block: HW Test Testing on hardware required before merging label Feb 1, 2025

zephyrbot added platform: NXP NXP platform: NXP Drivers NXP Semiconductors, drivers area: Clock Control area: SPI SPI bus labels Feb 1, 2025

zephyrbot requested review from butok, dbaluta, DerekSnell, dleach02, EmilioCBen, manuargue, mmahadevan108, nordic-krch, Raymond0225, tbursztyka and teburd February 1, 2025 05:45

zephyrbot assigned tbursztyka Feb 1, 2025

zephyrbot added manifest manifest-hal_nxp DNM This PR should not be merged (Do Not Merge) labels Feb 1, 2025

decsny force-pushed the feature/lpspi_enhancements_4 branch from b25696f to eb1c476 Compare February 1, 2025 05:50

decsny requested a review from henrikbrixandersen February 1, 2025 05:56

decsny force-pushed the feature/lpspi_enhancements_4 branch from eb1c476 to 52f48c1 Compare February 1, 2025 06:00

decsny added this to the v4.1.0 milestone Feb 3, 2025

Raymond0225 reviewed Feb 3, 2025

View reviewed changes

decsny force-pushed the feature/lpspi_enhancements_4 branch from 52f48c1 to 75e9544 Compare February 4, 2025 13:48

fabiobaltieri added DNM (manifest) This PR should not be merged (controlled by action-manifest) and removed DNM This PR should not be merged (Do Not Merge) labels Feb 4, 2025

decsny modified the milestones: v4.1.0, v4.2.0 Feb 11, 2025

decsny added the DNM This PR should not be merged (Do Not Merge) label Feb 18, 2025

decsny marked this pull request as draft February 18, 2025 17:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NXP LPSPI: Convert to native zephyr driver, remove MCUX shim #85010

NXP LPSPI: Convert to native zephyr driver, remove MCUX shim #85010

decsny commented Feb 1, 2025 •

edited

Loading

zephyrbot commented Feb 1, 2025 •

edited

Loading

butok commented Feb 3, 2025

decsny commented Feb 3, 2025 •

edited

Loading

Raymond0225 commented Feb 3, 2025

ehughes commented Feb 3, 2025

decsny commented Feb 3, 2025

decsny commented Feb 3, 2025 •

edited

Loading

ehughes commented Feb 3, 2025

ehughes commented Feb 3, 2025

Raymond0225 Feb 3, 2025

Raymond0225 Feb 3, 2025

Raymond0225 Feb 3, 2025

ehughes commented Feb 4, 2025

decsny commented Feb 4, 2025

decsny commented Feb 4, 2025

ehughes commented Feb 4, 2025 •

edited

Loading

decsny commented Feb 4, 2025

ehughes commented Feb 4, 2025

decsny commented Feb 18, 2025

Raymond0225 commented Feb 18, 2025

ehughes commented Feb 19, 2025

decsny commented Feb 19, 2025

NXP LPSPI: Convert to native zephyr driver, remove MCUX shim #85010

Are you sure you want to change the base?

NXP LPSPI: Convert to native zephyr driver, remove MCUX shim #85010

Conversation

decsny commented Feb 1, 2025 • edited Loading

Code Size Difference

zephyrbot commented Feb 1, 2025 • edited Loading

butok commented Feb 3, 2025

decsny commented Feb 3, 2025 • edited Loading

Raymond0225 commented Feb 3, 2025

ehughes commented Feb 3, 2025

decsny commented Feb 3, 2025

decsny commented Feb 3, 2025 • edited Loading

ehughes commented Feb 3, 2025

ehughes commented Feb 3, 2025

Raymond0225 Feb 3, 2025

Choose a reason for hiding this comment

Raymond0225 Feb 3, 2025

Choose a reason for hiding this comment

Raymond0225 Feb 3, 2025

Choose a reason for hiding this comment

ehughes commented Feb 4, 2025

decsny commented Feb 4, 2025

decsny commented Feb 4, 2025

ehughes commented Feb 4, 2025 • edited Loading

decsny commented Feb 4, 2025

ehughes commented Feb 4, 2025

decsny commented Feb 18, 2025

Raymond0225 commented Feb 18, 2025

ehughes commented Feb 19, 2025

decsny commented Feb 19, 2025

decsny commented Feb 1, 2025 •

edited

Loading

zephyrbot commented Feb 1, 2025 •

edited

Loading

decsny commented Feb 3, 2025 •

edited

Loading

decsny commented Feb 3, 2025 •

edited

Loading

ehughes commented Feb 4, 2025 •

edited

Loading