-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NXP LPSPI: Convert to native zephyr driver, remove MCUX shim #85010
base: main
Are you sure you want to change the base?
NXP LPSPI: Convert to native zephyr driver, remove MCUX shim #85010
Conversation
The following west manifest projects have changed revision in this Pull Request:
⛔ DNM label due to: 1 project with PR revision Note: This message is automatically posted and updated by the Manifest GitHub Action. |
b25696f
to
eb1c476
Compare
eb1c476
to
52f48c1
Compare
Is it possible to add improvements to the original MCUX driver and avoid adding and maintaining new Zephyr specific drivers? |
It was internally requested for this driver to enhance the API for this SDK driver 5 years ago to be compatible with zephyr and rejected by SDK team due to having to rewrite the entire fsl_lpspi to achieve it basically. So why write a new fsl_lpspi specifically designed for the purpose of working with zephyr and then shim to it, when you can just write it properly in zephyr? In my experience, the shim drivers cause the most maintainance overhead by like 10 fold, specifically the ones with heavy abstraction. Having zephyr native is pretty much always simpler to maintain, that's the main advantage. Not to mention removing coupling and dependency is always a positive thing in software engineering practice. This coupling can cause configuration management problems and "spaghetti" code due to needing to design an implementation around an existing API and having to pull from upstream projects. Another difference between zephyr driver and shim driver, is when a user finds a problem, for the shim driver, we get bug reports. Because the HAL is much harder to deal with for them. For the zephyr native, they just make a PR. I also think that as shown by the numbers in my original post, zephyr native always reduces code size by a mega amount, in this case we have more than halved throughout the last 2 months by this conversion. That's normal whenever this happens. The new zephyr driver is smaller than the Hal was alone. Partly maybe due to the fact that it can be tailored to the zephyr specific driver implementation rather than as general as possible like the HAL needs to achieve. So it is generally desirable not to have shim drivers. The choice to use shim drivers was an early zephyr project choice when there was only a small amount of contributors and they needed to move fast to make the project viable with quick hardware support. Now at this point it's just not desirable to have these dependencies for those reasons I said above. |
I spotted that a lot of modifications are from "CONFIG_SPI_MCUX_LPSPI" to "CONFIG_SPI_NXP_LPSPI". |
@decsny FYI. I am tracking an obscure bug with the shim based driver for LPSPI on the MCXA156. Even though the LPSPI driver passes the test, I was able to get it to fail in interesting ways. I am tagging you as I saw you submitted the updated to the FRDM-MCXA156. I can try this driver. Issue : Using the LPSPI driver on MCXA156 with the generic ""jedec,spi-nor" and a external flash memory only works with optimizations disabled. Setup: I have a FRDM-MCXA156 with a W25q64 SPI Flash memory wired to LPSPI1. I am using the "SPI_FLASH" sample as a basis for the my test. Note: As a controller, I also tested the same Flash chip wired to a FRDM-MCX947 (which also uses LPSPI) and a LPC55S69Xpresso (which has flexcomm). The spi_flash runs OK on both of them. I narrowed it down to some interaction w/ the MCXA156 / LPSPI I created an overlay : The driver works through boot and most of the test. It can read the the JEDEC flash ID, issue erase commands, etc. However, I noticed that writes would fail. After some debugging I observed this on my logic analyzer: The specific transaction for the flash write had the chip select active >200mS, most of that idle. The transaction after that was a command to shut down the flash as the SPI transaction for the write returned a timeout from the driver. I turned on logging and saw this message: The flash write is successfull if all optimizations are disabled. Looking through the LP SPI Flash driver, I found this: If you follow the spi_transcieve_dt function: I noticed when optimizations are on, This if statement never evaluates to true With optimizations off, it will evaluate to true for the write transcations. Even though this appears to be an issue with the SPI NOR driver for MCXA156. I could wire the same chip to other boards (MCXN947 and LPC55S69) and it worked with optimizations turned on. This does appear to be timing related. If I turn on all the SPI Dbg Messages, the sample will work. I noticed that MCXN driver isntance had a fifo size of 8 while MCXA had a size of 4. If I reduce this size, the MCXA156 version works Values of 1,2,3 yield a successful test result with optimizations enabled. A value of 4 will result in the timeout error. I am going to modify my west manifest to point to your new branch/driver and see if makes a difference. |
Anything in zephyr named mcux means it is a shim to the mcux ecosystem, and so this is no longer a shim we remove mcux . |
@ehughes I also found issues with using LPSPI with the SPI nor driver, because I was trying to enable the QSPI flash on from mcxw71 board. So far the new driver doesn't have any difference from the shim version as far as trying to communicate to the flash, so I doubt if you'll find it better. But thanks for the information because I'm going to get back look into this too eventually. Can you create a GitHub issue with the information you have about it? |
@decsny Updated my manifest to point to nxp-upstream, commit 52f48c1 The spi_loopback test has a bus fault (as does the spi_flash sample) for the FRDM-MCXA156 board. I used Segger Ozone to trace: |
Yes, I'll create a new issue with what I posted. |
uint8_t best_scaler = 0, scaler = 0; | ||
uint32_t diff, min_diff = 0xFFFFFFFF; | ||
|
||
while (scaler < 256 && min_diff != 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scaler is uint8_t, it is definitely smaller than 256.
uint32_t real_freq = 0; | ||
|
||
/* maximum freq won't get better than what we got with previous prescaler */ | ||
if (clock_freq / (TWO_EXP(prescaler) * 2) < best_freq) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about to use a right shift operator? clock_freq >> prescaler
|
||
while (div > 0) { | ||
div = low + (high - low) / 2; | ||
real_freq = (clock_freq / (TWO_EXP(prescaler) * (div + 2))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(clock_freq >> prescaler) /(div + 2) is better?
Remove the shim to the SDK driver, and only use direct register writes in the zephyr driver instead. This will save flash space. Remove mcux branding from all code. While doing this part of the rewrite, I discovered also some issues that I went ahead and fixed: - The driver was hardcoded to always only use PCS 0 no matter what the user provided spi_config set slave to. - Active high CS configuration was not respected. Also fixes an issue where SDK driver performed an althgorithm on 2,048 possible clock configurations before every transfer. This was replaced with a binary search in the native version. And make it so that this configuration does not need to happen every transfer, if the same spi_cfg is used, to improve performance. Various optimizations also due to the fact that the use case is more clear than the SDK was written for, so we can be more straightforward and simpler. Signed-off-by: Declan Snyder <[email protected]>
52f48c1
to
75e9544
Compare
@decsny FYI: I did test w/ the W71 and was able to get the onboard SPI chip to work with te spi_flash sample. The onboard flash is It is similar to what I was testing with. (macronix vs winbond). It did not have the timeout issue that was dependent on the compiler optimization setting or the tx/rx-fifo size. It appears to be the similar to the MCX N instances (which also worked). However, I mention in that issue that default transfer-delay is problematic. It should probably default to a value that yeilds 50% SCLK duty cycle as it is a reasonable default for most SPI peripherals. |
I definitely agree with this, I was just talking to somebody about these properties on a separate issue (#69658 (comment)) and was thinking the same thing as you actually. Either to make a default that is not the minimum or to make it a required property. I am leaning towards the latter for the theoretical principle (these delays need to be known and decided by board designer). But in a practical sense maybe leaning toward the former, because for the upstream EVK enablement, maybe we don't know how the SPI will be used on many of the boards and not sure what a sensible default is anyways, so would be easier to just update the binding default in one place. |
Rebased the PR since dependent one merged, will address more comments soon, still doing testing |
This is a feature particular to the NXP LPSPI IP. From a generic SPI /Driver / Interface point of view, I think the expectation from a Zephyr user will that a stream of contiguous bytes should have 50% as a API default. In my experience I have never need the the timing as showing in the picture (a gap between bytes that is less than the 1/2 the period of the SPI frequency). If there are special adjustments, always has been to make the gap between bytes wider for devices that need more time in between bytes. I think you will generally always pass a loopback test regardless, but the signals are essentially synchronous in the test case. However, I think the real world implications here bad (i.e. it was broken for a very simple SPI flash device on the FRDM-W71) From a board level timing / slack / setup/hold POV, analysis is always done assuming 50%. Then you apply allowed tolerances form specific devices. Many devices don't really care as long as setup/hold isn't violated, but the current approach is the least compatible with anything that could be in the wild. Feel that is OK to be optional as someone who wants this feature would be looking at NXP specific docs (vs generic SPI API). It just needs to have a default that yields timings that are generic. The bare metal MCUX driver example start use a 50% duty as a default. |
So it sounds like what you are suggesting is actually to programmatically decide the default instead of from DT. This I didn't consider and seems like a good idea to me. So what I could do as part of this PR is to remove the default value from DT, keep the property unrequired, and then if it is not present then to programmatically determine what is 50%, otherwise use DT value. Does this sound like a reasonable approach to you? |
Yep, I think copying what the bare metal sample code does is a good idea as it has been test a lot and yields good values. If the there is a value in the DT / Overlay, then use that, else use the computation. |
This PR was kind of poorly made, because there ended up being way too much in one commit, I'm gonna put DNM and spend the time to refactor the commit history |
I think this file should also be updated: |
@decsny Let me know when it is good to retest this. |
Okay, it's delayed for a while until after release , trying to stabilize things as they are for LPSPI first |
Convert last pieces of the LPSPI driver completely to a native Zephyr driver since mostly it already is.
Main commit message:
Also, I have calculated the difference in the code size of the LPSPI driver in 3.7.0 (including HAL + shim driver) vs the version in this PR after all the recent changes, and have the following results:
Code Size Difference
Tested on MCXN, pending testing on RT and MCXW
Depends on #84931
Depends on zephyrproject-rtos/hal_nxp#504
Depends on hardware testing for MCXW, and RT