Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Optimization: Using TCM and Wram #33

Open
AngelTomkins opened this issue Feb 29, 2024 · 0 comments
Open

Performance Optimization: Using TCM and Wram #33

AngelTomkins opened this issue Feb 29, 2024 · 0 comments

Comments

@AngelTomkins
Copy link

The current code does not use any of the TCM. This means that the instruction cache and data cache are overwritten more often and leads to more reads to main ram. This is a fairly free performance improvement that putting the most commonly run functions into the ITCM, and putting data that is used only by the arm9 cpu into the DTCM would improve some performance.

I have experimented with this and it is not a night/day improvement for filling the ITCM with commonly run functions. Likely due to cache coherency, so there should be a method of finding the best functions based on how commonly they are run, and if the function is run before the instruction cache is overwritten.

The DSihas 800KiB of wram putting data and functions there would mean faster read times and less stalling execution due to the arm7 reading audio data on the main bus. The comparison of read speeds from the wram and main ram can be seen here. The current arm9 instructions are about 800KiB in arm mode, and if we use thumb mode it is around 600KiB, which could fit into the wram. There is a basic implementation of wram in Blocksds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant