-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for NAM files #143
Comments
@christoph-hart |
Thanks for the input and adding both engines is definitely an option but I would love to avoid adding the big fat Eigen library and RTNeural is already in there with what looks to me 95% of the required feature set. |
Hi All! I think it should be possible to construct a NAM-style model using RTNeural's layers. If I remember correctly NAM uses a "Temporal Convolutional Network", and I have implemented a couple of those in the past using RTNeural's layers, although there are sometimes variations between those types of networks. Here's an example of a "micro-TCN" implementation that we use as part of RTNeural's test suite. Probably the best route forward would be to use that implementation as a starting point, add whatever might be missing from the NAM model architecture, and probably adapt the mechanism for loading model weights to match whatever format NAM models use to store their weight. I'd be happy to help with this process as my time allows. That said, I'm not sure it would make sense to add support for NAM models directly to RTNeural, since I think it falls a little bit outside the scope of what RTNeural does. I do have some future plans for a sort of "model library" which would have example implementations of several neural network architectures that are commonly used in real-time audio (and maybe other real-time domains as well), and I think having NAM models as part of the model library would be great. However, there's some other changes I want to make to RTNeural before starting on that, so it may be a while before I get there. |
Also interested in this. sdatkinson/NeuralAmpModelerCore#49 |
maybe relevant? Chowdhury-DSP/BYOD#363 |
Just out of curiosity I checked if we could build NeuralAmpModelerCore against the Eigen library comes with RTNeural, and yes, it works flawless. We could even share the jsaon header. |
My 2 cents: we want to avoid boilerplate code on engine side or even inside plugin, i.e. we don't really want RTNeural to have methods to parse .nam (or .aidax or whatever) model files (torch weights), we want to adjust the model file that is coming out of a training repo and port to the format used by RTNeural. |
Since the two systems use mostly the same model types, I think NAM would have a very big incentive implementing RTNeural. It would remove 90% of the complexity from: https://github.com/sdatkinson/NeuralAmpModelerCore/tree/main/NAM The only blocker seems WaveNet is not implemented (yet) in RTNeural |
Coming back to this as the request from my users keeps popping up.
I've tried to naively port over the
Yes, that could also work, no ambitions here to bloat up the RTNeural project from my side. Whether it's a python script or a conditionally compileable C++ class shouldn't make a big difference in the end (I would prefer the latter, but that's something I can add to my project then). |
Hi All, Sorry for the long delay, I've been a bit busy the past few months. I had a look at implementing one of the NAM architectures in RTNeural a little while ago, and was able to make some progress, but haven't gotten it fully working just yet. The main issue with re-exporting the weights of a NAM model into RTNeural's JSON format is that RTNeural's JSON format currently only supports "sequential" models, and I don't believe the WaveNet architecture is sequential. Hopefully I can finish up the WaveNet architecture sometime in the next week or two. That said, in order to fully utilize RTNeural's performance capabilities, it would be preferable to be able to know the network architecture at compile-time, which could pose a problem if the intent is to create a "generic" NAM model loader. I have some ideas about this, but I'll worry about that after the basic implementation work is done. Thanks, |
The vast majority of NAM models use the "standard" architecture. There are also three other, less commonly used, official WaveNet presets. Very, very few models will use any other architecture. If both compile-time and dynamic architectures are supported, then compile-time architectures can be provided for the official presets, along with a dynamic fallback for less common architectures. |
Aida-X could be the right approach; there is no need for thousands of model types so that architectures can be hard coded. LSTM is right for 99.5% of use cases. The rationale is most users download their models from Tonehunt. A script could remap Wavenet NAM files and retrain to LSTMs; if the system is optimized on a GPU, converting everything on ToneHunt in a reasonable amount of time could be possible. This way, RTNeural does not need to change anything, nor NAM. We need a fat GPU for a month :) I just made a Proof of concept: https://github.com/baptistejamin/nam-to-rtneural, I haven't tested the result yet, but that should be compatible |
Would that script preserve the expected sample rate of the model? NAM supports whatever sample rate your input and output pair is, whereas the Aida-X script forces 48,000hz. Also does this shrink the model? In my testing NAM is more accurate than the Aida-X models. |
I tested with two different models (clean and high gain), and it sounded exactly the same. Yes, 48kHz sampling rate is required; however, the way NAM handles this is by resampling. It's not actually the model that does this, but the Plugin host handling this. The script does not actually shrink the model. It just generates sound from the NAM core using a NAM model, and then retrains to an LSTM compatible with RTNeural (Aida-x implementation). So at the end, you have an LSTM model that is compatible with NAM + with RTNeural. RTNeural Models are running 10x faster, enabling running models on lower-end CPUs / or hardware such as Raspberry Pi, with super low latency, and with a lot of extra CPU cycles available to run effects, cab simulation, etc. Most nam models are using Wavenets lately, while RTNeural models will be LSTMs, this is the key difference. Originally NAM was only LSTM based, and people were pretty happy with it ;) |
Ok that's good to know. For a lot of my projects I'm cool with LSTM, however I often want to stack models at various points of the DSP chain and "oversampling" by feeding high sample rate models reduces aliasing, which isn't an issue with a single model, but does add up when you start sequencing them. Just a clarification with how NAM works currently. If you feed the trainer with say 96khz files, it trains the model based on this sample rate, notes the sample rate in the meta data, then NAM will resample the incoming audio to that sample rate. The model itself requires the audio to be at the sample rate of the input / output files, whether that's 48, 96, 192 etc, because the weights are all based on that sample rates. Higher sample rate models show reduced aliasing which as I mentioned above does have reasons to exist. So your script actually generates a model of a model? That would introduce even more loss wouldn't it? As now you are a further step away from the original capture files. Isn't this similar to converting an AAC into an Mp3? Is there any reason we can't train LSTM models at 96 or 192khz and have RTNeural interpret those models? Having tried both NAM and AIDA-X I've got to say, training NAM models is a lot easier, with many more options. This is what makes me think it is worth implementing NAM models rather than just converting them. |
Anyway, I don't mean to sound ungrateful, it's actually nice to have a pretty hands off way to re-train NAMs to Aida-X. |
We could retrain to 96khz if it's something you are willing to explore. We can do this. Implementing NAM models in RTNeural, and RTNeural in NAM seems out of scope. Unless RTNeural implements wavelets. RTNeural is just a core system for Machine learning, while NAM is dedicated to guitar amps and re-implements models in pure CPP, but in a less optimized way. The more optimized the inference engine is, the more powerful models we can get, allowing more layers, etc. For instance, with RTNeural, it could be possible (and Keith Bloemer already did) also to capture knobs effects. For instance to simulate a Fuzz pedal with Stab, Gain, etc. It's 's something that is not possible with NAM, and that will likely never be. |
Is there a better place to continue this discussion that isn't clogging up the issue token? |
I'd be happy to open a channel/thread on the RTNeural Discord if people want to chat more on there? Also, I know I've been saying this for a while, but I think I should finally have time this weekend to finish my NAM-style WaveNet implementation in RTNeural... we'll see how it goes! |
Sounds good, I'm already on the Discord. |
👍 I'm happy to test integrating it as soon as you've got something functional. |
@baptistejamin @rossbalch for AIDA-X related topics in this thread you may be interested in moving here
To all: once RTNeural engine has the support for Wavenet (and elaborations of it), then I can expand my current script to generate a json model for RTNeural. I still think it's the best way to support it, but as @jatinchowdhury18 pointed out, support for this arch needs to be implemented in the engine (I thought this was immediately possible since conv1d layers are already present in RTNeural, I was wrong). So the scenario could be having a python script with only torch as major dep, that does:
For example, .aidax models on ToneHunt are just RTNeural compatible json files with the extension changed from .json to .aidax and a metadata section added as requested by ToneHunt. I would love to see something like this happen, if you have other ideas let's discuss on RTNeural Discord, and thanks again @jatinchowdhury18 for bringing this outstanding engine to life! |
ToneX would be an interesting one. I think their weights are encrypted based on my poking around in the SQL library. |
Okay, I finally have something useful to share on this. Thanks all for your patience. I've put together a repo with a demo of a NAM-style WaveNet model implemented in RTNeural: https://github.com/jatinchowdhury18/RTNeural-NAM At the moment there seems to be some discrepancies between NAM's convolution layer and RTNeural's, so I'll need to debug that. There's also some missing bits (e.g. gated activations), but I don't think those should be too hard to add, now that the base implementation is in place. The main "issue" I'm imagining for people wanting to use this is that the RTNeural model needs to be defined at compile-time, with parameters taken from the model configuration. For example:
That type definition could be auto-generated without too much trouble, but that doesn't help much if you're planning to load the model at run-time. The RTNeural-Variant repo shows one way to deal with this issue, but it may not work well in this instance given how many parameter configurations NAM's WaveNet supports. On the bright side, in my test example, the RTNeural implementation is running ~2x faster than the NAM implementation on my M1 Macbook. Of course this isn't a fair comparison since the RTNeural implementation isn't correct yet, but it's a good sign! So far I've been using RTNeural's Eigen backend, but I'd love to get the wavenet working with the XSIMD backend as well to see if that might run a bit faster. Anyway, if anyone's got some time and wants to help with testing/debugging the RTNeural WaveNet, feel free to jump in to the other repo. I'm hoping to have time to get back to it later this week or next weekend. |
I think would be a blast, at the same time from my experience it seems this largely depends on toolchain and target arch. For example I get zipper noise with XSIMD on Mod Dwarf, I cannot compile at all on Chaos Audio Stratus. I get XSIMD working fine bulding AIDA-X in Yocto. So I would be happy to see it running in Eigen, then of course the more the better! |
That is definitely encouraging! What Tanh implementation are you using? When we switched to using a Tanh approximation in NAM it made a huge performance difference. |
At the moment, the RTNeural implementation is using Eigen's built-in |
The NAM fast tanh is optional, but the official plugin has been enabling it since I added the option: At the time, applying the tanh activation function was the top hit in the hot path and switching to the fast tanh approximation gave about a 40% performance improvement. |
Building on Windows (Visual Studio x64 Release) I get:
If I enable fast tanh for NAM, I get:
|
PR is here to fix the model weight loading: |
Hi Jatin,
how hard would it be to add support for parsing the NAM file format?
https://github.com/sdatkinson/NeuralAmpModelerCore
Just from a quick peek at both sources the required layers are almost there (except for the wavenet layer which seems like a high level abstraction of existing low level layers).
I would love to avoid adding too many neural network engines to my project so if you think it‘s doable I‘ll give it a shot.
The text was updated successfully, but these errors were encountered: