We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
你好,感谢你提供得代码,对我来说有很大帮助,但是我在用ppo得时候出现了点问题,我是一个初学者,我在训练得时候发现连续得ppo算法接入到我自定义得环境后他得每个episode得奖励都一模一样,网络给出得动作是不同但相差非常小,不知道为什么哪里出了问题
The text was updated successfully, but these errors were encountered:
请问您解决了吗,我也遇到了相同的问题
Sorry, something went wrong.
我换了sac的算法,但是我再改的时候发现我自定义环境有些地方有错误,你可以看看你环境有没有什么问题
请问您解决了吗,我也遇到了相同的问题 我换了sac的算法,但是我再改的时候发现我自定义环境有些地方有错误,你可以看看你环境有没有什么问题
我根据issue里大家提的意见对PPO代码进行了一些修改,修改后的代码已上传至github。我测试下来这个代码可以收敛,如果您有需要可以参考,谢谢。 https://github.com/iimxinyi/Lightweight-Reinforcement-Learning/tree/main/SADRL/PPO
No branches or pull requests
你好,感谢你提供得代码,对我来说有很大帮助,但是我在用ppo得时候出现了点问题,我是一个初学者,我在训练得时候发现连续得ppo算法接入到我自定义得环境后他得每个episode得奖励都一模一样,网络给出得动作是不同但相差非常小,不知道为什么哪里出了问题
The text was updated successfully, but these errors were encountered: