Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some question about code #19

Open
BingHan0458 opened this issue Feb 17, 2021 · 6 comments
Open

some question about code #19

BingHan0458 opened this issue Feb 17, 2021 · 6 comments

Comments

@BingHan0458
Copy link

Hello Professor,
Recently, I've been studying your paper and reproducing your code, and I have some question as follows:

  1. After training the stage1 and stage2, I got Figure 4 in your paper, but the other figures such as Figure 5? And how to get the result such as success rate and extra time and so on in Table II by your code or by doing some calculations?
  2. How to get the code of the baseline such as SL-policy and NH-ORCA?
  3. How long did you train stage1 and stage2? What results in Terminal or in GUI can be shown to prove that the policy has been trained well?
    I hope you can give me some advice, thank you very much!
@Acmece
Copy link
Owner

Acmece commented Feb 21, 2021

@BingHan0458 Hi,
I did not benchmark the algorithm. The success rate and the extra time, I think is easy to access if you run many epochs and count them. I test the algorithm by monitoring the average reward. If the average reward is converged, the model is converged too. I may find the code of this part and post it.

@BingHan0458
Copy link
Author

OK, Thank you very much!
And there is another question:
when I first run your code by rosrun stage_ros_add_pose_and_crash stageros worlds/stage1.world and mpiexec -np 24 python ppo_stage1.py, there are some output as follows:

####################################
############Loading Model###########
####################################
/home/.local/lib/python2.7/site-packages/torch/nn/functional.py:1351: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
  warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")
/home/.local/lib/python2.7/site-packages/torch/nn/functional.py:1340: UserWarning: nn.functional.tanh is deprecated. Use torch.tanh instead.
  warnings.warn("nn.functional.tanh is deprecated. Use torch.tanh instead.")
Env 10, Goal (002.8, -08.4), Episode 00001, setp 003, Reward -33.6, Distance 008.3, Crashed
Env 00, Goal (-02.2, -04.6), Episode 00001, setp 003, Reward -15.0, Distance 009.2, Crashed
Env 14, Goal (-02.8, -06.6), Episode 00001, setp 004, Reward -5.2 , Distance 008.8, Crashed
Env 10, Goal (002.9, -02.7), Episode 00002, setp 002, Reward -15.0, Distance 009.6, Crashed
Env 21, Goal (-05.6, 001.1), Episode 00001, setp 005, Reward -26.1, Distance 008.4, Crashed
Env 10, Goal (-04.6, -04.2), Episode 00003, setp 002, Reward -15.0, Distance 010.0, Crashed
Env 20, Goal (004.0, -04.7), Episode 00001, setp 019, Reward 37.4 , Distance 009.5, Reach Goal
Env 08, Goal (001.0, -07.9), Episode 00001, setp 020, Reward -28.6, Distance 008.3, Crashed
Env 20, Goal (-07.6, -01.4), Episode 00002, setp 002, Reward -15.0, Distance 008.5, Crashed
Env 01, Goal (007.2, 003.5), Episode 00001, setp 024, Reward 35.1 , Distance 008.6, Reach Goal
Env 13, Goal (-04.7, -02.9), Episode 00001, setp 026, Reward 35.0 , Distance 008.6, Reach Goal
Env 16, Goal (-02.9, 000.3), Episode 00001, setp 037, Reward -1.0 , Distance 009.3, Crashed
Env 19, Goal (-00.2, 004.2), Episode 00001, setp 048, Reward -12.9, Distance 008.5, Crashed
Env 17, Goal (005.4, 000.6), Episode 00001, setp 056, Reward -7.9 , Distance 009.4, Crashed
Env 07, Goal (-02.6, -03.0), Episode 00001, setp 056, Reward -11.6, Distance 008.5, Crashed
Env 09, Goal (-05.7, 000.3), Episode 00001, setp 065, Reward 38.5 , Distance 009.9, Reach Goal
Env 16, Goal (001.1, -07.3), Episode 00002, setp 031, Reward -13.2, Distance 009.4, Crashed
Env 06, Goal (-02.3, -00.2), Episode 00001, setp 104, Reward 35.1 , Distance 008.5, Reach Goal
Env 11, Goal (-06.4, 000.7), Episode 00001, setp 115, Reward 3.0  , Distance 008.0, Crashed
Env 10, Goal (001.5, -07.2), Episode 00004, setp 111, Reward 35.0 , Distance 008.5, Reach Goal
Env 14, Goal (002.0, -05.6), Episode 00002, setp 113, Reward 37.2 , Distance 009.5, Reach Goal
Env 18, Goal (004.1, -00.8), Episode 00001, setp 117, Reward 35.2 , Distance 008.6, Reach Goal
Env 02, Goal (-01.3, 001.5), Episode 00001, setp 119, Reward 34.5 , Distance 008.4, Reach Goal
Env 01, Goal (008.4, 000.9), Episode 00002, setp 097, Reward 35.6 , Distance 008.9, Reach Goal
Env 22, Goal (-00.2, 001.1), Episode 00001, setp 122, Reward 0.6  , Distance 008.4, Crashed
Env 03, Goal (-04.5, -05.8), Episode 00001, setp 123, Reward -14.7, Distance 008.2, Crashed
Env 23, Goal (002.0, 007.8), Episode 00001, setp 123, Reward 35.7 , Distance 008.9, Reach Goal
Env 01, Goal (003.4, 001.0), Episode 00003, setp 005, Reward -15.0, Distance 008.3, Crashed
Env 18, Goal (-03.2, 004.6), Episode 00002, setp 008, Reward -12.7, Distance 009.3, Crashed
Env 08, Goal (008.0, -02.0), Episode 00002, setp 105, Reward 34.9 , Distance 008.5, Reach Goal
Env 19, Goal (-07.4, 002.9), Episode 00002, setp 078, Reward 3.0  , Distance 008.2, Crashed
Env 01, Goal (003.1, 004.7), Episode 00004, setp 002, Reward -15.0, Distance 009.7, Crashed
Env 00, Goal (000.7, -07.8), Episode 00002, setp 123, Reward 36.5 , Distance 009.2, Reach Goal
Env 21, Goal (000.6, 008.9), Episode 00002, setp 121, Reward 35.3 , Distance 008.6, Reach Goal
Env 04, Goal (008.1, -03.2), Episode 00001, setp 126, Reward 36.6 , Distance 009.2, Reach Goal
Env 08, Goal (006.6, -01.9), Episode 00003, setp 004, Reward -14.5, Distance 009.3, Crashed
update
......

But after that, when I run the same code again, there are nothing output after ############Loading Model###########.
I am really very confused. I don't know why and how to modify it to continue to display the previous output such as Env 10, Goal (002.8, -08.4), Episode 00001, setp 003, Reward -33.6, Distance 008.3, Crashed. I would appreciate it if you could answer my question.
Thank you very much!

@BingHan0458
Copy link
Author

And the above question is about the model trained, when I change the code and let the program go to line 192 in ppo_stage1.py, but there are nothing output after ############Start Training###########. Is this the right way to train the model? What should the console output?

@Acmece
Copy link
Owner

Acmece commented Feb 22, 2021

I have tested it and cannot reproduce the problem. Could you pls provide more info? Or you can revert the repo to the original one.

@BingHan0458
Copy link
Author

BingHan0458 commented Feb 22, 2021

Hello! I try to change the command mpiexec -np 44 python ppo_stage1.py to mpiexec -np 22 python ppo_stage1.py and the output is as follows:

####################################
############Loading Model###########
####################################
/home/.local/lib/python2.7/site-packages/torch/nn/functional.py:1351: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
  warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")
/home/.local/lib/python2.7/site-packages/torch/nn/functional.py:1340: UserWarning: nn.functional.tanh is deprecated. Use torch.tanh instead.
  warnings.warn("nn.functional.tanh is deprecated. Use torch.tanh instead.")
Env 15, Goal (-06.4, 002.8), Episode 00001, setp 017, Reward -9.2 , Distance 009.0, Crashed
Env 10, Goal (-06.1, 005.1), Episode 00001, setp 018, Reward 36.7 , Distance 009.4, Reach Goal
Env 17, Goal (-03.7, 004.6), Episode 00001, setp 020, Reward -17.5, Distance 008.5, Crashed
Env 08, Goal (-05.7, -04.4), Episode 00001, setp 021, Reward -13.5, Distance 009.7, Crashed
Env 09, Goal (003.3, -00.2), Episode 00001, setp 024, Reward -9.4 , Distance 009.3, Crashed
Env 12, Goal (-07.4, 002.7), Episode 00001, setp 026, Reward 37.2 , Distance 009.4, Reach Goal
Env 03, Goal (002.5, 007.1), Episode 00001, setp 029, Reward 36.6 , Distance 009.4, Reach Goal
Env 05, Goal (003.7, 003.9), Episode 00001, setp 039, Reward 35.2 , Distance 008.7, Reach Goal
Env 02, Goal (003.2, 007.6), Episode 00001, setp 042, Reward 33.7 , Distance 008.0, Reach Goal
Env 19, Goal (-01.7, -02.8), Episode 00001, setp 045, Reward 34.0 , Distance 008.3, Reach Goal
Env 14, Goal (000.0, -03.6), Episode 00001, setp 060, Reward 34.6 , Distance 008.5, Reach Goal
Env 00, Goal (008.9, -01.2), Episode 00001, setp 064, Reward 34.9 , Distance 008.5, Reach Goal
......

but if I run by mpiexec -np 44 python ppo_stage1.py everytime with the unchanged number 44 there is nothing output after ############Loading Model###########. So I guess that is due to the number in the command, but I don't know why and how to set the number? And can I change any number?

Thank you !

@Balajinatesan
Copy link

Balajinatesan commented Jul 3, 2021

Env 03, Goal (-07.0, 009.5), Episode 00000, setp 097, Reward 12.6 , Reach Goal,
Env 04, Goal (-12.5, 004.0), Episode 00000, setp 052, Reward -33.4, Crashed,
Env 00, Goal (-18.0, 011.5), Episode 00000, setp 110, Reward 13.0 , Reach Goal,
Env 01, Goal (-18.0, 009.5), Episode 00000, setp 095, Reward 12.9 , Reach Goal,
Env 05, Goal (-12.5, 017.0), Episode 00000, setp 081, Reward -28.1, Crashed,
Env 02, Goal (-07.0, 011.5), Episode 00000, setp 044, Reward -30.0, Crashed,
Traceback (most recent call last):
File "ppo_stage2.py", line 212, in
run(comm=comm, env=env, policy=policy, policy_path=policy_path, action_bound=action_bound, optimizer=opt)
File "ppo_stage2.py", line 120, in run
obs_size=OBS_SIZE, act_size=ACT_SIZE)
File "/home/balaji/rover_ws/src/rl-collision-avoidance/model/ppo.py", line 204, in ppo_update_stage2
obss = obss.reshape((num_step*num_env, frames, obs_size))
ValueError: cannot reshape array of size 1966080 into shape (5632,3,512)
Hi, I got this error when I trained the second stage mpiexec -np 44 python ppo_stage2.py Do I need to change anything to train the model.
and also I finished the training for the first model can you guys give me an idea of how to implement the code on the real robot. If any of you guys have any idea please share with me here or at Gmail([email protected]) you can mail me. This might be a great help for us to get succeed.
@Acmece @BingHan0458

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants