Skip to content

For experiments involving instruct gpt. Currently used for documenting open research questions.

License

Notifications You must be signed in to change notification settings

CarperAI/InstructGPT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

BigModelName

This repository is for open-questions relating to RLHF and InstructGPT as pertaining to BigModelName.

Open Questions

  • What is the preference rate of PPO vs PPO-Ptx? Why was 27.8 chosen as the mixing factor between the pre-training gradients and the PPO gradients?
  • What do the gradient norms and gradient noise scales look like for PPO grads vs pre-training grads?
  • How important is SFT pretraining on human-written completions?

About

For experiments involving instruct gpt. Currently used for documenting open research questions.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published