RLHF and RLAIF in GPT-NeoX GPT-NeoX now supports post-training thanks to a collaboration with SynthLabs.
The Practitioner's Guide to the Maximal Update Parameterization Exploring the implementation details of muTransfer