RLHF and RLAIF in GPT-NeoX

GPT-NeoX now supports post-training thanks to a collaboration with SynthLabs.

October 10, 2024 · Dakota Mahan, Quentin Anthony, Louis Castricato, Nathan Lile, Stella Biderman

The Practitioner's Guide to the Maximal Update Parameterization

Exploring the implementation details of muTransfer