Search Articles Overview # Paper link LLMs are fine-tuned using RLHF for alignment This has not been widely explored in text-to-image models DPO was recently formulated as a simpler alternative to RLHF The policy Back To Top