MPF: Aligning and Debiasing Language Models post Deployment via Multi Perspective Fusion

This paper looks at how large language models change when trained on their own synthetic data over and over again. The authors focus on political bias and show that models like GPT-2 can gradually lean more to one side of the political spectrum, especially toward the right, as training cycles continue. They test different methods to control this bias but find it persists even when model collapse is prevented. Their analysis also shows that bias amplification and model collapse are caused by different mechanisms, meaning they need different solutions.

Share this

Download our

Download the Free

Academic Paper

Get the insights you need to stay compliant and competitive in the evolving AI landscape.

Academic Paper
MPF: Aligning and Debiasing Language Models post Deployment via Multi Perspective Fusion
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.