SAGED: A Holistic Bias-Benchmarking Pipeline for Language Models with Customisable Fairness Calibration

The development of unbiased large language models is widely recognized as crucial, yet existing benchmarks fall short in detecting biases due to limited scope, contamination, and lack of a fairness baseline.

SAGED(-Bias) is the first holistic benchmarking pipeline to address these problems. The pipeline encompasses five core stages: scraping materials, assembling benchmarks, generating responses, extracting numeric features, and diagnosing with disparity metrics.

Download our latest 
Academic Paper
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Share this

Unlock the Future with AI Governance.

Get a demo

Get a demo