In this post, we review the recent paper “Who Evaluates AI’s Social Impacts? Mapping Coverage and Gaps in First and Third Party Evaluations” (Anka Reuel et al., 2025). The authors examine how AI developers and independent third parties assess social impact, revealing gaps in transparency and coverage.
Key Findings
- Sparse first‑party reporting – Developers provide limited data on bias, environmental cost, and labor practices.
- Third‑party strength – Academics, nonprofits, and NGOs offer more robust analyses of bias, harmful content, and performance disparities.
- Missing infrastructure – No common platform exists to aggregate and compare third‑party evaluations.
Implications for Policy
The study calls for:
- Mandatory disclosure of data provenance, moderation labor, and training costs.
- Independent evaluation ecosystems to provide consistent, transparent assessments.
- Shared infrastructure for aggregating evaluation results.
For a deeper dive, see the full paper on arXiv.