ProJudge

A Multi-Modal Multi-Discipline Benchmark and Instruction-Tuning Dataset for MLLM-based Process Judges

Jiaxin Ai, Pengfei Zhou, Zhaopan Xu, Ming Li, Fanrui Zhang, Zizhen Li, Jianwen Sun,
Yukang Feng, Baojin Huang, Zhongyuan Wang†, Kaipeng Zhang†

†Corresponding Author: zhangkaipeng@pjlab.org.cn

Paper Code 🤗 ProJudgeBench 🤗 ProJudge-173k

🌈 we introduce ProJudgeBench, a comprehensive benchmark for assessing MLLMs' capabilities as process judges, and ProJudge-173k, a large-scale instruction-tuning dataset designed to enhance open-source MLLMs' process evaluation abilities.

🔔News

🚀 [20/10/2025] Our paper is accepted by ICCV2025!
✨ [03/11/2025] We release our paper and project page. The data and codes will be openly available soon!

BibTeX


@InProceedings{Ai_2025_ICCV,
    author    = {Ai, Jiaxin and Zhou, Pengfei and Xu, Zhaopan and Li, Ming and Zhang, Fanrui and Li, Zizhen and Sun, Jianwen and Feng, Yukang and Huang, Baojin and Wang, Zhongyuan and Zhang, Kaipeng},
    title     = {ProJudge: A Multi-Modal Multi-Discipline Benchmark and Instruction-Tuning Dataset for MLLM-based Process Judges},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2025},
    pages     = {4681-4690}
}