Why Choose LexBench
Core Features
LexBench provides professional browser agent evaluation capabilities to help you comprehensively assess AI Agent performance on real web tasks
01
Diverse Data
Support LexBench-Browser, Online-Mind2Web, BrowseComp and more, covering Chinese and English websites with various task types and difficulty levels
02
Professional Evaluation
Using GPT-4o as the evaluation model with multiple strategies (functional verification, UI comparison, semantic matching) for objective and accurate results
03
Visual Analytics
Rich visualization including pass rate trends, task distribution, multi-dimensional radar charts to present evaluation results intuitively
04
Open Leaderboard
Transparent public leaderboard with multi-dimensional filtering and comparison to quickly understand Agent performance