Core Features

Why Choose LexBench

Core Features

LexBench provides professional browser agent evaluation capabilities to help you comprehensively assess AI Agent performance on real web tasks

Diverse Data

Support LexBench-Browser, Online-Mind2Web, BrowseComp and more, covering Chinese and English websites with various task types and difficulty levels

Professional Evaluation

Using GPT-4o as the evaluation model with multiple strategies (functional verification, UI comparison, semantic matching) for objective and accurate results

Visual Analytics

Rich visualization including pass rate trends, task distribution, multi-dimensional radar charts to present evaluation results intuitively

Open Leaderboard

Transparent public leaderboard with multi-dimensional filtering and comparison to quickly understand Agent performance

Why Choose LexBench

Core Features

LexBench provides professional browser agent evaluation capabilities to help you comprehensively assess AI Agent performance on real web tasks

Diverse Data

Support LexBench-Browser, Online-Mind2Web, BrowseComp and more, covering Chinese and English websites with various task types and difficulty levels

Professional Evaluation

Using GPT-4o as the evaluation model with multiple strategies (functional verification, UI comparison, semantic matching) for objective and accurate results

Visual Analytics

Rich visualization including pass rate trends, task distribution, multi-dimensional radar charts to present evaluation results intuitively

Open Leaderboard

Transparent public leaderboard with multi-dimensional filtering and comparison to quickly understand Agent performance