
Product & Updates

Today, Pinetree Research announces a new milestone: Pinetree Agent achieves 97% on WebVoyager, setting a new state of the art on one of the field’s leading benchmarks for browser-based computer use.
WebVoyager evaluates agents across more than 600 real-world tasks spanning 15 live websites. Unlike static benchmarks, agents must interact with real interfaces, interpret changing page states, and complete open-ended objectives through sequential decision-making.
At 97%, Pinetree Agent substantially outperforms prior frontier systems and approaches near-ceiling performance on the benchmark. This level of accuracy suggests that browser-use agents are moving from experimental demos toward practical reliability.
Why WebVoyager Matters
WebVoyager is designed to test capabilities that conventional automation systems struggle with:
Dynamic navigation across real websites
Multi-step planning and memory
Error recovery under changing states
Interface understanding without scripts
End-to-end task completion
Strong performance requires more than language fluency. It requires robust action-taking under uncertainty.