Product & Updates

Pinetree Agent: Achieves Frontier Performance on WebVoyager

Pinetree Agent: Achieves Frontier Performance on WebVoyager

Image

Today, Pinetree Research announces a new milestone: Pinetree Agent achieves 97% on WebVoyager, setting a new state of the art on one of the field’s leading benchmarks for browser-based computer use.

WebVoyager evaluates agents across more than 600 real-world tasks spanning 15 live websites. Unlike static benchmarks, agents must interact with real interfaces, interpret changing page states, and complete open-ended objectives through sequential decision-making.

At 97%, Pinetree Agent substantially outperforms prior frontier systems and approaches near-ceiling performance on the benchmark. This level of accuracy suggests that browser-use agents are moving from experimental demos toward practical reliability.

Why WebVoyager Matters

WebVoyager is designed to test capabilities that conventional automation systems struggle with:

  • Dynamic navigation across real websites

  • Multi-step planning and memory

  • Error recovery under changing states

  • Interface understanding without scripts

  • End-to-end task completion

Strong performance requires more than language fluency. It requires robust action-taking under uncertainty.