Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys
Microsoft's Webwright is a terminal-native framework that turns any web browser into an AI agent. It scores 60.1% on Odysseys, nearly doubling base GPT-5.4's 33.5%. Here is how it works and how to use it.
Webwright Framework for Builders
Webwright: Terminal-Native Web Agent
Microsoft's framework scores 60.1% on Odysseys, nearly doubling base GPT-5.4's 33.5%.
Terminal-native
Runs entirely in the terminal with no GUI. Perfect for CI/CD and remote servers.
Modular action pipeline
Each web action is a plugin: click, type, scroll, extract. Easy to customize.
Browser profiles
Persist cookies, sessions, and extensions across runs for authenticated workflows.
Headless and headed mode
Toggle between invisible automation and visible debugging with one flag.
Odysseys score: 60.1%
Webwright scores 60.1% on the Odysseys benchmark, nearly doubling base GPT-5.4.
Base GPT-5.4: 33.5%
Standard GPT-5.4 without the framework achieves only 33.5% on the same benchmark.
Key improvements
Better navigation, error recovery, and multi-step reasoning drive the gains.
Install Webwright
Run npm install -g @microsoft/webwright to install the CLI tool.
Configure your agent
Set up a config file with browser profile, model, and action plugins.
Run your first task
Use webwright run --task 'search for AI papers' to execute a web task.
Monitor and debug
View logs, screenshots, and action traces in real time.
Claude Code plugin
Use the Webwright plugin to let Claude browse the web during coding sessions.
Automated research
Claude can fetch docs, read API references, and scrape competitor sites.
State persistence
Share browser state between Claude turns for multi-step workflows.
Custom actions
Write your own action plugins to handle site-specific interactions.
Why Webwright beats raw GPT
- +Raw GPT fails on multi-step web tasks due to context loss
- +Webwright's pipeline keeps state and retries on errors
- +Terminal-native means it works in any dev environment
- +Modular design lets you swap models without rewriting tasks
Key Takeaways
- 1Webwright runs entirely in the terminal with no GUI dependencies
- 2It achieves 60.1% on Odysseys vs 33.5% for base GPT-5.4
- 3The framework uses a modular action pipeline for web navigation
- 4You can extend Webwright with custom plugins and browser profiles
- 5It supports headless and headed mode for debugging
- 6Webwright integrates directly with Claude Code for agentic workflows
Webwright setup and task template
Sign in to access this template
Create a free Fluent account to unlock templates, prompt packs, and checklists.
Create free account