Skip to content
Back to Hub
BUILDERWebwright Framework for Builders

Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys

Microsoft's Webwright is a terminal-native framework that turns any web browser into an AI agent. It scores 60.1% on Odysseys, nearly doubling base GPT-5.4's 33.5%. Here is how it works and how to use it.

Webwright Framework for Builders

Webwright: Terminal-Native Web Agent

Microsoft's framework scores 60.1% on Odysseys, nearly doubling base GPT-5.4's 33.5%.

1. Core Architecture
πŸ’»

Terminal-native

Runs entirely in the terminal with no GUI. Perfect for CI/CD and remote servers.

🧩

Modular action pipeline

Each web action is a plugin: click, type, scroll, extract. Easy to customize.

🌐

Browser profiles

Persist cookies, sessions, and extensions across runs for authenticated workflows.

πŸ‘οΈ

Headless and headed mode

Toggle between invisible automation and visible debugging with one flag.

2. Performance Benchmarks
πŸ“Š

Odysseys score: 60.1%

Webwright scores 60.1% on the Odysseys benchmark, nearly doubling base GPT-5.4.

πŸ“‰

Base GPT-5.4: 33.5%

Standard GPT-5.4 without the framework achieves only 33.5% on the same benchmark.

πŸš€

Key improvements

Better navigation, error recovery, and multi-step reasoning drive the gains.

3. Getting Started
1

Install Webwright

Run npm install -g @microsoft/webwright to install the CLI tool.

2

Configure your agent

Set up a config file with browser profile, model, and action plugins.

3

Run your first task

Use webwright run --task 'search for AI papers' to execute a web task.

4

Monitor and debug

View logs, screenshots, and action traces in real time.

4. Integration with Claude Code
πŸ€–

Claude Code plugin

Use the Webwright plugin to let Claude browse the web during coding sessions.

πŸ“š

Automated research

Claude can fetch docs, read API references, and scrape competitor sites.

πŸ’Ύ

State persistence

Share browser state between Claude turns for multi-step workflows.

πŸ”§

Custom actions

Write your own action plugins to handle site-specific interactions.

Why Webwright beats raw GPT

  • +Raw GPT fails on multi-step web tasks due to context loss
  • +Webwright's pipeline keeps state and retries on errors
  • +Terminal-native means it works in any dev environment
  • +Modular design lets you swap models without rewriting tasks

Key Takeaways

  • 1Webwright runs entirely in the terminal with no GUI dependencies
  • 2It achieves 60.1% on Odysseys vs 33.5% for base GPT-5.4
  • 3The framework uses a modular action pipeline for web navigation
  • 4You can extend Webwright with custom plugins and browser profiles
  • 5It supports headless and headed mode for debugging
  • 6Webwright integrates directly with Claude Code for agentic workflows
template

Webwright setup and task template

Sign in to access this template

Create a free Fluent account to unlock templates, prompt packs, and checklists.

Create free account
claude-codeagentic-workflowsautomationbuildercontext-management