豆豆友情提示:这是一个非官方 GitHub 代理镜像,主要用于网络测试或访问加速。请勿在此进行登录、注册或处理任何敏感信息。进行这些操作请务必访问官方网站 github.com。 Raw 内容也通过此代理提供。
Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 26 additions & 1 deletion scripts/eval_gemini.ts
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ import {TestServer} from '../build/tests/server.js';

const ROOT_DIR = path.resolve(import.meta.dirname, '..');
const SCENARIOS_DIR = path.join(import.meta.dirname, 'eval_scenarios');
const SKILL_PATH = path.join(ROOT_DIR, 'skills', 'chrome-devtools', 'SKILL.md');

// Define schema for our test scenarios
export interface CapturedFunctionCall {
Expand Down Expand Up @@ -49,6 +50,7 @@ async function runSingleScenario(
server: TestServer,
modelId: string,
debug: boolean,
includeSkill: boolean,
): Promise<void> {
const debugLog = (...args: unknown[]) => {
if (debug) {
Expand All @@ -67,6 +69,17 @@ async function runSingleScenario(
const loadedScenario = await loadScenario(absolutePath);
const scenario = {...loadedScenario};

// Prepend skill content if requested
if (includeSkill) {
if (!fs.existsSync(SKILL_PATH)) {
throw new Error(
`Skill file not found at ${SKILL_PATH}. Please ensure the skill file exists.`,
);
}
const skillContent = fs.readFileSync(SKILL_PATH, 'utf-8');
scenario.prompt = `${skillContent}\n\n---\n\n${scenario.prompt}`;
}

// Append random queryid to avoid caching issues and test distinct runs
const randomId = Math.floor(Math.random() * 1000000);
scenario.prompt = `${scenario.prompt}\nqueryid=${randomId}`;
Expand Down Expand Up @@ -180,13 +193,18 @@ async function main() {
type: 'boolean',
default: false,
},
'include-skill': {
type: 'boolean',
default: false,
},
},
allowPositionals: true,
});

const modelId = values.model;
const debug = values.debug;
const repeat = values.repeat;
const includeSkill = values['include-skill'];

const scenarioFiles =
positionals.length > 0
Expand All @@ -211,7 +229,14 @@ async function main() {
`Running scenario: ${path.relative(ROOT_DIR, scenarioPath)} (Run ${i}/3)`,
);
}
await runSingleScenario(scenarioPath, apiKey, server, modelId, debug);
await runSingleScenario(
scenarioPath,
apiKey,
server,
modelId,
debug,
includeSkill,
);
console.log(`✔ ${path.relative(ROOT_DIR, scenarioPath)} (Run ${i})`);
successCount++;
} catch (e) {
Expand Down
44 changes: 44 additions & 0 deletions skills/chrome-devtools/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
---
name: chrome-devtools
description: Uses Chrome DevTools via MCP for efficient debugging, troubleshooting and browser automation. Use when debugging web pages, automating browser interactions, analyzing performance, or inspecting network requests.
---

## Core Concepts

**Browser lifecycle**: Browser starts automatically on first tool call using a persistent Chrome profile. Configure via CLI args in the MCP server configuration: `npx chrome-devtools-mcp@latest --help`.

**Page selection**: Tools operate on the currently selected page. Use `list_pages` to see available pages, then `select_page` to switch context.

**Element interaction**: Use `take_snapshot` to get page structure with element `uid`s. Each element has a unique `uid` for interaction. If an element isn't found, take a fresh snapshot - the element may have been removed or the page changed.

## Workflow Patterns

### Before interacting with a page

1. Navigate: `navigate_page` or `new_page`
2. Wait: `wait_for` to ensure content is loaded if you know what you look for.
3. Snapshot: `take_snapshot` to understand page structure
4. Interact: Use element `uid`s from snapshot for `click`, `fill`, etc.

### Efficient data retrieval

- Use `filePath` parameter for large outputs (screenshots, snapshots, traces)
- Use pagination (`pageIdx`, `pageSize`) and filtering (`types`) to minimize data
- Set `includeSnapshot: false` on input actions unless you need updated page state

### Tool selection
Comment thread
natorion marked this conversation as resolved.

- **Automation/interaction**: `take_snapshot` (text-based, faster, better for automation)
- **Visual inspection**: `take_screenshot` (when user needs to see visual state)
- **Additional details**: `evaluate_script` for data not in accessibility tree

### Parallel execution

You can send multiple tool calls in parallel, but maintain correct order: navigate → wait → snapshot → interact.

## Troubleshooting

If `chrome-devtools-mcp` is insufficient, guide users to use Chrome DevTools UI:

- https://developer.chrome.com/docs/devtools
- https://developer.chrome.com/docs/devtools/ai-assistance