豆豆友情提示:这是一个非官方 GitHub 代理镜像,主要用于网络测试或访问加速。请勿在此进行登录、注册或处理任何敏感信息。进行这些操作请务必访问官方网站 github.com。 Raw 内容也通过此代理提供。
Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,3 +87,35 @@ You can use the `DEBUG` environment variable as usual to control categories that
### Updating documentation

When adding a new tool or updating a tool name or description, make sure to run `npm run docs` to generate the tool reference documentation.

### Contributing to Evals

We use Gemini to evaluate the MCP server tools in `scripts/eval_scenarios`.
Each scenario is a TypeScript file that exports a `scenario` object implementing `TestScenario`.

- **prompt**: The prompt to send to the model.
- **maxTurns**: Maximum number of conversation turns.
- **expectations**: A function that verifies the tool calls made by the model.
- **htmlRoute** (Optional): Serve custom HTML content for the test at a specific path.

We look to test that the tools are used correctly without too rigid assertions. Avoid asserting exact argument values if they can vary (e.g., natural language reasoning), but ensure the core parameters (like URLs or selectors) were correct.

Example:

```ts
import {TestScenario} from '../eval_gemini.js';

export const scenario: TestScenario = {
prompt: 'Navigate to example.com',
maxTurns: 2,
expectations: calls => {
// Check that at least one call was 'browse_page'
const navigation = calls.find(c => c.name === 'browse_page');
if (!navigation) throw new Error('Model did not browse the page');
// Verify essential args
if (navigation.args.url !== 'http://example.com') {
throw new Error(`Wrong URL: ${navigation.args.url}`);
}
},
};
```