Evaluate MCP servers using Model Context Protocol Benchmark Runner. Operations teams benchmark server performance against real GitHub issues. Connects to MCP servers and GitHub, providing quantifiable metrics for optimization.
git clone https://github.com/greynewell/mcpbr.gitEvaluate MCP servers using Model Context Protocol Benchmark Runner. Operations teams benchmark server performance against real GitHub issues. Connects to MCP servers and GitHub, providing quantifiable metrics for optimization.
1. **Prepare Your Environment:** Ensure you have MCPBR installed (`pip install mcpbr`) and authenticated with both GitHub and your MCP server. Verify the server is running and accessible via the MCP Inspector or CLI. 2. **Select a Benchmark Issue:** Choose a representative GitHub issue from your repository that exercises the server's capabilities (e.g., a complex issue with multiple comments, code snippets, and labels). Copy the issue URL for the [GITHUB_REPO_ISSUE_URL] placeholder. 3. **Run the Benchmark:** Execute the MCPBR command with your server details: ```bash mcpbr benchmark --server github-repo-analyzer --issue https://github.com/owner/repo/issues/42 --baseline local-github-bridge ``` For advanced configurations, use the `--config` flag to specify custom parameters like context window size or timeout values. 4. **Analyze the Report:** Review the generated report (JSON and visualizations) to identify performance bottlenecks. Focus on metrics like latency, error rates, and context utilization that deviate from your targets. 5. **Iterate and Optimize:** Apply the recommendations from the report to your server configuration. Re-run the benchmark after each optimization to validate improvements. Use the `--compare` flag to generate delta reports between runs: ```bash mcpbr benchmark --server github-repo-analyzer --issue https://github.com/owner/repo/issues/42 --compare /tmp/mcpbr/github-repo-analyzer-20240501-100000.json ``` **Pro Tips:** - Run benchmarks during off-peak hours to minimize external API rate limiting (e.g., GitHub API throttling). - Use the `--warmup` flag to pre-load the server with common queries before benchmarking for more consistent results. - For teams managing multiple MCP servers, maintain a shared configuration file (e.g., `mcpbr-config.json`) to standardize benchmarking parameters across servers.
No install command available. Check the GitHub repository for manual installation instructions.
git clone https://github.com/greynewell/mcpbrCopy the install command above and run it in your terminal.
Launch Claude Code, Cursor, or your preferred AI coding agent.
Use the prompt template or examples below to test the skill.
Adapt the skill to your specific use case and workflow.
Run the MCPBR (Model Context Protocol Benchmark Runner) against the [MCP_SERVER_NAME] server using the [GITHUB_REPO_ISSUE_URL] issue as a benchmark. Focus on the following metrics: response latency, context window utilization, and error rate. Generate a report comparing the results to the baseline performance of [BASELINE_SERVER_NAME]. Include specific recommendations for optimizing the server configuration based on the findings.
### MCPBR Benchmark Report: MCP Server Performance Evaluation **Benchmark Run ID:** 20240515-143022 **MCP Server Evaluated:** `github-repo-analyzer` (v1.2.3) **Benchmark Issue:** https://github.com/owner/repo/issues/42 **Baseline Server:** `local-github-bridge` (v1.0.1) **Evaluation Period:** 2024-05-15 14:30:00 - 14:45:00 UTC #### Performance Metrics Comparison | Metric | github-repo-analyzer | local-github-bridge | Delta | Target | |----------------------------|----------------------|---------------------|--------|---------| | Avg. Response Latency | 187ms | 245ms | -24% | <200ms | | Context Window Utilization | 89% | 67% | +33% | >85% | | Error Rate | 0.8% | 2.1% | -62% | <1% | | Token Throughput | 1,240 tokens/sec | 980 tokens/sec | +27% | >1,000 | #### Key Findings 1. **Response Latency:** The `github-repo-analyzer` server demonstrated a 24% faster response time compared to the baseline, primarily due to optimized caching of frequently accessed repository metadata. The average latency of 187ms meets the target of <200ms, with 95% of requests resolving in under 250ms. 2. **Context Window Utilization:** The server achieved 89% context window utilization, significantly outperforming the baseline (67%). This indicates more efficient handling of large GitHub issues with minimal truncation. The high utilization suggests the server is leveraging the full 128K token context window effectively. 3. **Error Rate:** The error rate of 0.8% is a 62% improvement over the baseline (2.1%). The most common errors were related to rate-limiting during peak GitHub API calls, which were mitigated by implementing exponential backoff in the latest version. #### Optimization Recommendations - **Caching Strategy:** Implement a Redis-based cache for repository metadata to further reduce latency. Current caching is in-memory only and resets on server restart. - **Rate Limiting:** Add a configurable rate limiter to prevent GitHub API throttling during high-traffic periods. The current implementation lacks this safeguard. - **Context Truncation:** While context utilization is high, consider adding a dynamic truncation policy for extremely large issues (>50K tokens) to prevent memory bloat. - **Monitoring:** Deploy Prometheus metrics exporter to track these metrics in real-time. The current benchmarking is manual and lacks historical trend analysis. #### Next Steps 1. Deploy the recommended caching improvements and re-run the benchmark to validate the 10-15% latency reduction. 2. Monitor the error rate over the next 7 days to ensure the rate-limiting changes resolve the remaining issues. 3. Schedule a quarterly benchmarking review to track performance against evolving GitHub issue structures and MCP server updates. **Benchmark Artifacts:** - Raw metrics: `/tmp/mcpbr/github-repo-analyzer-20240515-143022.json` - Performance graphs: `/tmp/mcpbr/github-repo-analyzer-20240515-143022.png` - Configuration diff: `/tmp/mcpbr/config-changes.patch`
Unlock data insights with interactive dashboards and collaborative analytics capabilities.
IronCalc is a spreadsheet engine and ecosystem
ITIL-aligned IT service management platform
Customer feedback management made simple
Enterprise workflow automation and service management platform
Automate your spreadsheet tasks with AI power
Take a free 3-minute scan and get personalized AI skill recommendations.
Take free scan