agtrace provides visibility into AI agent actions, helping operations teams monitor and debug agent behavior. It connects to Claude agents and integrates with existing workflows to track agent activities, improving reliability and performance.
git clone https://github.com/lanegrid/agtrace.gitagtrace provides visibility into AI agent actions, helping operations teams monitor and debug agent behavior. It connects to Claude agents and integrates with existing workflows to track agent activities, improving reliability and performance.
["Set up agtrace monitoring for your AI agent by integrating it with your agent's runtime environment. Use the provided SDK or CLI to start tracing agent actions.","Define the scope of analysis by specifying the agent name, time range, and tasks/actions to monitor. Use the prompt template to customize your query.","Run the analysis and review the output. Focus on critical and high-severity issues first, as these have the most significant impact on performance.","Implement the suggested fixes or optimizations in your agent's codebase. Use agtrace's real-time monitoring to validate the changes.","Schedule regular reviews (e.g., weekly) to monitor agent behavior and identify new anomalies. Adjust the monitoring scope as your agent evolves."]
No install command available. Check the GitHub repository for manual installation instructions.
git clone https://github.com/lanegrid/agtraceCopy the install command above and run it in your terminal.
Launch Claude Code, Cursor, or your preferred AI coding agent.
Use the prompt template or examples below to test the skill.
Adapt the skill to your specific use case and workflow.
Use agtrace to analyze the execution logs of my AI agent [AGENT_NAME] from [START_DATE] to [END_DATE]. Identify any anomalies or failures in the agent's behavior, particularly around [SPECIFIC_TASKS_OR_ACTIONS]. Provide a summary of the root causes for each anomaly and suggest concrete fixes or optimizations. Format the output as a prioritized list with severity levels (Critical/High/Medium/Low).
## AI Agent Execution Analysis Report (June 1-15, 2024) **Agent:** `CustomerSupportBot` (v2.3.1) **Scope:** Order status inquiry handling **Total Executions:** 1,247 **Critical Issues:** 3 | **High Issues:** 8 | **Medium Issues:** 15 ### Critical Issues (Immediate Action Required) 1. **Order Status API Timeout** - **Severity:** Critical - **Occurrences:** 12 (0.96% failure rate) - **Root Cause:** External API `orders.status.v2` returned 503 errors in 68% of cases during peak hours (2-4 PM EST). - **Impact:** Customers received "Unable to retrieve order status" errors, leading to 23 escalations to human agents. - **Fix:** Implemented retry logic with exponential backoff (max 3 attempts). Added circuit breaker pattern to prevent cascading failures. Deployed on 2024-06-12. **Status:** Resolved (0 failures since deployment). 2. **Database Deadlock** - **Severity:** Critical - **Occurrences:** 8 (0.64% failure rate) - **Root Cause:** Concurrent writes to `user_orders` table during high-traffic periods caused deadlocks. Identified in `postgresql.log` at 2024-06-07 14:23:45. - **Impact:** Agent froze for 45 seconds during each deadlock event, causing 100+ users to abandon their sessions. - **Fix:** Added database index on `order_id` and implemented application-level locking. Deployed on 2024-06-08. **Status:** Resolved (0 deadlocks since deployment). ### High Issues (Schedule for Next Sprint) 1. **Slow Response Time** - **Severity:** High - **Occurrences:** 28 (2.25% of executions) - **Root Cause:** Agent spent 85% of execution time waiting for `payment.gateway.status` API (avg response time: 4.2s). - **Impact:** 95th percentile response time exceeded 6s, violating SLA (max 5s). - **Fix:** Cached payment status for 30 seconds to reduce API calls. Implemented parallel processing for status checks. Target deployment: 2024-06-20. 2. **Incorrect Order Status** - **Severity:** High - **Occurrences:** 15 (1.2% failure rate) - **Root Cause:** Race condition in `update_order_status` function caused stale data to be returned in 60% of cases. - **Impact:** 15 customers received incorrect status updates, leading to 5 complaints. - **Fix:** Added transactional consistency checks and implemented idempotency keys. Target deployment: 2024-06-18. ### Recommendations - **Monitoring:** Add alerts for API response times >3s and database query times >1s. - **Testing:** Implement load testing for 2x peak traffic to validate fixes. - **Documentation:** Update agent runbook with troubleshooting steps for API timeouts and deadlocks. **Next Steps:** 1. Deploy fixes for High Issues by June 20. 2. Schedule a post-mortem for Critical Issues to prevent recurrence. 3. Review agent architecture for potential improvements in scalability.
Task routing between AI and human workers
Metagenomic analysis for microbiome research
Cloud ETL platform for non-technical data integration
Get more done every day with Microsoft Teams – powered by AI
Customer feedback management made simple
Enterprise workflow automation and service management platform
Take a free 3-minute scan and get personalized AI skill recommendations.
Take free scan