agent-pr-replay

🥈Silver

Agent PR Replay takes merged PRs from any repository, reverse-engineers the task prompt, runs Claude Code against it, and compares what the agent did versus what humans actually shipped. The result is targeted, empirical guidance.

3260Updated 1mo ago

Intermediate30min to implementautomation

Saves ~240 min per use

Quick InstallView Source

git clone https://github.com/sshh12/agent-pr-replay.git

Works with:

Claude

Overview

About This Skill

How to Use

1. **Identify Target PRs:** Select a recently merged PR from your repository that represents a non-trivial task (e.g., feature implementation, bug fix, or optimization). Focus on PRs with clear descriptions or commit messages. 2. **Run the Analysis:** Use the prompt template above, replacing [PR_NUMBER] and [REPO_URL] with the actual values. For best results, include the PR title and description in your input to help the AI infer the original prompt accurately. 3. **Review the Comparison:** Examine the AI-generated output versus the human implementation. Pay attention to differences in approach, efficiency, and edge cases handled. Use tools like `git diff` to inspect the actual changes in the PR. 4. **Extract Insights:** Document the key differences and their implications. Focus on patterns (e.g., 'AI agents often propose simpler solutions that don't account for edge cases' or 'Humans over-engineer solutions when real-time updates are required'). 5. **Refine Workflows:** Use the insights to improve your prompt engineering practices. For example, if AI agents consistently miss real-time requirements, add explicit constraints to your prompts. Share findings with your team to standardize prompt templates.

Setup & Installation

Quick Install

No install command available. Check the GitHub repository for manual installation instructions.

Alternative Install (Git Clone)

git clone https://github.com/sshh12/agent-pr-replay

Requirements

Claude Code or compatible AI agent
Works with: Claude

Quick Start Guide

Install the Skill

Copy the install command above and run it in your terminal.

Open Your AI Agent

Launch Claude Code, Cursor, or your preferred AI coding agent.

Try It Out

Use the prompt template or examples below to test the skill.

Customize

Adapt the skill to your specific use case and workflow.

Usage Examples

Prompt Template

I need to analyze the merged PR [PR_NUMBER] from [REPO_URL]. Extract the original task prompt that triggered the PR (if not explicitly stated, infer it from the PR description and code changes). Then, simulate what an AI agent would have produced when given that prompt using Claude Code. Compare the AI-generated output with the actual human implementation in the PR. Highlight key differences in approach, quality, and efficiency. Finally, provide actionable recommendations for improving future AI agent prompts or workflows based on these insights.

Example Output

Analyzing PR #142 from the `data-pipelines` repository (https://github.com/acme/data-pipelines/pull/142), which merged a new feature to optimize the ETL process for customer transaction data. The PR description stated: 'Add a new column to the transactions table that calculates the rolling 7-day average purchase amount per customer, partitioned by customer_id.'

**Inferred Original Prompt:**
'Write a SQL query to add a computed column to the transactions table that calculates the 7-day rolling average of purchase amounts, grouped by customer_id. The column should be named rolling_7day_avg and should be updated automatically as new transactions are inserted.'

**AI Agent Simulation (Claude Code Output):**
```sql
ALTER TABLE transactions
ADD COLUMN rolling_7day_avg DECIMAL(10,2);

-- Create a materialized view for the rolling average
CREATE MATERIALIZED VIEW customer_rolling_avg AS
SELECT 
    customer_id,
    AVG(amount) OVER (
        PARTITION BY customer_id 
        ORDER BY transaction_date 
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    ) AS rolling_7day_avg
FROM transactions;
```

**Human Implementation (Actual PR):**
```sql
-- Added a trigger-based solution
CREATE OR REPLACE FUNCTION update_rolling_avg()
RETURNS TRIGGER AS $$
BEGIN
    UPDATE transactions 
    SET rolling_7day_avg = (
        SELECT AVG(amount) 
        FROM transactions t2 
        WHERE t2.customer_id = NEW.customer_id 
        AND t2.transaction_date BETWEEN NEW.transaction_date - INTERVAL '7 days' AND NEW.transaction_date
    )
    WHERE id = NEW.id;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_update_rolling_avg
AFTER INSERT ON transactions
FOR EACH ROW EXECUTE FUNCTION update_rolling_avg();
```

**Key Differences:**
1. **Approach:** The AI agent proposed a materialized view, which would require periodic refreshes, while the human implemented a trigger-based solution for real-time updates.
2. **Performance:** The trigger approach ensures data consistency at write-time, while the materialized view would need to be refreshed to stay current.
3. **Complexity:** The human solution required additional PL/pgSQL expertise, which the AI agent didn't account for.

**Actionable Recommendations:**
1. **Prompt Refinement:** Future prompts should specify whether real-time updates are required (e.g., 'Ensure the rolling average is updated in real-time as new transactions are inserted').
2. **Database Constraints:** The prompt should include database-specific constraints (e.g., 'Use PostgreSQL triggers for real-time updates').
3. **Testing Requirements:** Add a requirement to test performance under high write loads, as triggers can impact insert performance.
4. **Documentation:** Include a note about the trade-offs between materialized views and triggers in the prompt to guide the agent's approach.