Automatically analyzes CSV files to generate summary statistics, detect missing data, and create quick visualizations. Operations teams benefit from faster data insights and improved data quality checks. Connects to Python and pandas for data processing.
git clone https://github.com/coffeefuelbump/csv-data-summarizer-claude-skill.gitAutomatically analyzes CSV files to generate summary statistics, detect missing data, and create quick visualizations. Operations teams benefit from faster data insights and improved data quality checks. Connects to Python and pandas for data processing.
1. **Prepare your data**: Save your CSV file locally or upload to a cloud storage (Google Drive, Dropbox). Ensure the file has a clear header row with column names. 2. **Run the analysis**: Copy the prompt template above and replace [FILE_PATH_OR_URL] with your actual file path (e.g., 'C:/data/sales.csv' or 'https://example.com/data.csv'). Paste into your AI assistant. 3. **Review the output**: The AI will generate a comprehensive report with statistics, visualizations, and recommendations. Save this output for your records. 4. **Take action**: Use the identified anomalies and patterns to: - Clean your dataset (fix missing values, remove duplicates) - Prioritize data collection improvements - Generate follow-up reports on specific subsets 5. **Automate for regular checks**: For ongoing monitoring, save the prompt as a script in your preferred IDE (VS Code, Jupyter) and run it weekly against updated CSV files. Add error handling for file not found or format issues. **Pro Tips:** - For large files (>100MB), pre-filter your data or sample it first - Include [COLUMN_NAMES] in the prompt to focus on specific columns - Use [DATA_QUALITY_THRESHOLD=0.1] to adjust missing data sensitivity (default 5%) - Add 'export to PDF' to the prompt if you need a shareable report
No install command available. Check the GitHub repository for manual installation instructions.
git clone https://github.com/coffeefuelbump/csv-data-summarizer-claude-skillCopy the install command above and run it in your terminal.
Launch Claude Code, Cursor, or your preferred AI coding agent.
Use the prompt template or examples below to test the skill.
Adapt the skill to your specific use case and workflow.
Analyze the CSV file at [FILE_PATH_OR_URL]. Generate a summary report that includes: (1) key statistics (mean, median, min, max, quartiles) for all numeric columns, (2) missing data analysis with percentages and patterns, (3) top 5 most frequent values for categorical columns, and (4) a simple bar chart visualization for the top categorical column. Highlight any anomalies or data quality issues. Use Python with pandas and matplotlib/seaborn for the analysis.
### CSV Data Summary Report: sales_data_2023.csv **Dataset Overview:** - Total rows: 12,487 - Total columns: 14 (8 numeric, 6 categorical) - Date range: 2023-01-01 to 2023-12-31 **Numeric Columns Summary:** 1. **revenue** (USD): - Mean: $12,450 | Median: $8,920 | Min: $0 | Max: $98,750 - 78% of transactions below $15,000 (long tail distribution) - 3 outliers detected (>$50,000) - IDs: 4582, 7891, 11245 2. **quantity_sold**: - Mean: 4.2 | Median: 3 | Min: 1 | Max: 200 - 92% of orders ≤10 units (skewed right) 3. **customer_satisfaction_score**: - Mean: 4.1/5 | Median: 4.2 | Missing: 12% (1,498 rows) - Strong correlation with revenue (r=0.68) **Categorical Columns Summary:** 1. **product_category**: - Top 5 categories: Electronics (32%), Clothing (28%), Home & Kitchen (15%), Books (12%), Sports (8%) - Electronics shows highest average revenue ($18,200 vs $9,800 overall) 2. **region**: - West Coast: 28% | Northeast: 25% | South: 22% | Midwest: 18% | International: 7% - Missing: 5% (624 rows) **Missing Data Analysis:** - **High Priority:** customer_satisfaction_score (12%), region (5%), product_weight (3%) - **Pattern Detected:** Missing satisfaction scores correlate with low-value orders (<$100) - **Recommendation:** Investigate data collection process for these fields **Visualization:** [Bar chart showing product_category distribution with Electronics leading at 32%] **Anomalies Detected:** 1. 47 orders with $0 revenue (likely data entry errors) 2. 12 transactions with quantity_sold >100 (possible bulk order data entry) 3. 8 duplicate order IDs detected **Recommendations:** 1. Clean $0 revenue records by verifying with source system 2. Review bulk order data entry process 3. Address missing satisfaction scores for low-value orders 4. Standardize region field (e.g., 'West' vs 'Western Region')
Your one-stop shop for church and ministry supplies.
AI assistant built for thoughtful, nuanced conversation
Automate your browser workflows effortlessly
IronCalc is a spreadsheet engine and ecosystem
Enterprise workflow automation and service management platform
Automate your spreadsheet tasks with AI power
Take a free 3-minute scan and get personalized AI skill recommendations.
Take free scan