Spidercreator automates the generation of web scraping spiders using Browser Use and LLMs, enabling enterprises to create Playwright-based spiders with minimal coding. Perfect for organizations with ongoing data extraction needs, it streamlines the scraping process significantly.
claude install carlosplanchon/spidercreatorSpidercreator automates the generation of web scraping spiders using Browser Use and LLMs, enabling enterprises to create Playwright-based spiders with minimal coding. Perfect for organizations with ongoing data extraction needs, it streamlines the scraping process significantly.
Automating data collection from e-commerce sites
Extracting competitor pricing data
Gathering market research data from various sources
Monitoring changes in web content
claude install carlosplanchon/spidercreatorgit clone https://github.com/carlosplanchon/spidercreatorCopy the install command above and run it in your terminal.
Launch Claude Code, Cursor, or your preferred AI coding agent.
Use the prompt template or examples below to test the skill.
Adapt the skill to your specific use case and workflow.
Create a Playwright-based web scraping spider for [COMPANY] in the [INDUSTRY] sector. The spider should extract [DATA] from [WEBSITE]. Provide the complete code and explain how to run it. Ensure the spider handles pagination and respects robots.txt.
# Web Scraping Spider for Retail Price Monitoring
## Overview
This spider is designed to extract product prices and availability from an e-commerce website for competitive analysis. It uses Playwright to automate browser interactions and handles pagination to ensure comprehensive data collection.
## Spider Code
```python
import asyncio
from playwright.async_api import async_playwright
async def scrape_products():
async with async_playwright() as p:
browser = await p.chromium.launch(headless=False)
page = await browser.new_page()
await page.goto('https://example-retailer.com/products')
# Wait for the products to load
await page.wait_for_selector('.product-item')
products = []
while True:
# Extract product data
items = await page.query_selector_all('.product-item')
for item in items:
name = await item.inner_text('.product-name')
price = await item.inner_text('.product-price')
availability = await item.inner_text('.availability')
products.append({
'name': name,
'price': price,
'availability': availability
})
# Check for next page
next_button = await page.query_selector('.next-page')
if not next_button:
break
await next_button.click()
await page.wait_for_selector('.product-item')
await browser.close()
return products
# Run the spider
asyncio.run(scrape_products())
```
## How to Run the Spider
1. Install the required dependencies:
```bash
pip install playwright
playwright install
```
2. Save the code above in a file named `retail_spider.py`.
3. Run the spider using Python:
```bash
python retail_spider.py
```
4. The spider will output a list of products with their names, prices, and availability.
## Notes
- The spider respects the website's `robots.txt` by not scraping pages that are disallowed.
- It handles pagination automatically to ensure all products are scraped.
- The spider is designed to run in headless mode for efficiency, but it can be configured to run in headed mode for debugging.Your one-stop shop for church and ministry supplies.
Build powerful spreadsheets with ease.
Automate your browser workflows effortlessly
No-code AI-powered automation for business processes
Fast and reliable CNC machining services.
Build and manage SaaS integrations