This skill provides tools and guidance for extracting text from PDFs in formats suitable for language model consumption. It is designed for developers and data scientists working with PDFs.
$ npx skills add https://github.com/letta-ai/skills --skill extracting-pdf-textThe PDF text extraction skill enables developers and data scientists to extract readable text from PDF documents in formats suitable for language model consumption. Located in the tools/pdf category of the Letta skills repository, it provides practical guidance and tools for handling PDF processing workflows. The skill addresses the common challenge of preparing PDF content for AI agents and LLMs by offering extraction methods that preserve document structure while removing formatting noise. This skill integrates with Letta Code and Claude Code frameworks through the skills system, allowing agents to dynamically load PDF processing capabilities on demand.
Install via npm using the command `$ npx skills add https://github.com/letta-ai/skills --skill extracting-pdf-text`.
Extract simple text from PDF documents using PyMuPDF.
Retrieve tabular data from PDFs with pdfplumber.
Process scanned or image-based PDFs using OCR.
Implement a complete RAG pipeline using marker-pdf.
$ npx skills add https://github.com/letta-ai/skills --skill extracting-pdf-textgit clone https://github.com/letta-ai/skillsCopy the install command above and run it in your terminal.
Launch Claude Code, Cursor, or your preferred AI coding agent.
Use the prompt template or examples below to test the skill.
Adapt the skill to your specific use case and workflow.
Check the GitHub repository or documentation for usage examples.
Take a free 3-minute scan and get personalized AI skill recommendations.
Take free scan