This skill utilizes Tesseract.js to extract all readable text from images, providing full text content along with word-level bounding boxes and confidence scores. It's designed for users needing to convert visual text from images into editable text formats.
$ npx skills add https://github.com/pascalorg/skills --skill image-to-textImage to Text is an OCR skill powered by Tesseract.js that extracts all readable text from images and returns structured data including full text content, word-level bounding boxes, and confidence scores. It works by segmenting images into lines and words, making it ideal for reading screenshots, extracting UI copy from design mockups, and obtaining precise text positioning information. The skill supports multiple languages and provides confidence metrics to help assess extraction accuracy. Common use cases include converting design mockups to editable text without manual retyping, extracting UI labels and button text, and analyzing text layout and positioning in images.
Install using the command line with the provided install command.
Reading text content from screenshots or design mockups
Extracting UI copy including labels and buttons
Getting text positions and bounding boxes from design images
$ npx skills add https://github.com/pascalorg/skills --skill image-to-textgit clone https://github.com/pascalorg/skillsCopy the install command above and run it in your terminal.
Launch Claude Code, Cursor, or your preferred AI coding agent.
Use the prompt template or examples below to test the skill.
Adapt the skill to your specific use case and workflow.
Check the GitHub repository or documentation for usage examples.
Take a free 3-minute scan and get personalized AI skill recommendations.
Take free scan