Python library for extraction of text from almost any file format, including sound (!).

Textract is a Python library for pulling raw machine-readable text from pretty much anything: from Excel files, pdfs, images and, yes, even sound.

Leave a Reply

Your email address will not be published. Required fields are marked *

Built on WordPress by Smart Media AS