Introduction
Top  Next

appicon

TEXTfromPDF is a text extraction tool for WinXP/2000 that automates the conversion of Adobe PDF documents to text files.

One of a company's greatest assets is its intellectual capital. One form of intellectual capital is the documentation created by employees. This documentation is saved in a variety of file formats on computers throughout the enterprise. Companies need to be able to retrieve this information for search and archiving purposes. One of the methods that companies use to access this information is to extract the text from the various documents.

Text extraction is fundamental for information retrieval because text can be easily searched, archived, repurposed, and managed. Gathering the text from a PDF document requires manual selection, and copy/paste operations by a human. This is very slow, laborious, costly, and prone to error.

TEXTfromPDF gives a company access to this text content without requiring any Adobe product. It can automatically extract the text from thousands of Adobe PDF documents in a matter of minutes. The extracted content is saved to text files where it can be easily searched, or archived. Developers can also import the text file contents into a variety of databases for content management purposes.