Document Conversion Tool for Media Service Company
PPS PrePress Systeme GmbH is a digital paper solutions and media service provider headquartered in Germany. The company specializes in transforming paper archives into digital form allowing full-text retrieval over decades of newspaper issues and other data. PPS delivers high quality images and recognized text of newspaper pages usually bound in large books or from microfilm. For knowledge retrieval, PPS offers flexible and powerful tools from Convera Corp.
PPS contacted ATAPY Software for creation of a custom document processing tool. The goal was to automatically perform four main tasks:
- to accept and route images for OCR processing
- to use ABBYY FineReader Scripting Edition to recognize the images
- to export recognition results to a variety of document formats
- to save resulting documents in user-defined output directories
The tool designed by ATAPY detects scanned documents in the input directories indicated by the user, sends them to working directories, and submits them to FineReader for the recognition process. Recognized documents are stored in the output directories while problematic images go to the special ‘error directories’. An important feature of the application is the capability of exporting each recognized document into several formats: the user gets multiple documents as a result of a single processing phase.
High flexibility and configurability are the key points of the solution designed and implemented by ATAPY engineers. For each output format, the application allows the user to set its individual parameters such as page size for RTF/DOC, picture resolution for PDF, codepage for HTML, etc. ABBYY Software House state-of-the-art OCR technologies combined with ATAPY’s engineering expertise allowed this application to come out fast, highly usable, and cost-effective.
About PrePress Systeme GmbH:
PPS PrePress Systeme GmbH provides publishing industry with state-of-the-art software solutions since 1992, and offers the services of newspaper archives digitization since 1999. PPS PrePress Systeme GmbH offers a line of innovative search solutions, including the Enterprise-class intelligent search system ‘inter: gator’ and semantic search engine ‘PPS Finder’.