OCR documents


OCR Conversion and Processing Solutions

OCR conversion of all documents to Searchable PDF, Ms. Word, Excel, CSV, XML, HTML etc formats.

scan and ocr documentsOCR (Optical Character Recognition) is the process of converting paper documents into fully editable electronic files such as Microsoft Word files, Excel spreadsheets, XML, CSV and PDF searchable formats. The process begins by scanning the hard-copy material (eg. books, newspapers, documents, magazines, journals, directories and invoices etc.) to create digital images such as TIFF, PDF, JPEG etc. and then OCR is applied to convert images into editable text format.

Our OCR scanning and conversion approach

Our specialist OCR solutions include scanning of various types and sizes of documents and converting these to our client's required output format. The accuracy of the OCR recognition depends on the quality of the source documents. For example, if the documents are fairly good quality prints and are clear/legible, the OCR recognition accuracy level can be as high as 99.99%, however if your documents are old, faint, contains marks, scratches etc. the accuracy and the quality of the OCR recognised text will be affected accordingly.

OCR Clean up Options
For these types of poor quality documents, we provide the following further OCR services

  • OCR clean up
  • OCR proof reading
  • OCR formatting (layout, tables, images, fonts and pagination etc.)

OCR Conversion to Excel
Our OCR to Excel conversion services can be applied to structured (tables), semi-structured (text, tables, images etc.) or non-structured (loose formatted) documents. For example, if you have documents which are printed from an Excel spread sheet, such as a CRM system, bank statements, directories containing addresses and contact details, we can convert these to fully formatted, accurate Excel spread sheet format.

We can further process the data and convert it to file formats such as CSV, XML, Text Searchable PDF and SharePoint import.

We typically OCR the following:

  • Books, newspapers, magazines and manuscripts
  • Books and documents to Microsoft Word
  • Catalogues to Microsoft Excel
  • Document conversion to XML, HTML, CSV and SPSS

Our OCR Conversion process

The first step in the process of O.C.R conversion is to assess the quality of the original documents to determine the layout and formatting. Once we have assessed the documents, the OCR processing rules are then configured. OCR tests are carried out and samples are created for approval. We offer three levels of Optical Character Recognition conversion. These three levels depend upon what you require:

OCR level 1
Suitable for plain and simple formatted documents and the output can be converted to any required file format such as word file, text file, xml file etc.

OCR level 2
This is for somewhat more complex layouts which have data in tables, flow charts, differing fonts and images. If you need to keep the original layout, formatting, fonts and page numbering then we recommend level 2.

OCR Level 3
The most in-depth OCR recognition and conversion level which includes manual proof-reading and correction of any errors that may occur through the OCR process. This ensures that specific areas are double-checked, corrected, cleansed as required and OCR accuracy of up to 99.99%.



Benefits of OCR documents

Our OCR recognition services offer many benefits, for example

Saving Time and Cost
If you have a book in hard copy format that you need to edit / update, for example, this will normally require significant amount of time to re-type. However, with our OCR recognition service, we can automatically scan, OCR and output your book to an editable format such as Microsoft Word in a very short time.

Also, if you have directories etc. we can scan these and convert to Microsoft Excel, CSV or XML format for you to have complete addresses or contact lists. We can also import the data into your CRM, Outlook etc. format.

Fully text searchable documents
OCR provides machine readable and searchable formats better than standard non readable and searchable image formats such as PDF, TIFF and Jpeg etc.

OCR Recognition of different languages
With our OCR conversion service, we can process multi lingual documents such as English, French, German, Portuguese, Italian, Spanish, Urdu, Arabic and Russian. Plus all other major world languages, subject to sample testing.


Read more articles about Optical Character Recognition:

Why chose Pearl Scan for your OCR Scanning, Recognition and Processing Requirements

Here are a few reasons why you should consider Pearl Scan OCR scanning and conversion services:

  • We have a decade of experience working with the most reputable companies in the UK.
  • We are a creative and friendly team of professionals who are easy to approach.
  • We have custom built OCR scanning and conversion engines.
  • We offer a one-stop service for all your OCR solutions.
  • Our OCR service can support small to large volume projects.
  • We have OCR services available in London, Manchester, Birmingham, via online throughout the UK and Europe.
  • Our OCR scanning services have been accredited to ISO 9001 (Quality Management System), ISO 27001 (Information Management System) and EMS 14001 (Environment Management System) and Data Protection 1998.

Fast and accurate OCR    |      Converted to Searchable PDF, Word, Excel, CSV and XML    |     Books, Newspapers, Invoices and all printed documents