Document Scanning Glossary

Back to Help

Like any other specialist industry, document scanning uses technical terms that may not be easy to understand. Here, we have created a glossary to explain some of these phrases. If you think we have missed anything off, ask a member of our team who will be happy to help!


A0 (Paper size measurement)
Millimetres: Height: 1189 – Width: 841
Inches: Height: 46.81 – Width: 33.1

A1 (Paper size measurement)
Millimetres: Height: 841 – Width: 594
Inches: Height: 33.11 – Width: 23.39

A2 (Paper size measurement)
Millimetres: Height: 594 – Width: 420
Inches: Height: 23.39 – Width: 16.54

American Standard Code for Information Exchange. A text document containing no formatting information.

Is a popular computer aided drafting software package developed by Autodesk.

Top of page


The action of copying important data and storing it somewhere in case the original becomes lost, damaged or stolen.
Black and White
Most standard business documents can be captured perfectly with black and white and this is generally recommended if you intend to scan them for archival purposes.

Top of page


CAD Format
A file format used for storing blueprints, plans and technical drawings, these can be AutoCAD DWG, DFX or any CAD file format you require.

CD / DVD Duplication
The act of making duplicate CDs or DVDs using standard CD/DVD copier generally used for small scale archival purposes.

CD / DVD Replication
Making duplicates of CDs or DVDs from a glass master disk using injection moulding used most often for creating large quantities of copies.

Computerised Files
These can be TIFF or JPEG which are a kind of computer file used to keep pictures. We also use PDF (Portable Document Format) Files which are designed to be easily printed and shared.

Cross Index
To list an item under more than one heading or category.

CSV File
(Comma Separated Values) File is a file containing information separated by commas such as a list of client details, this type of file can be used to import information into other applications.

Top of page


An organised collection of information stored in a structured way. Once in a database information can be searched or statistical analysis performed very quickly.

DGN File
A file type used by the MicroStation architectural and engineering software package used to store drawing information.

Digital Format
These computerised files can be TIFF or JPEG, which are computer files used to store pictures or graphics. We also use PDF (Portable Document Format) which are designed for easy printing and sharing.

DjVu (pronounced dï-ji-vu) is a computer file format designed primarily to store scanned images, especially those containing text and line drawings.

Duplex is a term meaning to scan or print on both sides of a sheet of paper

DWG File
(Pronounced Drawing) is a file type used by the popular computer aided drafting program AutoCAD to store drawing information.

DXF File
(Data Exchange Format) A file format that uses plain text to store drawing information.

Top of page


A book that is available in a computerised format such as EPUB, Adobe PDF or the Mobipocket .mobi format.

Top of page


File Format
Usually in Database or Spreadsheet format which can be integrated into a variety of systems.

Top of page


Grayscale images are monochrome but rather than just capturing the dark and light areas of an image, grayscale captures all the greys any faded text or black and white photography.

Top of page


ICR (Intelligent Character Recognition) is an advanced version of OCR (Optical Character Recognition) that is able to learn fonts during processing to improve recognition levels

The process of converting a collection of data into a database suitable for easy search and retrieval.

Top of page


JPEG Files
JPEG/JPG is a single page per file image format that is popular among graphic designers and web designers it is advised to use this format if you’re scanning colour images for presentation purposes.

Top of page


Laser Printer
The Laser Printer is a common type of computer printer that produces high quality printing and is able to produce both text and graphics.

Top of page


Microsoft Word
Microsoft Word is a popular piece of word processing software sometimes abbreviated to MS Word.

Microsoft Excel
Microsoft Excel is a popular spread sheet program.

Microsoft Access
A popular database application by Microsoft that uses Microsoft Database (MDB) files.

An architectural and engineering software package developed by Bentley Systems for generating drawings.

Master disk
A disk from which multiple copies are generated.

Top of page


A Network is a collection of computers sometimes connected to servers (computers that offer specific services to other computers connected to them) to allow them to share resources.

Top of page


Involves the use of computer software to translate images of type written text into machine-editable text.

Office Network
A collection of computers, sometimes connected to a server (computer that offers special services to other computers attached to it) to allow them to share resources.

OMR (Optical Mark Recognition) is a method of computerising input usually from paper forms. These usually contain ticks, or crosses such as those you might find on a multiple choice test or questionnaire.

OCR Processing Rule Set
A customised principle generated for a specific batch of documents to improve recognition.

Top of page


PDF File
PDF is a very popular multi page file format that is used by many companies because it can be opened by almost anybody using free software most people have already on their computers.

PDF Links
PDF Links allow you to create buttons in your PDF documents that link from one page to another, these buttons can be text or image, or just an area of the scanned image that you want to make clickable.

PDF Annotations
PDF Annotations allow you to add small pieces of text to an existing PDF document such as notes or appendices.

PDF Bookmarks
(PDF Bookmarks are a tool to allow you to easily navigate through a PDF document by placing links to sections which appear on the right hand side of the PDF when opened.

PDF Comments
PDF Comments are used to write notes on a PDF document these can be useful if you’re collaborating on a project with many different people.

PDF Searchable
PDF Searchable is type of PDF File that has been processed using OCR so that its contents can be searched

Top of page


Resolution describes the level of detail an image holds, the higher the resolution the more detail an image has. Image resolution is measured in DPI (Dots per inch) or PPI (pixels per inch)

Raw Image
These are images that have been scanned but not treated or enhanced in anyway.

A method of rendering an image using pixels (tiny dots on a computer or television screen which when printed or viewed make up an entire image).

Top of page


A term that describes printing or scanning only on one side of the paper.

A type of application program which manipulates numerical and text information in rows and columns of cells.

The process of turning documents into images that can be manipulated on a computer.

A device that connects to a computer allowing the user to turn (scan) documents into images they can manipulate on the computer.

(Structured Query Language) a computer programming language for retrieving and updating databases.

Top of page


A type of computer file used for storing pictures, these can be full color, grayscale or black and white. A multiage TIFF file can contain many pages in a single file.

Top of page


UV lacquer
A treatment applied to CD’s to give a high gloss scratch resistant finish to copied CDs

Top of page


A method of rendering an image using coordinates rather than singular dots of colour. Images rendered in this way can be scaled without degradation of the image quality.

Top of page


(eXtensible Markup Language) A decendent of HMTL for web publishing. Considered to be more general and uniform than its parent.

Top of page

Why Choose Pearl Scan?

In conjunction with the EN BS ISO 9001:2005, 27001, 14001 and in-house implemented quality, security and compliance procedures allow us to deliver peace of mind scanning services to our client. We are an approved document scanning and data capture scanning service provider to many reputable health, education, manufacturing, financial, logistics etc. organisations.

ISO 9001ISO 14001ISO 27001Investing in PeoplePCIIRMS

Founded in 2003, with almost 15 years of valuable knowledge and expertise in delivering successful document scanning and data capture services through the UK to some of the most reputable and globally known organisations.

We operate from a custom built document scanning and data capture centre, which is built around security, safety and confidentiality. The site is monitored 24hours a day by security and CCTV systems.

The document scanning and data capture bureau is equipped with the state-of-the-art dedicated document, Microfilm media, Books and Large Format Plans scanning and capture technology; catering for a wide range of document types and sizes making us a one-stop service provider for scanning and digital conversion needs . We continually invest in our staff training and latest technology to ensure that we are delivering quality and innovations at all times.

Pearl Scan Group has the infrastructure to provide quick turnaround for urgent document scanning needs to taking on a large volume scanning and conversion of documents, microfilm media, books etc. projects. Our document scanning and data capture service centre always run at 80% of its productivity allowing 20% space and resources for on-demand, ad-hock projects.