Document Scanning Glossary
Like any other specialist industry, document scanning uses technical terms that may not be easy to understand. Here, we have created a glossary to explain some of these phrases. If you think we have missed anything off, ask a member of our team who will be happy to help!
A0 (Paper size measurement)
Millimetres: Height: 1189 - Width: 841
Inches: Height: 46.81 - Width: 33.1
A1 (Paper size measurement)
Millimetres: Height: 841 - Width: 594
Inches: Height: 33.11 - Width: 23.39
A2 (Paper size measurement)
Millimetres: Height: 594 - Width: 420
Inches: Height: 23.39 - Width: 16.54
American Standard Code for Information Exchange. A text document containing no formatting information.
Is a popular computer aided drafting software package developed by Autodesk.
The action of copying important data and storing it somewhere in case the original becomes lost, damaged or stolen.
Black and White
Most standard business documents can be captured perfectly with black and white and this is generally recommended if you intend to scan them for archival purposes.
A file format used for storing blueprints, plans and technical drawings, these can be AutoCAD DWG, DFX or any CAD file format you require.
CD / DVD Duplication
The act of making duplicate CDs or DVDs using standard CD/DVD copier generally used for small scale archival purposes.
CD / DVD Replication
Making duplicates of CDs or DVDs from a glass master disk using injection moulding used most often for creating large quantities of copies.
These can be TIFF or JPEG which are a kind of computer file used to keep pictures. We also use PDF (Portable Document Format) Files which are designed to be easily printed and shared.
To list an item under more than one heading or category.
(Comma Separated Value) File is a file containing information separated by commas such as a list of client details, this type of file can be used to import information into other applications.
An organised collection of information stored in a structured way. Once in a database information can be searched or statistical analysis performed very quickly.
A file type used by the MicroStation architectural and engineering software package used to store drawing information.
These computerised files can be TIFF or JPEG, which are computer files used to store pictures or graphics. We also use PDF (Portable Document Format) which are designed for easy printing and sharing.
DjVu (pronounced dï-ji-vu) is a computer file format designed primarily to store scanned images, especially those containing text and line drawings.
Duplex is a term meaning to scan or print on both sides of a sheet of paper
(Pronounced Drawing) is a file type used by the popular computer aided drafting program AutoCAD to store drawing information.
(Data Exchange Format) A file format that uses plain text to store drawing information.
A book that is available in a computerised format such as EPUB, Adobe PDF or the Mobipocket .mobi format.
Usually in Database or Spreadsheet format which can be integrated into a variety of systems.
Grayscale images are monochrome but rather than just capturing the dark and light areas of an image, grayscale captures all the greys any faded text or black and white photography.
ICR (Intelligent Character Recognition) is an advanced version of OCR (Optical Character Recognition) that is able to learn fonts during processing to improve recognition levels
The process of converting a collection of data into a database suitable for easy search and retrieval.
JPEG is a single page per file image format that is popular among graphic designers and web designers it is advised to use this format if you're scanning colour images for presentation purposes.
The Laser Printer is a common type of computer printer that produces high quality printing and is able to produce both text and graphics.
Microsoft Word is a popular piece of word processing software sometimes abbreviated to MS Word.
Microsoft Excel is a popular spread sheet program.
A popular database application by Microsoft that uses Microsoft Database (MDB) files.
An architectural and engineering software package developed by Bentley Systems for generating drawings.
A disk from which multiple copies are generated.
A Network is a collection of computers sometimes connected to servers (computers that offer specific services to other computers connected to them) to allow them to share resources.
Involves the use of computer software to translate images of type written text into machine-editable text.
A collection of computers, sometimes connected to a server (computer that offers special services to other computers attached to it) to allow them to share resources.
OMR (Optical Mark Recognition) is a method of computerising input usually from paper forms. These usually contain ticks, or crosses such as those you might find on a multiple choice test or questionnaire.
OCR Processing Rule Set
A customised principle generated for a specific batch of documents to improve recognition.
PDF is a very popular multi page file format that is used by many companies because it can be opened by almost anybody using free software most people have already on their computers.
PDF Links allow you to create buttons in your PDF documents that link from one page to another, these buttons can be text or image, or just an area of the scanned image that you want to make clickable.
PDF Annotations allow you to add small pieces of text to an existing PDF document such as notes or appendices.
(PDF Bookmarks are a tool to allow you to easily navigate through a PDF document by placing links to sections which appear on the right hand side of the PDF when opened.
PDF Comments are used to write notes on a PDF document these can be useful if you're collaborating on a project with many different people.
PDF Searchable is type of PDF File that has been processed using OCR so that its contents can be searched
Resolution describes the level of detail an image holds, the higher the resolution the more detail an image has. Image resolution is measured in DPI (Dots per inch) or PPI (pixels per inch)
These are images that have been scanned but not treated or enhanced in anyway.
A method of rendering an image using pixels (tiny dots on a computer or television screen which when printed or viewed make up an entire image).
A term that describes printing or scanning only on one side of the paper.
A type of application program which manipulates numerical and text information in rows and columns of cells.
The process of turning documents into images that can be manipulated on a computer.
A device that connects to a computer allowing the user to turn (scan) documents into images they can manipulate on the computer.
(Structured Query Language) a computer programming language for retrieving and updating databases.
A type of computer file used for storing pictures, these can be full color, grayscale or black and white. A multiage TIFF file can contain many pages in a single file.
A treatment applied to CD's to give a high gloss scratch resistant finish to copied CDs
A method of rendering an image using coordinates rather than singular dots of colour. Images rendered in this way can be scaled without degradation of the image quality.
(eXtensible Markup Language) A decendent of HMTL for web publishing. Considered to be more general and uniform than its parent.