Document scanning glossary

Like any other specialist industry, document scanning uses technical terms that may not be easy to understand. Here, we have created a glossary to explain some of these phrases. If you think we have missed anything off, ask a member of our team who will be happy to help!

 

 

A

A0 (Paper size measurement)
Millimetres: Height: 1189 - Width: 841
Inches: Height: 46.81 - Width: 33.1

A1 (Paper size measurement)
Millimetres: Height: 841 - Width: 594
Inches: Height: 33.11 - Width: 23.39

A2 (Paper size measurement)
Millimetres: Height: 594 - Width: 420
Inches: Height: 23.39 - Width: 16.54

ASCII
American Standard Code for Information Exchange. A text document containing no formatting information.

AutoCAD
Is a popular computer aided drafting software package developed by Autodesk.


Top of page

B

Backup
The action of copying important data and storing it somewhere in case the original becomes lost, damaged or stolen.
Black and White
Most standard business documents can be captured perfectly with black and white and this is generally recommended if you intend to scan them for archival purposes.


Top of page

C

CAD Format
A file format used for storing blueprints, plans and technical drawings, these can be AutoCAD DWG, DFX or any CAD file format you require.

CD / DVD Duplication
The act of making duplicate CDs or DVDs using standard CD/DVD copier generally used for small scale archival purposes.

CD / DVD Replication
Making duplicates of CDs or DVDs from a glass master disk using injection moulding used most often for creating large quantities of copies.

Computerised Files
These can be TIFF or JPEG which are a kind of computer file used to keep pictures. We also use PDF (Portable Document Format) Files which are designed to be easily printed and shared.

Cross Index
To list an item under more than one heading or category.

CSV File
(Comma Separated Value) File is a file containing information separated by commas such as a list of client details, this type of file can be used to import information into other applications.

Top of page

D

Database
An organised collection of information stored in a structured way. Once in a database information can be searched or statistical analysis performed very quickly.

DGN File
A file type used by the MicroStation architectural and engineering software package used to store drawing information.

Digital Format
These computerised files can be TIFF or JPEG, which are computer files used to store pictures or graphics. We also use PDF (Portable Document Format) which are designed for easy printing and sharing.

DJVU
DjVu (pronounced dï-ji-vu) is a computer file format designed primarily to store scanned images, especially those containing text and line drawings.

Duplex
Duplex is a term meaning to scan or print on both sides of a sheet of paper

DWG File
(Pronounced Drawing) is a file type used by the popular computer aided drafting program AutoCAD to store drawing information.

DXF File
(Data Exchange Format) A file format that uses plain text to store drawing information.


Top of page

E

E-Book
A book that is available in a computerised format such as EPUB, Adobe PDF or the Mobipocket .mobi format.


Top of page

F

File Format
Usually in Database or Spreadsheet format which can be integrated into a variety of systems.


Top of page

G

Grayscale
Grayscale images are monochrome but rather than just capturing the dark and light areas of an image, grayscale captures all the greys any faded text or black and white photography.


Top of page

I

ICR
ICR (Intelligent Character Recognition) is an advanced version of OCR (Optical Character Recognition) that is able to learn fonts during processing to improve recognition levels

Indexing
The process of converting a collection of data into a database suitable for easy search and retrieval.

Top of page

J

JPG Files
JPEG is a single page per file image format that is popular among graphic designers and web designers it is advised to use this format if you're scanning colour images for presentation purposes.

Top of page

L

Laser Printer
The Laser Printer is a common type of computer printer that produces high quality printing and is able to produce both text and graphics.

Top of page

M

Microsoft Word
Microsoft Word is a popular piece of word processing software sometimes abbreviated to MS Word.

Microsoft Excel
Microsoft Excel is a popular spread sheet program.

Microsoft Access
A popular database application by Microsoft that uses Microsoft Database (MDB) files.

MicroStation
An architectural and engineering software package developed by Bentley Systems for generating drawings.

Master disk
A disk from which multiple copies are generated.

Top of page

N

Network
A Network is a collection of computers sometimes connected to servers (computers that offer specific services to other computers connected to them) to allow them to share resources.

Top of page

O

OCR
Involves the use of computer software to translate images of type written text into machine-editable text.

Office Network
A collection of computers, sometimes connected to a server (computer that offers special services to other computers attached to it) to allow them to share resources.

OMR
OMR (Optical Mark Recognition) is a method of computerising input usually from paper forms. These usually contain ticks, or crosses such as those you might find on a multiple choice test or questionnaire.

OCR Processing Rule Set
A customised principle generated for a specific batch of documents to improve recognition.

Top of page

P

PDF File
PDF is a very popular multi page file format that is used by many companies because it can be opened by almost anybody using free software most people have already on their computers.

PDF Links
PDF Links allow you to create buttons in your PDF documents that link from one page to another, these buttons can be text or image, or just an area of the scanned image that you want to make clickable.

PDF Annotations
PDF Annotations allow you to add small pieces of text to an existing PDF document such as notes or appendices.

PDF Bookmarks
(PDF Bookmarks are a tool to allow you to easily navigate through a PDF document by placing links to sections which appear on the right hand side of the PDF when opened.

PDF Comments
PDF Comments are used to write notes on a PDF document these can be useful if you're collaborating on a project with many different people.

PDF Searchable
PDF Searchable is type of PDF File that has been processed using OCR so that its contents can be searched

Top of page

R

Resolution
Resolution describes the level of detail an image holds, the higher the resolution the more detail an image has. Image resolution is measured in DPI (Dots per inch) or PPI (pixels per inch)

Raw Image
These are images that have been scanned but not treated or enhanced in anyway.

Raster
A method of rendering an image using pixels (tiny dots on a computer or television screen which when printed or viewed make up an entire image).

Top of page

S

Simplex
A term that describes printing or scanning only on one side of the paper.

Spreadsheet
A type of application program which manipulates numerical and text information in rows and columns of cells.

Scan
The process of turning documents into images that can be manipulated on a computer.

Scanner
A device that connects to a computer allowing the user to turn (scan) documents into images they can manipulate on the computer.

SQL
(Structured Query Language) a computer programming language for retrieving and updating databases.

Top of page

T

TIFF File
A type of computer file used for storing pictures, these can be full color, grayscale or black and white. A multiage TIFF file can contain many pages in a single file.

Top of page

U

UV lacquer
A treatment applied to CD's to give a high gloss scratch resistant finish to copied CDs

V

Vector
A method of rendering an image using coordinates rather than singular dots of colour. Images rendered in this way can be scaled without degradation of the image quality.

X

XML
(eXtensible Markup Language) A decendent of HMTL for web publishing. Considered to be more general and uniform than its parent.

 Top of page