Note: OCR capability is only supported on devices that have the purchased OCR package.
The following instructions are for programming your scanner for optical character recognition (OCR). The scanner will read OCR-A, OCR-B, MICR E-13B, and SEMI Font, in a 6 to 60 point OCR typeface. You can either select a pre-defined OCR template, or create your own custom template for the type of OCR format you intend to read.
You can create a custom template to read OCR characters according to the specifications of you own application (see Custom OCR Template). You can read either this custom template only, or a combination of your custom template with other, pre-defined OCR templates.
The Passport Template may be used to read passports, visas and official travel documents based on the ICAO standard. This template reads both OCR-A and OCR-B fonts. Passports and Format-A visas each consist of two rows of 44 OCR-B characters.
Format-B visas and TD-2 travel documents each have two rows of 36 OCR-B characters, while TD-1 travel documents employ three rows of 30 OCR-B characters. Each row is read separately and not all of the rows may be issued if there are problems decoding some of the rows.
P<UTOERIKSSON<<ANNA<MARIA<<<<<<<<<<<<<<<<<<<
L898902C<3UTO6908061F9406236ZE184226B<<<<<14
V<UTOERIKSSON<<ANNA<MARIA<<<<<<<<<<<<<<<<<<<
L898902C<3UTO6908061F9406236ZE184226B<<<<<<<
V<UTOERIKSSON<<ANNA<MARIA<<<<<<<<<<<
L898902C<3UTO6908061F9406236ZE184226
I<UTOD231458907<<<<<<<<<<<<<<<
3407127M9507122UTO<<<<<<<<<<<2
STEVENSON<<PETER<JOHN<<<<<<<<<
I<UTOSTEVENSON<<PETER<<<<<<<<<<<<<<<
D231458907UTO3407127M9507122<<<<<<<2
The ISBN Template is used to read an International Standard Book Number (ISBN) in either OCR-A or OCR-B font.
ISBN 0-8436-1072-7
This format consists of the 4 letter ISBN followed by 13 characters that are separated by hyphens or spaces. The last digit is a Mod 11 checksum of 10 numbers (0-9), or an “X.” All ISBN results are checked for a valid checksum.
ISBN 978-0-571-08989-5
This format differs from the 13 character format in that the checksum is a Mod 10 checksum of 10 numbers (0-9) only.
You can enable multiple OCR templates along with the ISBN template by clicking the button for the template(s) you want.
The Price Field is used in a number of applications including book pricing. The Price Field Template reads both OCR-A and OCR-B fonts. The format is as follows:
C1234 P5678E
The field begins with a 'C' and ends with an 'E.' The first part of the Price Field is a 'C' followed by four numeric digits. The second half begins with a currency character. The above example shows the letter 'P' but the Price Field template allows the following additional characters:
$€£¥
Following the currency character, a numeric grouping of 3, 4, 5, or 6 digits is followed by a terminating letter 'E.' The template reads both OCR-A and OCR-B fonts. The following examples can also be read when the Price Field Template is enabled:
C6712 $801E
C0217 €4399E
C0823 ¥31559E
C0331 £706213E
You can enable multiple OCR templates along with the Price Field template by clicking the button for the template(s) you want.
MICR E-13B consists of 14 characters: the numbers 0-9 and 4 control characters. The 4 control characters are known as TOAD (Transit, On Us, Amount and Dash), and are output in the following manner:
MICR Character |
Function |
Output |
![]() |
Transit |
A (ASCII 65) |
![]() |
Amount |
B (ASCII 66) |
![]() |
On Us |
C (ASCII 67) |
![]() |
Dash |
D (ASCII 68) |
MICR E-13B is used in financial applications, such as checks, to encode bank account numbers, bank routing numbers, check numbers, and other information on a single row. There are standard guidelines that address how data must be represented on checks and other financial documents, but there is a great deal of flexibility left to the discretion of the document designer.
The MICR E-13B Template reads any MICR string whose length is between 4 and 40 characters. Only one consecutive space is allowed in a template,. Since there are many checks produced where the MICR line contains fields separated by more than one space, these fields will be read and output as individual MICR strings. There is a broad range of strings that produce MICR output, so you should check for partial reads of MICR text where only part of the targeted MICR string is actually in the image presented to the scanner.
The following examples can be read when the MICR E-13B Template is enabled:
A123456789A
C01235C A123456789A 193412454C
C98765C A568123977A 67891788C70
Note that in the third example, there will be 2 separate output results because of the 4 space gap between the first and second fields.
You can enable multiple OCR templates along with the MICR E-13B template by clicking the button for the template(s) you want.
One of the standard fields within MICR E13-B is the routing field. It begins with the Transit symbol (A) and is followed by 9 numeric digits and a terminating Transit symbol. In some checks, the routing field is separated on each end by at least one space and can be read as a standalone field. This would be done by creating the following template (see Custom OCR Templates):
1 4 A 5 1 4 9 A 0
If the routing field is part of a longer field (i.e., there is no space between either the leading or trailing transit character and other MICR data), then a custom template must be created to read those documents.
Click on the drop down list to program your scanner to read OCR in either Normal Video (black characters on a white background), Reverse Video (white characters on a black background), or Both Normal and Reverse Video. Select OCR Off to disable OCR reading.
Once OCR reading is enabled, you must select a Pre-Defined Template, shown under Template Selection, or create a custom OCR Template in order to read OCR characters.
You can create a custom template, or character string that defines the length and content of OCR strings that will be read with your scanner. These templates are entered in the OCR Template text box. The templates define the OCR font as well as the layout of the text in a row and column format. Each row can have up to 50 characters, with up to 18 rows in a template, with a maximum of 320 characters. Within each character position, the allowable characters can be specified either through explicit ASCII values, groups of ASCII values, wildcard characters, or combinations of these types. To achieve better OCR results, limit each character position’s values to the specific expected values in your application.
OCR template strings must begin and end with double quotes ("). A template string cannot have any spaces or other punctuation in it.
Internal gaps longer than one space are not allowed in OCR text. For example, the OCR text
ONE SPACE
is valid because there is only one space between the E and S in the text. However, the following text is illegal given the two spaces between the O and S:
TWO SPACES
An arbitrary number of spaces at the beginning and end of a line are acceptable. These spaces must be included in the template with the ASCII value of a space (32) (decimal is 0x20), and not be included as part of a group or wildcard character.
The ideal height of an OCR character after sampling is about 20 pixels, but characters up to 50 pixels in height can be read. If OCR characters are consistently above 40 pixels in height, downsampling the image by a factor of 2 will achieve better results in both speed and decode rates.
7 bit ASCII values are used in the OCR template strings. However, there are no 7 bit ASCII representations for the euro, pound, or yen currency characters. 8 bit codes for these characters are:
Currency |
Decimal |
Hex |
Euro |
128 |
x80 |
Pound |
163 |
xA3 |
Yen |
165 |
xA5 |
The hex character is output. For example, the euro output is [0xA3]. Refer to the ASCII_Conversion_Chart.
Custom OCR Templates are strings made up of various control codes, along with standard ASCII values.
Control Codes Chart |
Control Code |
Value |
Argument |
End of Template |
0 |
|
New Template |
1 |
Font: 1 - OCR-A 2 - OCR-B 3 - Both A&B 4 - MICR 5 - Semi |
New Line |
2 |
|
Define Group Start |
3 |
ID [001-255] |
Define Group End |
4 |
|
Wildcard: Numeric |
5 |
[0-9] |
Wildcard: Alpha |
6 |
[A-Z uppercase] |
Wildcard: Alphanumeric |
7 |
[0-9] [A-Z uppercase] |
Wildcard: Any (including space) |
8 |
|
Defined Group |
A |
ID [001-255] |
In Line Group Start |
B |
|
In Line Group End |
C |
|
Checksum |
D |
Weights, Type, MOD |
Fixed Character Repeat |
E |
[01-50] |
Variable Character Repeat |
F |
Range Low [01-50], Range High [01-50] |
ASCII Hex Value |
x |
|
Note: In all following examples, spaces are used in template strings for readability only.
All OCR templates begin with the New Template control code. The value immediately following this control code indicates the font(s) for which this template is designed.
Example: You need to read 8 numeric digits in either OCR-A or OCR-B:
12345678
The string would be: 1 3 5 5 5 5 5 5 5 5 0
The breakdown:
Control Code |
Description |
1 |
New Template Code |
3 |
Both OCR-A and OCR-B font |
5 |
Wildcard: Numeric - 8 times |
5 |
|
5 |
|
5 |
|
5 |
|
5 |
|
5 |
|
5 |
|
0 |
End of Template |
A template may contain multiple distinct templates all within the same string. Begin each template with a New Template control code.
A new line within a multiple line template is indicated by the New Line control code.
Example: You need to read 2 lines of OCR-A characters. The first line has 4 numeric digits and the second line has 8 alphanumeric characters and spaces:
4321
A-3D FG9
The string would be: 1 1 5 5 5 5 2 8 8 8 8 8 8 8 8 0
The breakdown:
Control Code |
Description |
1 |
New Template Code |
1 |
OCR-A font |
5 |
Wildcard: Numeric - 4 times |
5 |
|
5 |
|
5 |
|
2 |
New Line |
8 |
Wildcard: Any (including space) - 8 times |
8 |
|
8 |
|
8 |
|
8 |
|
8 |
|
8 |
|
8 |
|
0 |
End of Template |
To simplify the creation of user templates, the Fixed Character Repeat control code may be used to repeat a character a specified number of times. Any specific ASCII value, wildcard, or group can be repeated. Because each OCR line is limited to a maximum of 50 characters, you can shorten your string by using a fixed character repeat.
Example: Using the same example as used for New Template, you need to read 8 numeric digits in either OCR-A or OCR-B:
12345678
The string without repeating characters was: 1 3 5 5 5 5 5 5 5 5 0
Using Repeating Characters, it would be: 1 3 5 E 0 8 0
The breakdown:
Control Code |
Description |
1 |
New Template Code |
3 |
Both OCR-A and OCR-B font |
5 |
Wildcard: Numeric |
E |
Fixed Character Repeat - 8 times |
0 |
|
8 |
|
0 |
End of Template |
The Variable Character Repeat control code may be used to repeat a count for a character a variable number of times. Any specific ASCII value, wildcard, or group can be repeated.
The control code requires 4 bytes that give the minimum and maximum number of times (2 bytes each) that the character may appear in the template. Because each OCR line is limited to a maximum of 50 characters, you can shorten your string by using a variable character repeat. The minimum and maximum counts must be in the range from 1 to 50, with the minimum count less than or equal to the maximum count.
Example: You need to read OCR-B characters where any line may contain 5, 6, or 7 numeric digits. The string, without repeating variable characters, would be:
1 2 5 5 5 5 5 1 2 5 5 5 5 5 5 1 2 5 5 5 5 5 5 5 0
Using repeating variable characters, the template would be: 1 2 5 F 0 5 0 7 0
The breakdown:
Control Code |
Description |
1 |
New Template Code |
2 |
OCR-B font |
5 |
Wildcard: Numeric |
F |
Variable Character Repeat - 5 min, 7 max |
05 |
|
07 |
|
0 |
End of Template |
In a given character position, you must specify which values a text character may take. To reduce the overall size of templates, you may define common groups of ASCII characters and then use the defined group rather than repeating the same sequence over and over.
Groups can be made up of individual ASCII values or wildcard values. The wildcard values are Control Codes Numeric (5), Alpha (6), Alphanumeric (7), and Any(8).
To define a group, specify the Defined Group control code followed by an ID from 1 to 255. (Up to 255 groups may be defined in a single template.) Use the group ID to use the group in any template you build.
Note: Groups may not be nested.
You need to read a 4 character OCR-B text string where each character may be a hexadecimal digit (0123456789ABCDEF). The string would be:
1 2 3 0 0 1 x 4 1 x 4 2 x 4 3 5 4 5 5 5 A 0 0 1 0
Note: Spaces are used in this example only for ease of readability.
The breakdown (dark box indicates group definition):
Control Code |
Description |
1 |
New Template Code |
2 |
Both OCR-A and OCR-B font |
3 |
Defined Group |
001 |
Group ID |
x41 |
ASCII hex character for A |
x42 |
ASCII hex character for B |
x43 |
ASCII hex character for C |
5 |
Numeric Digit |
4 |
Define Group End |
5 |
3 Numeric Digits |
5 |
|
5 |
|
A001 |
Defined Group, ID 001 |
0 |
End of Template |
Refer to the ASCII conversion chart for hex/character conversions.
The In Line Group defines a one time instance of a group that occupies one character position in the template. Use this for unique groups of characters that occur only once.
A checksum reduces the probability of misreads. There are two types of checksums: row and block. For additional checksum protection, there are four different weighting schemes: 1, 12, 13, and 137. The checksum calculation is based on modulo arithmetic. The modulo factor may vary from 6 to 36.
The byte immediately following the Checksum control code (D) defines the type of checksum that will be used:
Bit Position(s) |
Meaning |
7,6: Weight Scheme |
00: Weight Scheme: 1 |
|
01: Weight Scheme: 12 |
|
10: Weight Scheme: 13 |
|
11: Weight Scheme: 137 |
5: Checksum Type |
0: Row |
|
1: Block |
4-0: Modulo Value |
Checksum Modulo - 5 |
Row Checksums (0) perform a checksum calculation on all characters preceding them up to the first character on the same row. Block Checksums (1) perform a checksum calculation on all characters up to the very first character in the template; they span multiple rows. The 5 bit Modulo Value stores the Checksum Modulo - 5. The stored number can range from 1, which is a Checksum Modulo value of 6, to 31, which describes a Checksum Modulo of 36. A Modulo value of 0 (Checksum Modulo of 5) is illegal. The characters within a checksum field have a numerical value that is used in the checksum calculation. Digits are converted to their numerical value (0-9), while uppercase letters range from 10 for an “A” to 36 for a “Z.” All punctuation characters have a value of 0 for checksum purposes. However, they do count as a spot for determining the weight values used in calculating the checksum.
The Weight Scheme defines how the values described above can be changed based on their character position. The default weight scheme is 1. This means that the checksum is based only on the character value and is not dependent on its position. The other weight schemes multiply the character value by a repetitive weight value that helps in identifying characters that have had their column locations switched. The 4 weight schemes are:
Weight Scheme Table |
|
Weight Scheme |
Multiplier Values |
1 |
1 1 1 1 1 ... |
12 |
1 2 1 2 1 2 ... |
13 |
1 3 1 3 1 3 ... |
137 |
1 3 7 1 3 7 1 3 7 ... |
The checksum character always starts with a weight of 1. As you move to the left of the checksum, the weight value is updated to the next member of the sequence. The sequences repeat until the first character in a row for a Row type checksum, and to the first character in the template for a Block type checksum. The resulting sum is then divided by the Checksum Modulo number of the checksum. The remainder of this division should be zero for a valid checksum.
ABCD6
EFG5X
The two lines of OCR-B text above both contain a row checksum. In addition, the last character of row 2 is a block checksum. The 2 row checksums are mod 10 with a 13 weight (decimal=133, hex=0x85), while the block checksum is a mod 36 with a 137 weight (decimal=255, hex=0xFF). The following template will read this text:
1 2 6 6 6 6 D 8 5 2 6 6 6 D 8 5 D F F 0
Note: Bold text shows the row and block checksum notations.
The breakdown of the row checksum:
D85 |
Description |
1 |
Weight Scheme: 13 (see Checksum Table) |
0 |
|
0 |
Checksum Type: Row (see Checksum Table) |
0 |
Translation of the sum to binary code |
0 |
|
1 |
|
0 |
The breakdown of the block checksum:
DFF |
Description |
1 |
Weight Scheme: 137 (see Checksum Table) |
1 |
|
1 |
Checksum Type: Block (see Checksum Table) |
1 |
Translation of the sum to binary code |
1 |
|
1 |
|
1 |
The top line checksum is the 6 at the end of the line. While this example shows the checksum at the end of the line, it may appear anywhere on the line and then protects all the characters to its left. The following sum is generated to verify a proper checksum on line 1:
6 D C B A
(1x6) + (3x13) + (1x12) + (3x11) + (1x10) = 100
Note that the 13 weight scheme starts with a 1 on the checksum digit, and then alternates between a 1 and 3 for all digits to the left of the checksum, up to the first character on the line. The numerical values of the alphabetic characters range from 10 for an 'A' to a 36 for a 'Z.’ The sum of 100 is a multiple of 10, so the mod 10 checksum here has passed. On line 2, the row checksum is the 5 following the G. Verify its line by generating its sum:
5 G F E
(1x5) + (3x16) + (1x15) + (3x14) = 110
Again, a value is obtained that is a multiple of 10, validating this row checksum. The X at the end of the line is a mod 36 block checksum with 137 weighting. It protects all the characters in the template, including the first line. Calculating its sum working backwards from the block checksum and using the 137 weighting scheme:
X 5 G F E 6 D C B A
(1x34) + (3x5) + (7x16) + (1x15) + (3x14) + (7x6) + (1x13) + (3x12) + (7x11) + (1x10) = 396
The resulting sum is a multiple of 36, so the block checksum has been validated.