Reference

How to Use the Documentation

   Please read the Release Notes before installing OmniPage. The
notes include up-to-date lists of supported scanners, compatible
file formats, and any last minute information concerning the
current release of OmniPage.

   Use this Reference manual to find specific information about
any OmniPage feature. It describes all the commands and settings,
how to use the editor, how to improve performance, and how to
troubleshoot common problems. This information is also available
in OmniPage's online Help system. Chapter 2 contains a variety of
tutorial exercises to help you learn OmniPage and see what it can
do to streamline your workload.

   OmniPage professional contains Caere's 24-bit image-editing
program Image Assistant. The Image Assistant Tutorial booklet
introduces the program's basic features. Refer to the Image
Assistant online Help system for detailed information about Image
Assistant's features.

   Some features described in this documentation are available
only in the OmniPage professional version of the product. These
descriptions are marked "Professional version only.'

Assumptions

   We assume that you know how to work in the Microsoft Windows
environment. If you have questions about how to use dialog boxes,
scroll bars, edit boxes, and so on, please refer to the Windows
User's Guide.

CAERE CORPORATION
100 Cooper Court
Los Gatos, California 95030
European Offices:
CAERE Gm6H.
Ismaninger Strasse 17-19
81675 Munich, Germany
OmniPage and OmniPage Professional Windows Version S
CopyrightC 1994 Caere Corporation. All rights reserved. CAERE
OmniPage, OmniPage Professional, Image Assistant, AnyPage,
AnyFax, 3D OCR, and True Page are trademarks of Caere
Corporation.
Many of the designations used by manufacturers and sellers to
distinguish their products are claimed as trademarks. Such
designations appearing in this manual have been printed in
initial caps.
Product Serial Num6er:
(from Disk #1 label)

Table of Contents

         1 Installation

      What's in the Package                                  1-2
      System Requirements                                    1-3
__    Security Lock (International Versions Only)            1-4
      Saving Previous User Dictionaries Before Installation  1-5
      Installing the Software                                   
      1-6
      Setting up a Windows Swap File (Virtual Memory)        1-8
      Starting OmniPage                                     1-10
      Selecting Your Scanner                                1-11
      Conserving Disk Space                                 1-12
Chapter 2 Tutorials
Before You Start2-1
Tutorial 1 - Basic Text Recognition2-2
      The OCR Process2-2
      Automatic OCR with the Default Settings2-6
      Touring the Toolbar2-13
      Touring the Settings Panel2-15
      Using the Process Buttons2-23
      True Page Recognition (Professional version only)2-29
      Opening a Graphic in Image Assistant (Professional version only)2-32
Tutorial 2   Document Types and OCR Settings2-35
      Setting a Zoning Method2-36
      Complex Layouts2-38
      Standardized Forms2-42
      Legal Documents and Spreadsheets2-49
      Documents with Specialized Characters (Professional version
      only)2-50
      Foreign-Language and Multilingual Documents2-56

Tutorial 3  Streamlining the OCR Workflow                   2-58
      Saving a Settings File for Specific Documents         2-58
      Scanning Large Jobs                                   2-60
      Opening Multiple Image Files                          2-63
      Exporting Images                                      2-64
      Deferring Recognition (Professional version only)     2-66
Chapter 3 Commands and Settings

     The Too5$oarrtcut Command Buttons                                3-3

                                                                      3-3
           Processing Buttons                                         3-5
          AUTO Button                                                 3-5
          Image Button                                                3-7
          Zone Button                                                3-10
          OCR Button                                                 3-12
   The File Menu                                                     3-14
           Open Document                                             3-14
          Close Document                                             3-16
         Mail                                                        3-16
                   3-16
           Save As                                                   3-16
           Export Image                                              3-19
           Revert to Saved                                           3-21
           Get Accuracy Info                                         3-21
           Save Settings                                             3-24
           Load Settings                                             3-25
           Save Zone Template (Professional version only)            3-25
           Print                                                     3-27
           Publish to Envoy (Professional version only)              3-27
           Exit                                                      3-28
     The Edit Menu                                                   3-29
           Cut                                                       3-29
           Copy                                                      3-30

            Clear                                                    3-30
            Clear                                                    3-30
                All Zones                                            3-31
            Select All in Page                                       3-31
            Check Recognition                                        3-31
                         3-33
                         3-33
            Delete Recognized Zone                                   3-35
            Select Recognized Zones                                  3-35

_____       Delete Current Page                             3-36
___       Go to Page                                        3-36
     The Format Menu                                        3-37
          Character                                         3-37
_____      Paragraph                                        3-38
    The Process Menu                                        3-41
      Auto3-41
      Stop3-42
      Scan Image3-42
      Load Image3-43
      Auto Zones3-45
      Manual Zones3-46
      Use Template (Professional version only)3-50
      Perform OCR3-51
      Defer OCR (Professional version only)3-51
      Train OCR (Professional version only)3-52
      Process Settings3-55
      Finish Current Document3-55
      Finish Deferred Documents (Professional version only)3-57
      Start Image Assistant (Professional version only)3-59
The Settings Menu3-60
      Settings Panel3-60
      Select Scanner3-62
      Select Languages3-62
      Edit Training File (Professional version only)3-63
      Edit Zone Contents File3-66
      Edit User Dictionary3-68
The Window Menu3-70
      Tile Horizontal3-70
      Tile Vertical3-70
      Cascade3-70
      Arrange Icons3-70
      Hide/Show Toolbar3-71
      Hide/Show Status Bar3-71
      Hide/Show Ruler3-71
      Zone Window3-71
      Text Window3-71
      Zoom In3-71
      Zoom Out3-71
The Help Menu3-72
      Contents3-72
      Procedures3-72

Using Help                                               3-7
About                                                    3-7

Chapter 4 The Settings Panel
     Settings Panel Overview                                          4-2
           Selecting Settings Panel Options                           4-3
     Scanner Options                                                  4-4
           Page                                                       4-4
          ADF                                                         4-5
           Options                                                    4-6
     Zones Options Columns                                            4-9
                                                                      4-9
           Single Column or Table                                     4-10
           None                                                       4-10
     OCR Options                                                      4-11
           input Options                                              4-11
           Use Language Analyst                                       4-12
           Retain Graphics                                            4-13
           Output Options                                             4-14
     Fonts Options                                                    4-17
           Retained Font Formats                                      4-17
           ignored Font Formats                                       4-18
     Spelling Options                                                 4-19
                         4-19
            Spell Checking Options                                   4-20
     Preferences Options                                             4-21
            Save Page Images in Caere Document                       4-21
            Prompt Before Deleting Pages                             4-22
            Save Settings on Quit                                    4-22
            Reject Character                                         4-22
Chapter 5 Editing Recognized Documents
     Choices Before OCR                                               5-2
           OCR Output Options                                         5-2
           Font Options                                               5-6
           Retaining Graphics                                         5-7
           Language Analyst                                           5-8
           Languages and Dictionaries                                 5-9
     Editing Options After OCR                                       5-12
           Overview of the Text Window                               5-12
            Checking Recognition 5-13

      Verifying the Image5-14
      Formatting the Page and Editing Text5-14
Saving Your Document5-19

     6 Improving Performance
__- Improving Speed                                          6-2
         Manual Brightness                                   6-2
         Language Analyst                                    6-4
         Manual Zones                                        6-5
         Set Up a Permanent Windows Swap File                6-5
   Improving Accuracy                                        6-6
         Document Quality                                    6-6
         Scanner Options                                     6-7
         Scanning Angle                                      6-7
         Scanner Glass Clarity                               6-8
         Paper Transparency                                  6-8
   7 Troubleshooting
Before You Begin7-2
Installation7-3
      Installing OmniPage with the Norton Desktop7-3
      Conflicts with Disk Cache Programs7-3
      Using EMM386.EXE7-4
      SETUP repeatedly requests the same disk7-4
      Testing OmniPage with a Simplified System7-4
Scanners7-5
      The Scan Image commands are grayed out7-5
      "Can't Open Scanner" message displays7-5
      Microtek Scanners7-5
      Testing OmniPage with the Sample Pages7-5
      Checking the Scanner Driver Name and Version7-6
      Checking the Scanner Hardware7-9
      Changing your Scanner Installation7-9
      Scanning Causes System Crash7-10
Memory7-11
Operation7-12
Error Messages7-15
Caere Product Support7-24
      Dialog-up Services7-24
      Information We Need From You7-24
      International Support                                    7-24

Chapter 8 Understanding OCR
     How OCR Works                                                    8-1
     Basic OmniPage OCR Technologies                                  8-2
            AnyFont 8-3
            Page Analysis 8-3
            Character Experts 8-4
            Self-Learning OCR 8-5
            AnyPage 8-5
            Compound Neural System 8-6
            AnyFax 8-7
            The Language Analyst 8-7
            Trigram Analysis 8-7
            3DOCR 8-8
            True PageTM 8-8
Glossary
Appendix A How to Use True Page
      When to Use True Page                                   A-2
      True Page Considerations for Different Documents        A-4
            Business Documents                                A-S
            Legal Documents                                   A-6
            Newspaper and Magazine Articles                   A-7
            Tables and Spreadsheets                           A-8
      Settings Panel Options                                  A-9
            OCR Options                                       A-9
            Zones Options                                    A-10
            Font Options                                     A-11
      Target Applications                                    A-12
            Working in a Target Application                  A-12
Appendix B Technical Information
     TWAiN Scanners                                                   B-2
                   with the Canon CJ-10 Scanner                       B-2
           Use Relisys and Umax Scanners with TWAiN Only              B-3
           TWAiN Scanners Not Listed                                  B-3
           Using Microtek Scanners with TWAiN                         B-3
     Supported Output File Formats                                    B-4
     Formatting Information                                           B-6
           One-Page, True Page Output in WordPerfect for Windows 5.2  B-6

      True Page Output to WordPerfect for Windows 6.0 Format   B-6
      Recognizing Wide Text Zones May Cause Incorrect Margins  B-6
      Saving to Spreadsheet Applications                       B-7
      Incorrect Font Size Output                               B-7
      Retaining Graphics using AUTO and HP AccuPage            B-8
Other Important Information                                    B-9
      Required HP Printer Driver                               B-9
      Resolution for Orchid Fahrenheit 1280 Video Card         B-9
      Recognizing Legal, Landscape Pages with 3D OCR           B-9
      After Dark Star Trek Edition 1.0 and Image Assistant     B-9
      Calibrating a Printer for Image Assistant                B-9
Index

    Chapter 1
Installation
Please read this section carefully! It includes:
   What's in the Package
  System Requirements
  Security Lock (International Versions Only)
-  Saving Previous User Dictionaries Before Installation
  Installing the Software
  Setting up a Windows Swap File (Virtual Memory)
  Starting OmniPage
-  Selecting Your Scanner
   Conserving Disk Space

What's in the Package

What's in the Package
                       Your OmniPage or OmniPage Professional
                      5.02 package includes:

                        OmniPage Installation disks
                        OmniPage Reference manual
                        image Assistant tutorials booklet

                        Color Calibration Chart for Image
                         Assistant (Professional version only)

                         Warranty Registration Card

                       If anything is missing, please contact
                      your Caere dealer.

                         oPln~~t~5e~ write your warranty 
                            registration number (printed disk
                            labels) in this manual. The number
                            should be
                         nnnnX-Xnri~rinnnnn wherein n is a digit
                         and x is a letter.

System Requirements

Requirements

To install and run OmniPage, you need the following setup:

Computer with an 80386 or higher processor.

Microsoft Windows version 3.1 or higher.

Windows-compatible mouse
Total system memory of at least 8MB RAM. 12MB RAM are recommended
for Windows for Workgroups users to optimize speed.
4MB or larger permanent Windows swap file. OmniPage requires
hard disk space with at least 24MB available: 10MB for OmniPage
files (international versions require 14MB), 10MB for temporary
storage while OmniPage is running, and 4MB for Windows swap file.
OmniPage Professional requires hard disk space with at least
27.5MB available: 13.5MB for OmniPage files (international
versions require 18.5MB), 10MB for temporary storage while
OmniPage is running, and 4MB for Windows swap file.
Image Assistant (Professional version only) requires a Super-VGA
color monitor with 512K memory on the adapter card to view 256
colors. To view all 24 bits of color (millions of colors) in 24-
bit color images, you need a 24-bit video card.
t
Image Assistant uses large amounts of free hard disk space when
processing images. The more free disk space you have, the more
you can edit and process large image files.
An OmniPage-compatible scanner. OmniPage supports most Windows-compatible
scanners; see TWAIN Scanners on page B-2. Your scanner must be installed
and tested according to the manufacturer's instructions. OmniPage can open
TIFF image files produced by other scanners. See Supported Input File
Formats on page 3-8.

Security Lock (International Versions Only)

Security Lock (International Versions Only)

   international versions of OmniPage use a hardware key as security lock
to prevent unauthorized use of Caere software. The security lock, included
in the OmniPage package, is a sma. device that fits between your computer's
parallel printer port (LPT) and a parallel printer cable (if used). The
security lock must be installed in order for OmniPage to work.

To install the security lock:

1  Plug the security lock into your parallel printer port (LPT).

2  Plug the parallel printer cable (if used) into the security lock.

3 The security lock will not affect printer use.

Printer cable (if used)  Security lock

                                   

Parallel port (LPT)

a

Security Lock with HP Scanjet Scanners

   The combination of the security lock, an HP Scanjet or Scanjet Plus
scanner with an HP 88920 scanner interface board, and a computer with an
on-board printer port may not work. The error message "Cannot write to
device HP Scan" may appear when you try to select a scanner or scan a page
with the security lock installed. You must add another parallel port board
via an add-in card and attach the security lock there if this occurs.
Savina Previous User Dictionaries Before Installation

Previous User Dictionaries Before Installation
            Save a user dictionary (*.ud) created in a previous version of
           OmniPage as a text file before you install OmniPage or OmniPage
           Professional 5.0. The dictionary is overwritten during
           installation of the later version otherwise.

           To save a previous user dictionary as a text file:

            1 Open your older version of OmniPage.

            2 Choose User Dictionary.. in the Defaults menu.

            3 Select the user dictionary file to save and click
              OK. The Edit User Dictionary dialog box appears.

            4 Click Export...

            5 Save the dictionary as a text file in a different
              directory.

            6 Install and open your newer version of OmniPage or
              OmniPage Professional.

            7 Choose Edit User Dictionary.. in the Settings menu.
              The Select File dialog box appears.

            8 Click New.

              The File to Save dialog box appears.

            9 Enter a name for your new user dictionary and click
              OK. The Edit User Dictionary dialog box appears.

            10   Click import....

            11   Select the user dictionary you saved as a text
                 file and click OK.

              See Edit User Dictionary... on page 3-68 for more
              information on importing a text file.

Installing the Software

Installing the Software

   To optimize installation speed, make sure SMARTDrive is loaded
before installing OmniPage. Do one of the following to load
SMARTDrive:

Type smartdrv at the MS-DOS prompt before you start Windows.

Or, add the SMARTDrive command line to your autoexec.bat file.
For more information, see the Optimizing Windows chapter in your
Windows User's Guide.

To install OmniPage software:

2

Start Windows and open the Program Manager window.

Insert the copy of Omnipage disk #1 in drive A: (or B:) of your
computer.

3 Choose Run in the Program Manager File menu.

The Run dialog box appears.

4  Type A: \SETUP (or B: \sETUp) in the Command Line edit box and
   click OK.

A dialog box prompts you to choose where to install OmniPage.
OMNIPAGE is the default directory for first-time OmniPage users.
OMNIPRO is the default directory for first-time OmniPage
Professional users. If you are installing an upgrade of Omnipage
in the same directory as your current version, existing OmniPage
files are automatically deleted.

5 Click Continue to start installation.

A progress meter appears.

6

7

Insert the other installation disks as prompted.

Select your geographic location, North America or Not North
America, in the dialog box that appears and click OK.

Your selection determines the default dictionary.

8  Type your registration information in the dialog boxes that
   appear.

Installinq the Software

You will be prompted to print out this information.

      Product support is only available to registered users.
Please send the printed registration information to
      Caere in the supplied envelope. Outside the US or Canada,
      be sure to use the correct envelope. Or, fill out and
      return the supplied registration card.
9  Click OK in the notification box when OmniPage notifies you
   installation is complete.
10 Restart Windows.

Setting up a Windows Swap File (Virtual Memory)

Setting up a Windows Swap File (Virtual Memory)

   OmniPage performs faster the more available memory you have.
12-16MB RAM is recommended for optimal performance. Set up a
permanent Windows swap file with a minimum of 4MB of free,
contiguous disk space to further improve disk speed.

   A swap file acts as virtual memory. Free disk space set aside
as a swap file is used as if it were additional memory. This lets
you run more programs than you could with memory alone, but it is
slower than using regular memory.

      The disk space used for a swap file is different than the
_____ disk space needed for temporary storage while you are
      working on a file. Be sure to allocate enough free disk
      space for both a swap file and temporary storage.
   Windows 3.1 automatically creates a swap file at setup. You
can change the size of the swap file through the Control Panel.
Before setting up or changing a swap file, you may need to
optimize your disk to maximize the amount of contiguous free disk
space (defragment the disk). Contiguous means that the free disk
space is literally one solid, empty block. Utility programs such
as Norton Utilities can defragment a hard disk. For more
information about swap files, see the Optimizing Windows chapter
in your Windows User's Guide.
To set up or change a Windows swap file (virtual memory):
1 Start Windows in Enhanced mode by typing win /3
2  Double-click the Control Panel icon in the Main window of the
   Program Manager.
3  Double-click the 386 Enhanced icon to open the 386 Enhanced
   dialog box.
4  Click the Virtual Memory button to open the Virtual Memory
   dialog box.
This dialog box displays the location, size, and type of swap
file. The swap file should be at least 4096KB.
5 Click the Change button to expand the dialog box.

Setting up a Windows Swav File (Virtual Memory)

6  Select a new drive in the Drive list if you want to locate the
   swap file some place other than the default drive. For
   example, you can store the swap file on a second hard disk
   that is faster or larger than the default. If you can't find a
   drive with at least 4096KB of free space, try deleting some
   files and optimizing the disk again.

- - - A swap file must be located in an uncompressed drive. 
    If - you use DoubleSpace or another disk compression method,
    consult its documentation regarding swap files.

7 Select Permanent from the Type list.

8  Type 4096 or greater in the New Size edit box and select
    Use 32-Bit Disk Access if it is available.

9  Click OK in the Virtual Memory dialog box and click Yes to 
    verify changes to virtual memory.

10 Restart Windows.

Starting OmniPaqe

Starting OmniPage
                       To start Omnipage, do one of the following:
  Double-click the OmniPage or OmniPage Professional icon in the
   Windows Caere Applications program group box.
Or, choose Run in the Windows Program Manager File menu. Type in
the drive and directory where OmniPage is installed followed by
the Omnipage command. For example:
C: \Omnipage\omnipage for OmniPage or

C: \omnipro\omnipage for OmniPage Professional.

Selectinq Your Scanner

Your Scanner

   Your scanner must be installed according to the manufacturer's
instructions. Make sure that the scanning software supplied by
the manufacturer works on your system before you install
OmniPage.

   The first time you run OmniPage, the Select Scanner dialog box
appears for you to select your scanner. Depending on the make of
your scanner, a dialog box may appear prompting you for scanner
driver parameters such as I/O addresses. Please consult your
scanner's documentation for information about port or memory
addresses.

   The ScanMgr icon appears at the bottom of the Windows desktop
after you select a scanner. You can change your scanner selection
at any time. For a list of supported scanners, please see TWAIN
Scanners on page 3-2.

      To select a TWAIN scanner. please see TWAIN Scanners
      on page 3-2
To change your scanner selection:
1  Install and test the scanner according to the manufacturer's
   instructions.
2 Start OmniPage.
3 Choose Select Scanner... in the Settings menu.
The Select Scanner dialog box appears.
4  Select the name of your scanner and any required scanner
   driver parameters.
5 Click OK.
      If you experience a system crash when you try to scan, Cb
add the following line under [386Enh] in your
      system.ini file: ENNExciude-A000-EFFE and then restart
      Windows.

Conservinq Disk Space

Conserving Disk Space

   OmniPage copies all file format conversion filters and ScanMgr
files to your hard disk during installation. The size of the
average filter file is 26KB. The size of the average Sc file is
about 18KB.

   To save disk space, you can delete unused file format conversion
filters and ScanMgr files from your hard disk. ScanMgr files and
conversion filters for supported file -are listed under Appendix B.
To reinstall a full set of conversion filters and ScanMgr files, you
must reinstall OmniPage.


Chapter 2

Tutorials

   This chapter contains three tutorials, each of which contains a
number of exercises. The tutorials take you through basic text
scanning and into more advanced concepts such as how to create OCR
training files, scan a large stack of documents, and use deferred
page recognition to maximize your efficiency.

   There are three tutorials:

     Basic Text Recognition
     Document Types and OCR Settings
     Streamlining the OCR Workflow
e You Start
5;
1  Be sure your scanner is attached, turned on, and working with your
   system.

2  Make sure you have the following page samples you need to work
   through the tutorials in this chapter:

     Multiple Column Page Sample

     Single Column Page Sample

     True Page Sample (Professional version only)

   These samples were included with your OmniPage package.

3  Save the files as directed during the exercises so you can use
   them in later exercises.

Tutorial 1 Basic Text Recognition

Tutorial 1 - Basic Text Recognition

   OmniPage lets you scan documents and recognize text with the click
of a single button in the toolbar. The toolbar also the most common
OCR options at your fingertips. OmniPage gives you efficient,
flexible control over your documents: you can stop, backtrack, and
restart at any stage without repeating the whole process.

   This chapter takes you through basic scanning and text recognition
exercises with OmniPage. After completing the exercises in this
chapter. you will know how to: Use the Auto feature to recognize text
in standard pages.

  Use the toolbar and Settings Panel.

  Use the process buttons to scan, zone, and recognize text.

  Check and change OCR results.

  Work with the True Page option (Professional version only).

Launch Image Assistant from OmniPage to work with graphics
(Professional version only).

The OCR Process

  This exercise acquaints you with the OmniPage application window and
gives you a brief overview of the OCR process. There are two steps:

1 Open OmniPage.

2 Reset the defaults if necessary.

Tutorial 1 Basic Text Recognition

Open OmniPage

  Open OmniPage by double-clicking its icon in the Program Manager.
Caere Applications is the default program group.

The OmniPage Toolbar

A single click of the AUTO button processes your documents
automatically.

  OmniPage's toolbar contains an AUTO button, three large process
buttons, and several shortcut command buttons as pictured above

Tutorial 1 Basic Text Recognition

The process buttons outline the basic flow of OCR.


The Image button determines where the page images come from. The Zone
button determines whether recognition zones are automatically or
manually set.

The OCR button determines how and when OCR is performed. Basic OCR
includes: Where the page images come from OmniPage can scan documents
or open image files.
          Whether recognition zones are automatically or manually set.
OmniPage can automatically define and order the page areas to be
recognized
How and when OCR is performed.
OmniPage can perform optical character recognition on the page right
away (Perform OCR). In the Professional version only, it also can
perform OCR at a later time (Defer OCR) and learn special characters or
symbols (Train OCR).
  The smaller buttons are shortcuts to menu commands such as Copy,
Save, Check Recognition..., and so forth.
Reset the Defaults (if necessary)
  The default settings are active the first time you open OmniPage.
  If you have not changed any settings, proceed to the next section,
Automatic OCR with the Default Settings. Otherwise, follow these steps
to return to the default settings: Tutorial 1 Basic Text Recoqnition

1Click the drop-down list under each process button and select these
options:

  Scan Image

  Auto Zones

 Perform OCR (only the Professional version has a drop-down list)


        Click the arrow to open the options list. Then click the option
        you want to set.


2  Click the Settings Panel button in the toolbar to open the Settings
   Panel.

  3Click Use Defaults to return to the default settings.

4 Click Yes in the dialog box that asks if you are sure. S Click Close
to close the Settings Panel.

    You can leave the Settings Panel open if you have room on your
    screen. (This is useful if you need to change the settings
    frequently.)

The next exercise shows you how to use the OmniPage default settings to
scan a page and recognize its text.

Tutorial 1 - Basic Text Recognition

Automatic OCR with the Default Settings

  OCR is easy with OmniPage even when the page itself is complex. Just
click the AUTO button and OmniPage goes to work: it determines scan
intensity, column structure, and performs OCR.

  In this exercise, you will use the Multiple Column Sample practice
sheet, scanning with the default settings. (If the defaults been
changed, please reset them as described on the page.)

There are three steps:

1 Click AUTO to start the process.

2 Check the results in the text window.

3 Save the file.

Click AUTO to Start the Process

1 Place the Multiple Column Sample in your scanner.

AUTO

For best text recognition, make sure the page is aligned and oriented
correctly with the text facing down. Most scanners have an arrow or
graphic that indicates proper page placement.

2 Click the AUTO button.

The AUTO button changes to a STOP button.

OmniPage highlights each process button in turn as its function is in
progress. The status bar at the bottom of the window also reports
progress.

The Image button is outlined in black as OmniPage creates a
recognition document from the scanned page.

Tutorial 1 - Basic Text Recognition

  The scanned image opens in the zone window. The Zone button is
   highlighted.

OmniPage determines column flow for the text and divides it into
recognition zones, each surrounded by a rectangle. This shows how
OmniPage will order the text as it recognizes the image.

  The OCR button is outlined in black as character recognition takes
   place.

A character window opens with an enlarged view of the text during the
OCR process.

There are three passes over the text for the OCR when you use the
default settings: a cyan pass for initial recognition; a blue pass as
the recognition text Tutorial 1 Basic Text Recognition

is analyzed and corrected; and, a dark blue pass the final
recognition stage.

---

3 View the recognized text document in the text w'

  if you have OmniPage Professional, the OCR Settings Panel option
True Page Retain All Page Formatting is the default your text output
matches the original document as closely a' possible. See True Page
Recognition (Professional version onl' on page 2-29 for more
information about scanning with this feature.

  
Tutorial 1 Basic Text Recognition

  If you are using the non-professional version of OmniPage, the OCR
Settings Panel option Retain Font and Paragraph Formatting is the
default. OmniPage matches the font and paragraph format to the
original but the text is displayed in one column in order of
recognition.

--  --     a.  TI






-, I    -





 -I ffi-  TT
    -TI -  -     a

      il a

If the text is not ordered by either method described above, you may
have misaligned the page in your scanner. Realign the page and try
scanning again.

Check the Results in the Text Window

  After the image is converted to editable text, view the recognized
document in the text window and compare it to the scanned image in the
zone window, then check OCR results.

  OmniPage highlights words or characters in the text that were
changed by the Language Analyst or identified as questionable or
unrecognizable. (If there are no highlighted words, there were no
known errors.)

Blue: words the Language Analyst changed are highlighted in blue.

Green: "suspects," words that may not have been recognized correctly,
are highlighted in green.

unrecognizable characters, or "rejects," are marked with a red tilde
(~).

Tutorial 1 Basic Text Recognition

1  Choose Tile in the Window menu so that you compare the text and
   zone windows if the te> is maximized.

2  Click the maximize arrow at the top right of the to return the text
   window to full size for easier editing.

3  Click the Check Recognition button to check for potential OCR
   errors. (This button is dimmed if window is not active.)


4 Correct any errors in the text.

If the word is misspelled, you can correct the spelling. Change To
edit box and click Change.

OmniPage may list one or more suggested words  in the Change To drop-
down list. The first word in the list is the word as OmniPage
recognized it. Click on a suggested word and click Change to replace
the word in the text. Alternatively, type the proper word in the
Change To edit box.

If the word is correct as presented, you have two choices:

  Click Add to add the word to your User Dictionary. Words added to
   the User Dictionary are considered acceptable spellings in future
   documents. Tutorial 1 - Basic Text Recognition

  Click Ignore to ignore the currently flagged word. Other instances
   of the word in the document will be checked

After you click a button, OmniPage automatically moves to the next
word it identified as misspelled or questionable. Continue correcting
text as necessary.

 Double-click any word in the text window.

The Verification window opens to display the corresponding word in the
original scanned image. Verify that the recognized word is the same as
the word in the original. If it is not, retype the word correctly in
the text window.

 Click anywhere in the text window to close the Verification window.

Save the File

  Once a document's text is recognized, you can save it either as a
file in one of several word-processing formats or as a Caere Document.

  Documents saved using a word-processing format, such as WordPerfect,
can be opened from within that application in the same way as any other
one of its files. This method saves the formatted text and any embedded
graphics in the text window, not the original scanned page image from
the zone window.

  Saving the document as a Caere file (*.met) saves both the recognized
text in the text window as well as the original page image from the zone
window.

  OmniPage can open only Caere Documents. If a scanned page is going to
be used more than once, or saved to several different word-processing
file formats, save it first as a Caere Document. This saves you the
trouble of rescanning the page. In this exercise, you will save the file
first as a Caere Document and then as a word-processing file.

Tutorial 1--Basic Text Recognition

Choose Tile in the Window menu so that you can compare the text and
zone windows if the text wi is maximized.

Click the maximize arrow at the top right of the wint to return the
text window to full size for easier text editing.

Click the Check Recognition button to check for any potential OCR
errors. (This button is dimmed if the t window is not active.)

The Check Recognition window opens with the imag and the text of the
first word that was replaced or
questioned during OCR.

on are gcnerated eac:

Correct any errors in the text.

If the word is misspelled, you can correct the spelling in the Change
To edit box and click Change.

OmniPage may list one or more suggested replacements in the Change To
drop-down list. The first word in the list is the word as OmniPage
recognized it. Click on a suggested word and click Change to replace
the word in the text. Alternatively, type the proper word in the Change
To edit box.

If the word is correct as presented, you have two choices:

Click Add to add the word to your User Dictionary. Words added to the
User Dictionary are considered acceptable spellings in future
documents.
Tutorial I--Basic Text Recoqnition

 Click Ignore to ignore the currently flagged word. Other instances of
the word in the document will be checked .

After you click a button, OmniPage automatically moves to the next word
it identified as misspelled or questionable. Continue correcting text
as necessary.

5 Double-click any word in the text window.

The Verification window opens to display the corresponding word in the
original scanned image. Verify that the recognized word is the same as
the word in the original. If it is not, retype the word correctly in
the text window.

Click anywhere in the text window to close the Verification window.

Save the File

Once a document's text is recognized, you can save it either as a file
in one of several word-processing formats or as a Caere Document.

Documents saved using a word-processing format, such as WordPerfect,
can be opened from within that application in the same way as any other
one of its files. This method saves the formatted text and any embedded
graphics in the text window, not the original scanned page image from
the zone window.

Saving the document as a Caere file (*.met) saves both the recognized
text in the text window as well as the original page image from the
zone window.

OmniPage can open only Caere Documents. If a scanned page is going to
be used more than once, or saved to several different word-processing
file formats, save it first as a Caere Document. This saves you the
trouble of rescanning the page. In this exercise, you will save the
file first as a Caere Document and then as a word-processing file.
Tutorial 1--Basic Text Recognition

Click in the text window to make it active. 2 Click the Save As...
button.
The Save As dialog box opens.

Caele~' METI

Select CaeYe[~.Met] in the Save File as Type edit box an select the
Data directory as its location.

4 Type lsc~n.met in the File Name edit box.

5 Click OK.

6 Choose Close Document in the File menu or use the Ctrl+W keyboard
shortcut.

Choose Open Document... in the File menu or use the
Ctrl+O keyboard shortcut to re-open the file lscan.met

The file opens in the OmniPage window.

Note that both the zone and text windows open.

Click the Save As... button.
Tutorial 1--Basic Text Recognition

9 Select a word-processing application file type, such as Microsoft
Word for Windows, in the Save Files as Type list box and give the file
a new name.

I~~t

~ lile ~or all pdg~
s ,,~r,l,

~:t lealc nev~ : ~

blanl pag~

E~7 c:~ omnipage ~ dala

10 Click OK.

11 Choose Close Document in the File menu or use the Ctrl+W keyboard
shortcut.

Touring the Toolbar

The OmniPage toolbar provides options for fast and efficient document
processing.

You can set the Image, Zone, and OCR options on the toolbar before
clicking the AUTO button so that scanning progresses without
interruption.

You can also select the Image, Zone, and OCR processes one at a time.
Click the Image button to scan or load images, and click the Zone
button later when you are ready to set the zones. Each process button
becomes available as soon as the preceding process is finished.

This exercise gives you an overview of the most commonly used options.
Note the location of the Settings Panel button in
Tutorial 1--Basic Text Recognition

the toolbar. For more information on the Settings Panel see Touring the
Settings Panel on page 2-15.

The Image button determines where the page images come from.

The Zone button determines whether recognition zones are automatically
or manually set.

The OCR button de- This button opens termines how and the Settings
Panel. when OCR is performed .

~ ~ ~ ___ I ~ _---~ ~ =~ '-~'1 1 1~'-~
~ F~ ~ ~ ~ ~ ~ a~ ~ 1 ~ ~ ~ ~ ~ .~ l

There are three steps in this exercise:

Select an Image button option.

2 Select a Zone button option.

3 Locate the OCR button (select options in the Professional version
only).

Select an Image Button Option

Getting an image, either by scanning a document or loading an image
file, is the first step in the OCR sequence. The option you select in
the drop-down list takes place when you click the AUTO button.

Click the drop-down list under the Image button and choose Scan Image.

Use this setting when scanning pages with your scanner.

2 Click the drop-down list in the Image button and select Load Image.

Use this setting when opening image files for recognition.

3 Reset the drop-down list to Scan Image.

Select a Zone Button Option

The Zone button lets you choose whether to have OmniPage draw zones
automatically or whether to draw the zones manu
ally. The Zone button is available after a page has beenscanned or an
image loaded.

Click the drop-down list in the Zone button and select Manual Zones.

If you select Manual Zones, then click the AUTO button, OmniPage stops
after acquiring the image. At this point you can draw the zones
manually. See Chapter 2, Document Types and OCR Settings, for more
information about when and how to draw your own 70ne~.

OmniPage Professional users also can use the Zone button to open zone
templates they have created. See Standardized Forms on page 2-42 for
more information.

2 Reset the drop-down list to Auto Zones.

Locate the OCR Button

The OCR button is the last button in the process. The OCR button is
available after a page has been zoned, whether automatically or
manually. OmniPage Professional users can select the Train OCR and
DefeY OCR options, detailed in later chapters.

Touring the Settings Panel

The Settings Panel lets you customize the OCR process: you can set
scanning, zoning, OCR, spell checking, and other parameters. Select
these options before scanning.

This exercise provides an overview of the most important Settings Panel
options as well as brief explanations of when to choose certain
options. For a more detailed explanation of each item in the Settings
Panel, please refer to Chapter 4, The Settin~s Panel.

There are eight steps:

Open the Settings Panel

2 View the Panel options

3 Select the Scanner options

A      C A I A ~ tt h
A      7 r~ n A ~n t
i ~ n c:
TutoYial I--Basic Text Recognition

5 Select the OCR options

6 Select the Fonts options

7 Select the Spelling options

8 Select the Preferences options

Open the Settings Panel

There are several ways to open the Settings Panel.

Click the Settings Panel button in the OmniPage toolbar.

The Settings Panel opens the way you last left it.

2 Click Close to close the Settings Panel.

3 Position the mouse pointer over the Image button in the toolbar and
click the right mouse button.

The Settings Panel opens to the Scanner options.
This method of opening the Settings Panel works with the Scanner, Zone,
and OCR settings when you click the correspondin~ process button.

Each task must be available or the button cannot be clicked. The Zones
and OCR process buttons are not available until a document has been
loaded or scanned.

5 Click Close to close the Settings Panel.

6 Choose Settings Panel... in the Settings menu or use the Ctrl+E
keyboard shortcut to re-open the Settings Panel.

View the Panel Options

The Settings Panel is composed of six different sets of options. Each
is represented by an icon in a scrollable list in the window.

Use one of the methods described in Step 1 to open the Settings Panel.

Click the Scanner icon in the left of the Settings Panel to view
scannin~ options.

Click each icon in turn to view its optlons.

Use the scroll box to access and select icons below the OCR icon.

Select the Scanner Options

Select the scanner icon again to view options available when using your
scanner.
Tutorial 1--Basic Text Recognition

The most important settings for recognition accuracy are the choices
under Options. These determine how a page is scanned and will vary
according to the type of document you want to recognize.

OmniPage Professional users can select the 3D OCR option. Combined with
AnyPage or HP AccuPage 2, 3D OCR achieves OmniPage's best possible
accuracy with difficult document types such as degraded documents or
pages with varying print intensity. See Chapter 2, Document Types and
OCR Settings, for more information.

Select the Auto Brightness with AnyPage/HP AccuPage Z o~tion.

Use this for pages with crisp text on varicolored backgrounds, such as
magazine pages in which some of the text appears on a colored
background. Any halftones on a page scanned with this setting will
appear as grayscale images.

Scanning with this setting is slower than using the manual brightness
control, but the OCR results are generally better.

This option uses Caere AnyPage or HP AccuPage 2 technology to adjust
the brightness levels for each section of the text automatically. If
your scanner supports HP AccuPage 2, the option will be named Auto
Brightness with HP AccuPage 2.

Select Manual Brightness.

Use this for pages with distinct, normal-sized text (8-12 point)
printed on white paper, and for all black-andwhite scanners. Any
graphics on a page scanned with this setting will appear in black and
white. This is the fastest option because the page is scanned at a
uniform brightness level.

Move the scroll box for brightness control to the left and to the
ri~ht.

The brightness setting changes as the scroll box moves. The number of
settings you have available depends
upon your scanner model. An HP ScanJet Plus, for example, has 256
brightness settings.

The brightness control is only available with the Manual Brightness
option. It is dimmed if one of the other two options is selected.

Select the Zones Options

Click the Zones icon to view the Zoning Method options.

These options determine how OmniPage zones the areas for reco~nition
and orders the text.

r~

mS~ rir f31 ! ~ble

Multiple Columns is the default zoning method. It detects column flow
in standard and multi-column documents and lets you save any graphic
images.

Single Column or Table is used for spreadsheets and correspondence.
Graphics will be discarded.

None is used in special circumstances. Refer to None on page 2-10 for
more information about this setting.
Select the OCR Options

Click the OCR icon to view OCR input and output options.

Select Use Language Analyst to check spelling and perform word and
character analYsis.

Language analysis begins automatically during the recognition process.

Use Output Options to choose how OmniPage will handle output
formatting.

 True Page - Retain all Page Formatting is the default for the
Professional version only; it is not an option in the regular version
of OmniPage. Select this to retain all the original page formatting in
a recognized document. Use this to duplicate a document as closely as
possible, especially when you will not need to do much editing or
reformatting after recognition.

Retain Font and Paragraph Formatting Only is the default setting in the
regular version of OmniPage. Select this to retain font and paragraph
formatting in a recognized document.

Ignore Fonts and All Formatting is an option for those who need only
unformatted text from a document.

The choice you make in Output Options will affect which selections are
available in the Fonts Options.
Select the Fonts Options

Click the Fonts icon to choose typefaces that will appear in the text
window or your word processor if you chose OCR options that retain font
formatting. If font formatting is ignored, choose one typeface for all
font formats.

Use the Retained Font Formats choices when you have selected the True
Page - Retain All Page Formatting or Retain Font and Paragraph
Formatting options in the OCR settings.

Use the Ignored Font Formats choices when you have selected Ignore
Fonts and All Formatting in the OCR settings.

The font characteristics will not be retained if you disable your
Windows TrueType fonts.
Select the Spelling Options

Click the Spelling icon to select dictionary and spell checking
options.

OmniPage's main dictionary contains over 100,000 terms. (Order
dictionaries for additional languages by calling Caere at (800)
535-SCAN.) The User Dictionary is your personal editable dictionary.

Select the Preferences Options

Click the Preferences icon to customize general OmniPage operations.
Using the Process Buttons

Instead of using the AUTO button, you can click each process button in
the toolbar individually. Each button becomes available as soon as the
preceding button's process is finished.
Once an image is loaded or scanned, you can set Zones or OCR options in
the Settings Panel if necessary. Once set, you do not have to rescan
the image; just click the appropriate button to
have the new settings take effect.

You will use the Single Column Page Sample to practice using each
process button individually. After going through all three processes,
you will reset the Zoning Method option in the Set
tings Panel and then click the Zone and OCR buttons to redo
the OCR process.

There are six steps:

Check the toolbar settings.

2 Click each process button in turn.

3 Change the zoning method in the Settings Panel.

4 Click the Zone button to reset the zones.

5 Click the OCR button to finish the process.

6 Check the text and save the file.

Check the Toolbclr Settings

Click the drop-down lists under the process buttons and select

Scan Image

Auto Zones

 Perform OCR

Click the Settings Panel button in the toolbar to open the Settings
Panel.

Click Use Defaults to return OmniPage to its default settings .

Tutorial 1--Basic Text Recognition

2-Z4 Tutorials

4 Click the Zones icon in the Settings Panel.

Note that the Multiple Columns zoning method is the default.

You are scanning the Single Column Page Sample, which means that the
wrong zoning method is selected. Leave this setting as it is for now--
you will change it later.

5 Professional users only--click the OCR icon.

6 Select the Retain Fonts and Paragraph Formatting Only option .

7 Click Close.

Click Each Process But~on in Turn

Align the Single Column Page Sample properly in your scanner.

2 Click the Image button.

The scanned document appears in the zone window.
Click the Zone button.

OmniPage zones the document.

Note that OmniPage, because of the zoning method set in the Settings
Panel, mistakenly zones the numbers on the right of the table as a
separate column of text.

Click the OCR button.

OmniPage makes three passes over the document and displays the
recognized text in the text window.
5 Compare the formatting of the document in the text window to that of
the image in the zone window.

1;~ ~ _ ~:
I e t edlbng capa~lllbes
Price
Sophistication of features
Image handling capabilities
Number of features
Direct application input
Multi-language recognition i
       9~       ~
g2
89
87
82
82
           7~           _
I   ~  ~   ~ n          .~

This exercise used the Multiple Columns method, which means that
OmniPage ordered the zones from left to right according to the columns
it detected.

OmniPage has separated the table into two distinct columns, placing the
numbers column below the text column.

Multiple Columns was the wrong option to choose for this document. You
must choose the Single Column or Table zoning method to maintain the
table's formatting.

Change the Zoning Method in the Settings Panel

Reopen the Settings Panel.

2 Click the Zones icon.
Tutorial 1--Basic Text Recognition

Select the Single Column or Table option.

4 Click Close.

Click the Zone Button to Reset the Zones

Click the Zone button to reset the zones.

A dialog box asks if you want to replace the current zone~.

Click Yes.

OmniPage draws new zones.
Tutorial 1--Basic Text Recognition

3 Verify that the new zones are drawn correctly.

The table is now preserved as a unit.

Click the OCR Button to Finish the Process

Click the OCR button to finish the process.

A dialog box asks if you want to replace the current text .

2 Click Yes.

When recognition is complete, the text window opens.

3 Compare the formatting of the document in the text window to that of
the image in the zone window.

Notice that the numbers in the table's second column now line up with
the corresponding lines of text in the first column. The table's format
has been preserved by using the proper zoning method.

2-28 Tutorials
Check the Text and Save the File

Click the Check Recognition button in the toolbar and make any changes
necessary.

See Automatic OCR with the Default Settings on page 26 for detailed
instructions on checking text.

2 Click the Save As... button.

3 The Save As dialog box opens.

Select a word-processing application file type, such as Microsoft Word
for Windows, in the Save File as Type drop-down list and save the file
with a new name.

5 Click OK.

6 Choose Close Document in the File menu or use the Ctrl+W keyboard
shortcut.

True Page Recognition (Professioncll version only)

OmniPage Professional includes True Page recognition so your OCR output
can retain the page layout and the images exactly as they were
displayed on the page. Skip this exercise if you do not own the
Professional version of OmniPage.

In this exercise you will scan the True Page Sample. There are three
steps:

Select True Page and other settings.
Tutorial 1--Basic Text Recognition

2- ~n Tut~Yi~lc

2 Click AUTO.

3 Check the results and save the file.

Select True Page and Other SeHings

Click the drop-down lists under the process buttons in the toolbar and
select

 Scan Image
 Auto Zones
 Perform OCR

2 Click the Settings Panel button in the toolbar.

3 Click Use Defaults.

4 Click the OCR icon.
Make sure the following options are selected:

True Page - Retain All Page Formatting

Retain Graphics

~llck Close.

Click AUTO

Align the True Page sample in your scanner.

2 Click the AUTO button.

When the process is finished, the text window should open to show the
recognized document in its original format and the graphic in its
proper place.

thas exarrp~e,
lryoure setung
     contrast you'll
see 15 sarn-
set of ~ples of how
your rrnage will

ft   look with 15
different con-
     trast settings

Check the Results and Save the File

Click the Check Recognition button in the toolbar to verify the OCR
results.

2 Save the file as a Caere Document.

3 Type the name 1

4 Save the file again in a word-processing format.

You have a number of different options for saving the file. If your
word-processing application supports embedded graphics, you can save
the document in that application and the graphics will be displayed.
Tutorial 1--Basic Text Recognition

2-32 Tutorials

If you like, repeat Steps 1 and 2 but deselect the Retain Graphics
option in the OCR Settings Panel. The text should appear in the same
format you see, but will have an empty space where the graphic was
originally.

ofes-      ''~ et of     ple s ~f how ~ 0~ ~g- w~ g soft     trast
settingS

Opening a Graphic in Image Assistant (Professional version only)

You can work with Caere's full 24-bit image editing application, Image
Assistant, directly from OmniPage. This exercise uses the saved
document from the previous exercise. Skip this exercise if you do not
own the Professional version of OmniPage .

There are three steps:

Open the Caere Document file called layout.met.

2 Double-click the graphic.

3 Experiment with the Tool Palette.

Open the Caere Document File Named Layout.met

Open the Caere Document file you saved as layout.met. If you need to
scan the page, see the instructions in the preceding exercise.
Tutorial 1--Basic Text Recognition

Double-Click the Graphic

Click the text window to make it active.

t ~t ohas     example, rl yoi~r~i s e~5~ng

s avail-     IPook with 15 c'ifferent cong       trast settings

the text window.

2 Double-click the graphic in

Image Assistant launches and opens a document window containing the
graphic you just double-clicked.

Tl~torial~ 2- ~ ~
Tutoria/ 1--Basic Text Recognition

Experiment with the Tool Palette

Experiment with the tools in Image Assistant's tool palette to see what
special image editing effects you can achiev~

Refer to the Image Assistant tutorials booklet for a ~uided tour of its
features.

Choose Save in the File menu if you want to save changes to the file.

The file's default name is OmniPage.tif. It is saved as a TIFF file.
You can choose Save As... in the File menu if you want to assign the
file a different name and file type.

3 Click OK.

4 Choose Exit in the File menu.

Image Assistant closes and you are returned to OmniPage Professional.

Refer to the Image Assistant tutorials booklet included with your
OmniPage package for detailed information. You can also use the online
Help system available in Image Assistant.
Tutorial 2--Document Types and OCR Settings

Document Types and OCR Settings

People encounter a variety of documents in an average workday: office
memos; legal documents; standardized forms; newspaper and magazine
pages; foreign-language reports; etc.

Before you scan any page, you must determine how you want OmniPage to
order the page information and in what format you want the pages'
recognized text and graphics.

This chapter examines some common document types and the OCR concepts
associated with each one:

Setting zoning options

You'll use the zone tools and learn which OCR settings to choose for
various types of documents.

Complex layouts

You'll learn when to use manual zoning and practice recognizing just a
portion of a scanned document.

Standardized forms

You'll specify zone contents, edit a zone contents file, create a zone
template, and export a graphic.

Legal documents and spreadsheets

General tips are listed for each.

Documents with specialized characters

You'll train OmniPage to recognize specialized characters and edit an
OCR Training file.

Foreign-language and multilingual documents

You'll learn how to use the Language Analyst, how to select
dictionaries, and how to select an appropriate character set.

Make sure you have the following page samples you need to work through
the tutorials in this chapter:

Manual Zoning Page Sample

Standardized Form Sample
Tutorial 2--Document Types and OCR Settinas

Setting a Zoning Method

The zoning method selection in the Settings Panel tells OmniPage how it
should evaluate the column structure of text zones. These zones may be
drawn either automatically by OmniPage or manually by you.

Select Multiple Columns when recognizing several columns of text on a
page or any column with a graphic.

OmniPage separates text from graphics and looks for regular vertical
separations of text to define columns.
Tutorial 2--Document Types and OCR Settings

This is a good method to use on magazine or newspaper pages.

Select Single Column or Table when recognizing a table, chart,
spreadsheet or page-wide text with no graphics (memos and reports, for
example).

~ .          1~

Select None when you want everything in the zone recognized as text.

OmniPage will not discern column layout or distinguish graphics from
text. This is the fastest option to use when you recognize manually
drawn, text-only zones. It is useful for documents with very small text
areas such as those found in pleading pages or telephone books.

With practice, you will learn which options to select for particular
documents. The examples in this chapter will strate some of those
choices.

Refer to Touring the Settings Panel on page 2-15 and Chapter 4, The
Settings Panel, for more information on the Settings Panel options.
Tutorial 2--Document Types and OCR Settinqs

Complex Layouts

After you select options in the Settings Panel, you have a choice
between auto and manual zoning. With complex or unusually formatted
documents, manual zoning often returns better results than auto zoning.

In Chapter 1 of this tutorial you used auto zoning after scanning the
page samples. Manual zoning would have achieved virtually the same
recognition results, but with more effort on your part. In those
exercises, there was no point to manual zoning because the entire
document was recognized and no text was reordered.

Use manual zoning in the following circumstances:

to select just a portion of a page for recognition to rearrange text
order to specify the contents of a particular zone

You can practice drawing your own zones with the Manual Zoning Page
Sample. There are six steps:

Set Manual Zones and other options.

2 Click AUTO to start the process.

3 Practice using the zone tools.

4 Draw the appropriate zones.

5 Perform OCR.

6 Check the results and save the file.

Set Manual Zones and Other Options

Set these options in the toolbar:

Scan Image
Manual Zones
Perform OCR
Tutorial 2--Document Types and OCR Settings

Open the Settings Panel and click Use Defaults to return OmniPa~e to
its default settings.

3 Click the Zones icon.

4 Be sure that Multiple Columns is the selected zoning method .

5 Click the OCR icon.

6 Select Retain Font and Paragraph Formatting.

This option preserves the document's fonts and paragraph structure.

7 Click Close.

Click AUTO to Start the Process

Align the Manual Zoning Page Sample in your scanner.

2 Click AUTO to start the process.

OmniPage scans the page. The zone window opens with the zone tools
palette displayed. The process stops so you can draw recognition zones
manually.

Use the arrow buttons to rotate the image.

Zoom your view of the           e~
page in and out. ~           ~             l ZoneConen~s:
|Alphanume~ic                1~ 1

Draw zones around the text _
you want recognized.                +

Change the order of the      ~ .  O~pSe~    ~hroclt
recognition zones.           1~   ~5~                         I

                             1~     ~                         l
                                   ~                          I

Erase a zone.               ~_, ~ ,~

_.. ~ _
-~.~.=

L _______~_
Tutorial 2--Document Types and OCR Settings

Practice Using the Zone Tools

~: I When the zone window opens, the Draw Zones tool button . +~ I with
a cross-hair appears. (If you had selected Auto Zones, this _ I button
would show a cursor instead of a cross-hair.)

Click the Zoom tool.

The cursor becomes a magnifying glass.

Move the Zoom tool over any part of the image and click the left mouse
button to enlarge the image.

Move the Zoom tool over the enlarged image and click the right mouse
button to reduce the image.

4 Click the Draw Zones tool.

5 Hold down the mouse button and drag the cursor to draw a rectangle
around any section of text on the page.

Leave white space around the text if possible.

OmniPage tags this rectangle with a 1.

Draw a second zone anywhere on the page.

This rectangle is numbered 2. OmniPage numbers each new zone
sequentiallv.

Click the Order Zones tool.

The cursor becomes the # symbol and numbers in the two zones disappear.

Click the second zone you drew.

Now the zone is labeled 1. This zone will be recognized first and
placed at the beginning of the new document in the text window.

Click the first zone you drew.

It is now labeled 2 and will be recognized second.

10 Click the Erase Zones tool and click each zone.

The zones disappear.
TUtt')rial 2--Document Types and OCR Settings

11 Click the left arrow button to rotate the image 90 degrees
counterclockwise.

12 Click the right arrow button to rotate it back.

OmniPage rotates the page automatically when you use the AUTO feature
and you have Automatically Correct Page OYientation selected in the
Settings Panel OCR options.

Draw the Appropriate Zones

Suppose the only information you need from this page is the text about
the international awards OmniPage products have won. Shorten
recognition time by drawing zones around just the portions of text you
want to use.

Draw a zone around the February 1992 award listed in the Product
Highlights section of the text.

OmniPage Sets the Standard for O

t~re~5 OmniPalle ~s thC world ~tr~drlrd for optical ~cW n~cogaition
iOCI~ W] r Orir~n~llv releasrdl b 198~, OmniPge WA5 the ~ir9t acr~te,
rd~rdnble tX ptrrsonal conipuWs Since then Cr CtC hn~ pion~ d r ew OCR
technr loEiec to mnl page tecognltion t ~C acr urate ;t~d er der to rle
d41n ever To~y cac alro of hn
aw~rd-win[ung ODutil'age Psofeuion~l, whiCh provirle~ you zll ~hc
p~wnfi~l ~p b ci Omni P~p and mrare

C~P rlrmotmcc ~urt ~d irnage scr~nnin~ rottluor.

Aprll 1~2
OnmiP~gc win~ Word.

luly 1992:

OraniPay~ r~s PC ~:~gr~rin~'~ '~litDr'~ t~hoice" uwr~rd

ALguel 199~.

MacworlJ mag~zine
~ward~ O~niPage it~
"~'vorld ~Irlcc Awrrd -

                      UK'~ "Elitt}r's Cho
                      aw~d
                      Novernbcr 1992
                      PC Lbn puting n~
                      zulc sebctt~ OrnniP
OCR"                  Din3ct ar,d HP'r~ Sc
       b Ir992        npn~ blorir

SCpkc~               Product~ - Inpul D~
US Par~rnlO~ice
  Cuerr a ~tent r)n
Tutorial 2--Document ~ypes and OCR Settings

2 Draw another zone for the August 1992 award.

3 Draw a third zone for the October 1992 award.

Perform OCR

Click the AUTObutton or the Perform OCR button to continue the process.

The recognized text appears in the text window.

Check the Results c~nd Save the File

Click the Check Recognition button in the toolbar to check your OCR
results.

You can save the file in the format of your choice, choose Close
Document in the File menu, or use the Ctrl+W shortcut.

Standardized Forms

You can speed document processing and improve accuracy by manually
zoning a document and telling OmniPage what kind of characters it
should recognize in those zones. This is called "specifying zone
contents." It is particularly effective with standardized forms and
spreadsheets.

You can save the zones as a template file and use this template each
time you scan the same kind of document.

If you do not specify the zone contents, OmniPage looks for
Alphanumeric characters: letters of the alphabet, numbers, and standard
punctuation symbols. You can specify a zone as Numeric or Graphic as
well.

 The Numeric file is editable. Characters can be added to or deleted
from this file.

 Any zone recognized as a graphic can be exported separately as a
graphic file.

 You also can create your own zone contents files with the characters
you require.

Specifying Zone Contents

In this exercise you will create a new zone contents file and export a
graphic.

Use the Standardized Form Sample in this exercise.
Tutorial 2--Document Types and OCR Settings

There are five steps:

Set the toolbar options.

2 Scan and zone the image.

3 Create a new zone contents file.

4 Perform OCR.

5 Export the graphic as a TIFF file.

Set the Toolbar Options

Set these options in the toolbar:

Scan Image

Manual Zones

Perform OCR

Open the Settings Panel and click Use Defaults to return OmniPage to
its default settings.

Scan and Zone the Image

Click the AUTO or Image button to begin scanning.

OmniPage scans the page and stops so you can draw manual zones. The
image appears in the zone window.
Tutorial 2--Document Types and OCR Settings

2 Draw a zone around the logo and the words "Account Analysis" in the
top left of the page.

r CCII =AAUkBiB~

Fh~ndal Inlann~n

Click the Zone Contents drop-down list and select Graphic.

This tells OmniPage not to perform OCR on that zone because it contains
a picture.

For the purposes of this exercise, you are recognizing the entire
company logo as a graphic, even though it consists mainly of letter.s.

Account Analysis

Note that OmniPage normally would recognize the logo as text, and skip
recognizing the icon in the logo entirelv.
Tutorial 2--Document Types and OCR Settings

Draw a zone around the text under the logo, from the Account
information through the first paragraph under th~ Financial Information
header.

.~.~ ,. ~ .

Fhlmdal Inf~aa~n

~-      ~     .

Select Alphanumeric in the Zone Contents drop-down list.

OmniPage will look for both letters and numbers when it recognizes this
portion of the image.

6 Draw a zone around the financial section of the page.

7 Select Numeric in the Zone Contents drop-down list.

The only characters in this section are numbers and the letters YTD. If
the Alphanumeric option were selected, a 5 could be mistaken for the
letter S and a 0 (zero) for the letter O. Selecting the Numeric option
reduces these common OCR errors.

A numeric zone contents file does not contain any alpha characters,
however, so in this case the Numeric designation is not sufficient for
optimal recognition. You will create a new zone contents file that
includes the characters YTD.
Tutorial 2--Document Types and OCR 5ettings

Create a New Zone Contents File

Choose the Edit Zone Contents File... command in the Settings menu.

The Select File dialog box opens.

Click New.

The Edit Zone Content File dialog box opens with a string of
highlighted characters.

The highlighted characters are replaced with the ones you enter.

Click Save.

The File to Save dialog box opens.

Type the file name finance in the File to Save dialog box.

6 Click OK.

7 If the third zone you drew around the financial contents in the zone
window is not selected, click in it now to select it.
Tutorial 2--Document Types and C)CK Settlngs

8 Select Finance in the Zone Contents drop-down list.

Perform OCR

Click the OCR button to continue the process.

OmniPage recognizes each of the zones according to the zone contents
you specified.

Click the Check Recognition button to verify the results.

You should find no errors in the form.

Export the Graphic as a TIFF File

Choose Export Image... in the File menu.

The Export Image dialog box opens.

Select the Save Each Graphic Zone to a File option in the Image Options
section.

Type the name graphic . tif in the File Name edit box.

The graphic format TIFF is already selected in the Save Fi1es as Type
list box.

4 Click OK.

To save the file in the format of your choice, choose Close Document in
the File menu, or use the Ctrl+W shortcut.

For more information on exporting graphics and saving files, see
Tutorial 3--Streamlining the OCR Workflow on page 2-58.

Creating a Zone Template (Professional version only)

If you regularly scan a particular type of document, especially
standardized forms that require the same manual zoning on each page,
create and save a zone template. Instead of redrawing the zones each
time you scan that document type, simply open the zone template before
scanning.

Each zone template file designates zones exactly as they were drawn
along with their zone contents specifications. (Zones options from the
Settings Panel are not saved.) You can create up to 250 templates.

In this exercise you will create manual zones on a scanned document,
save the zones as a new zone template, and open the template to use on
the form again.

Tutorial 2--Document Types and OCR Settings

You can use the Standardized Form Sample you used in the previous
exercise, or any page you choose.

Set these options in the toolbar:

 Scan Image

Manual Zones

- Perform OCR

Scan a document of your choice.

The image appears in the zone window and OCR stops so you can draw
manual zones.

3 Draw your manual zones.

4 Specify zone contents as appropriate.

5 Choose Save Zone Template... in the File menu.

The Save Zone Template File dialog box opens.

Enter a file name in the File Name edit box.

Normally you would open a zone template after scanning a document and
before it has any zones. For the purpose of this exercise, you will
remove the zones already set in the image, then open the zone tem~late.

7 Click OK.

8 Choose Clear All Zones in the Edit menu, click the Clear All Zones
button in the toolbar, or use the Erase 7nn~c
Tutorial 2--Document Types and OCR Settinqs

tool in the zone window palette to erase zones one by one.)

Select the file name of your new zone template in the drop-down list
under the Zone button.

10 Click the Zone button.

OmniPage draws the template zones on the image.

11 Set options as needed in the Settings Panel before you perform OCR.

Legal Documents and Spreadsheets

The exercises in this chapter have used standard 8.5 x 11 inch
portrait-oriented pages. Many users, however, need to scan documents of
varying sizes, orientation, and complexity. This section lists some
general tips to keep in mind when scanning the following commonly used
documents.

Legol Documents

Keep these general tips in mind when scanning legal documents:

Select Lega/ size in the Scanner Options section of the Settings Panel
if the document is printed on legal size (14 inches in length) paper.

Many legal documents consist of page-wide text. If this is the case,
select Single Column or Table as the zoning method in the SettinKs
Panel.

Pleading Papers

Keep these general tips in mind when scanning pleading papers:

Generally you should select the None zoning method in the Settings
Panel.

If you would like to reproduce page layout without much editing or
reformatting, select the True Page Retain All Page Formatting
(Professional version only) OCR option in the Settings Panel. Users who
have the regular version of OmniPage should select Retain Font and
Paragraph Formatting Only.

You may want to draw manual zones in some circumstances. If numbers on
pleading papers are
I I A C C 1- h A ~ f i ~  A c f r m
t h _ t _ v t             n m n i P
A A ~
Tutorial 2--Document Types and OCR Settinqs

will consider the numbers to be part of the text body and place them on
a text line. Try drawing a zone around just the body text to omit the
line and numbers from the recognition process.

This is a good option if you are going to add text that will change the
line numbers. Use your word-processing application to number each line.

If you want a carriage return inserted at the end of each line, try
saving the scanned document as a standard ASCII file and open it in
your word-processing program. You would choose the Text Only method of
conversion in some programs. Consult your word-processing manual for a
more detailed description of importing documents not created by that
program.

You may have to experiment to find the best process for scanning and
saving each document.

Spreadsheets

Keep these general tips in mind when scanning spreadsheets (these tips
also work for charts, tables and memos with pagewide text and tabs):

Select Landscape as the orientation in the Scanner options section of
the Settings Panel if the document is presented in landscape view.

Select Single Column or Table as the zoning method in the Settings
Panel to preserve the spreadsheet format. When OmniPage detects five or
more spaces, the Single Column or Table option converts the spaces to a
tab.

Draw your own zone around a table of numbers and identify its contents
as a Numeric zone to improve recognition results. You can create new
zone contents files for special characters that your spreadsheet may
contain. See Standardized Forms on page 2-42.

Documents with Specialized Characters (Professional version only)

OmniPage automatically recognizes characters commonly found in most
documents. Other documents, such as mathematical papers, will contain
characters and symbols OmniPage has not yet learned to recognize. You
can train OmniPage Professional to recognize these characters.
Tutorial 2--Document Types and OCR Settinqs

Creating an OCR Training File

This tutorial shows you how to teach OmniPage Professional to recognize
characters not normally found in text by using the Train OCR Sample.

Set these options in the toolbar:

 Scan Image

Auto Zones

Train OCR

Open the Settings Panel and click Use Defaults to return OmniPa~e to
its default settin~s.

Scan a document of your choice that contains symbols or other
specialized characters.
Tutorial 2--Document Types and OCR Settings

After recognition, the Train Characters window opens to display images
of recognized characters. Those OmniPage had trouble identifying are
displayed in the grid boxes at the top of the dialog box.

Beneath each image, in smaller type, is OmniPage's attempted
identification of that character. A tilde means that OmniPage couldn't
identify the character.

Characters OmniPage believes it identified correctly in the document
are listed alphabetically below the suspect characters. Check for
common errors, such as a zero being recognized as the letter O.

Occasionally, you will see common characters, such as c or e.
Generally, you will not want to train OmniPage to recognize these
letters unless they are in a very specialized font. The Language
Analyst corrects common OCR errors more efficiently.

Double-click a character, or select it and click Specify.

In this example, OmniPage must be taught to recognize the copyright
symbol (~).
Tutorial 2--Document Types and OCR Settings

The Specify Character dialog box opens with a close-up of the symbol as
it appeared in the scanned document.

The dialog box includes a list box of Extended ANSI ~haracters and a
Character edit box.

'r'ac'y.~ P
US '

ple, ~: and

If the symbol you seek appears in the list, click it.

It appears in the Character edit box.

A If the symbol or character does not appear in the list, C~ you must
type it in the Character edit box instead.

In the example below, the symbol for pi is not in the list, so the user
has chosen to type in the numbers 3.14159 to replace the symbol.

160 ---
161 --- j 162 --- C 163 --
164 --- ~ 165 --- Y 166 --- ~ 167 --- 168 --169 --

and ~ witl
TutoYial 2--Document Types and OCR Settings

6 Click OK.

The specified character now appears under the suspectcharacter in the
Train Characters dialog box.

~_~ ~ ~u u_ ~
~d d tt oc pec tr
_ .

(' ~ fhN tW rf ., f~
_ _ _              -I

~ ca S~ rt ~ ca oc

... ~ ~ ~ ~ ~     . ~=~

The symbol has turned gray to indicate that you have specified a
character for it.

Click the Save button.

The Enter save file name dialog box opens.

8 Type a file name in the Filename edit box.

9 Click OK.

A dialog box asks if you want to recognize the image with the training
file you just created. At this point, you can continue recognition or
stop the exercise. The new file becomes the default in the OCR section
of the Settings Panel if you click Yes.

Editing an OCR Training Eile

You can edit a training file as needed when you scan a document with
previously unrecognized characters. Any training file can be appended
to another training file.

Choose Edit Training File... in the Settings menu.

The Select File dialog box opens.
Tutorial 2--Document Types and OCR Settings

Select a training file.

Th~ Tr~in Characters dialo~ box opens.

Use the buttons to add, delete or modify character identifications as
needed.

If you had created another training file previously, you could click
the Append button to add these characters to it.

Click Save to save your changes and close the Train Characters dialog
box.

If you have made no changes, click Cancel to close the dialog box.
Tutorial 2--Document Types and OCR Settinqs

Foreign-Language and Multilingual Documents

For optimal recognition of documents in any language, you should
select:

the appropriate language in the Select Language(s) dialog box (choose
Select Languages... in the Settings menu).

English
G elman
Flench
llalian
D u~ch
Sp~nish
Swedish
Polluguese
Danish
Nolweuian

the appropriate main dictionary for the language of the text you are
recognizing in the Spelling section of the Settings Panel.

the Language Analyst in the OCR options of the Settings Panel.

Foreign-Language Documents

When you want to recognize a foreign-language document double-check
that your settings are correct as described above. During recognition
of any document, the Language Analyst consults the main dictionary and
the user dictionary. This is why it is important that the currently
selected dictionary matches the language you are trying to recognize.

Speed recognition by deselecting the option IJse Language Analyst in
the Spelling section of the Settings Panel if the right dictionary is
not available. The Language Analyst will try to match words to the
chosen dictionary when selected, then turn itself off anyway if it
perceives that dictionary entries are not improving recognition
results.

(If you recognize many documents in a language other than that of your
default main dictionary, you should order a dictionary for that
language from your Caere distributor or by calling Caere at
800-535-SCAN.
Tutorial 2--Document T~Pes and OCR Settinqs

Multilingual Documents

You may want to recognize documents written in more than one language.
It is important to select both the proper language set and main
dictionary. You can zone multilingual documents automatically, but
manual zoning may return better results.

Suppose you have a document written largely in French, with a few
sections in Portuguese. If you use auto zoning, select both French and
Portuguese in the Select Language(s) dialog box.
Select French as the main dictionary selection.

English
Gelman
3~a;D
~alian u~ch panish wedish
=~ anish olweqian

The Language Analyst will assist in recognizing the French portions of
the document but shut itself off when it finds a text block in
Portuguese. Use the Check Recognition feature to correct recognition
mistakes. If recognition is poor, try turning off the Language Analyst
and recognizing the document again.

If you use manual zoning, the process is more timeconsuming but the
results will be more accurate. Draw recognition zones around just the
French portions of the text. Leave the Language Analyst on.

After recognition of the French portions is complete, save the document
as a word-processing file.

Repeat the process for the Portuguese portion of the text, making the
appropriate dictionary and language selections. This replaces the
French text recognized first. (If you don't want to replace the first
recognized text, save the document as a Caere Document befoYe
recognizing the second language.)

When recognition is done, save the second document as a word-processing
file. Use your word-processing program to open both documents and cut
and paste as needed.
Tutorial 3--Streamlining the OCR Workflow

Streamlining the OCR Workflow

OmniPage provides a number of time-saving features to help you
streamline your OCR workflow. This chapter shows you how to use some of
them. After completing the exercises in this chapter, you will know how
to:

Save and reload a settings file.

Determine the most efficient way to process a large group of documents.

Open multiple image files.

Use different options to export pages and graphics as individual image
files.

Use the Defer OCR option (Professional version only).

There are no text samples for the exercises in this chapter.

Saving a Settings File for Specific Documents

OmniPage lets you save Settings Panel selections as a settings file.
You can open and use this file when needed for similar document types
to save yourself time. Disk space is the only limit to the number of
settings files you can create.

Suppose you regularly receive double-sided customer response forms
printed in landscape mode with small (8-point Helvetica) type. This
type of form requires that you select specific options in the Settings
Panel. Rather than set them each time you scan the incoming forms, save
the settings as a file and open it as needed.

In this exercise you will set, save, and load settings for particular
documents.

Open the Settings Panel.

2 Set the following options:

 Scanner: Landscape OYientation, Double-sided Pages, and Auto
BYightness with AnyPage.

Zones: Single Column or Table.

OCR: Ignore Fonts and all FoYmatting.

OmniPage will match the original fonts if you use the Retain Font
Formats option, but in this case we want
Tutorial 3--Streamlinina the OCR Workflow

the type to appear in a larger size and a different font than the
original.

Fonts: in the Ignored Font Formats group box, select Times New Roman in
the drop-down Font list, and type 12 in the Font Size edit box.

If necessary, choose Select Languages... in the Settings menu and
select a Language for your saved settings file.

Choose Save Settings... in the File menu.

The Save Settings dialog box opens with the Caere Settings file format
as the default.

ct a location for the file and type in a file name.

6 Click OK.

7 Open the Settings Panel and click Use Defaults to return OmniPage to
its default settings.

(In the normal course of your work, you would go on to scan documents
with your settings and later change the settings as you worked with
other documents.)
Tutorial 3--Streamlining the OCR Workflow

8 Choose Load Settings... in the File menu.

The Load Settings dialog box opens.

9 Select your file and click OK.

10 Browse through the settings you changed in the Settings Panel to
verify that they were restored.

Scanning Large Jobs

If you have an automatic document feeder (ADF), you can use the
OmniPage AUTO button to scan a large stack of documents, recognize them
as a group, and save the results later as a single file or as several
smaller files. For example, you may want to fill your ADF and click the
OmniPage AUTO button before you leave the office for the day. OmniPage
can scan, zone, and recognize all the documents and have them ready for
you to save the next morning.

To automatically scan a batch of documents unattended, you must select
Auto Zones or, with the Professional version, either Auto Zones or a
zone template. If Manual Zones is selected, OmniPage stops after each
page image so you can select zones.

OmniPage Professional users have the option of deferring OCR to a later
time. See Deferring Recognition (Professional version only) on page
2-66.

All OmniPage users have the option of saving scanned files in
word-processing or Caere Document file format. You can reopen Caere
Document files and make changes after the process is finished. To
protect your processing time investment, save the scanned documents in
Caere Document file format before
Tutorial 3--Streamlining the OCR Workflow

you begin checking text recognition or editing in the text window.

Preparing Documents for the ADF

Decide how you will save the scanned documents before you fill the ADF.
Suppose you wanted to scan 25 pages. How you
plan to save the pages affects how you will group them in the document
feeder. You have these options for saving scanned pages in a
word-processing format:

Create one file for all pages

The pages would be saved as one 25-page file.

Create one file per page

The pages would be saved as 25 one-page files.

Create new file at each blank page

You would insert blank pages as separators into a stack of one-sided
documents. All pages following a blank page would be saved as a
separate file with a unique document name.

Automatic file naming is discussed in the section Saving the File(s) on
page 2-62.

Viewing Pages in the Zone and Text Windows

View your scanned document or loaded image in the zone or text window.
Move through the pages by clicking the arrows in the bottom left of the
OmniPage window. Click the right arrow to move to the next higher page
and click the left arrow to move to the next lower page.

Alternatively, choose Go To Page... in the Edit menu or use the Ctrl+G
keyboard shortcut to open the Go To Page dialog box.

Adding, Replaeing, and Deleting Pages

Scanned pages or loaded images can be appended to any open Caere
Document or to the current loaded or scanned image in the zone window.
Pages also can be replaced and deleted.

If a one-page image file or Caere Document is open, any new image
loaded or new page scanned becomes page two of that
document.

Tutorial 3--Streamlining the OCR Workflow

When you click the Image button while viewing any page of a multi-page
document except for the last page, the Scan Image dialog box opens. (If
the Load Image option is set, the Load Image dialog box opens. It is
the same as the Scan Image dialog box.) Choose whether to replace the
current page with the new page(s), or whether to insert the new page(s)
before the current page or at the end of the document.

~ If you open a Caere Document while another Caere C~ Document or an
image file is open, the currently open document will close first. A
dialog box gives you the option of saving the document before it
closes, if you have made changes to it.

To delete a page in a currently open document, move to the page you
want to delete and choose Delete Current Page in the Edit menu or use
the Ctrl+D keyboard shortcut.

Saving the File(s)

When recognition and any text editing you want to do are complete,
click the Save As... button in the toolbar. Choose either a
word-processing or a Caere Document file format.

If you choose a word-processing format, you have three options for
saving the scanned pages: as a single file, as one file per page, or as
one file for every blank separator that OmniPage locates. See Preparing
Documents for the ADF on page 2-61 for more information on these
choices.

 Create one file per page or Create new file at each blank page: enter
a name with five characters or less into the File Name edit box.
OmniPage adds three numbers to each file name to make it unique. For
example, if you typed file into the File Name box, the first page is
saved as fileO01, the second page as fileO02, and so on.
Tutorial 3--Streamlining the OCR Workflow

 Create one file for all pages: enter a standard eightcharacter file
name.

If you chose the Multiple Columns and Retain Graphics options in the
Settings Panel, the graphics will be saved with the text only if the
format in which the file is saved supports embedded graphics. ASCII
text, for example, does not support embedded graphics.

You can export graphics to a separate file independent of the text as
described in Exporting Images on page 2-64.

Opening Multiple Image Files

You can load any number of image files--such as a batch of faxes
received on a fax modem--for group recognition. You can load TIFF, PCX,
DCX, or BFX images.

To do this:

Set the OmniPage toolbar appropriately for the images you are
recognizing. For example:

 Load Image

 Auto Zones

 Perform OCR

2 Click the AUTO button.

The Load Image dialog box opens.
Tutorial 3--Streamlining the OCR Workflow

3 Select the type of image you want to recognize and click Add to add
it to the Selected File.s list h-)~

Click OK when you have added all the files you want.

Each file is opened and processed in order of appearance in the list.

When you load images with the AUTO button, the image files are added to
any currently open document as described in Adding, Replacing, and
Deleting Pages on page 2-61.

Choose Open Document... in the File menu to open Caere Document files.
When you use the Open Document... command to open a Caere Document
image file, OmniPage closes any open image file or Caere Document
first. A dialog box gives you the option of saving the document before
it closes.

Exporting Images

Any scanned page or pages can be exported as an image file. You can
export the image files in TIFF, BMP, or PCX format. OmniPage can export
an entire page as one image file, or it can find the individual graphic
zones on each page and export them as separate files.
Tutorial 3--Streamlininq the OCR Workflow

Choose Export Image... in the File menu when you want to export either
an image or its graphic zones to a file. The Export Image dialog box
opens.

There are two choices under Save Options:

Save Current Page Only

Save All Pages

There are two choices under Image Options:

Save Each Graphic Zone to a File

Save Entire Page to a File

Choose one option in each section. How you match these two options
affects the length of the file name you can assign the image and how
OmniPage appends an extension.

Save Current Page Only and Save Entire Page to a File: the name you
choose can have eight characters. This creates one one-page image file.

Save All Pages and Save Entire Page to a File: the name you choose can
have five characters. 00n is appended, where n represents the page
number (001, 002, etc.). This creates multiple one-page image files.

Save Current Page Only and Save Each Graphic Zone to a File: the name
you choose can have seven characters. OmniPage appends a letter to
indicate the order of the graphic on the page.

A is the first graphic, B is second and so on. This creates one file
for each graphic on the current page. Up
Tutorial 3--Streamlining the OCR Workflow

to 26 files can be creat method .

Save All Pages and Sav~ name you choose can ] appends both a numb~

The number (OOn) indicates the page number and the letter indicates the
order of the graphic on the page. Thus the second graphic on the second
page would be ~* i 002B. This creates one file for each graphic on
every page .

Deferring Recognition (Professional version only)

The typical OCR flow is to scan, zone, and OCR a page in the stack and
then repeat the process with the next page until every page in the
stack is done.

Compared to the time it takes to scan and zone a page, however, the
recognition process can be time-consuming. You might find it more
convenient to scan and zone all your pages at once but defer
recognition to a later time when it can take place unattended by you.

OmniPage Professional gives you the option of deferring the recognition
process. This means that you can scan and zone a number of documents or
open and zone a number of images and put off recognition until later.
You can even schedule OCR to commence at a specific time. This chapter
gives you general guidelines on deferring recognition when scanning a
stack of documents or loading a group of image files.

Set the toolbar with the Defer OCR setting.

Set the other toolbar and Settings Panel options according to the
requirements of the documents or files you plan to scan.

Load the Automatic Document Feeder (ADF) with the documents to be
scanned or set the Image button to Load Ima~e.
Tutorial 3--Streamlininq the OCR Workflow

Click the AUTO button.

If you are scanning documents, the first page in the stack is scanned
and zoned, then the next page, etc.

If you are loading image files, the first image in the list is opened
and zoned, then the next image, and so on.

If you are using the Auto Zones feature, each page is zoned
automatically. If you are using the Manual Zones feature, the AUTO
processing stops each time a page is ready for zoning.

Once the page images have been zoned, you have two choices: finish
recognizing the current document or save the file and Perform
recognition later.

Finish Current Document

If you want to finish the current open document:

Choose Finish Current Document... in the Process menu.

The Finish Current Document dialog box lets you choose to save the
document to a specific file format.

Select Convert Automatically to save the document immediately after
recognition. If you do not select Convert Automatically the file will
be saved as a Caere Document. You also have the option of deleting the
Caere Document after recognition.

Click Save Output To... to choose a file format and location for the
saved file.

The Save As dialog box opens. Choose a file type and a destination for
your file. Click OK to return to the Finish Current Document dialog
box.
Tutorial 3--Streamlining the OCR Workflow

Sove a Document for Lclter Recognition

If you want to save the document for later processing:

Choose Save in the File menu after the page has been zoned.

The Save As dialog box opens.

Select the Caere Document file format .met as the file type and type a
name into the File Name edit box.

3 Click OK.

4 When you want to finish the document, choose Finish Deferred
Documents... in the Process menu.

The Finish Deferred Documents dialog box offers options for recognizing
and saving the deferred files:

Click Add Files... to add Caere Documents to the list.

The Open dialog box opens. Double-click the files you want to open, and
click OKto return to the Finish Deferred Documents dialog box.

Select Convert Automatically to save the document immediately after
recognition. If you do not select Convert Automatically the file will
be saved as a Caere Document. You also have the option of deleting the
Caere Document after recognition.

Click Save Output To... to choose a file format and
Tutorial 3--Streamlining the OCR Workflow

location for the saved file.

The Save As dialog box opens. Choose a file type and a destination for
your file. Click OK to return to the Finish Deferred Documents dialog
box.

Choose a time to start OCR in the When to Recoanize section.

5 Click OK.

At the set time, OmniPage opens each document in the order in which it
was added to the list. It performs recognition, saves, and closes each
document when recognition is complete
** No page found **
Chapter 3

Commands and
Setti ngs

This chapter explains how to use all of OmniPage's commands and
settings which are located within seven menus and a convenient toolbar.

The OmniPage menus include the:

File menu
Edit menu
Format menu
Process menu
Settings menu
Window menu
Help menu

The toolbar provides shortcut command and processing buttons to perform
OmniPage operations.

The information in this chapter is organized hierarchically to describe
each toolbar button and menu command. For example, the description for
Save Options is listed at the end of the following series of
descriptions:

File menu description: Save As... description:
Save Options description
Some features are only available with the OmniPage Professional
version. These are noted in this chapter as "Professional version
only."

For practical ways to use OmniPage with step-by-step instructions, see
Chapter 2, Tutorials.
Use the toolbar to access the fundamental steps of the OCR process:

Getting the page image that you want to recognize.

2 Choosing what will be recognized in the image by creating zones.

Recognizing the image or, if you're an OmniPage Professional user,
performing other OCR options before recognition .

You can choose automatic processing so that OmniPage automatically
performs all these steps according to the commands that you select. Or,
you can work interactively with OmniPage each step of the way.

In addition to the OCR processing steps, the toolbar also provides
shortcuts for performing other important OmniPage commands.

i ~
AuTr~ ~ ~Scan Image ~:

Processing buttons           Shortcut command

See Touring the Settings Panel on page 2-15 for more information.

Shortcut Command Buttons

The toolbar's shortcut command buttons are for your convenience.

Use the Settings Panel button to open the Settings Panel.

Use the Save button to save the current document.
Use the Save As... button to save the current document with a different
name or in another file format.

Use the Print button to print recognized text in the current document.

Use the Help button to get help on OmniPage.

Use the Image Assistant button to launch the Image Assistant 24-bit
color and image-editing program (Professional version
onlY).

Use the Cut button to cut text in a recognized document.

Use the Copy button to copy text in a recognized document.

Use the Paste button to paste text in a recognized document.

Use the Clear All Zones button to delete the currently drawn zones in
the zone window. Only the zone borders are deleted; the image itself
remains the same.

Use the Find/Replace button to find and replace words in a recognized
document.

Use the Check Recognition button to check for errors in a recognized
document.

The shortcut command buttons perform the same functions as the
corresponding commands in the File, Edit, Settings, and Help menus. For
more information about these commands, see their respective menu
entries further in this chapter.
The Toolbar

Processing Buttons

The toolbar's processing buttons perform the same operations as the
Process Settings commands in the Process
menll .

A UTO
button

Image Zone     OCR button button    button

You can use the:

 AUTO button to automatically process your document from start to
finish according to the currently selected processing commands.

 Image button to get an image for recognition by scanning a page or
loading an existing image.

 Zone button to specify what will be recognized in an image by
creating zones manually, automatically, or with a template
(Professional version only).

 OCR button to perform OCR, defer OCR (Professional version only), or
train OCR (Professional version only).

The status bar at the bottom of the screen reports the currently
activated operation and then the operation that you can select next.

AUTO But~on

The AUTO button, located on the far left side of the toolbar, performs
the same operations as the Auto command in the
Pro~cc mrm

AUT0
Click AUTO to start and finish processing each page of a new document
automatically or to finish processing the current page of an open
document. This process is determined by the commands selected in the
Image, Zone, and OCR button dropdown list boxes.

For example, if you want OmniPage to automatically scan and process a
multi-page document, you can select Scan Image, Auto Zones, and Perform
OCR in the processing button dropdown list boxes. When you click AUTO,
the first page in the scanner will be scanned, automatically zoned, and
recognized. The same process is automatically repeated for the next
page. This continues until all of your pages are processed.

When a document is already open, you can click AUTO to finish
processing the current page. The resulting operation depends on the
state of the page and the selected Image, Zone, and OCR commands. For
example, if your page image already has zones, then OmniPage
immediately begins recognition processing according to the selected OCR
command.

The AUTO button changes to STOP as automatic processing begins. Click
STOP at any time if you want to discontinue processing.
Image

Scan Image

the Toolbar

Button

Use the Image button to get an image for recognition by scanning a page
or loading an existing image. This button performs the same operations
as the Scan Image/Load Image Process Settings commands in the Process
menu.

Soan Image

S oan I mage Loa~ Imaoe

Select Scan Image or Load Image from the Image button dropdown list
box. Click the Image button to initiate the selected operation.

The selected Image command is also used when OmniPage performs
automatic processing.

Scan Image

Choose this to scan a page in your scanner. Before scanning, make sure
the appropriate Scanner options are selected in the Settings Panel.

/~ You can use your right mouse button to click the Image button and
automatically open the Settings Panel to Scanner options.

While scanning a page, a progress meter appears and the status bar
reports the progress. The page image appears in the zone window when
scanning is complete.

Click the STOP button in the toolbar to cancel scanning at any time.
Load Image

Choose this to load a previously saved image file as a new document or
to add an image file to your open document.

.tlf

An image file is a "picture" of text and/or graphics that is saved in
an image file format such as TIFF or PCX. When you load an image file
in OmniPage, it appears in the zone window. See the next section for a
list of supported input file formats.

Supported Input File Formats

OmniPage can open files with the following file formats.

Caere Format (~.met) You can open Caere Document files (*.met) created
in the 5.0 or later version of OmniPage. Image File Formats PCX TIFF
Uncompressed TIFF Compressed (Types 11,111, IV, and PackBits)

TIFF files must be line art and 200, 300, 400, or 600 dpi; 300 dpi is
recommended.
Fax File Formats

OmniPage supports fax files saved in the .PCX format. Many fax boards
can receive or convert the .PCX format; please consult your fax
documentation for more information .

To load an image file:

In the Load Image dialog box, specify the path and directory where your
image files reside.

Select the type of file you wish to load from the List Files of Type
drop-down list box.

Files of that type in the specified directory appear in the File Name
list box.

Click the file you want to load and then click OK.

The file opens in the zone window. For a multi-page image file, you
must click the Image button to load each consecutive page in the file.

Click Cancel to exit without loading an image file.

You can load multiple image files when you have Load Image selected for
automatic processing. For example, you may have a number of TIFF files
that you want to process automatically. These files are loaded and
processed in the order that they are selected and combined into one
working document.

To load one or more image files for automatic processing:

Select Load Image and the desired zone and OCR commands and then click
the AUTO button to begin processing.

The Load Image dialog box appears.

Specify the path and directory where your image files reside.

Select the type of files you wish to load from the List Files of Type
drop-down list box.

Files of that type in the specified directory appear in the File Name
list box.

For each file you want to load, click the file and then click Add.
Click Add All to select all the files in the directory. The files
appear in the Selected Files list box.

To add image files from other directories, repeat steps two through
four. You can select up to 255 files.

To remove a file from the list box, click it and then click Remove.
Click Remove All to remove all files from the list box.

When you have selected all the files you want to load, click OK.

The images will be loaded into the zone window and processed one at a
time, in the order that they were seleeterl

Click Cancel to exit without loading any files.

Zone Button
Use the Zone button to create zones that determine what will be
recognized in the page image. This button performs the same operations
as the Auto Zones/Manual Zones/Use Template Process Settings commands
in the Process menu.

Select Auto Zones, Manual Zones, or a zone template file (Professional
version only) from the drop-down list box. If you select Auto Zones or
a zone template file, click the Zone button to initiate the o~eration.

The selected Zone command is also used when OmniPage performs automatic
processing.

Auto Zones

Select this in the drop-down list to have OmniPage automatically draw
and order zones in the current page image and determine the appropriate
text flow for recognition.
To create Auto Zones, OmniPage uses the selected Zones option in the
Settings Panel: Multiple Columns, Single Column or Table, or None. For
more information about each of these options, see Zones Options on page
2-9.

 You can use your right mouse button to click the Zone button and
automatically open the Settings Panel to Zones options.

If a page already has zones, you are prompted to delete the current
zones before auto zoning occurs. Click Yes to proceed. The zone window
is then updated so that you can review the

7~1n~ thAt ~r~ rir~wn.

See Tutorial 2--Document Types and OCR Settings on page 2-35 to learn
more about using zones.

Manual Zones

Choose this to draw and order your own zones in the current page image
using the tool palette in the zone window.

When you create zones manually, OmniPage uses the selected Zones option
in the Settings Panel (Multiple Columns, Single Column oY Table, or
None) to determine the text flow within each zone that you draw. For
more information about each of these options, see Zones Options on page
2-9.

If a page already has zones, you are prompted to delete the current
zones; click Yes to proceed. The zone window is then updated so that
you can draw your own zones.

For more detailed information on creating manual zones, see Manual
Zones on page 2-46.

Zone Templates (Professional version only)

Choose a zone template file directly from the drop-down list box to
apply zones to the current page image based on that template. This is a
quick and efficient means of processing similar documents with the same
zoning requirements.

A zone template file is comprised of various zone attributes such as
position, order, and zone contents. If you frequently process documents
with the same layout, such as business forms, create and save a zone
template and apply it to all such documents.
If a page already has zones, you are prompted to delete the current
zones before applying a zone template. Click Yes to proceed. The zone
window is updated so that you can review the zones that are drawn.

You can create zones manually and save them as a template using the
Save Zone Template... command in the File menu. For more information on
creating zones manually, see Manual Zones on pa~e 2-46.

OCR Button

Use the OCR button to perform the selected OCR command on the page
image. This button performs the same operations as the PeYfoYm OCR,
Defer OCR, and TYain OCR Process Settings commands in the Process menu.

Select PeYfoYm OCR, DefeY OCR (Professional version only), or Train OCR
(Professional version only) from the drop-down list box. If you select
PeYfoYm OCR or TYain OCR, click the OCR button to initiate the
operation.

The selected OCR command is also used when OmniPage performs automatic
processing.

Perform OCR

Choose this to recognize text on the current page.

Before performing OCR, make sure the appropriate OCR ol~tions are
selected in the Settin~s Panel.

/~ You can use your right mouse button to click the OCR button and
automatically open the Settings Panel to OCR options.
If there are no zones on the page when you select Perform OCR and click
the OCR button, OmniPage automatically creates zones according to the
selected Zone command. If Manual Zones is currently selected, OmniPage
ignores this and draws zones automatically.

Defer OCR (Professional version only)

Choose this to delay text recognition of one or more pages of your
document.

For example, you can use the AUTO button to scan pages, create zones,
and defer OCR of your document. Then, at your convenience, you can set
OmniPage to recognize your entire document by choosing Finish CurYent
Document or Finish DefeYYed Documents in the Process menu.

You can also recognize individual pages of an unrecognized document.
For example, you can open a document to a particular page, choose
PeYfoYm OCR, and click the OCR button; only that page will be
recognized.

To save a document with deferred pages, you must save it in Caere
Document format (i.met).

For more information, see Deferring Recognition (Professional version
only) on page 2-66.

Train OCR (Professional version only)

Choose this to create a character training file (i.trn) that assists
OmniPage during text recognition and allows better reco~nition of
special characters.

A character training file is a set of pre-recognized text characters
that OmniPage compares with the characters in the page image during
recognition. Before recognizing an image, you can create a new training
file or choose an existing one in the Settings Panel OCR options.

For more information on creating a training file, see Train OCR
(Professional version only) on page 2-52. For step-by-step instructions
on training OCR for special characters, see Documents with Specialized
Characters (Professional version only) on page 2-50.
Open
Document....
...     
Ctrl+O
Close
Document
Ctrl+W

Save    
Ctrl+S
Save hs
Export Image
eevert trl
SaYed

Get hccuracy
Inlo

Save
Settings
Load
Settings,
Save Zone
Template

erint.............   Clrl+P
Publish to Enyoy

The File menu lets you manage OmniPage file operations. File menu
commands include:

Open Document...
Close Document
Mail... (MAPI mail systems only)
Save
Save As...
Export Image...
Revert to Saved
Get Accuracy Info...
Save Settings...
Load Settings...
Save Zone Template...
Print. ..
Publish to Envoy...
Exit
Open Document...

Choose Open Document... to open a Caere Document (i.met) or an image
file.

Caere Document (~.met)

OmniPage creates a Caere Document the first time you scan or open an
image. A Caere Document can have up to 255 pages. Each page can vary to
include the original image, zones, and reco~nized text.

You can continue to reopen a Caere Document in OmniPage, make edits,
and save it in any other supported file format you wish. Additionally,
if a Caere Document is saved with its original page images, you can
retain graphic images, verify recognized text with the page image,
defer recognition, and rerecognize pages at any time.
Image file

An image file is a "picture" of text and/or graphics that is saved in
an image file format such as TIFF or PCX. Image files do not have OCR
or zone information. When you open an image file in OmniPage, it
appears in the zone window.

To open a Caere Document or image file:

Locate your Caere Documents or image files in the Open Document dialog
box.

Select the type of file you wish to open from the List Files of Type
drop-down list box.

Files of that type appear in the File Name list box.

Double-click a file or select it and click OK.

The image file opens in the zone window. A Caere Document opens with
recognized text in the text window and its original image (if saved) in
the zone window. In either case, the first page of your file is
displayed.

Click Cancel to exit without opening a file.

You can only have one working document open at a time. If you attempt
to open another file, you are prompted to close your current document.
You can add page images to your document by using the Load Image or
Scan Image command in the Process menu or Image button drop-down list
box.
Close Document
Choose Close Document to stop working on a document but leave OmniPage
running.

If the current document has not been saved or has changed since the
last save, a prompt appears asking if you want to save the document
before closing. Click Cancel to go back to the
open document.

Choose Mail... to access your mail system and send each page of
recognized text from your currently open document. This
command is only available for MAPI mail systems such as
Microsoft Mail.

Choose Save to write the contents of your current working document to
disk. This command is also available as a button in the toolbar.

The Save As dialog box appears if you are saving the file for the first
time. After saving, you can continue working on your document.

S~ve ~s...

Choose Save As... to choose a file format and save a document to disk.
This command is also available as a button in the
tnnlh~r.

Use this command to save Caere Documents and recognized documents to
other file formats. To save a recognized
document in more than one file format, you can:

Save the file as a Caere Document (i.met).

By saving your document as a Caere Document, you can continue to reopen
it in OmniPage, make edits, and save it in other supported file
formats. A Caere Document can have up to 255 pages. Each page can
include the ori~inal ima~e, zones, and reco~nized text.
Save the initially recognized document in each desired format using
save As... while it is open in the text window. Remember, only a Caere
Document can be reopened (and resaved in a different format) in
OmniPage .

Caerer
MET)  
L~,~

ti~ 6
~M~ a~ f

Sc-ve Options

When you save your document to a file format other than a Caere
Document you can select one of three Save Options.

create one file for all pages

Select this to save all the pages in your document as one file. (Blank
pages are not saved.) Save the file with a standard file name of eight
characters or less.

create one file per page

Select this to create a separate file for each page in your document
and automatically increment file names. (Blank pages are not saved.)
The assigned file names are comprised of up to five characters and
appended numbers starting with 001.

For example, if you use "form" as a file name, the first file is named
formO01, the second file formO02, and so on. The file extension added
depends on your choice of file formats. Word for Windows file would be
named form001 .doc .
Create new file at each blank page

Select this to create a new file after each blank page in your
document. (Blank pages are not saved.)

For example, if you want to scan several batches of pages at once,
insert blank pages to separate each batch. OmniPage will save the first
batch of pages as a file, detect a blank page, save the next batch of
pages as a file, detect a blank page, and so on. The assigned file
names are comprised of up to five characters and appended numbers
starting with 001.

For example, if you use "form" as a file name, the first file is named
formO01, the second file formO02, and so on. The file extension added
depends on your choice of file formats. A Word for Windows file would
be named fnrm()() 1 ~1 n~

To save a file:

Select the path and directory to save your file in the Save As dialog
box.

The default directory is called Data; OmniPage creates this during
installation.

2 Type a name for your file in the File Name edit box.

3 Select the appropriate file format from the Save Files as Type
drop-down list.

See Supported Output File Formats on page 2-4 for a list of supported
file formats and a description of ASCII and ANSI options.

For a recognized document that you are saving in another file format,
select the appropriate Save Option as described in the preceding
section.

Click OK.

OmniPage automatically adds the appropriate file extension to the file
name and the current working file returns to the screen.

Click Cancel at any time to exit without saving.
Export Image...

Choose Export Image... to save an image to disk in an image file format
such as TIFF or PCX.

An image file is a "picture" of text and/or graphics. For example,
scanning a page results in an image that you can save in an image file
format. Image files do not have OCR or zone information. When you open
an image file in OmniPage, it appears in the zone window.

Save Options

You can select one of two Save Options.

Select Save Current Page Only if you want OmniPage to save only the
current page image as a file.

Select Save All Pages if you want OmniPage to create a separate file
for each page in your document and automatically increment file names
starting with 001.

Image Options

You can select one of two Image Options.

Select Save Each Graphic Zone to a File if you want OmniPage to save
only the graphics within your page image. You must create zones in the
page image and perform OCR before you can ~hf~nc~ thi~ nntion.
f\ Choose the Multiple Columns zoning option in the Settings Panel
Zones options to have OmniPage automatically separate graphics from
text. Or, draw manual zones and identify the graphics as graphic zones.

Select Save Entire Page to a File if you want OmniPage to save the
entire page image. You do not need to create zones or
perform OCR unless you have graphic zones.

Graphic File Name

How you match the Save and Image Options affects the length of the file
name you can assign the image and how OmniPage
aPpends an extension.

Save Current Page Only and Save Entire Page to a File: the name you
choose can have eight characters. This creates one one-page image file.

Save All Pages and Save Entire Page to a File: the name you choose can
have five characters. 00n is appended, where n represents the page
number (001, 002, etc.). This creates multiple one-page image files.

Save Current Page Only and Save Each Graphic Zone to a Fi/e: the name
you choose can have seven characters. OmniPage appends a letter to
indicate the order of the graphic on the page.

A is the first graphic, B is second and so on. This creates one file
for each graphic on the current page. Up to 26 files can be created in
one directory with this
method .

Save All Pages and Save Each Graphic Zone to a Fi/e: the name you
choose can have four characters. OmniPage
appends both a number and a letter as an extension.

The number (OOn) indicates the page number and the letter indicates the
order of the graphic on the page.
Thus the second graphic on the second page would be
i002B. This creates one file for each graphic on every

~Jd~e .

To save an image file:

Select the path and directory to save the file in the Export Image
dialog box.

T~   f   r ~  llr fil~       
th~, ril,. I~r~,, ,~,lit 1~  ~
Select the appropriate file format from the Save Files as Type
drop-down list.

4 Select the appropriate Save and Image Options.

5 Click OK.

OmniPage automatically adds the appropriate file extension to the file
name and the current working file returns to the screen.

Click Cancel at any time to exit without saving.

Revert to Saved

Choose Revert to Saved to undo edits made to a file and return to the
last-saved version of the file.

For example, if you have deleted important information or cut and
pasted text into unreadable gibberish, choose Revert to Saved and the
file will reappear as it was when you last saved it.

Get Accuracy Info...

Choose Get Accuracy rnfo .. for a statistical report showing how well
OmniPage recognized a page.

Accuracy information is valuable for comparing the effect of different
settings on recognition accuracy. For example, if you are not sure
about which Scanner options to choose, you can compare the recognition
accuracy percentages of different options. You can also quickly tell if
a poor-quality document is worth scanning. If the recognition accuracy
rate is less than 97%, it might be quicker to rescan a better copy of
the page or to enter the text manually.
The Get Accuracy Info dialog box provides a statistical report for the
most recently recognized page.

I Acculacy Infolmalion lo~ Pagc: 1

Numbel ol Chalactel~: 3!i8"
N umbc~ ol Wolds: 673
Numbe~ ol Rejeck: O
Numbel ol Suspecls: 0
Numbel ol Spolling Replacemenls: O
Recognilinn Time: 13 lec
Wo~d~ pe~ Minule: S~
Recognilion Rale: 3 cha~sec
Accu~acy Rdle: 100 00 Z
L~

Number of Characters

This is the number of characters and spaces on the page.

Number of Words

This is the number of words on the page.

Number of Rejects

This is the number of unrecognizable characters. This does not count
improper substitutions or incorrectly recognized formatting commands.

Reject characters appear in red in the recognized document; by default,
re jects are represented by the tilde (~) character.

Number of Suspects

This is the number of questionable characters which OmniPage made an
attempt to recognize. These words appear in green in the recognized
document.

Number of Spelling Replacements

This is the number of words which were automatically corrected by
Caere's Language Analyst feature. These words appear in blue in the
recognized document.
Recognition Time

This is the time it took to break the page down into text and graphics
and perform recognition. This does not count scanning time, the time it
takes to create zones, or the time spent writing data to disk.

Words per Minute

This is the number of words per minute (wpm) that OmniPage recognized.
Assuming that the average word is five characters long, the formula is:

Recognition Rclte

This rate is expressed in characters per second (cps). The formula is:

Accuracy Rclte

This is the recognition accuracy given as a percentage. The formula for
Accuracy Rate is:

If the accuracy rate is less than 97%, it might be quicker to rescan a
better copy of the page or to enter the text manually.
Save Settings...

Choose Save Settings... to save the currently selected Settings Panel
options and language selection(s) to a settings flle (~.set) for later
use.

Saving settings files is especially useful if you process different
types of documents. Since various documents may require different
settings, you can save different settings files and then load the
appropriate file for a particular docum(~nt

To save settings:

In the Save Settings dialog box, select the path and directory to which
you want the settings file saved.

You may want to create a special directory for your settings files.

Type a name for your settings file in the Fi/e Name edit box.

Select Caere Settings (nset) as the type of file you are saving.

Click OK to save the settings file.

Click Cancel to exit without saving.

To load a settings file, use the Load Settings... command in the File
menu.
Load Settings...

Choose Load Settings... to load a previously saved settings file
(*.set).

A loaded settings file automatically sets Settings Panel options and
language selection(s) to preselected values. This is useful for quickly
restoring OmniPage to settings required by certain documents.

To lo~d cl set~ings file:

In the Load Settings dialog box, select the path and directory where
your settings files reside.

Select Caere Settings (~.set) as the type of file you are loading.

Double-click the settings file you want. Or, select the file and click
OK. The settings are loaded immediately.

Click Cancel to exit without loading a settings file.

To save a settings file, use the Save Settings... command in ~hf~ Fil~
m~nll

Save Zone Template... (Professional version only)

Choose Save Zone Template... to save the zones that you manually create
on a page image as a template.

A zone template file (~.zon) is comprised of various zone attributes
such as position, order, and zone contents. For example, if you
frequently process documents with layouts and
content that require the same type of zoning, you can create and save a
zone template and apply it to all such documents.

To save a zone template:

After manually creating the zones you want to save, choose Save Zone
Template....

In the Save Zone Template File dialog box, select the path and
directory where you want the zone template ~ve~l

The default directory created during installation is called Data.
OmniPage looks for zone template files in this directory.

Type a name for your zone template file in the File Name edit box.

Select Caere Zone (~.zon) as the type of file you are savmg.

Click OK to save the zone template file.

Click Cancel to exit without savin~ the zone template.

To apply a zone template to a page image, choose the Use Template...
command in the Process menu or select a template directly from the Zone
button drop-down list box.
Choose Print... to print a recognized document. This command is also
available as a button in the toolbar.

The dialog box that appears depends on your printer.

A document is printed according to the selected print options such as
print range, print quality, and number of copies.

Select the desired print options and click OKto start the print job.
Click Cancel to exit without printing or saving the selected print
options.

Publish to Envoy... (Professional version only)

Choose Publish to Envoy... to save recognized text and any retained
graphics as a WordPerfect Envoy runtime file.

An Envoy file displays information as if it were printed, only it is
printed on the screen rather than on paper. Envoy preserves your
document's fonts and page layout. Text and graphics will appear exactly
as they did in the OmniPage text window.

You cannot change an Envoy file's contents as you would edit a file in
a word processor, but you can rearrange, combine, and delete whole
pages of the file. You can annotate an Envoy file on screen and print
out the entire file on paper. You can also copy selected items from an
Envoy file to the Windows Clipboard and paste them into other
applications.
Saving Recognized Text as a WordPerfect Envoy Runtime File:

Choose Publish to Envoy....

The Print dialog box appears with the Envoy driver automatically
selected.

Click OK.

The Save Envoy Runtime As dialog box appears.

3 Select the path and directory in which to save the file.

4 Enter a name for your file.

5 Click Setup... to change the default print options.

6 Click OK.

Your file will automatically be saved as an Envoy Runtime file with an
.exe extension.

Opening a WordPerfect Envoy Runtime File

An Envoy runtime file is self-opening: it includes a scaleddown version
of the WordPerfect Envoy application. This file can open itself on the
same kind of computer it was created on (Macintosh or PC) even without
Envoy installed.

To open your Envoy runtime file, double-click its file name (~.exe) in
the Windows File Manager. Your file will open in a scaled-down version
of the Envoy viewer with the title "Embedded Document."

/~ The file will open in the regular Envoy viewer if you have the Envoy
application installed.

Descriptions appear on the Envoy title bar at the top of the screen and
on the status bar at the bottom of the screen to define what a selected
button or command does and actions you can do next as you perform a
task.

I ~ You cannot import or open any file other than the file
 ~ that is attached to the Envoy runtime viewer.

Choose Exit to quit the OmniPage program.lf the current working
document has changed since the last save, a prompt appears asking if
you want to save the document. Click Cancel
The Edit Menu

Cut     
Ctrl~X
Copy    
Clrl~C
P~ sle  Clrl
~V
Cl~ar   Del
Selert All In
P~ge

Check Recognilion..  Clrl~K
Verify Im;~ge        Clrl~Y

EindlRepl~ce.......  Clrl~F

The Edit menu lets you revise text in the text window and work with
images in the zone window. Edit menu commands i n ~

Cut
Copy
Paste
Clear
Clear All Zones (for zone window only)
Select All in Page
Check Recognition...
Verify Image
Find/Replace. . .
Delete Recognized Zone
Select Recognized Zones
Delete Current Page
Go to Page...

Choose Cut to temporarily delete selected text from the recognized
document. This command is also available as a button in the toolbar.

Cut text is stored on the Windows clipboard and may be pasted anywhere
(except into a graphic) in the document. The text remains on the
clipboard until new text is cut or copied.

To cut text:

Position the text cursor at the start of the text, hold the mouse
button down, and drag the cursor across the text to hi~hli~ht it.

Release the mouse button when you have selected the desired area of
text.

Choose Cut or click the Cut button.

The selected text disappears.

Place the cursor where you want to place the text and click the mouse
button.
5 Choose Paste in the Edit menu or click the Paste button.

The Verify Image feature cannot track text that is cut and pasted from
one page to another.

Choose Copy to duplicate selected text from the recognized document.
This command is also available as a button in the
tnnlhAr

Copied text is stored on the Windows clipboard and may be pasted
anywhere (except into a graphic) in the document. The text remains on
the clipboard until new text is cut or copied.

To copy text:

Position the text cursor at the start of the text, hold the mouse
button down, and drag the cursor across the text to highlight it.

Release the mouse button when you have selected the desired area of
text.

Choose Copy or click the Copy button.

The selected text remains as is.

Place the cursor where you want to place the text and click the mouse
button.

Choose Paste in the Edit menu or click the Paste button.

Choose Paste to place cut or copied text in the recognized document.
This command is also available as a button in the toolbar.

Pasted text appears at the cursor location. A copy of the pasted text
remains on the Windows clipboard until new text is cut or copied.

/j\ The Verify Image feature cannot track text that is cut and
 ~ pasted from one page to another.

Choose Clear to delete selected text from the recognized document
permanently.
To clear text:

Position the text cursor at the start of the text, hold the mouse
button down, and drag the cursor across the text to highlight it.

Release the mouse button when you have selected the desired area of
text.

Choose Clear.

Cleared text is not stored on the Windows clipboard, so you cannot
paste it.

Clear All Zones

Choose Clear All Zones to delete all of the zones in a page image in
the zone window. This command is also available as a button in the
toolbar.

Clear All Zones appears in the menu only when the zone window is
active. When you clear zones, only the zone borders are deleted; the
image itself remains the same. After the zones are cleared, you can
create new zones manually, automatically, or by using a zone template
(Professional version only).

Select All in Page

Choose Select All in Page to automatically select the entire contents
of a recognized page in the text window. This command is available only
when the text window is active.

To deselect a selected page, click anywhere within it or choose Select
All in Page again.

Check Recognition...

Choose Check Recognition... to check for errors in a recognized
document. This command is also available as a button in the toolbar.

OmniPage uses the currently selected main and user dictionaries to
check recognition. The Check Recognition operation will stop at:

Blue words: these were replaced by the Language Analyst.

Green words: these have questionable characters that OmniPage made an
attempt to recognize.
Red: unrecognizable characters in a word are replaced with a red reject
character (~ is the default).

Words not found in the dictionariec

When OmniPage finds a possible error, the Check Recognition dialog box
shows the original image of the word in the context of the original
page.

/~ You can only see character bitmaps if the original page images are
saved in the Caere Document.

Cae~e designs~ develo

Choose one of the following options for a word flagged as a possible
error. After you choose an option for the word, OmniPage automatically
finds the next possible error.

Ignore

Click this to allow a word to remain as is and go on to find the next
error.

Add

Click this to add a word to your current User Dictionary and go on to
find the next error. Other occurrences of the word that are suspected
errors in the current document will be checked. However, OmniPage will
accept future occurrences of the word when you use the same user
dictionary for future documents.

Change

Click this to replace a word with the word in the Change To edit box.

To place a word in the Change To edit box, you can either type in a
word or select a word from the drop-down list box
A The original text of a word corrected by the Language Analyst appears
as the first word in the list box in case you want to change it back.

Done

Click this to exit the Check Recognition operation. Any changes made up
to that point will be retained.

Verify Image

Choose Verify Image to view the original image of recognized text in
the Verification Window. The Verification Window is an important
feature that you can use while you are viewing and editing recognized
text. It shows a clear close-up of the original image and surrounding
area of selected text.

In order to verify images, the original page images must be saved in
the Caere Document. To save page images, make sure Save Page Images in
Caere Document is selected in the Settings Panel Preferences options
before scanning or loading an image.

Saving the original page images slightly slows down ~  ~ processing
and takes up more disk space.

To verify cm imclge:

Place the cursor in the area of recognized text that you want to
verify.

Choose Verify Image. Or, double-click the mouse button.

The Verification Window appears showing the original image of the
selected area of text.

You cannot verify the image of text that is cut and pasted ~  ~ from
one page to another.

Find/Replace...

Choose Find/Replace to find a word or set of characters in the
recognized document and replace it with another word, if
desired. This command is also available as a button in the toolbar.

~ .

ith ~

By default, when you search for a word, all occurrences of letter
combinations that match the word are found. For example, if "jelly" is
the search word, OmniPage would find the "jelly" in "jellyfish." You
can also select other more specific options for finding words,
including Match Whole Word Only and Match Case.

Match Whole Word Only

Select this to find only the words that exactly match the length of the
search word. Compound words that contain the search word within them
will not be found.

Motch Case

Select this to find only the words that exactly match the upper- and
lower-case attributes of the search word.

To find ~ word:

Type the word or set of characters for which you are searching in the
Find What edit box.

Select specific search options, if desired.

You can select Match Whole Word Only and/or Match Case.

Click Find Next.

The first occurrence of the word is highlighted.

To continue searching, click Find Next again.

k ~anr~/ tr) ~it
To replace a word:

Type the word or set of characters that you want to replace in the Find
What edit box.

Select specific search options, if desired.

You can select Match Whole Word Only and/or Match Case.

Type the desired replacement word in the Replace With edit box.

Click Find Next.

The first occurrence of the word is highlighted.

Click Replace to insert the replacement word.

OmniPage then automatically looks for the next lrrPn~e of the ~e~reh
word.

Click Replace All to replace all instances of a word.

Click Cancel to exit.

Delete Recognized Zone

Choose Delete Recognized Zone to delete a selected text or graphic zone
in a recognized page. This command is available in the Fdit menu when
the text window is active.

You can delete a text zone if your cursor is in it. To delete a graphic
zone, however, you must choose Select Recognized Zones in the menu and
then click in that graphic zone.

Select Recognized Zones

Choose Select Recognized Zones to select all of the text and graphic
zones in a recognized page. A check mark appears next to this command
when it is selected.

OmniPage produces various text and graphic zones in a recognized page
which you can resize or reposition to change the page layout. When you
select zones, handles appear on each zone. Use the handles to resize a
zone. To move a selected text or graphic zone to another area of the
recognized page, place the mouse pointer inside the zone, hold the
mouse button down, and drag it to the desired location.
To deselect the zones in the recognized document, choose Select
Recognized Zones from the Edit menu again. The check mark disappears
and the zones are deselected.

/~ You can also select zones individually by placing the
' ~ mouse pointer inside a zone and doing an Alt-right mouse click.

Delete Current Page

Choose Delete Cuwent Page to delete a page.

You may want to delete a page in your document that was poorly scanned
or recognized.

When you delete a page, everything is discarded including the page
image and recognized text.

Go to Page...

Choose Go To Page... to switch to another page in the current document.
Both the text window and zone window will change to reflect the
selected page.

You can also open the Go To Page dialog box by clicking the current
page number in the status bar.

In the Go To Page dialog box, you can select First Page, Last
Page, or type in a specific number in the Page Number edit box.

Click OK to go to the selected page. Click Cancel to exit and return to
the current page.
The Format Menu

The Format menu lets you format character and paragraph attributes
while you edit a recognized document in the text window. Format menu
commands include:

Character. . .

Paragraph . . .

Character...

Choose Character... to change the attributes of a selected character or
section of text in a recognized document.

A Character formatting will not be retained if you save a C~ file in
ASCII or ANSI format.

You can select multiple attributes in the Font dialog box including
font, font style, size, and effects. The Sample box illustrates the
attributes that you select.

Font

Select a font from the Font list box. You can type a letter in the Font
edit box to skip to the fonts beginning with that letter.

Font Style

Select Regular to return selected characters to an unformatted state.
Bold, italic, and underlined characteristics disappear.

Select Italic to change selected characters to an italicized format.
Select Bold to change selected characters to a boldfaced format.

Select Bold Italic to change selected characters to a boldfaced and
italicized format.

Size

Select from a range of font sizes in the Size list box.

Effects

Select Underline to change selected characters to an underlined format.

To apply character formatting:

Position the text cursor at the start of the text, hold the mouse
button down, and drag the cursor across the text to highlight it.

Release the mouse button when you have selected the desired area of
text.

3 Choose Character... in the Format menu.

4 Make the desired formatting selections in the Font dialog box.

Click OK to accept the formatting selections; the selected text changes
accordingly.

Click Cancel to exit without applying the formatting selections.

You can also use the Bold, Ita/ics, and Underline buttons in the text
window for convenient formatting shortcuts.

Paragraph...

Choose Paragraph... to change the attributes of a selected paragraph in
a recognized document.

Paragraph formatting will not be retained if you save a file in ASCII
or ANSI format.
You can select line spacing and alignment attributes in the Paragraph
Format dialog box.

Line Spacing

Select Single for single-spaced lines.

Select Double for double-spaced lines.

Select Triple for triple-spaced lines.

Alignment

Select Left for left-aligned text.

Select Center for center-aligned text.

Select Right for right-aligned text.

Select Justify for justified text.

To apply paragraph formatting:

Place the cursor somewhere within the paragraph that you want to
format.

2 Choose Paragraph... in the Format menu.

3 Make the desired formatting selections in the Paragraph Format dialog
box.
Click OK to accept the formatting selections.

The selected text changes accordingly.

Click Cancel to exit without applying the formatting selections.

You can also use the buttons in the text window for convenient
formatting shortcuts.
The Process Menu

The Process menu lets you perform fundamental OmniPage operations,
including each step of the OCR process. Process menu commands include:

Auto/Stop
Scan Image/Load Image
Auto Zones/Manual Zones/Use Template...
Perform OCR/Train OCR/Defer OCR
Process Settings
Finish Current Document...
Finish Deferred Documents...
Start Image Assistant (Professional version only)

~ome of the Process menu commands are available as buttons in the
toolbar. In particular, the Process Settings commands are available in
the Image, Zone, and OCR button drop-down list boxes. The Process
Settings commands change according to the currently selected button
commands and vice versa.

Scan Image
Auto Zones
Perform OCR
Process Settings

Choose Auto to automatically start and finish processing each page of a
new document or finish processing the current page of an open document.
This command performs the same function as the AUTO button in the
toolbar.

Automatic processing of a document is determined by the currently
selected Image, Zone, and OCR Process Settings commands. For example,
if Scan Image, Auto Zones, and Perform OCR are selected as the
processing commands, the following process occurs when you choose Auto:

The page in your scanner is scanned and the resulting imacre appears in
the zone window.

OmniPage automatically creates zones on the page image .

OmniPage performs OCR on the page image and the resulting recognized
page appears in the text window.

This same process repeats for every page in a multipa~e document.
Scanning, zoning, and OCR operations occur according to the currently
selected Settings Panel options.

When a document is already open to an unfinished page image, you can
choose Auto to finish processing that page according to the selected
processing commands. For example, if your document is open to an
unrecognized page image without zones, you can choose Auto to create
zones and recognize the page according to the selected zone and OCR
commands.

As automatic processing begins, the Auto command changes to Stop.
Choose Stop if you want to discontinue processing.

Choose Stop if you want to discontinue processing at any time. This
command performs the same function as the Stop button in the toolbar.

For example, you may want to stop scanning your page if you realize
that inappropriate scannin~r o~tions were select~d

Scan Image

Choose Scan Image to scan a page in your scanner. This command performs
the same function as the Image button when Scan Image is selected from
the drop-down list box.

Before scanning, make sure the appropriate Scanner options are selected
in the Settings Panel.

/~\ You can use your right mouse button to click the Image button and
automatically open the Settings Panel to Scanner options.

A scanned image becomes your working document if a document is not
already open. When a document is already open, scanned images can be
added to it. An image is automatically appended to the end of the
document if the last page is currently open. If the document is not
open to the last page, a dialog box opens. You can replace the current
page, insert before the current page, or append to the end of the

docllm~nt

While scanning an image, a progress meter appears and the status bar
displays progress. To cancel scanning at any time,
click the STOP button in the toolbar. When scanning is complete, the
image appears in the zone window.

You can scan multiple pages when you select Scan Image for automatic
processing. For example, you may have a multi-page document that you
want to process automatically. After selecting Scan Image and the
desired zone and OCR processing commands, you can choose Auto to begin
automatic processing. The pages are scanned and processed in the order
that they are placed in the scanner and combined into one working
document.

You can change the Scan Image command to Load Image in the Process
Settings cascading menu or the Image button dropdown list.

Load Im~ge

Choose Load Image to open a previously saved image file. This command
performs the same function as the Image button when Load Imaqe is
selected from the drop-down list box.

An image file is a "picture" of text and/or graphics that is saved in
an image file format such as TIFF or PCX. When you load an image file
in OmniPage, it appears in the zone window.
See Supported Input File Formats on page 2-8 for a list of supported
image file formats.

A loaded image file becomes your working document if a document is not
currently open. When a document is already open, image files can be
added to it. An image is automatically appended to the end of the
document if the last page is currently open. If the document is not
open to the last page, a dialog box opens. You can replace the current
page, insert before the current pa~e, or append to the end of the
document.

To load an image file:

Locate your image files in the Open Document dialog box.

Select the type of file you wish to open from the List Files of Type
drop-down list.

Files of that type appear in the File Name list box.

Double-click a file or select it and click OK.

The file opens in the zone window. For a multi-page image file, you
must click the Image button to load each consecutive pa~e in the file.

Click Cancel to exit without loading an image file.

You can load multiple image files when you have Load Image selected for
automatic processing. For example, you may have a number of TIFF files
that you want to process automatically. These files are loaded and
processed in the order that they are selected and combined into one
working document.

To load one or more image files for automatic processing:

Select Load Image and the desired zone and OCR command s .

Click the AUTO button to begin processing.

The Load Image dialog box appears.

3 Locate your image files.

4 Select the type of files you wish to load from the List Files of Type
drop-down list.

Files of that tY~e appear in the File Name list box.
Click the file to load and then click Add.

The file appears in the Selected Files list box.

Click Add All to select all the files in the directory.

 To add image files from other directories, repeat steps two through
four. You can select up to Z55 files.

To remove a file from the list box, click it and then click Remove.
Click Remove All to remove all files
from the list box.

Click OK when you have selected all the files to load.

The images are loaded into the zone window and processed one at a time,
in the order they were selected .

Click Cancel to exit without loading any files.

You can change the Load Image command to Scan Image in the Process
Settings cascading menu or Image button dropdown list.

Auto Zones

Choose Auto Zones to have OmniPage automatically draw and order zones
that determine what will be recognized in the page image. This command
performs the same function as the Zone button when Auto Zones is
selected from the drop-down list.

To automatically create zones and determine the text flow for
recognition, OmniPage uses the selected Zones option in the Settings
Panel: Multiple Columns, Single Column or Table, or None. For more
information about each of these options, see Zones Options on page 2-9.

If a page already has zones, you are prompted to delete the current
zones before auto zoning occurs; click Yes to proceed. The zone window
is then updated so that you can review the
70n~s that are drawn.

The automatically drawn zones appear in black and each zone has a
number indicating its recognition order. Using the zone window tools,
you can reorder zones for recognition and deselect zones that You do
not want to recognize.
To reorder zones for recognition:

.;-- . I 1 Click the Order Zones tool.
;~ . .. l

'1  |      The numbers in the zones will
disappear.

Click within the zone you want to recognize first.

The number 1 will appear in the zone.

Click within the next zone you want recognized.

The number 2 will appear in the zone.

4 Continue until all the zones are appropriately ordered.

To deselect zones that you do not want to recognize:

Click the Select Zones tool.

2 Click within each zone you want to deselect.

A zone changes from black to white when it is deselected. To reselect a
zone, click it again with the .Select Zones tool.

You can change the Auto Zones command to Manual Zones in the Process
Settings cascading menu or the Zone button dropdown list box

If you use the OmniPage Professional version, you can also change this
command to Use Template... in the Process Settings cascading menu or
select a zone template directly from the Zone button drop-down list
box.

Manual Zones

Choose Manual Zones to draw, order, and specify your own zones that
determine what will be recognized in the page image .

When you create zones manually, OmniPage uses the selected Zones option
in the Settings Panel (Multiple Columns, Single Column or Table, or
None) to determine the text flow within each zone that you draw. For
more information about each of these options, see the Zones Options
entry in Chapter 2, The Settings Panel.
You can draw zones using the tool palette in the zone window.

|Alphsrlumeric

Use the Zoom tool to zoom in or out. After selecting it, click the left
mouse button to zoom in (enlarge the image) and the right mouse button
to zoom out (reduce the image).

Use the Draw Zones tool to draw zones around areas of text.

Use the Order Zones tool to number zones in the order you want them
reco~nized.

Use the Erase Zones tool to delete existing zones.

Use the Arrow buttons to rotate the entire image 90 degrees left, 180
degrees, or 90 degrees right.

Use the Zone Contents drop-down list box to assign a zone ~onnontc fil~
t~ ~ s~l~cted zone.
To draw zones:

Click the Draw Zones tool.

2 Enclose an area you want as a zone by holding the mouse button down
and dragging the mouse.

When you have enclosed the desired area, release the mouse button.

Continue using the mouse to draw zones in the page ima~e until you have
finished.

You can draw up to 64 separate zones of which 26 can be graphic zones.
Any area of the image that is not part of a zone will not be
reco~nized.

A number appears in each zone indicating the order in which the zone
will be recognized. To reorder zones, llcr thr ()rrl~r 7nn~ tnnl

To resize zones:

Click the Draw Zones tool.

2 Click a zone to select it.

Handles appear on the zone border.

Select a handle, hold the mouse button down, and drag the mouse in the
direction that you want to enlarge or reduce the zone.

To order zones for recognition:

Click the Order Zones tool.

The numbers in the zones disappear.

2 Click within the zone you want to recognize first.

The number 1 appears in the zone.

3 Click within the next zone you want recognized.

The number 2 appears in the zone.

4 Continue until all the zones are appropriately ordered.

To reorder zones, click the Order Zones tool again.
To move zones:

Click the Draw Zones tool.

2 Place the mouse pointer inside a zone.

Your cursor changes to a four-way arrow.

Hold down the mouse button and drag the zone wherever you want it.

Only the zone borders can be moved; the contents of the page image
remain in the same place.

To erase zones:

Click the Erase Zones tool.

2 Click within each zone you want to delete.

Only the zone borders go away; the contents of the page image remain.

Assigning Zone Contents Files

For better recognition accuracy, you can assign zone cnntents files to
various zones that you draw in a page image.

For example, if your image has a paragraph of text followed by a table
of numbers, you can draw separate zones around each and assign an
alphanumeric zone contents file to the paragraph and a numeric zone
contents file to the table.

To assign zone contents files to zones:

After drawing zones manually, click within a zone to select it.

Select the appropriate zone contents file from the Zone Contents
drop-down list box.

Repeat steps one and two for any other zones you wish.

You can change a zone contents assignment at any time before
recognition.

You can change the Manual Zones command to Auto Zones in the Process
Settings cascading menu or the Zone button dropdown list box. If you
use the OmniPage Professional version, you can also change this command
to Use Template... in the
Process Settings cascading menu or select a zone template directly from
the Zone button drop-down list box.

Use Template... (Professional version only)

Choose Use Template... to create zones that determine what will be
recognized in the page image by applying a zone template file (*.zon).
This is a quick and efficient means of zoning similar documents.

A zone template file is comprised of various zone attributes such as
position, order, and zone contents. If you frequently process documents
with layouts and content that require the same type of zoning, you can
save a zone template and apply it to all such documents.

A You can create zones manually and save them as a template using the
Save Zone Template... command in the File menu. For more information on
creating zones manually, see Manual Zones on page 2-46.

When you choose Use Template..., a dialog box appears listing all zone
template files in the Data directory.

To apply ~ zone templ~te:

Click the zone template p.zon) to use for the current page Image.

The selected file is highlighted.

Click OK.

Zones are drawn on the page image according to the selected zone
teml)late.

Click Cancel to exit without applying the zone template.
You can also select a zone template directly from the Zone button
drop-down list.

You can change the Use Template... command to Manual Zones or Auto
Zones in the Process Settings cascading menu or the Zone button
drop-down list.

Perform OCR

Choose Perform OCR to recognize text on the current page. This command
performs the same function as the OCR button when Perform OCR is
selected from the drop-down list.

Before performing OCR, make sure the appropriate OCR options are
selected in the Settings Panel.

Use your right mouse button to click the OCR button and automatically
open the Settings Panel to OCR options.

If there are no zones on the page when you select Perform OCR, OmniPage
automatically creates zones according to the selected Zone command. If
Manual Zones is currently selected, OmniPage ignores this and draws
zones automatically.

If you use the Professional version, you can change the Perform OCR
command to Defer OCR or Train OCR in the Process Settings cascading
menu or OCR button drop-down list.

Defer OCR (Professional version only)

Choose Defer OCR to delay text recognition of one or more pages of the
document you are processing. This command performs the same function as
the OCR button when Defer OCR is selected from drop-down list box.

For example, you can select Scan Image, Auto Zones, and Defer OCR as
the Process Settings commands and then choose Auto to initiate
automatic processing. Each page of your document will be scanned and
zoned but recognition will be deferred .

At your convenience, you can choose Finish Current Document... in the
Process menu to finish processing your open document. Or, set OmniPage
to recognize the deferred document at a specified time by choosing
Finish Deferred l)nnl~m~7t~ in th~ Process menu.
You can change the Defer OCR command to Perform OCR or Train OCR in the
Process Settings cascading menu or OCR button drop-down list box.

Train OCR (Professional version only)

Choose Train OCR to create a character training file (i.trn) that
assists OmniPage during text recognition and allows better recognition
of special characters. This command performs the same function as the
OCR button when Train OCR is selected from the drop-down list.

A character training file is a set of pre-recognized text characters
that OmniPage compares with the characters in the page image during
recognition. Before recognizing an image, you can create a new training
file or choose an existing one in the Settings Panel OCR options. For
step-by-step instructions on training OCR, Documents with Specialized
Characters (Professional version onlY) on ~a~e 2-50.

The Train Characters dialog box shows the original image and OmniPage's
interpretation of each character in the page image. Click any character
to select it for training.

                                          ~('                l    I
Original ima9e                            (~ T         a          cefg

OmniPage's in~erpre~a~ion                 1~ T         a          cefg

( g h i I n o

                        h        i   n

P P P r r S

_
Specify

Select a character and click Specify (or double-click the character) to
open the Specify Character dialog box. This shows the selected
character in the context of the original page image.

1 60--
1 61--I
1 62 - -
1 63 - _
164--
1 65--
1 66--
1 67--
1 68----
1 69 =~

~:~Training a list o~
~Ppr~N ~K

You can associate character(s) for the selected character bitmap. Type
the desired character(s) in the Character edit box or select a
character in the Extended ANSI list box and click OK.

Delete

To discard a previously specified character, select it and click
Delete.

Append

Click Append to add the current set of trained characters to another
training file. A dialog box appears displaying a list of existing
character training files. Click the file you wish to append and then
click OK.
Click Save to save the trained characters to a file; a dialog box
appears.

Name the file and click OK. If you name an existing file, you will be
asked if you want to replace it with the current file. Training files
are saved to the Data directory; this is the default directory that
OmniPage creates during installation.

To create a character training file:

Open an image file or scan an image that includes the characters YOU
want to train.

Select the appropriate zones and choose Train OCR in the Process menu
or OCR button.

The Train Characters dialog box appears.

Specify characters in the dialog box and edit them as desired .

Click Save to name and save a character training file for the
characters you have trained.

Click Append to add the trained characters to an existing training
file.

Click Cancel to exit without saving the training file.

A dialog box gives you the option of recognizing your page image and
making this the current training file after saving or appending the
file.

Click Yes to recognize your page image and apply the training file you
just created.

Click No if you want to return to the OmniPage screen without
recognizing the image.
You can change the Train OCR command to Perform OCR or Defer OCR in the
Process Settings cascading menu or OCR button drop-down list box.

Process Settings

Choose Process Settings to access the image, zone, and OCR processing
commands. These commands are also accessible in the Image, Zone, and
OCR button drop-down list boxes.

Selected image command
Selected zone command
Selected OCR com mand

Finish Current Document... J Load Image Finish Deferred Documents...
JAuto Zones

Start Image Assistant    Manual Zones
                        Use Template

                        JPerform OCR
                        Train OCR
                        Defer OCR

In the Process Settings cascading menu, you can select:

Scan Image or Load Image

Auto Zones, Manual Zones, or Use Template... (Professional version
only)

Perform OCR, Train OCR (Professional version only), or Defer OCR
(Professional version only)

The currently selected Process Settings commands determine what image,
zone, and OCR operations can be performed. OmniPage also uses the
selected commands for automatic processing.

For more information on each Process Settings command, refer to its
respective Process menu entry in this chapter.

Finish Current Document...

Choose Finish Current Document... to automatically finish recognition
,~rocessing of an open document.

For example, you can scan pages and create zones in a multipage
document without taking the time to recognize it. Later, at your
convenience, you can choose Finish Current Document... to reco(Jnize
the entire document.
A OmniPage uses the currently selected Settings Panel ~ ~ ~ options to
finish processing your document.

Sclve Options

For your convenience, you can select Save Options in the Finish Current
Document dialog box. This way, OmniPage automatically finishes your
document and then saves it to your specifications.

Select Convert Automatically to save your document in a preselected
file format after recognition. Click Save Output to... to open the Save
As dialog box and select specific options for saving your document.

Select Delete Caere Document when Finished to automatically discard the
Caere Document after recognition. Your document will only be saved in
the file format that you select. Remember, only a Caere Document can be
reopened (and resaved in a different format) in OmniPage.

If the Caere Document is deleted, you can't reopen or
 ~ edit your reco~nized document in OmniPage.

To finish the current document:

Open the document you want to finish, if it is not already open in
OmniPage.

2 Choose Finish Current Document... in the Process menu.

3 Select the appropriate Save options.
Click OK to begin processing immediately.

Every page of the document will be processed. If a page does not have
zones, OmniPage automatically creates zones using the selected Settings
Panel Zones option.

Click Cancel to exit without processing the current document.

Finish Deferred Documents... (Professional version only)

Choose Finish Deferred Documents... to automatically finish recognition
processing of up to 255 documents at a specified time.

A document with one or more unrecognized pages can be saved as a Caere
Document. You can defer recognition of a page by choosing Defer OCR in
the Process menu or the OCR button drop-down list box. For more
information on this feature, see the Defer OCR entry in this chapter.

/~ OmniPage uses the currently selected Settings Panel options to
process your documents.
SAve Options

For your convenience, you can select Save Options in the Finish
Deferred Documents dialog box. This way, OmniPage automatically
finishes your deferred documents and then saves them to Your
specifications.

Select Convert Automatically to save your document in a preselected
file format after recognition. Click Save Output to... to open the Save
As dialog box and select specific options for saving your document.

Select Delete Caere Document when finished to automatically discard the
Caere Document after recognition. Your document will only be saved in
the file format that you select. Remember, only a Caere Document can be
reopened (and resaved in a different format) in OmniPage.

/j\ If the Caere Document is deleted, you can't reopen or
 ~ edit your recognized document in OmniPage.

When to Recognize Options

Select Now to process the document(s) as soon as you click OK.

Select Later to process the document(s) at another specified time.
Select a time (hour and minute) from the drop-down list h.~f, c

To finish deferred documents:

In the Finish Deferred Documents dialog box, click Add Files....

The Open dialog box appears.

Locate your files.

Caere Document files in the specified directory will appear in the list
box.

Select a file you want to open and click Add, or doubleclick the file.

The file appears in the Selected Files list box.
Continue to select the deferred files that you want to finish. You can
choose files from various directories.

If you change your mind about a file, select it and click Remove.

When you have selected the desired file(s), click OK.

The Finish Deferred Documents dialog box reappears.

Select the appropriate Save Options and When to Recognize Options.

Click OK to recognize the deferred documents as specified.

Each document is opened, processed, saved as specified, and then
closed. If you do not specify any save options, a document will be
saved to its original file name.

Click Cancel to exit without recognizing the deferred ~o~l~m~nts

Start Image Assistant (Professional version only)

Choose Start Image Assistant to launch the Image Assistant 24-bit color
and image-editing program. This command is also available as a button
in the toolbar.

/~ You can also launch Image Assistant by double-clicking a graphic
zone in your recognized document. The graphic will appear in a new
image window.

With Image Assistant, you can scan and edit color, grayscale, and
line-art images. Image Assistant provides a broad range of feature-rich
tools and capabilities for sophisticated image control by experienced
users. For casual users, the Assist Mode provides the most commonly
used features in a simplified format.

For more information about Image Assistant, see the Image Assistant
tutorials booklet and the on-line documentation in the Image Assistant
Help program.
The Settings Menu

Edit Training File...
Edit Zone Contents File...
Edit User Dictionary...

The Settings menu lets you modify and set system-wide settings.
Settings menu commands include:

Settings Panel...
Select Scanner...
Select Languages...
Edit Training File...
Edit Zone Contents File...
Edit User Dictionary...

OmniPage retains the most recently selected system settings. For
example, if you select Spanish as the language, OmniPage will use the
Spanish character set for recognition until you change it.

Settings Panel...

Choose Settings Panel... to open the Settings Panel. This command is
also available as a button in the toolbar.

The Settings Panel is the central location for settings OmniPage uses
to process your documents.

Using the icons in the scroll box on the left side of the
Settings Panel, you can access six different sets of options.
Click the Scanner icon to select options that control how your scanner
scans a page.

Click the Zones icon to select the zoning option that determines the
flow of text during recognition.

Click the OCR icon to select input and output options that assist
OmniPage during recognition and determine the format of the recognized
document.

Click the Fonts icon to select retained or ignored font format options.

Click the Spelling icon to select dictionaries and spell checking
options.

Click the Preferences icon to select options that customize general
OmniPage operations.

The Settings Panel changes to reflect the options of the icon that you
click. For more information about these options, see Chapter 2, The
Settings Panel. For a guided tour of the Settings Panel, see Touring
the Settings Panel on page 2-15.
Select Scanner...
Choose Select Scanner... to set the current system scanner.

Rbaton DTP Scan 4
Rbaton Scan 300/FB,300/SF
Rbaton Scan 3005,300GS
Rbaton Transcribe/300
Rpple OneScanner
Rgfa Focus Il II, & Color
RUR 3~0
Brother BS300,BS300GS
Canon IX-12,IX-12F

To select a scanner:

Scroll through the list of avai-lable scanners and click the ~referred
scanner.

Click OK to set the chosen scanner.

Click Cancel to exit without setting the scanner.

Certain scanners require additional parameters such as port address and
speed. For these particular scanners, you will be prompted for the
appropriate information. See your scanner documentation for more
information.

Select Languages...

Choose Select Languages... to select one or more language character
sets for text recognition.

OmniPage can recognize additional characters (such as circumflexes,
umlauts, etc.) unique to a particular language; eleven languages are
available. You may select more than one language at a time, but for
faster recognition, use only the minimum number of languages that are
necessary.
Select the language that matches the language of your main dictionary
selection. Also, be sure to select only one language if you use the
Language Analyst or 3D OCR.

To select one or more languages:

Select the preferred language(s) from the list box by clicking once.

Selected languages are highlighted.

To deselect a language, click it again.

Click OK.

Click Cancel to exit without setting the selected language(s) .

Edit Training File... (Professional version only)

Choose Edit TYaining File... to edit an existing character training
file.

A character training file (~.trn) is a set of pre-recognized text
characters that OmniPage compares with the characters in the page image
during recognition. Training files assist OmniPage during text
recognition and allow better recognition accuracy of special
characters. Before recognizing an image, you can create a new training
file or choose an existing one in the Settings Panel OCR options.
Original image
Associa~ed charac~ers

When you choose Edit Training File..., a dialog box appears listing all
training files in the Data directory. Click the file you want to edit
and then click OK.

The Train Character dialog box shows the existing characters in the
training file, including the original images and the associated
characters.

~ q r~ %

copyright paragraph
trademark            _

* @

* ~
_

See Documents with Specialized Characters (Professional version only)
on page 2-50 for detailed information on how to create and edit a
trainin~ file.
Specify

Select a character and click Specify (or double-click the character) to
open the Specify Character dialog box.

"at" is typed in so that OmniPage will

You can change the character(s) associated with the selected image
bitmap of the character. Type in the desired character(s) in the
Character edit box or select a character in the Extended ANSI list box.
Click OK to mark the character to be saved.

Delete

To discard a trained character from the training file, select it and
click Delete.
Append

Click Append to add the current set of trained characters to another
character training file. A dialog box appears displaying a list of
existing character training files.

Select the file you wish to append and click OK.

Sclve

Click Save to save edits to the trained character file.

To edit cl charclcter file:

Select the character file you want to edit from the dialog box and
click OK. or double-click the file.

Edit the characters in the Train Character dialog box as desired.

Click Save to save the edited training file and return to the OmniPage
screen.

Click Append to add the trained characters to another file.

Click Cancel to exit without saving the edits to the training file.

Edit Zone Contents File...

Choose Edit Zone Contents Fi1e... to create a new zone contents file or
edit an existin~ file.

A zone contents file (*.zcn) lets you identify the specific characters
that OmniPage looks for within specified zones during recognition.
OmniPage is shipped with numeric, ~raphic, and alphanumeric zone
contents files. All zone
contents files appear in the Zone Contents drop-down list in the zone
window.

When you draw zones manually in a page image, you can improve the
quality and accuracy of recognition by identifying each zone's
contents. For example, if you have a paragraph of alphanumeric text
followed by a numeric table, you can draw separate zones and assign an
alphanumeric zone contents file to the paragraph and a numeric zone
contents file to the table.

When you choose Edit Zone Contents File..., a dialog box appears that
lists all zone contents files in the Data directory. You can select an
existing file to edit or create a new one.

To create or edit a zone contents file:

To edit an existing file, click the file name in the File list box and
then click OK.

To create a new file, click New.

A dialog box appears containing a list box of the extended ANSI
character set and an edit box containing the characters in the zone
contents file. If you selected New, the edit box contains the
94-character (typical keyboard) ASCII character set.

Edit the contents of the edit box by typing in characters you want to
add to the file and deleting (with your Backspace or Delete keys)
undesired characters.

To add a character from the extended ANSI character set, double-click
the character in the list box; the ANSI character appears in the edit
box.

Click Reset to replace the contents of the edit box with the ASCII
character set.
3 Click Save to save your changes. 4 A dialog box prompts you to name
the file if it is new. Click Cancel to exit without saving any changes.

Edit User Dictionary...

Choose Edit User Dictionary... to create a new user dictionary ( .ud)
or edit an existin~ one.

A dialog box appears listing all user dictionary files in the Data
directory. To edit an existing dictionary, select it and click OK. To
create a new dictionary, click New; you are prompted to enter a name.

Whether you are creating a new dictionary or editing an existing one,
the Edit User Dictionary dialog box appears.

. .~   .

If you are editing an existing user dictionary, the dialog box lists
all the words currently in that dictionary. If you are creating a new
dictionary, no words are listed.
Use the buttons in the Edit User Dictionary dialog box to create or
edit your dictionary.

Add

Click this to add the word that you type in the User word edit box to
your dictionary. The word will appear in the list box.

Delete

Click this to delete a selected word from the dictionary.

Purge

Click this to delete all words from the dictionary.

Import.. .

Click this to add words from another application to your user
dictionary. For example, you may want to add technical terms from a
particular file.

The Import Text File dialog box prompts you to enter the file name and
directory of the file you want to import. An imported text file can be
any document or word list in ASCII format. Most word processors can
convert a file into ASCII format; see your program's documentation.

OmniPage will go through the selected text file, discard words already
in the main or other user dictionaries, and add the remainin~ words to
your current user dictionary.

Export...

Click this to save your user dictionary as a text file. The Export To
Text File dialog box prompts you to enter a file name and destination
for your file.

Sove As...

Click this to save an edited dictionary with a new name. User
dictionaries are automatically saved with a .ud file extension.

Done

k this to save edits to your dictionary and then exit.
The Window Menu

The Window menu provides options for viewing the OmniPage screen and
your document. Window menu commands include:

Tile Horizontal
Tile yertical
Cascade
Arrange Icons

Show Toolbar
Hide Status Bar
Hide ruler

Zone Window
Text Window

Zoom In
Zoom Out

Tile Horizontal
Tile Vertical
Cascade
Arrange Icons
Hide/Show Toolbar
Hide/Show Status Bar
Hide/Show Ruler
Zone Window
Text Window
Zoom In
Zoom Out

Tile Horizontal

Choose Tile Horizontal to resize the open zone and text windows so they
fit in the window area horizontally. To switch windows, click the
window that you want to activate.

Tile Vertical

Choose Tile Vertical to resize the open zone and text windows so they
fit in the window area vertically. To switch windows, click the window
that YOU want to activate.

Choose Cascade to arrange the open zone and text windows one on top of
the other with title bars showing. To switch windows. click the title
bar of the window you want to activate.

Arrange Icons

Choose Arrange Icons to organize minimized window icons at the bottom
of the screen. Click the Minimize button in the upper-right corner of
the window to iconize the open zone and text windows.
Hide/Show Toolbar
Choose Hide Toolbar to hide the toolbar. Choose Show Toolbar to view
the toolbar again.

Hide/Show Status Bar

Choose Hide Status Bar to hide the status bar located at the bottom of
the window. Choose Show Status Bar to view the status bar a~ain.

Hide/Show Ruler

Choose Hide Ruler to hide the text window ruler. Choose Show Ruler to
view the ruler again.

Zone Window

Choose Zone Window to bring the zone window into view.

Text Window

Choose Text Window to bring the text window into view.

Choose Zoom In to enlarge an area of an image in the zone window for a
close-up view.

When an image is opened, it is fit to the zone window. You can zoom in
three more levels.

Zoom Out

Choose Zoom Outto decrease an enlarged view of an image in the zone
window.
The Help Menu

The Help menu provides access to the OmniPage online Help program and
information about OmniPage. Help menu commands include:

Contents

Procedures

Using Help

About. . .

Choose Contents for a list of the topics available in the OmniPage Help
program. The Help program conforms to the Windows Help standard.

Procedures
Choose Procedures for a Help listing of OmniPage procedures for
different OmniPage tasks.

Using Help

Choose Using Help for instructions on using the Help program .

Choose About... to see information about the current OmniPage version
you are using, any copyrights in effect, the program's licensee,
company name, and serial number.
Chapter 4
The Settings Panel

This chapter explains how to use the Settings Panel: the central
location for settings OmniPage uses to process your documents .

The Settings Panel includes:

Scanner options

Zones options

OCR options

Fonts options

Spelling options

Preferences options

You should make sure that the Settings Panel options are set
appropriately for your document before you begin any OmniPage
operation. Some options are only available with the OmniPage
Professional version; these are marked as "Professional version only."

For a guided tour of the Settings Panel, see Touring the Settings Panel
on page 2-15.
Settings Panel
Overview

To open the Settings Panel, choose Settings Panel... in the Settings
menu or click the Settings Panel button in the toolbar.

Using the icons in the scroll box on the left side of the Settings
Panel, you can access six different sets of options.

Click the Scanner icon to select options that control how your scanner
scans a page and the way an image file is loaded.

Click the Zones icon to select the zoning option that determines the
flow of text during recognition.

Click the OCR icon to select input and output options that assist
OmniPage during recognition and determine the format of the recognized
document.

Click the Fonts icon to select font format options for retaining or
ignoring the original font styles.

Click the Spelling icon to select dictionaries and spell checking
options.

Click the Preferences icon to select options that customize
The Settings Panel changes to reflect the options of the icon that you
click. You can select options and then click Close or leave the
Settings panel open as a floating window. The options selected last are
retained until you select new ones.

Selecting Settings Panel Options

There are three ways to select Settings Panel options:

Manual selection

Click each Settings Panel icon and select options manually. You can
change your selections at any time.

Use Defaults button

Click the Use Defaults button in the Settings Panel to reset all the
Settings Panel options to the default values.

Load Settings command

Choose Load Settings... in the File menu to select a previously saved
settings file ('.set). A loaded settings file automatically sets the
Settings Panel options and language selection(s) to preselected values.

You can save Settings Panel selections to a settings file by choosing
Save Settings... from the File menu. Disk space is the only limit for
the number of settings files you can save.
Scanner Options

Click the Scanner icon in the Settings Panel to select options that
control the way your scanner scans a page.

A Use your right mouse button to click the Image button C~ in the
toolbar and automatically open the Settings Panel to Scanner options.

Select Page options to describe page size and orientation.

Size

The Size drop-down list box lets you select the dimensions of the pages
you are scanning.

Select Letter for 8 . 5 " by 1 1 " size page s.

 Select Legal for 8.5" by 14" size pages.

 Select A4 for 21 cm by 29.7 cm European-size pages.

Orientation

The Orientation drop-down list box lets you select the orientation of
the pages you are scanning or page images you are loading. If you are
scanning, be sure to position the pages correctly in the scanner.
Select Portrait for a vertically-oriented page.

Select Landscape for a horizontally-oriented page.

Select Flipped to automatically rotate a portrait page 180 degrees
during the scan.

Select Flipscape to automatically rotate a landscape page 180 degrees
during the scan.

A Flipped and Flipscape options are useful if you are scanning pages in
a book and have trouble positioning the book in the scanner for certain
pages.

You can select Scan until Empty and Double-sided Pages for automatic
processing if you are using a scanner with an automatic document feeder
(ADF).

tcon until Empty

Select this to scan every page in the ADF when OmniPage performs
automatic processing.

For example, if you put multiple pages in the ADF and click the AUTO
button, the first page will be scanned and then processed according to
the selected zone and OCR commands. The next page will then be scanned
and processed in the same manner. This process will continue until the
ADF is empty.

If you do not select Scan Until Empty, OmniPage will only
 scan the first page in the ADF and you will need to click the AUTO
button again to process the next page.

Double-sided Pages

Select this when OmniPage performs automatic processing to scan pages
that are printed on both sides. OmniPage will process the batch of
pages in the ADF and then prompt you to turn the entire batch over to
process the reverse sides.
For example, if you have three double-sided pages numbered 1 through 6
(1 is on the front, 2 is on the back, and so on), OmniPage first
processes pages 1, 3, and 5 and then prompts you to turn the batch over
in the ADF. It then processes pages 6, 4, and 2. The resulting file
consists of pages 1, 2, 3, 4, 5, and 6 in the correct order.

You can divide a large batch of pages into several sections for
processing. For example, if you divide a large batch into two sections,
OmniPage would process one side of the first section and then the
reverse side. It repeats this procedure with the second batch of pages
and then appends them to pages of the first batch in the appropriate
order. If you want to later save each batch as a separate file, insert
blank pages as separators.

A If you use a flatbed scanner without an ADF, do not select
Double-sided Pages. Place the pages in the scanner in the order that
you want them to be scanned.

Select brightness setting for scanning your page. Use these options to
account for variations in paper and print quality in much the same way
you would adjust brightness on a copier. Depending on the quality of
your page, the option you choose greatly affects recognition accuracy.

You can select:

3D OCR with ANYPage/HP AccuPage 2

Auto Brightness with AnyPage/HP AccuPage 2

Manual Brightness.

A The option that appears, AnyPage or HP AccuPage 2, depends on your
scanner. HP AccuPage 2 is available with HP IIp, IIc, and IIcx
scanners. AnyPage is available with all other supported grayscale
scanners.

3D OCR with AnyPage/HP AccuPage 2 (Professional version only)

Select this to combine 3D OCR and AnyPage/HP AccuPage 2 technologies to
get the best scanned image and highest recognition accuracy possible.
This option is only available with supported grayscale scanners.
AnyPage/HP AccuPage 2 technology automatically adjusts an image to get
the optimum brightness level for each area of text and graphics on a
page. 3D OCR uses the grayscale information of a page during
recognition to view characters clearly and completely. This combination
of technologies delivers OmniPage's greatest accuracy possible.

Use 3D OCR and AnyPage/HP AccuPage 2 for all kinds of pages whenever
you want the best possible recognition results. This setting is
especially useful when you scan poor quality pages, pages with very
small type, or pages with text on colored or shaded backgrounds.

3D OCR and AnyPage/HP AccuPage 2 is slower than other settings. If you
scan high-quality documents with crisp text on a white background,
select Manual Brightness for the fastest results. 3D OCR adds 150 to
250K per page to the size of the image file. If you are scanning many
pages, you may want to use the Auto Brightness with AnyPage/HP AccuPage
2 or Manual Brightness setting to save disk space.

Auto Brightness with AnyPage/HP AccuPage 2

Select this to use AnyPage/HP AccuPage 2 technology to get high-quality
scanned images and high recognition accuracy. This option is only
available with supported grayscale scanners.

AnyPage/HP AccuPage 2 technology automatically adjusts an image to get
the optimum brightness level for each area of text and graphics on a
page. This setting works well for most pages and is especially useful
when you scan text on colored or shaded backgrounds.

Auto Brightness with AnyPage/HP AccuPage 2 is slower than a manual
setting. If you scan high-quality documents with crisp text on a white
background, select a Manual Brightness setting for the fastest results.

Manual Brightness

Select this to manually adjust (lighten or darken) the brightness
setting for scanning a page. The setting you choose is applied to the
entire page area.

Manual Brightness is the fastest setting if you scan highquality
documents with crisp text on a white background.
However, recognition accuracy is highest using AnyPage/HP AccuPage 2
and 3D OCR technologies.

To adjust brightness, select the square in the slide, hold the mouse
button down, and move the square to lighten or darken the setting. Or,
click the left or right arrow on the slide. The
number of settings available depends on the scanner you use.

Use a setting in the middle to scan high-quality documents with crisp
text on a white background.

Use a darker setting for a page that has thin, broken characters.

Use a lighter setting for a page that has thick, runtogether
characters.

The number in the edit box to the right of the slide quantifies the
brightness level you select. Use this number as a reference for future
documents.

To evaluate the effectiveness of the brightness setting, watch the
Character Window as OmniPage performs text recognition. Look for clear,
legible text samples.

Small Text

If your scanner supports HP AccuPage 2, you can select the
Small Text option for better recognition of text with small point
sizes.

Select this option to increase recognition accuracy if the text in your
page image is between four and seven points.

 The Small Text option slightly increases processing time.
Zones Options

Click the Zones icon. in. the Settings Panel to select the zoning
method that determines the flow of text during recognition.

 Use your right mouse button to click the Zone button in C~ the toolbar
and automatically open the Settings Panel to Zones options.

Regardless of how zones are created on a page image (manually,
automatically or with a template), OmniPage uses the selected Zones
option in the Settings Panel to determine the flow of text within each
zone. OmniPage also uses the selected Zones option to draw and order
zones on the page image when you choose the Auto Zones command in the
drop-down list under the Zone button in the toolbar.

For practical examples of choosing the best zoning method for various
types of documents, please see Chapter 2, Tutorials

Multiple Columns

Select Multiple Columns if you want OmniPage to discern the column
layout, determine the order of text, and distinguish graphics from
text. This works well with most types of documents and is especially
useful for newspaper articles and magazine pages.

Using this method, OmniPage looks for regular vertical separations of
text to define columns and then recognizes column-wide text zones. It
starts at the top of the first column, moves to the bottom, then
continues to the top of the next column, "snaking" throughout the text.
Unless you have the Trll~ P~e feature (Professional version only)
selected, the
resulting recognized document displays the text in one column from
beginning to end with any retained graphics at the bottom.

A Select Retain Graphics in the Settings Panel OCR options  when you
select Multiple Columns; otherwise, graphics are discarded.

Single Column or Table

Select Single Column or Table if you want OmniPage to treat the entire
page area as one column. This works well with documents such as
spreadsheets, tables, financial forms, and memos.

Using this method, OmniPage starts at the top of the page and moves to
the bottom, outlining page-wide text zones. If OmniPage detects five or
more spaces between columns, it assumes the page is in a spreadsheet
format and inserts tabs as delimiters between the columns to preserve
the format.

 You must draw zones manually, identify graphics with the Graphic zone
contents file, and select Retain Graphics in the Settings Panel OCR
options when you select Single Column or Table; otherwise, graphics are
~l i c~ r~

Select None if you want OmniPage to recognize the entire page area as a
single text zone.

Using this method, OmniPage does not discern column layout or
distinguish graphics from text. It tries to recognize everything it
sees on the page as text elements.

None is the fastest option to use when you recognize manually drawn,
text-only zones. It can also be useful for documents with very small
text areas such as those found in pleading pages or telephone book
pages.
OCR Options

Click the OCR icon in the Settings Panel to select input and output
options that assist OmniPage during recognition and determine the
format of the recognized document.

Use your right mouse button to click the OCR button in the toolbar and
automatically open the Settings Panel to OCR options.

Input Options

Input options determine the way OmniPage looks at text elements during
recognition. You can specify a character type, select a training file
(Professional version only), and have the page orientation
automatically corrected.

Chclrclcter Type

The Character Type drop-down list box lets you identify the printed
text characteristics in your document.

Select Automatic to have OmniPage automatically distinguish between
conventional and dot matrix printed text characters in the image you
are recognizing.

Select Normal if the image you are recognizing has conventionally
printed text characters.

Select Dot Matrix if the image you are recognizing has characters
printed in draft mode by a 9-pin dot-matrix printer. Do not select Dot
Matrix for pages printed in near-letter-quality mode or printed by a
24-pin dotmatrix printer.
Select OCR-A if all the characters in the image you are recognizing are
printed in OCR-A font. OCR-A is a special font used for items such as
part numbers and utility bills.

A If your document contains a mixture of OCR-A and a conventional font,
select Normal or Automatic for faster recognition.

Training File (Professional version only)

The Training File drop-down list box lets you select a character
training file (~.trn) that assists OmniPage with text recognition of
special characters. Any training files that you create a~l~ear in this
list.

A character training file is a set of pre-recognized text characters
that OmniPage compares with the characters in the page image during
recognition. Before recognition, you can create a new training file or
choose an existing one to assist with OCR.

For more information on creating a character training file, please see
Train OCR (Professional version only) on page 2-52.

Automatically Correct Page Orientation

Select this to correct an improperly oriented image by 90, 180, or 270
degrees during text recognition.

For example, if you have a portrait page that was accidentally scanned
upside-down, OmniPage will try to rotate it 180 degrees during
recognition so that it is properly oriented in the text window.

Use Language Analyst

Select Use Language Analyst so that OmniPage automatically performs
word and character analysis during the recognition process to check
spelling and replace unknown words with words that are most likely to
be correct.

The Language Analyst uses information about language context and usage
rules to evaluate words, compute likely errors, and determine
replacement words. Replacement words al~ear in blue in the reco~nized
document.

Be sure to select the appropriate main and user
you use the Language Analyst. You should also make sure that the
aD~ro~riate lan~ua~e is selected.

If any words in your document such as company-specific terms are
replaced inappropriately during recognition, you can:

Make sure Ignore Acronyms, Ignore Abbreviations, and Ignore Proper
Nouns are selected in the Settings Panel Spelling options so that these
types of words will not be replaced. Then re-recognize the document.

Create you own user dictionary for special terms and select it as the
user dictionary in the Settings Panel Spelling options. Then
re-recognize the document.

Please see Edit User Dictionary... on page 2-68 for more information on
user dictionaries.

Deselect Use Language Analyst and re-recognize the document. Use the
Check Recognition command in the Edit menu to check for spelling errors
and unknown word s .

Retain Graphics

Select Retain Graphics if you want OmniPage to retain original graphics
such as photographs or diagrams in the recognized ] ment.

To retain graphics, make sure to also:

Select Save Page Images in Caere Document in the Settings Panel
Preferences options before you scan or load an image.

Select Multiple Columns as your Zones option so OmniPage will
automatically distinguish graphics from text.

Or, if you select Single Column or Table as your Zones option, create
zones manually and identify graphics with the Graphic zone contents
file.

Select the True Page - Retain All Page Formatting OCR output option
(Professional version only) to keep graphics in the same position in
the recognized page that they were in the original page. If you do not
use True Page, retained graphics are placed at the bottom of the
recognized page.

OmniPage Professional users can edit and save graphics in the Image
Assistant 24-bit color and image-editing program. n(~llhl,o-,Ali,Ak the
~rAr~hic in vour reco~nized document to
launch Image Assistant; the graphic will appear in a new image window.

You can scan and edit color, grayscale, and line-art images with Image
Assistant. For more information, see the Image Assistant Tutorial and
the online documentation in the Image Assistant Help program.

Output Options

Output options determine the way text, graphics, and formatting will
appear in the recognized document. You can select:

True Page- Retain All Page Formatting

Retain Font and Paragraph Formatting Only

Ignore Fonts and All Formatting.

Files saved in ASCII or ANSI format do not retain any ~  ~ formatting
other than spaces and carriage returns.

True Page- Retain All Page Formatting (Professional version only)

Select this to reproduce the original page formatting and layout as
closely as possible in the recognized document.

True Page technology retains the original paragraph and font
formatting. It also preserves the layout of your original page by
creating "frames" around areas of text and graphics. These frames are
exported intact when you save your document in an appropriate file
format and open it in another application that supports frame-based
layouts.

Use the True Page setting if you want to duplicate a document, such as
a resume, as closely as possible and do not plan to do a lot of editing
to it after recognition. If you do plan to modify your recognized
document, choose Retain Font and Paragraph Formatting Only or Ignore
Fonts and All Formatting.

True Page attempts to reproduce the following page formatting
attributes:

Relative text column positioning

 Relative graphic positioning (you must select Retain Graphics in the
Settings Panel OCR options)

Margins
OCR Options

 Tabs
 Line Spacing
 Indentation
 Justification
 Blank vertical space
 Centered lines
 Font styles (select specific fonts in the Settings Panel Fonts
options)
 Font sizes
 Character attributes (boldface, italics, underline)

Retain Font and Paragraph Formatting Only

Select this to retain font and paragraph formatting in the recognized
document.

This option does not retain the original page layout; it formats
recognized text in a single column. If you also selected Retain
Graphics, any graphics in your document appear at the bottom of the
page.

This option attempts to reproduce the following page formatting
attributes:

 Margins
 Tabs
 Line Spacing
 Indentation
 Justification
 Blank vertical space
 Centered lines
 Font styles (select specific fonts in the Settings Panel Fonts
options)
 Font sizes
 Character attributes (boldface, italics, underline)

Ignore Fonts and All Formatting

Select this to ignore fonts and all formatting in the recognized
document and use a universal font and font size instead. Choose a font
and font size for recognized text in the Ignored Font Formats section
of the Settings Panel Fonts options .
This option does not retain the original page layout; it formats
recognized text in a single column. If you also selected Retain
Graphics, any graphics in your document appear at the bottom of the
page.
Fonts Options

Click the Fonts icon in the Settings Panel to select font format
options for retaining or ignoring the original font styles.

Retained Font Formats

You can select fonts to map to the various font styles in your document
if you choose True Page - Retain All Page Formatting (Professional
version only) or Retain Font and Paragraph
Formatting in the Settings Panel OCR options.

OmniPage will detect the font styles of characters during recognition.
Characters with a particular font style will be
formatted in the recognized document according to the font selected for
that style. For example, if you assign Arial to Serif Proportional font
styles, characters with Times New Roman (a Serif Proportional style
font) would be formatted with Arial font in the recognized document.

Use the drop-down list boxes to assign fonts for the following font
styles:

Serif Proportional

Character spacing varies depending on each character; short lines
finish off the letter strokes.

Sans-Serif Proportional

Character spacing varies depending on each character; letter strokes do
not have finishing lines.

Serif and Monospaced

Character spacing is the same for each character; short lines finish
off the letter strokes.
Sans-Serif and Monospaced

Character spacing is the same for each character; letter strokes do not
have finishing lines.

Ignored Font Formats

You must select a universal font and font size for recognized text in
your document if you choose Ignore Fonts and All Formatting in the
Settings Panel OCR options.

OmniPage will ignore the font styles of characters during recognition.
Instead, all of the characters will be formatted in the recognized
document according to the font and font size you select. Select a font
in the Font drop-down list box and type a font size in the Font Size
edit box
Spelling Options

Click the Spelling icon in the Settings Panel to select dictionaries
and spell checking options.

Dictionaries

You can select one main dictionary and one user (personal) dictionary.
OmniPage uses the selected dictionaries for checking recognition and
the Language Analyst; be sure to always select the appropriate
dictionaries for your document.

Mclin Dictionory

Select a main dictionary in the Main Dictionary drop-down list box.
Main dictionaries have the file extension .ndx.

OmniPage is delivered with the United States English main dictionary,
useng.ndx, and the United Kingdom main dictionary, ukeng.ndx.
International versions of OmniPage also include dictionaries for other
languages. To order dictionaries for additional languages, call your
local Caere distributor or call Caere at (800) 535-SCAN.

User Dictionclry

Select a user (personal) dictionary from the User Dictionary drop-down
list box. User dictionaries have the file extension .ud.

To create a user dictionary or edit an existing user dictionary, choose
Edit User Dictionary... from the Settings menu. For more information on
creating and editing a user dictionary, please see Edit User
Dictionary... on page 2-68.
Spell Checking Options
You can select the following spell checking options to be used by the
Language Analyst and the check recognition process:

Ignore Acronyms
Ignore Proper Nouns
Ignore Abbreviations

Ignore Acronyms

OmniPage will ignore entirely capitalized words of four characters or
less (for example, HUD, USDA).

Be sure to deselect Ignore Acronyms if you want the acronyms in your
User Dictionary to be checked or if you want to add acronyms to your
user dictionary.

Ignore Proper Nouns

OmniPage will ignore a word not beginning a sentence that has a
capitalized first letter (for example, in He sawJane throw... OmniPage
ignores the name Jane).

Be sure to deselect Ignore Proper Nouns if you want the proper nouns in
your User Dictionary to be checked or if you want to add proper nouns
to your user dictionary.

Ignore Abbreviations

OmniPage will ignore a capitalized letter followed by three or fewer
lowercase letters and a period (for example, Mrs., Dr., etc.).

Be sure to deselect Ignore Abbreviations if you want the abbreviations
in your User Dictionary to be checked or if you want to add
abbreviations to your user dictionary.
Preferences Options

Click the Preferences icon in the Settings Panel to customize ~ general
OmniPage operations.

Save Page Images in Caere Document

Select this option to save original page images in Caere Documents. An
image is the "picture" of text and/or graphics that appears in the zone
window when you scan a page or open a TIFF image file.

To the page image, make sure Save Page Images in Caere
 ~ Document is selected before you scan or load a page image.

You can reopen a Caere Document in OmniPage, make edits to recognized
text, and save it in any other supported file format. However, you must
save the original page images in a ~aere Document in order to:

 Retain graphics.

 Verify recognized text with its original image.

 Re-recognize pages.

 Defer recognition (Professional version only).

Saving a Caere Document without page images allows quicker processing
and saves disk space but does not allow any of the above operations.
Preferences Options

Prompt Before Deleting Pages

Select this if you want OmniPage to prompt you before carrying out the
Delete Page command. This gives you the option to cancel the operation
before deleting a page.

Save Senings on Quit

Select this if you want to automatically save the current OmniPage
settings when you exit the program. The Settings Panel options,
language selection(s), and scanner selection will be retained until you
select new settin~s.

Reject Character

Unrecognizable characters are represented by a red reject character in
the recognized document. In the Reject CharacteY edit box, type in any
character that you want to be the reject character. The default
character is a tilde (~).

For example, if OmniPage could not recognize the J in REJECT, and the
tilde (~) was the reject character, the string RE~ECT would appear in
your recognized document.
Chapter 5

Editing Recognized
Documents

The OmniPage editor is designed for quick and efficient editing of any
errors in your recognized document. It also has text editing and page
formatting capabilities. Additionally, if you are using OmniPage
Professional, you can use Image Assistant to edit graphics in your
recognized document.

Remember that OmniPage is designed to be used in conjunction with
word-processing and desktop publishing applications, not to replace
them. Extensive editing of your recognized document is more efficient
in the applications designed for that purpose. For example, you can
recognize a document, correct any errors, make some text and formatting
changes, and then save your document to another application to continue
working with it.

This chapter discusses the factors that influence the output of your
document, including:

 Choices Before OCR

 Editing Options After OCR

 Saving a Recognized Document
Choices Before OCR

The choices you make before performing OCR on your document have a
significant impact on the resulting text format, page format, and
accuracy. In particular, the following factors are important:

 OCR Output Options

Font Options

Retaining Graphics

Language Analyst

Language and Dictionary Selections

Output Options

The OCR output option that you select in the Settings Panel determines
the way text and paragraph formatting will appear in your recognized
document. You can select True Page - Retain All Page Formatting
(Professional version only), Retain Fonts and Paragraph Formatting, or
Ignore Fonts and All Formatting.

A Regardless of the OCR output option you select, to retain graphics in
your recognized document, you must also select Retain Graphics.

Select an
OCR output option
True Page - Retain All Page Formatting (Professional version only)

Select True Page - Retain All Page Formatting as the OCR output option
if you want the recognized document to match the original page layout
as closely as possible.

Select True Page formatting only if you want to preserve the original
page layout. If you plan to do extensive editing, such as adding
additional paragraphs, you should select another OCR output option.

True Page retains font characteristics, paragraph formatting, and the
relative positioning of columns to match the original page layout. If
you also selected Retain Graphics, graphics will appear in the same
position as they were in the original page.

contrast you'll see l~ sarnple s of how yow Irnage will look with 15
c~fferent contrast settings
Choices Before OCR

Retain Fonts and Paragraph Formatting

Select Retain Fonts and Paragraph Formatting as the OCR output option
if you want your recognized document to retain the fbnt characteristics
and paragraph formatting of the ori~inal ima~e.

This option does not retain the original page layout; it formats
recognized text in a single column. If you also selected Retain
Graphics, any graphics in your document will appear at the bottom of
the recognized page.

aper ~}~

.:
~n~rocuctlon of the personal computer has caused dramatic changes in
the way businesses and individuals access, retrieve, share, store,
analyse, and present information
Today, more than 60 percent of the work force spends its time creating
processing or distributing information compared to just 17 percent in
1950 Yet in spite of the widespread use of computers, the promise of a
paperless office is far from a reality More than 90 per cent of me
information generated today resides on paper In fact more than 150
billion pages of information are generated each year
An estmated 10 percent of the information
Ignore Fonts and All Formatting

Select Ignore Fonts and All Formatting as the OCR output option if you
plan to do a lot of editing or reformatting of the text in your
recognized document. OmniPage will remove the font characteristics and
paragraph formatting. Recognized text will appear in the font that you
select for Ignored Font Formats in the Settings Panel Font options.

This option does not retain the original page layout; it formats
recognized text in a single column. If you also selected Retain
Graphics, any graphics in your document will appear at the bottom of
the recognized page.

Paper Proliferates Despite Information Age Expectations

The introduction of the personal computer has caused dramatic
individuals access, retrieve, share, store, analy~e ~oday, more than 60
percent of the work force spends its time creating, processing, or
distributing information, compared to just 17 percent in 1950 Yet in
spite of the widespread use of computers, the promise of a paperless
office is far from a reality More than 90 percent of the information
generated today resides 'on paper In fact, more than 150 billion pages
'of information are generated each year An estimated 10 percent of the
information ,used by an organization is reused in some way, ,and
research shows that percentage is increas' ing Employees in large and
small businesses, educational organi:~ations, and
Choices Before OCR

Font Options

The font choices that you make in the Settings Panel determine the
appearance of text in your recognized document.

If you select True Page or
Retain Fonts and Paragraph
Formatting, select fonts to map to the original page's font styles.

Depending on the OCR output option you select in the Settings Panel,
you will select either Retained Font Formats or Ignored Font Formats in
the Settings Panel Fonts options.

The drop-down menus display all of the fonts installed on your system.

'on~ F~

_ ,.

OC~
_        _  .

    1~      ,  S~nS
~_~
  ,   ~fe :I
FRI1I

      Fe~f
~;;;

S pellmg
. _

~d MQr~&~C~ [ o~ el ~I

:I FRI~ #nui~

~e~f ~h~        F~r~SIze:

If you select Ignore Fonts and All
Formattinq, select a universal font
and font size for all text.
Retaining Graphics

OmniPage can retain graphics in the original page image, such as photos
or diagrams, and display them in your recognized document. To do so,
select Retain Graphics in the Settings Panel OCR options before
recognition.

OmniPage Professional users can select True Page as the OCR output
option so that graphics appear in their original position. Otherwise,
graphics appear at the end of the recognized page.

To retain graphics, make sure to also:

Select Save Page Images in Caere Document in the Settings Panel
Preferences options before you scan or load an image.

Select Multiple Columns as your Zones option so OmniPage can
distinguish graphics from text. Or, if you select Single Column or
Table, create manual zones around graphics and identify their zone
contents with the Graphic zone contents file.

After recognition, OmniPage Professional users can edit retained
graphics by launching Image Assistant directly from OmniPage. Simply
double-click the graphic in your recognized document and Image
Assistant launches with the graphic in a new image window. For more
information about Image Assistant, please read the Image Assistant
tutorials booklet and refer to its online Help program.

You can save retained graphics individually or with the entire page
image. For more information about saving graphics, see Export Image...
on page 2-19.
Choices Before OCR

Language Analyst

The Language Analyst uses information about language context and usage
rules to evaluate characters and words during the recognition process.
This method returns more accurate results than simply spell-checking
after recognition is complete.

Select Use Language Analyst in the Settings Panel OCR options.

Make sure to select the appropriate language, main dictionary, and user
dictionary for the document you are recognizing.

The Language Analyst uses the dictionaries to analyze and correct text
during recognition. Words corrected by the Language Analyst appear in
blue in your recognized document. The Language Analyst shuts itself off
automatically when it detects that the dictionary information is not
improving recognition results. For example, if the main dictionary does
not match the primary language of your document, language analysis will
terminate.

If your original is very clean with crisp text, you may want to
deselect the Language Analyst to increase recognition speed.
Languages and Dictionaries
For the best recognition results, be sure to select the appropriate
language for your document and select main and user dictionaries
specific to that language.

Lclngu~ges

OmniPage supplies the appropriate characters (such as circumflexes,
umlauts, etc.) for recognizing the following languages:

English
German
French
Italian
Dutch
Spanish
Swedish/Finnish
Portuguese
Danish
Norwegian
Irish/Gaelic

Select one or more language character sets using the Select Language...
command in the Settings menu.

E ngEsh

Flonch
I lalian
Dulch
Spanish
Swedish~Finmsh
P(tl~uguese
Danish

For fastest recognition, use only the minimum number of languages that
are necessary. You should select only one language if you use the
Language Analyst or 3D OCR (Professional version only).

See Foreign-Language and Multilingual Documents on page 256 for an
explanation of how to recognize foreign-language and multilingual
documents.
Choices Before OCR

Dictionaries

Select the appropriate main and user dictionaries for your document in
the Settings Panel Spelling options.

The Worldwide English version of OmniPage is delivered with the United
States English main dictionary, useng.ndx, and the United Kingdom main
dictionary, ukeng.ndx. To order dictionaries for additional languages,
call your local Caere distributor or call Caere at (800) 535-SCAN.

You can create your own user dictionaries. To create a new user
dictionary, follow these steps:

Choose Edit User DictionaYy... in the Settings menu.

The Select File dialog box opens.
Choices BefoYe OCR

2 Click New.

The File to Save dialog box opens.

3 Type in a name of eight characters or less for the new dictionary and
click OK.

For example, if you were creating a French user dictionary, you might
type fruser. OmniPage automaticallv aPPends a .ud extension.

The Edit User Dictionary dialog box for the new dictionary opens.

You can add words to the dictionary directly or import wnr~l~ frnm a
text file:

 Click Add to add the word that you type in the User word edit box to
your dictionary. The word will appear in the list box.

 Click Import... to add words from another application to your user
dictionary.

A dialog box prompts you to enter the file name and directory of the
file you want to import; it can be any document or word list in ASCII
format. OmniPage will go through the selected text file discard words
already in the main or other user
dictionaries, and add the remaining words to the new dictionarv.

A You can also add words to your user dictionary
C~, interactively using the Check Recognition command after
recognition.

Editing Options After OCR

After you recognize your document, it appears in the text window. At
this point, you can check recognition, verify the image, do some page
formatting and text editing, and save the document in the desired file
format.

Overview of the Text Window

You can use various editing tools in the text window to edit your
recognized document.

line-spacing     Alignment buttons buttons

Caere designs, develops, manufactures and markets Informahon products
The Company~s products provlde a low cost, accurate means bar code data
mto computer usable form For many applicatlons, the atkactive
altemative to manual data enky, which is slow, tedious and e offers two
famlhes of information recogmtion products OmniPage, September 1988, is
a page recognition software product with version SEJ30, the IBM PC AT
and compatibles (with a coprocessor card) an Company also markets a
line of OCR and bar code data enky products products in 19~ and its bar
code products in 1983 As a pioneer in th believes that infommation
recognition markets, whether OCR, bar code technology driven, cost
sensitive and often slow to develop Building OCR, the Company's skategy
is to identify and pursue markets in v ' can be cost effectively
automated

The earliest infommation recognition systems required propriet Lower
cost recognition systems were subsequently inkoduced, but th~
recognizing a few type styles and slzes and were unable to recognize t~

C ha racterformatting buttons Ibold, italics, underlinel
Checking Recognition

The appearance of the text in your recognized document indicates the
overall results of recognition. Look for:

 Blue text: words corrected by the Language Analyst.

Green text: suspects, or questionable characters, which OmniPage made
an attempt to recognize.

Red text: reject characters (~ is the default) representing
unrecognizable characters.

Use the Check Recognition button or the Check Recognition... command in
the Edit menu to identify possible OCR errors and missl~ellin~s in the
reco~nized document.

OmniPage uses the currently selected main and user dictionaries to
check recognition. When OmniPage finds a possible error, the Check
Recognition dialog box shows the image bitmap of the error in the
context of the original page mage.

To see character bitmaps, be sure Save Page Images in Caere Document is
selected in the Settings Panel Preferences options before you scan or
load an image.

Caere designs~ develo

The Change To drop-down list box provides a list of suggested
replacements for a word flagged as an error. You can:

Click Ignore to ignore the word in future instances. Click Change to
change the word as suggested. Type in or select another word and click
Change.
 Click Add to add the word to the user dictionary.
After you choose an editing option for a word, OmniPage automatically
continues to find the next possible error.

Veri~ying the Im age

You can compare text in your recognized document with its original
image in the Verification Window.

In order to verify images, be sure Save Page Images in ,/~, Caere
Document is selected in the Settings Panel
Preferences options before you scan or load an image.

To see its original image, simply double-click a word in the text
window or choose Verify Image in the Edit menu. The Verification Window
opens displaying a clear close-up of the selected word and surrounding
area of text.

Caere designs~ develops, manufactllres a lucts. The Company's products
provide a

Click in the text window again to close the verification window.

You cannot verify the image of text that is cut and pasted ~ . ~ from
one page to another.

Formatting the Page and Editing Text

OmniPage provides several tools to help you edit the text and page
format in your recognized document. If you plan to make a lot of
changes to the recognized text, however, it is generally more efficient
to do so in your word-processing or desktop publishing application.

True Pc~ge Format~ing (Professionc~l version only)

If you are using the OmniPage Professional version and selected True
Page - Retain All Page Formatting before recognition, the font
characteristics and page layout of your recognized document should
closely match the ori~inal ima~e.
Editing Options After OCR

With True Page, OmniPage produces various text and graphic zones in the
recognized page which you can resize or reposition to change the page
layout. Choose the Select Recognized Zones command in the Edit menu to
select all of the text and graphic zones in the page. Handles appear on
each zone.

1~ ' 17 ' ! ~           ' ;r

e~aper

 The mhroduchon of the~ersonal computer has caused dramahc changes m
the way busi~lesses and individuals access, rehleve, share, store,
analyze, and present informahon

 Today, more than 60 pe~cent of the work         ~ To man
force spends its hme creahng processlng or       easy-to-u
dishibuhmg mformahon compared to just 17         informaho
percent in 1950 ~et m spite of the widespread    ~uickly tr
~se of computers, the promlse of a paperless ~ puters
offlce is far from a reality More than ~o per-
cent of the informahon generated today resides
on paper In facS more than 150 billion pages     I OmniPag
of informahon are generated each year            Lproducts,

n eshmated l o percent of the informahon
~sed by an organizahon is reused in some way,
~nrt r,s,~rrh shnwc th~t n,rr,nta~e is increas-

Fecognlho computer mprove lt

A You can also select zones individually by placing the mouse pointer
inside a zone and doing an Alt-right mouse click.

To resize a selected text or graphic zone, use your mouse to drag a
handle in the direction that you want to enlarge or reduce the zone. To
move a selected zone to another area of the recognized page, place the
mouse pointer inside the zone, hold the mouse button down, and drag it
to the desired location.

Choose Select Recognized Zones in the Edit menu again to deselect the
zones in the recognized document.
Editinq O~tions After OCR

Click in a selected zone and choose Delete Recognized Zone in the Edit
menu to delete that zone.

Paragraph Formatting

You can change the line spacing and alignment attributes of a selected
paragraph in your recognized document.

Paragraph formatting will not be retained if you save a C~ file in
ASCII or ANSI format.

To apply paragraph formatting, follow these steps:

Place the cursor somewhere within the paragraph that you want to
format.

2 Choose Paragraph... in the Format menu.

3 Make the desired formatting selections in the Paragraph Format dialog
box.

9~1~

You can select Single, Double, or Triple line spacing. For paragraph
alignment, you can select Left, Center, Right, or JustifY.

4 Click OK to accept the formatting selections; the selected paragraph
will change accordingly.

Click Cancel to exit without applying the formatting selections.

You can also use the line-spacing and alignment buttons in thf~ t~xt
win~lnw f~r form~ttinu chnrt~lltc
Tab Formatting

Use the Tab-setting buttons in the text window to insert tabs in Your
reco~nized document.

Leftaligned tab button ~ E3E3~3~ Decimal-aligned tab button

Center-aligned tab button Right-aligned tab button

To apply tab formatting, follow these steps:

Select the paragraph in which you want to set tab stops.

2 Click the appropriate Tab-setting button (left, center, right, or
decimal).

Click the area in the upper-half of the ruler where you want to place
the tab stop.

4 Repeat steps 1 through 3 to continue setting tabs.

Character Formatting

You can change the attributes of a selected character or section of
text in a recognized document.

Character formatting will not be retained if you save a
 ~ file in ASCII or ANSI format.

To apply character formatting, follow these steps:

Position the text cursor at the start of the text, hold the mouse
button down, and drag the cursor across the text to highlight it.

Release the mouse button when you have selected the desired area of
text.

Choose Character... in the Format menu.
Editin~ Options After OCR

4 Make the desired formatting selections in the Font dialog box.

You can select multiple attributes for text, including font, font
style, size, and effects. The Sample box illustrates the attributes
that you select.

5 Click OK to accept the formatting selections.

The selected text changes accordingly.

Click Cancel to exit without applying the formatting selections.

You can also use the Bold, Italics, and Underline buttons in the text
window for convenient formatting shortcuts.

Other Useful Editing Commands

Choose SelectAII in Page in the Edit menu to all the text in the text
window. This is an easy way to apply formatting changes universally. To
deselect a selected page, click anywhere in the text window or choose
Select All in Page again.

Use the Cut button or the Cut command in the Edit menu to cut selected
text in a recognized document.

Use the Copy button or the Copy command in the Edit menu to copy
selected text in a recognized document.

Use the Paste button or the Paste command in the Edit menu to paste cut
or copied text in a recognized document.
Use the Find/Replace button or the Find/Replace command in the Edit
menu to find and replace words in a recognized document.

For more information about these commands, see their respective menu
entries in Chapter 1, Commands and Settings.

Saving Your Document

Use the Save As... button or Save As... command in the File menu to
save your recognized document to the desired file format.

To save your recognized document in more than one file format, you can:

Save the file as a Caere Document (*.met).

By saving your document as a Caere Document, you can continue to reopen
it in OmniPage, make edits, and save it in any other supported file
format you wish. A Caere Document can have up to 255 pages; each page
can include the original image, zones, and recognized text.

Save the initially recognized document in each desired format using
Save As... while it is open in the text window. Remember, only a Caere
Document can be reopened (and resaved in a different format) in
OmniPage .

The way text appears when you open your recognized document in another
application depends on the features of the chosen file format and
application.

For example, if you save a page with text and graphics in ASCII format,
only the text will be displayed when you open the file in a new
application because ASCII format does not retain graphics. Likewise,
graphics are only displayed in applications that support that
capability.

Normal differences in typeface sizes between applications can result in
differences in the page formatting and display of the text. The
settings within the application, such as margins, also affect the page
layout.
Savina Your Document

If you use the True Page option (Professional version only), OmniPage
exports text in frames. If your application doesn't accept frames, the
text frames are not maintained in their original positions and the text
within the frames is displayed in one, vertical column.
Chapter 6
Improving Performance

You can make OmniPage run faster and recognize text more accurately by
learning how to use a few different settings.

Speed up OmniPage by selecting Manual Brightness, by turning the
Language Analyst feature off, and by manually selecting zones for text
recognition.

Computing power is what affects speed the most. A 486 computer is
dramatically faster than a 386. Also, 8MB of system RAM is a minimum;
as with most CPU-intensive programs, more memory is better.

Improve text-recognition accuracy with the Professional version by
selecting 3D OCR with AnyPage/HPAccuPage 2 or by selecting Auto
Brightness with AnyPage/HP AccuPage 2 with the non-Professional
version.

The Language Analyst feature improves accuracy considerably, as does
taking into account document quality, scanning angle, and paper
transparency.
Improving Speed

OmniPage is designed to run automatically, making text recognition easy
and effortless. However, the automatic features can take longer to
work. Using the Manual Brightness setting and turning off the Language
Analyst can make OmniPage run faster.

The 3D OCR with AnyPage/HP AccuPage Z (Professional version only) and
Auto Brightness with AnyPage/HP AccuPage 2 features improve accuracy
considerably with a variety of documents. However, these automatic
features sacrifice speed to provide better accuracy.

The brightness settings are available in the Settings Panel scanner
options. To access this panel, choose Settings Panel in the Settings
menu or click on the Settings Panel button in the toolbar. Then click
on the Scanner icon in the left of the panel.

Manual Brightness

Use the Manual Brightness control in the Settings Panel Scanner options
if you're scanning high-quality printed documents with crisp, black
text printed on a white background .

If text characters on your document tend to be thick and overlapping,
adjust the brightness slide towards Lighten. If characters appear thin
and broken, adjust the setting towards Darken. If characters appear at
an angle, reposition the document in the scanner and rescan.
The Character Window appears while OmniPage performs text recognition.

~ frc

It shows samples of the scanned image as OmniPage sees them.
The following figure shows how well-formed characters appear in the
Character Window. No special brightness adjustment is needed.

The following figure shows how thin, broken characters appear in the
Character Window. Try adjusting the brightness control toward Darken
and rescan.

The following figure shows how thick, run-together characters appear in
the Character Window. Try adjusting the brightness control toward
Lighten and rescan.

Language Anal~st

The Language Analyst feature uses information about language context
and usage rules to evaluate characters, compute likely errors, and
determine replacement words. It improves text recognition on difficult
documents considerably.

However, as with the auto brightness features, if you scan high-quality
documents with crisp, black letters printed on white paper, recognition
is faster with the Language Analyst tnrn~ ff
The Use Language Analyst setting is available in the Settings Panel OCR
options. Choose Settings Panel in the Settings menu or click the
Settings Panel button in the toolbar. Then click the o~R icon in the
left side of the panel.

Zones

The Manual Zones feature lets you draw selection boxes around just the
parts of a page you want recognized. Using this feature, you don't need
to wait while OmniPage recognizes unnecessary text.

See Complex Layouts on page 2-38 for detailed instructions on how to
use the Manual Zones feature.

Set Up a Permanent Windows Swap File

To increase OmniPage's speed, set up a permanent Windows swap file
(virtual memory) with at least 4MB of free, contiguous disk space. For
more information, please see Setting up a Windows Swap File (Virtual
Memory) on page 2-8.
Improvina Accuracy

Improving Accuracy

If you scan typeset, high-quality printed pages, you will probably find
that OmniPage recognizes text perfectly: the text that appears in your
word processor matches the text in the scanned page letter for letter.

With lesser-quality pages, text-recognition accuracy will be poorer.
These factors most affect text-recognition accuracy:

 Document Quality

 Scanner Options

 Scanning Angle

 Scanner Glass Clarity

 Paper Transparency

Docu ment Quality

OmniPage recognizes characters in almost any font from 6 to 72 points
in size. However, keep the following in mind when using OmniPage:

 The print should be reasonably clean and crisp. Characters must be
distinct: separated from each other and not blotched together or
overlapping.

 The document should be free of notes, lines, or doodles; anything
that is not a printed character will slow OmniPage considerably, and
any character distorted by a mark will be unrecognizable.

 The document font should be non-stylized; for example, OmniPage won't
recognize the Zapf Chancery font accurately.

 It's hard to recognize underlined text accurately; the underline
changes the shape of descenders on the letters q, g, y, p, and j.
Scanner Options

The Scanner options are your most powerful means of improving
text-recognition accuracy. They are available in the Settings Panel
Scanner options.

The 3D OCR with AnyPage/HP AccuPage 2 feature
(Professional version only) recognizes text most accurately on the
widest range of documents: faxes, copies of copies, etc.
This setting, when used with the Language Analyst, provides the best
recognition accuracy possible. This feature is only
available with grayscale scanners.

The Auto Brightness with AnyPage/HPAccuPage 2 feature uses Caere
AnyPage or HP AccuPage technology to improve accuracy considerably if
your page is dirty, if text is printed on a colored background, or if
the page has shading from a copy machine. It is slightly less accurate
and slightly faster than the 3D OCR with AnyPage feature. This feature
is only available with grayscale
scanners.

Scanning Angle

Make sure that the document is positioned correctly in your scanner and
is not slanted. Even if you put a page in the scanner correctly, it is
still possible for the page to be turned slightly so that text will be
difficult to recognize. The final document may have missing characters,
split lines of text, or several words
recognized incorrectly if the page is not scanned correctly.
Improving Accuracy

If you notice that the page is crooked in the Character
Window, adjust and rescan it. If you are scanning a multiplepage
document and notice poor recognition on certain pages, it may be that
those pages were crooked in the scanner. Try
scanning them again.

Scanner Glass Clarity

The sheet of glass on the flatbed of the scanner must be clear. If it
gets dirty, wipe it gently with a soft, damp, lint-free cloth or
tissue. Be sure it is completely dry before you put pages on

Paper Transparency

Some paper is thin enough that the scanner sees text printed on the
opposite side of the scanned page. This is often the case with
telephone-book pages. To correct this problem, put a black piece of
paper behind the page between the page and the lid of the scanner.
Chapter 7
Troubleshooting

Use this chapter if you have trouble getting the program or your
scanner to run properly. If the program runs but there are many errors
in your text files, or the program seems too slow, see Chapter 4,
Improving Performance.

There are seven sections in this chapter:

Before You Begin

Installation

Scanners

Memory

Operation

Error Messages

Caere Product Support

As a general rule for any software product, if you're having trouble
and you can't figure out what to do, it may help to reboot and restart
the program. Reinstalling the software often eliminates inexplicable
problems. If possible, be sure to save any open files in OmniPage or
other applications before you reboot.
Before You Begin

Before You Begin

Whatever problem you are experiencing with OmniPage, first verify that
your computer, scanner, and other applications are functioning
properly.

 Make sure that your system meets all the requirements for hardware,
memory, and software as listed in Chapter 1, Installation.

 Verify that the scanner is plugged in, turned on, and that all cable
connections are secure.

 Check to see that the image-scanning software that came with your
scanner is installed and working properly.

Resolve any problems that occur with Windows or your image-scanning
software before you try using OmniPage again. You should also run
virus-checking software regularly to ensure that performance problems
are not caused by a virus.
Installation problems may result from:

An inadequate or incompatible system configuration-- double-check that
your system meets the system requirements listed in Chapter 1,
Installation.

 A bad disk or corrupted file.

Installing OmniPage with the Norton Desktop

Some versions of the Norton Desktop are incompatible with the OmniPage
installation program. If you run Windows under Norton Desktop and you
have difficulty installing OmniPage, follow these procedures:

Open the Windows system.ini file in a text editor and change the line
Shell=ndw.exe to Shell=progman.exe.

This will make the Windows Program Manager appear when you start
Windows rather than the Norton De sktop .

If you're not sure how to edit the system.ini file, consult Your
Windows User's Guide.

2 Save the file.

3 Restart Windows .

4 Install OmniPage according to the instructions in Chapter 1,
Installation.

Re-open the Windows system.ini file and change the linP ~h~ ro~man.exe
back to Shell=ndw.exe.

This will make the Norton Desktop appear when you start Windows.

6 Save the file.

7 Restart Windows .
Conflicts with Disk Cache Programs

Some disk cache programs interfere with memory allocation and will
prevent you from installing successfully. If you are using a disk cache
program other than Windows smartdrv.sys, temporarily disable it and try
to install OmniPage again. Then
A ~ A h I A t h A~h
A r~ r ~1 ~ r A mA
n d v e r i f vt h
a t i t w o r k sP
e r f o r m a n c e
Installation

will not be acceptable if you do not use a disk cache program: in most
cases, you should use smartdrv.sys.

Using E M M 386.EXE

You may not be able to use this memory manager with OmniPage. Try
running OmniPage with just the default Windows memory management
programs. Consult your memory manager's documentation for instructions
on how to de-install emm386.exe from your config.sys file.

SETU P repeatedly requests the same disk.

If the correct disk is in the disk drive, the disk is probably damaged.
To check the disk, exit the installation program and Windows. From the
DOS prompt, type dir B: (if you are installing from drive A:, type dir
A:). If you receive an error message from DOS, the floppy disk is
damaged.

If you are able to see the disk directory, try to copy a file from the
OmniPage disk to your hard disk. DOS may be unable to copy files from
the disk even if it can read the directory. If the disk is damaged,
contact Product Support for a replacement. See ~aere Product Support on
page 7-25.

Testing OmniPage with a Simplified System

If OmniPage won't run correctly, try running it on a simplified setup:
a system with no network commands, other drivers, memory managers, etc.
This will eliminate the possibility of conflicts with any other devices
or drivers.

Use a text editor to comment out any memory-resident device drivers and
applications from your autoexec.bat and config.sys files not used by
Windows, OmniPage, your scanner, your hard drive, or your monitor and
reboot your system. See your DOS documentation for instructions on how
to edit your autoexec.bat and config.sys files.

If OmniPage then runs, you'll know that your problem is a conflict with
an item in your system's config.sys or autoexec.bat files. One by one,
you can add items back to the config.sys and autoexec.bat files,
reboot, and start OmniPage. When OmniPage no longer runs, you'll know
that the last item you added is incompatible with OmniPage. Do not run
emm386.exe in your config.sys file.
Once a scanner is installed and working with its image scanning
software, most users can install and use OmniPage with no other changes
to their system.

To get up and running as quickly as possible, install your scanner
hardware and any software you received with it, including the scanner
driver, according to the manufacturer's instructions. Use the scanning
software supplied by the manufacturer to be sure that the scanner is
working on your system before scanning with OmniPage. Consult your
scanner documentation or the manufacturer's product support if your
scanner does not work with the manufacturer-supplied scanning software.
Resolve any problems before continuing.

If your scanner operates with the manufacturer's image scanning
software but not with OmniPage, use the following topics to pinpoint
and correct the problem.

The Scan Image commands are grayed out.

This usually happens when OmniPage defaults to the scanner setting No
Scanner. Make sure that your scanner is turned on and choose Select
Scanner in the Settings menu. Select the scanner that you are using and
click OK.

"Can't Open Scanner" message displays.

Make sure your scanner is turned on. If the scanner was turned off when
you started OmniPage, turn it on and choose Select Scanner in the
Settings menu. Select the scanner that you are using and click OK. If
this does not work, attempt to scan with the software that came with
your scanner to see if the problem is with your scanner hardware.

Microtek Scanners

Set the scanner speed to 2 for best accuracy. Consult your Microtek
scanner documentation for instructions on how to set scanner speed.

Testing OmniPage with the Sample Pages

If OmniPage successfully recognizes text from an image file, the
problem is probably scanner related. If OmniPage is unable to nerform
reco~nition at all, review the installation and
Scanners

operation troubleshooting sections and the corresponding error messages
in the Error Messages section of this chapter.

Use one of the OmniPage Sample Pages (such as the Multiple Column Page
Sample) to verify the functionality, recognition performance, and
accuracy of OmniPage. Successful completion of the test procedure does
the following:

 Verifies OmniPage's ability to perform text recognition.

 Provides a benchmark for recognition time on your system.

 Verifies recognition accuracy independently of your document or
scanner image quality.

If OmniPage seems to run slowly even with all other applications
closed, use the test file to find the typical recognition time for a
good-quality document and image on your system.

The test file will produce a text file with near-100% text recognition
accuracy. If you are unable to achieve a similar level of accuracy with
your documents, review Chapter 4, Improving Performance, for possible
problem areas and solutions.

To load the TIFF file:

Open the Settings panel by choosing Settings Panel... in the Settings
menu.

2 Click Use Defaults in the Settings Panel.

3 Click OK to confirm that you want to reset all settings.

4 Select Load Image in the drop-down list under the Image button in the
toolbar.

5 Click the AUTO button in the toolbar.

The Load Image dialog box will appear.

6 Open the file test.tif in your omnipage/data directory.

OmniPage should now determine text zones and perform recognition.
Checking the Scanner Driver Name and Version

At least two files allow OmniPage to communicate with the scanner
hardware:

One or more Device Driver files, supplied by the scanner manufacturer,
which tell your computer how to communicate with the scanner.

A Scmgr file, supplied by Caere, which tells OmniPage how to
communicate with your computer and the scanner. The scmgr file is named
in the format scmgrxx.exe where x is a number.

Device Drivers

Scanners require that a file called a device driver be installed on
your hard disk. This file tells the computer how to communicate with
the scanner. The name of the file is referenced in your config.sys or
autoexec.bat file.

When you turn your system on, DOS reads the config.sys and autoexec.bat
files. It sees the reference to the device driver file and then reads
that file.

The name of the file referenced in the config.sys or autoexec.bat file
must match the name of the file installed on your disk. Also, the
version of the file must be compatible with OmniPage .

Check the Supported Scanners list in the Release Notes to find the
device driver name and version for your scanner. Compare this
information with the contents of your config.sys or autoexec.bat file.
Look for the driver version in the message given by the driver as it
loads (when your system boots) or on the diskette label of your
scanning software. Sometimes the scanning software provided by the
scanner manufacturer is not the device driver OmniPage needs.

If the device driver name and version in your config.sys or
autoexec.bat file does not match the information listed for your
scanner, check the Supported Scanners list or the drivers in your
OmniPage directory to see if the correct driver is supplied with your
disk set. Then modify your config.sys or autoexec.bat to use the
Caere-suPplied driver.

If the correct device driver is not supplied on your OmniPage disk set
(which will happen if Caere does not have a license to
distribute the driver), contact [ne scanner manufacturer and request
the driver and version specified in the Supported ~nnf~rc li~t in th(~
R~l~a~tq Nnt~.

If you use an extended memory manager, avoid loading your scanner
driver high in memory. Consult your memory manager's documentation for
detailed information about loading drivers high.

For information on how to edit your config.sys or autoexec.bat file,
see your DOS Operations Manual. Be sure to reboot your computer when
you have finished editing. DOS will load the new device driver when the
system reboots.

Record any parameter information for the device driver. For a few
scanners, this information will be required when you select your
scanner for the first time or if you change your ~ann~r installation.

Here are a few examples of device driver entries in a config.sys or
autoexec.bat file.

A typical config.sys entry for a Microtek scanner device driver would
look like this:

This line in your config.sys file tells DOS to load the device driver
named mscan.sys, which is located in the omnipage (or omnipro for
Professional users) directory on the C: drive.

Some device drivers require one or more parameters designating port
address, interrupt (IRQ), or Direct Memory Access (DMA) channel. The
values for the parameters are determined by the switch settings and
addresses used in the scanner hardware installation. See your scanner's
documentation for information on these settin~s.

An entry in your config.sys for the Complete Page Scanner might look
like this:

OmniPage Professional users would have a directory named omnipro.

Other device drivers are loaded with a batch ~ile, usually the
autoexec.bat file, instead of the config.sys.
An entry in your autoexec.bat for the Canon IX-12 scanner could look
like this:

\omnipage\ixhnd2/08

This line in your autoexec.bat file tells DOS to load the device driver
called IXHND2 and defines the port address and memory address for the
device driver to use when communicating with the scanner interface
card.

/j\ Load only one device driver for your scanner at a time. C~ Multiple
device drivers for the same scanner may cause problems between OmniPage
and the scanner hardware.

Scmgy

The OmniPage Scmgr file lets OmniPage communicate with your system and
your scanner. There is a different Scmgr file for each scanner or
scanner family supported by OmniPage. The device driver entry in your
config.sys or autoexec.bat must be the device driver the Scmgr file
expects or an error will occur. Choose Select Scanner in the Settings
menu if you are unsure which scanner is currently selected. Then check
the Supported Scanners list in the Release notes and make sure the
appropriate Scmgr file is located in your OmniPage directory. The scmgr
file is named in the format scmgrxx.exe where x is a number.

If you can't find the appropriate Scmgr file, reinstall OmniPage
software.

Checking the Scanner Hardware

If you experience a problem between OmniPage and your scanner, make
sure the hardware used matches the hardware OmniPage supports. For
example, OmniPage supports a Canon IX-12 scanner, but only with the
Canon IF-3 interface card, not the older IF-2 or the JLaser interface
card for Canon scanners. Check the Supported Scanners list in the
Release notes.

For a few scanners, you will also need to know the scanner interface
card switch settings and addresses and enter them in the Select Scanner
dialog box. Refer to your scanner's owner's manual for more detailed
information.
Scanners

Changing your Scanner Installation

If you change scanners, you will need to register the new scanner with
OmniPage.

After you have installed and tested your scanner with the
manufacturer's scanning software, start OmniPage and choo Select
Scanner in the Settings menu. The Select Scanner dial~ box will appear.
Your scanner's name should appear in the li box. Highlight the name of
your scanner and click OK.

With a few scanners, you may need to provide the 1/0 (po address and
speed. Refer to your scanner's owner's manual for more detailed
information.

Scanning Causes System Crash

If you experience a system crash when you try to scan, add the line
EMMExc1ude=A000-EEFF under [386Enh] in your system.ini file and restart
Windows.
It may be difficult to determine that OmniPage is running poorly due to
a memory problem. However, you can optimize your system to reduce the
possibility. Here are a few tips:

Use a known compatible extended memory manager like himem.sys.

Add the EMMExclude=A000-EFFF line to the system.ini file [386enh]
section. The characters after the A are zeroes.

If using Quarterdeck's QEMM386 as the memory manager, add a NOEMS
switch to the qemm386.sys statement in config.sys. You must also remove
the stealth feature from this command (ST:x). The command should be as
simple as possible:

device=\drive\QEMM\QEMM386.SYS RAM NOEMS Use at least 8MB of physical
RAM. Keep the virtual memory swap file greater than 4MB and less than
8MB. Keep at least 20MB of free disk space available on the drive where
the temp files are stored. Use the Windows Smartdrive program for disk
caching. Verify that there is at least 7MB RAM available in Windows:
choose About... in the Program Manager Help menu and see the Memory
entry.
Operation

The following topics cover some commonly seen problems.

OmniPage No Longer Works

If you have used OmniPage without difficulty in the past, you may have
altered your system configuration. Make sure that your scanner still
works with other software. If you have recently installed a new
application, this may be the case. To determine the last time the
autoexec.bat and config.sys files were modified, check the date and
time displayed by the DOS DIR command. Edit the files if necessary.

OmniPage works but operates slowly and frequently accesses the hard
disk drive.

As a Windows 3.1 application, OmniPage is able to take advantage of
virtual memory when running low on memory. This may occur with a
minimally configured system (only 8MB of RAM) if memory has become
fragmented with use or if other applications are running in the
background. When low memory conditions occur, Windows will use disk
space to simulate the RAM it does not have available. Disk access time
is much longer than RAM access time, so the computer system will run
much more slowly when it has to use virtual memory.

A quick fix for memory fragmentation problems is to quit OmniPage and
Windows and reboot your system. This will clear any fragmentation of
memory that has occurred (until, of course, it happens again).

Try closing any other applications that are running in the background.
This will usually free enough memory for OmniPage to operate without
using the swap file.

If you regularly work with long, complex documents, adding more RAM to
your system is the best solution. For more information on optimizing
your system and application performance under Windows, refer to the
Windows User's Guide.

OmniPage is too slow.

If recognition accuracy is good and you have enough memory, you may
want to run OmniPage on a faster computer to improve its speed. OCR is
a very time- and memory-intensive
process, so the processing power of your computer will determine the
speed of processes. For example, you will notice a significant speed
improvement if you upgrade from a 386\sx to 386 25Mhz or from a 386
25Mhz to a 486 33Mhz.

Faxes are not recognized accurately.

Recognizing faxes accurately can be difficult. A typical fax machine
produces documents at 200x100 dpi; a typical scanner scans at 300x300
dpi. Because of the lost resolution, OCR has less information to work
with, and the accuracy is not as good as it would be on the printed
original. In addition, if the fax is printed on thermal paper, it is
more difficult to scan than a document on regular white paper. There
are three things you can do to improve the accuracy of fax recognition:

If a fax modem is connected to your computer, you can receive a fax as
a file without printing it. You can then load the fax file as an image
in OmniPage and then recognize the text.

However, OmniPage must support the file format that your fax card
produces. These formats are listed in the Supported File Formats
section of the Release Notes. If your fax card's file format is not
supported, your fax software may be able to convert the fax to a PCX or
Uncompressed TIFF image that OmniPage can open and recognize.

Have senders select Fine Mode when they send you a fax. This improves
scan resolution to 200x200 dpi. It also helps if they choose a
sans-serif font that is 11 points or larger.

Request that senders send faxes to you directly from a fax card in
their computer. Transmissions directly from fax cards are much clearer
than fax-machine transmissions because they do not involve low-
resolution scanning.

The scanner will begin to scan and stop. The entire system locks up and
you have to reboot your computer.

You may have an interrupt conflict between your scanner and another
device. If you have a bus mouse and you usually do not use the mouse
and scanner at the same time, check the interrupt used by the scanner
and mouse for a possible conflict. The interrupt address typically used
by some network cards may cause the same problem.
Operation

OmniPage hangs the system at the beginning of the recognition process.

Many computer systems provide a feature called shadow RAM to enhance
system performance. If OmniPage causes the system to hang, turn off the
shadow RAM function of your computer and try again. Refer to your
computer's operations manual for information on disabling shadow RAM.

Some computer systems do not allow you to turn
  shadow RAM off. Incompatibilities with these systems are usually not
related to shadow RAM.

System hangs may be related to incompatibilities with memory resident
applications or device drivers. Use a text editor to comment out any
memory-resident device drivers and applications from your autoexec.bat
and config.sys files not used by Windows, OmniPage, your scanner, or
your hard drive and reboot your system.

 Do not remove a device driver unless you are aware of its function and
know it may be safely removed. Hard disks often require special device
drivers that should not be removed. Video displays that require special
device drivers may need to be reconfigured instead of removed. Make a
backup boot disk with your current operating system version,
autoexec.bat, and config.sys to guard against potential mistakes.

You receive garbage or nothing when you attempt text recognition AND
when you select manual zones, you see vertical lines running through
the document image or no image at all.

The memory address for your scanner interface card is probably
interfering with the memory address for your video display adaptor. Use
the instructions in your scanner owner's manual to move the scanner
interface card to a different memory address.
Error Messages

A maximum of 250 files can be selected.

You tried to load over the maximum of 250 files.

A maximum of 256 pages can be saved in a Caere Document.

You tried to scan or load pages that would increase the size of the
file over the 256-page limit.

Cannot find the requested scanner driver.

The scanner driver file has been deleted or moved from its proper
location in the boot drive or in the OmniPage directory. Be sure your
scanner works with the scanner manufacturer's scanning software and
reinstall OmniPage.

Cannot read file filename.ext.

The SETUP installation program presents this error message when it
cannot read a file on the OmniPage disk set. The file is probably
corrupted. Contact Product Support for a replacement disk.

Error accessing the user dictionary. Try freeing up hard disk space.
Dictionary size is limited to 32K.

You may have moved OmniPage to a different location on your hard disk,
or you may have renamed directories in the path where OmniPage is
located. Reinstall OmniPage.

Error adding word to the user dictionary.

You are low on free disk space or the user dictionary is full. The user
dictionary capacity is 32K or about 5300 words. Free up some disk space
or edit the user dictionary to remove unnecessary words.

Error connecting the text editor. The file may be corrupt or moved; try
reinstalling OmniPage. If the error persists, please call product
support.

An internal program file may have been damaged or is no longer in the
OmniPage directory. Please reinstall OmniPage. If this doesn't work,
call product support.
Error converting to an image file. If the error persists, please call
product support.

An internal program file may not work with a particular file or may be
corrupted. If the problem happens consistently with just one file, you
may not be able to OCR that file. If the problem happens with different
files, please call product support.

Error creating a mail document. Be sure your mail application is
working correctly. Try freeing up hard disk space in your TEMP
directory.

Test your mail application in another application to be sure it is
working correctly. Be sure there is at least lMB of free disk space in
your \temp directory. Your \temp directory is specified in your
config.sys file in the line: set temp=c:\name

Error creating the zone window. Try freeing up hard disk space or try
closing other applications.

OmniPage requires 8MB of available RAM configured for use with Windows
running in Enhanced mode.

The 4MB Windows permanent swap file usually provides enough memory to
allow OmniPage to run at any time. However, if you have several
applications open, you may not have enough memory to run OmniPage.
Close one or more open applications to free enough memory to run
OmniPa~e.

You can check available memory by choosing About ProgYam Manager in the
Program Manager Help menu. You should have at least 8MB available
memory.

Try deleting unnecessary files from your hard disk to free up space if
you do have enough memory. You may have run out of disk space.

Error deleting page.

An internal program file may not work with a particular file or may be
corrupted. If the problem happens with different files, please call
product support.

Error during conversion of the file (%i). Try freeing up hard disk
space or closing other applications.

You may be short of either volatile (RAM) memory or storage space. Try
running OmniPage as the only application and delete
all unnecessary files from your hard disk to maximize free hard disk
space.

Error during OCR. The page may be too complex. Try using Manual Zones
to recognize smaller areas of the page.
.

The page has a very complex layout or has very small text and requires
too much memory to recognize. Select Manual Zones in the drop-down list
under the Zone process button and try drawing fewer, smaller areas on
the page.

Error finding blocks on the page. The page may be too complex.

The page is very complex or has very small text and requires too much
memory to recognize. Select Manual Zones in the drop-down list under
the Zone process button and try drawing fewer, smaller areas on the
page.

Error finding or reading FFNNLOG.DAT file. Reinstall OmniPage.

This file has been deleted or moved from its proper location in the
OmniPage directory. Reinstall OmniPage.

Error finding zones on the page. The page may be too complex. Try using
Manual Zones to recognize smaller areas of the page.

Your page has a very complex layout or has very small tex and requires
too much memory to recognize. Select Manual Zones in the drop-down list
under the Zone process button and try drawing fewer, smaller areas on
the page.

Error getting image from OCR.

An internal program file may not work with a particular file or may be
corrupted. If the problem happens with different files, please call
product support.

Error getting the image from the scanner. Please check your scanner
settings in the Settings Panel and try again.

You may have selected an inappropriate setting for your page. Open the
Settings Panel and make sure that the selected Scanner options, such as
page size and orientation, are correct
Error Messages

Error initializing OCR. Try closing other applications.

OmniPage requires 8MB of available RAM configured for use with Windows
running in Enhanced mode.

The 4MB Windows permanent swap file usually provides enough memory to
allow OmniPage to run at any time. However, if you have several
applications open, you may not have enough memory to run OmniPage.
Close one or more open applications to free enough memory to run
OmniPage.

You can check available memory by choosing About Program Manager in the
Program Manager Help menu. You should have at least 8MB available
memory.

Error launching Notepad. Open a different text editor to read the
Release Notes.

You may have removed the Windows text editor program from its usual
location in the Windows directory. The release notes file is titled
readme.txt and should be located in your OmniPage directory. You can
open the file with any word processing program.

Error loading conversion code. The file may be corrupt or moved - try
reinstalling OmniPage. If the error persists, please call product
support.

An internal program file may have been damaged or is no longer in the
OmniPage directory. Please reinstall OmniPage. If this doesn't work,
call product support.

Error loading the training module. The file may be corrupt or moved -
try reinstalling OmniPage. If the error persists, please call product
support.

An internal program file may have been damaged or is no longer in the
OmniPage directory. Either restore the file (rtrain.dll) or reinstall
OmniPage. If this doesn't work, call product support.

Error loading the user interface module. The file may be corrupt or
moved - try reinstalling OmniPage. If the error persists, please call
product support.

An internal program file may have been damaged or is no longer in the
OmniPage directory. Please reinstall OmniPage. If this doesn't work,
call product support.
Error opening file. Try closing other applications and verify that you
are opening a valid file type.

OmniPage requires 8MB of available RAM configured for use with Windows
running in Enhanced mode.

The 4MB Windows permanent swap file usually provides enough memory to
allow OmniPage to run at any time. However, if you have several
applications open, you may not have enough memory to run OmniPage.
Close one or more open applications to free enough memory to run
OmniPage.

You can check available memory by choosing About Program Manager in the
Program Manager Help menu. You should have at least 8MB available
memory.

The file must also be in a format that OmniPage recogmzes. See
Supported Input File Formats on page 1-8.

Error opening the main dictionary. The file may be corrupt or moved -
try reinstalling OmniPage. If the error persists, please call product
support.

You may have moved OmniPage to a different location on your hard disk,
or you may have renamed the directories where OmniPage is located.
Reinstall OmniPage.

Error opening the text editor.

You may have moved OmniPage to a different location on your hard disk,
or you may have renamed the directories where OmniPa~e is located.
Reinstall OmniPage.

Error opening the user dictionary. The file may be corrupt or moved -
try reinstalling OmniPage. If the error persists, please call product
support.

You may have moved OmniPage to a different location on your hard disk,
or you may have renamed the directories where OmniPage is located.
Reinstall OmniPage.

Error preparing file for conversion. Try freeing up disk space. If the
error persists, please call product support.

The file you are saving may be corrupted. Try recognizing your document
again with the Language Analyst deselected in the Settings Panel OCR
options. Do not make any edits before saving.
