Document Management Software - DocuLex

Zone OCR Questions

What kind of documents are good candidates for zonal OCR?

The zonal optical character recognition (OCR) process is not an exact science, even under the best of circumstances. For optimal results with zonal OCR, documents need to have the following characteristics:

  • The print is sharp and clear.
  • There is a lot of white (clear) space surrounding the information you want to capture.
  • The information to be captured is in the exact same position on each page.
  • All pages that use the same zonal recognition form are the same size.
    Note: If there are minor differences in paper size, such as paper that has been manually cut (even with a paper cutter) to create a size other than the original, the document is not a good candidate for zonal OCR.
  • The documents are created by a high-resolution printer.
    • The documents are not created on a dot matrix printer.
    • The documents are not created on a copy machine.
    • The documents are not faxed copies.
    • The information to be captured is not hand-written or hand-stamped.

What is the best way to figure out exactly where to place the zonal OCR regions on a recognition form?

The FormEdit utility program that is part of the Admin program contains the Underlay Image feature. This feature takes most of the guesswork out of drawing zonal regions. Drawing accurate zonal regions on a zonal recognition form is a multi-step process that includes creating, testing, and troubleshooting. These steps are explained below.

  1. Create a zonal recognition form.
    • See Creating Zonal Recognition Forms in the Admin User’s Guide or Admin Help for step-by-step instructions.
  2. Test the form. (You’ll find instructions for performing steps 2.a-2.d HERE.)
    1. In the Admin program, create a test batch.
    2. In the Professional Capture program, scan several pages into the test batch.
      Note: It’s a good idea to look through the stack of pages that you need to scan. Notice whether the pages are uniform in size and whether the areas you want to capture are in the same place on every page. Since the pages that you want to scan are probably not 100% identical (the areas that you want to capture may differ slightly in location from one page to the next), it’s best if you choose pages that are as different from each other as possible. That way you can design a form that will capture data accurately on the greatest number of pages.
    3. Tag the pages.
    4. Run zonal evaluation.
    5. Perform visual quality assurance to check the accuracy of the form.
      • Did you create the proper zonal regions?
      • Are they in the right location?
      • Are they the right size?
    6. If the results are acceptable, you are finished creating the zonal recognition form.
    7. Otherwise, continue with step 3.
  3. Troubleshoot.
    1. Modify the zonal recognition form.
      1. In the Admin program, from the Job menu, select Edit Forms.
      2. Select the form you want to edit.
      3. Select Open.
      4. Select Yes.
      5. Add more zonal regions if necessary.
      6. Relocate or resize the existing zonal regions if necessary.
    2. Update the job profile for the batch.
      1. In the Admin program, from the Batches menu, select Update Job Profile.
      2. Browse to your test batch.
      3. Select OK.
      4. Select OK.
      5. Select Yes.
    3. Repeat steps 2.d-2.g and 3 as necessary.

I have created a zonal recognition form. How do I use it to perform zonal OCR?

First, it’s important to understand that the Professional Capture program uses zonal recognition forms after you complete the scanning process. You don’t print zonal recognition forms to use at scan time. When you select the Run Zonal Evaluation command, the Professional Capture program evaluates the zones (regions) on the page that you defined with the zonal recognition form.

Note: Before you use the zonal recognition form, it’s important to test its accuracy. If you haven’t already done so, instructions can be found HERE.

To perform the zonal evaluation process, follow these basic steps:

  1. In the Admin program, create a batch from the same job you used to create the zonal recognition form.
  2. In the Professional Capture program:
    1. Open the batch you just created.
    2. Define document breaks and scan your documents.
    3. Tag the appropriate pages or documents to be evaluated for zonal optical character recognition.
    4. Run the zonal evaluation.
    5. Perform visual quality assurance.

Explanations of how to accomplish these steps follow.

Step 1. In the Admin program, create a batch from the same job you used to create the zonal recognition form.
To create a batch, see Creating a Batch in the Admin User’s Guide or Admin Help.

Step 2.a. Open the batch you just created.
To open a batch in the Professional Capture program:

  1. From the File menu, select Open Batch (or select the Select Batch/Set paths icon).
    The Browse for Folder dialog box is displayed.
  2. Select the batch you created in the Admin program, then select OK.
    The Open Batch dialog box is displayed.
  3. Select OK.
    The Professional Capture main window is displayed with the name of the batch (the batch label) in the title bar.

Step 2.b. Define document breaks and scan your documents.
If you want the pages you scan to be processed as separate documents, you need to define document breaks before you scan the pages. You can do this in many ways, including the following:

  • Create, print, and insert MULT bookmark barcode recognition forms at the beginning of each new document.
    Note: Use this method when you want to automatically scan a stack of documents that have differing numbers of pages.
  • Create, print, and insert a SING bookmark barcode recognition form at the beginning of a stack of pages that you intend to scan as single-page documents.
  • Scan one document at a time.
  • Use the fixed-page scanning option.
    Note: Use this method when the documents you want to scan are all the same number of pages, such as invoices or checks, but not necessarily just one page.

The instructions in this step only document how to use the fixed-page scanning option.

To use the fixed-page scanning option, make sure the fixed-page count is set to the correct number of pages. The default is 1. If you need to change the page count, do the following:

  1. In the Professional Capture program, from the Options menu, select Scan Settings.
    The Scan Settings dialog box is displayed.
  2. In the Fixed Page Count box, type the number of pages that your documents contain, then select OK.
    The Professional Capture main window is displayed.

To scan documents:

  1. Place your documents in the scanner.
  2. From the Scan menu, select New (or the Scan New Docs icon

What do I need to do to increase the accuracy of my Zonal OCR?

Consider the following issues when you want to use zonal optical character recognition (OCR):

  • The condition of the source documents (the documents you are going to scan).
    Before you decide to use the zonal OCR process to capture information from your scanned documents, you need to determine if the documents are good candidates for zonal OCR. You will find a list of criteria for optimal results by clicking HERE.
  • The location of the zonal regions on the zonal recognition form.
    You need to determine if the zonal regions are in the correct location on the zonal recognition form. You will find instructions for doing this by clicking HERE.
  • The fact that using zonal OCR requires a visual quality assurance step.
    The OCR engine reads any dots, marks, or stippling on the page and may interpret them as periods, hyphens, vertical bars, etc. It can also add extra spaces. In addition, if letters/numbers are blurry or if parts of characters are missing, the OCR engine can misinterpret them; for example, it may read an uppercase M as an uppercase H, or a 3 as a 5.In practice, this means 1 or 2 characters out of every 100 characters scanned are not captured correctly even on pristine source documents. Therefore, zonal OCR requires a quality assurance step.

    After you have run zonal evaluation on your batch, you need to perform visual quality assurance. You will find instructions for doing this by clicking HERE, step 2.e.

The barcode on my zonal recognition form gets in the way of drawing a zonal region in the correct location. Does my zonal recognition form need to include the barcode?

No, a zonal recognition form does not need a Form ID, a barcode, or registration marks.
The only item that needs to be on your zonal recognition form is at least one zonal OCR region.

To create such a form, follow the steps as outlined in Creating Zonal Recognition Forms in the Admin User’s Guide or Admin Help, with the following exceptions:

  • At step 2, you need to clear the Include Form ID and Include Registration Marks check boxes.
  • At step 5, you need to name the form ZOCR.FRX.

What kind of image files can I use for the underlay image feature in the FormEdit utility program?

The only image file that is supported by the FormEdit utility program in the Admin program is a .tif file.

The instructions in Creating Zonal Recognition Forms in the Admin User’s Guide and Admin Help say you can use .tif or .bmp files. We will correct that statement in the next release of the documentation.

Get in Touch With Us!
Our Partners
docSTAR
2165 Technology Drive
Schenectady, NY 12308
Phone: 1-888-DOC-STAR
x
Loading...