Advanced OCR

12 minute read Last updated on July 12, 2024


Use the Advanced OCR process node to refine optical character recognition (OCR) results and extract metadata through the use of zones. Through zones, you can define how you want the OCR engine to recognize various elements on the page such as text, forms, tables, graphics, etc. For example, to capture invoice numbers from incoming documents, you can create a zone in the area of the document where invoice numbers appear. In addition, zones can extract metadata automatically and associate the metadata with the original document.

Notes:

  • Dispatcher Stratus uses the Tesseract OCR engine.

  • The recommended minimum DPI for scanned documents is 200. Better results can be expected for documents 300 DPI and above.

  • The accuracy of data obtained from a document through OCR is contingent on the orientation of an incoming document matching the orientation of the OCR zones as they have been configured. It may be necessary to rotate the document at the time of scanning or with the Advanced Settings to ensure the orientation matches.

To open the Advanced OCR node’s configuration window, add the node to your workflow and double-click it.

Configuring the Advanced OCR Node

Advanced OCR

  • Enabled - To enable this node in the current workflow, check the box at this field. If you leave the box blank, the workflow ignores the node and documents pass through as if the node was not present. Note that a disabled node does not check for logic or error conditions.

  • Node Name - The node name defaults to this field. This name appears in the workflow below the node icon. Use this field to specify a meaningful name for the node that indicates its use in the workflow.

  • Node Description - Enter an optional description for this node. A description can help you remember the purpose of the node in the workflow or distinguish nodes from each other. If the description is long, you can hover the mouse over the field to read its entire contents.

Buttons

  • Advanced Settings - To access additional settings, including image rotation and language recognition, click this button.
  • Help - To access Online Help, click this button.
  • Cancel - To exit the window without saving any changes, click this button.
  • Save - To preserve your node configuration and exit the window, click this button.

Advanced OCR Node Properties

On the Advanced OCR properties window, you can fully define and customize zones for your OCR processing. The window consists of the following areas:

Preview Area

Use the Preview area to upload a sample document you can use to help define your zones. The document should resemble the documents you want to scan.

When you first open the Advanced OCR node properties window, the Preview area contains only the Upload your document window, and many options on the screen are inactive. Once you upload a document, the image appears in the Preview area and the options activate.

To upload a document, click on the icon in the Upload your document window or click on the Upload icon on the Toolbar. The Open Sample window appears from which you can choose a document. In addition, the application provides several sample documents of various sizes that you can also use. See the following illustration:


Note: Sample documents are stored in the following directory:

    C:\Users\Administrator\AppData\Roaming\Konica Minolta\DST\Resources\ToOcrNode\Images 

Select a document and click Open. The sample document appears in the Preview area.

Note: Once you select a sample document, you can select a different document by clicking on the Upload icon on the Toolbar. If you have already created zones, once you select the new document a window appears and you choose to save or delete the existing zones.

Top Toolbar

Use the toolbar at the top of the window to further define the zone as well as customize the view of the node properties window. Note that many options on the Toolbar do not activate until you upload a sample document to the Preview area.

When using the drop-down palettes on the Toolbar, pressing the Enter key or clicking anywhere outside of the palette applies those changes to the zone.

Toolbar Icons Description
Zone Coordinates - Click to define specific coordinates for the zone. You can change the size of the zone by entering values (in pixels) in the Width and Height fields. You can also move the position of the zone by entering values (in pixels) in the Left and Top fields.
Zone Type - Click on this icon to define settings for a selected zone.
Zone Page Range - Click to specify on which pages to apply the zone. Options include:
  • All pages within allowable range - Select this radio button to ensure that the zone applies to all pages within the range.
  • Only these pages from allowable range - Select this radio button to ensure that the zone applies only to pages within a specified range. Next, enter the page range in the empty field provided below.
Delete - Click to delete a selected zone.
Pages - Click on the arrows to navigate through multiple pages of the sample document (if necessary).
Upload Sample Document - Click to find and upload another sample document to use in the Preview area.
Actual Size - Click to revert the preview sample document to its original size.
Fit to Width - Click to stretch the sample document to fit the width of the Preview area.
Whole Page - Click to fit the sample document completely in the Preview area.
Zoom controls - Use either the magnifying glass icons or the sliding bar to zoom in and out of the Preview area.

Zones List

Use this area to create, edit, and delete detection zones. Zones define areas of an imaged document for use by the OCR engine, and they can output text from the document. For example, to capture invoice numbers from incoming documents, you can create a zone in the area of the document where invoice numbers appear.

Once you upload a document to the Preview area, the Zones List activates. All defined zones (if any) for the node appear in the list, as in the following illustration:


To access additional options for zones, open the More Actions Menu menu by clicking the actions icon at the upper-right corner of the Zones List area, as shown below:


Clicking the three dots in the Zones area will open up the More Actions Menu that allows you to:

Menu Option Menu Action Keyboard Shortcut
Show / Hide all zones Toggle the visibility of all zones on the Canvas and display a “hidden” icon actions next to each zone in the list when it is hidden. If the current selection includes a mix of zones that are shown and hidden, clicking this option will hide all zones. F6
Delete all zones Delete all zones from the Zone Editor / Canvas. Ctrl+Shift+Del

Clicking the three dots next to a zone allows you to:

Menu Option Menu Action Keyboard Shortcut
Show / Hide zone See next table See next table
Delete Delete the selected zone from the Zone Editor / Canvas. Del
Rename Rename the selected zone. F2


There is a second menu for Show / Hide Zone with more options:

Menu Option Menu Action Keyboard Shortcut
Show / Hide all zones Toggle the visibility of all zones on the Canvas and display a “hidden” icon actions next to each zone in the list when it is hidden. If the current selection includes a mix of zones that are shown and hidden, clicking this option will hide all zones. F6
Show / Hide this zone Toggle the visibility of the selected zone on the Canvas. F7
Hide all zones but this Hide all zones on the Canvas except the selected zone. F9
Delete selected zones Delete selected zones from the Zone Editor / Canvas. Del

Note: You can also Rename, Delete, and/or Show / Hide the properties of individual zones by right-clicking them in either the Zones List area or the Preview area and selecting an option from the menu that appears.

Multiple zones can be selected two ways:

  1. Click and drag the mouse in the Preview area to highlight multiple zones at once
  2. Use CTRL+Click to select multiple zones. This method works in the Zones area and in the Preview area.

If you have multiple zones selected, you may select the More Actions menu from any of the selected zones, and the options to modify multiple zones will appear.

Creating Zones

To create zones, do the following:

  1. Add New Zone - Click this button to access a drop-down palette, as in the following illustration:

  2. On the Add New Zone drop-down palette, do the following:

    • Zone Name - Enter an identifying name for the zone (e.g., invoice or address). You can enter up to 15 characters.
    • Left and Top - Enter a value (in pixels) to position the zone from the left and top of the document.
    • Width - Enter a value (in pixels) to define an appropriate width for the zone.
    • Height - Enter a value (in pixels) to define an appropriate height for the zone.
    • Zone Page Range - Specify the pages on which the zone will be applied. Options include:
      • All pages within allowable range - Select this radio button to apply the zone to all pages within the specified range. This means that a zone configured on the first page of the document will automatically be applied to the rest of the pages in the document (if the specified Page range to process is Every Page).
      • Only these pages from allowable range - Select this radio button to apply the zone to only a specific range of pages within the specified range. Then enter the page range in the empty field provided.
    • Save - Click this button when you are done. The zone appears in the specified location on the Preview area. See the illustration below:

    • Cancel - Click this button to exit the drop-down palette without saving any changes.

Editing Zones

To edit a zone, click on it in the Zones List or the Preview area. You have the following options for editing a selected zone:

  • Preview area
    • Relocate - Click on the zone and drag it to a new area on the Preview area.
    • Resize - Click on one of the handles on the zone border and drag the edge to a new location on the Preview area. Note that the handles may not be available if you drastically change the size of the sample document. In such cases, click on the icon on the Toolbar to use the Zone Coordinates option to resize the zone.

Defining Type/Content for Zones

You can choose settings for each zone to match the specific format of your zone content. With the zone selected on the Preview area, click on the icon on the toolbar to display the Zone Type drop-down palette. Next, choose a type for the zone:

  • Text Zone - Zone contents will be treated as flowing text.
  • Table Zone - Zone contents will be treated as a table.
  • Graphic Zone - Zone contents will be treated as an embedded image, and not as recognized text (e.g., photos, logos, and drawings).

OCR Metadata

Once you define an OCR zone, other nodes in the workflow can reference it. For more information about OCR Metadata and Metadata syntax, see the Metadata Browsing page.

Additional Settings

You can specify which pages to include in the OCR process and the output format. These fields appear in the lower-left corner of the Advanced OCR Node properties window.

Specifying Page Ranges to Process

You can specify which pages to include in the OCR process. The Page range to process area appears at the left of the node properties window, below the Zone List. If you click on the drop-down, the following options appear:


  • Every page - Process every page.

  • Every even page - Process even pages only.

  • Every odd page - Process odd pages only.

  • First page - Process the first page only.

  • Last page - Process the last page only.

  • Define your own page range - Process a custom page range. Once you choose this option, an empty field appears where you can enter the page range. You have the following options:

    • Specify a page range by using commas and/or dash signs counting from the start of the document. For example, enter 1, 2, 5-7 to process pages 1, 2, 5, 6, and 7.

    • Specify a specific sequence within a range of pages by using parentheses. For example, enter 1-10(3) to process every third page from pages 1 to 10.

    • Specify the last page by using ‘end.’ For example, enter end(-5) - end to process pages 15-20 of a 20-page document.

      Other examples include:

    • To process pages 1, 2, 5,6,7, and 19 of a 20-page document, enter: 1,2,5-7, end(-1).

    • To process pages 10-15 of a 20-page document, enter: 10-end(-5).

    • To process every other page from pages 10-15 of a 20-page document, enter: 10-end(-5)(2).

    • To process pages 15-20 of a 25-page document, enter: end(-10)-end(-5).

    • To process pages 10-20 of a 20-page document, enter: end(-10)-end.

    Note: If you specify a page range that does not correspond to the number of pages in the incoming document (e.g., processing pages 10-20 for a three-page document), the file will go out on error.

Choosing an Output Format

Use the Output field to specify the format of the output file. This area appears at the left of the node properties window, below the Zone List.

At the Output field, if you click on the drop-down, a list of output options appears:

  • Original Document + Metadata - Outputs the original file along with metadata extracted from defined zones. This is the default setting and is necessary to use metadata in other nodes within the workflow, such as Metadata to File and Metadata Route, for further processing.
  • PDF Searchable - PDF output converter that retains the original image in the foreground with the recognized text hidden in the background (in the correct position). Recommended for archiving and indexing documents. With this format, the entire input document is included as output.
  • Text - Outputs the document to plain text (*.TXT) that can be read by most text editors and word processors.
  • Comma Separated Text - Outputs the document into a tabled text file that can be read by Excel (*.CSV).
  • Text with line breaks - Outputs the document to text with a line break after each line.
  • Unicode Text - Outputs the document to plain text, using two-byte Unicode characters.
  • Unicode Comma Separated Text - Outputs the document into a tabled text file using two-byte Unicode characters. The resulting file can be read by Excel.
  • Unicode Text with line breaks - Outputs the document to text with a line break after each line and uses two-byte Unicode characters.

Note: All processed output files include only the content captured in the user-defined zones, except for Original Document + Metadata and PDF Searchable. These output formats include the original file along with the content captured in the zones.