[Attribute Extraction Settings] Screen

This screen is displayed in the following cases.

It enables you to configure the method for extracting attributes from a form type.

To use AI extraction, select [AI Extraction]. To extract strings to the top, bottom, left and right of a keyword as attributes, select [Extract by Keyword]. To extract attributes from a specific area on the form, select [Extract by Area Specification].

The displayed screen depends on the item selected in [Extraction Method].

  • [Extraction Method] cannot be set to [Extract by Area Specification] for a preset form.
  • The settings of the items selected in [Extraction Method] are saved. The settings of the items that are not selected are discarded.
  • There is a limit to the number of pages that AI extraction can be used with. You can increase the number of pages by purchasing an optional license.
  • If the limit for the number of pages used is reached during job processing, the processing will continue even if the limit is exceeded. The number of pages exceeding the limit will be deducted from the total number of pages used after purchasing an additional license.
  • If you run a job after the limit for the number of pages used is reached, the job will not result in an error, but the AI extraction results will be empty, so please enter the data manually.
  • AI extraction uses generative AI, but the uploaded files and extraction results are not used for AI learning.
    Also note that the extraction results may not be accurate. If necessary, check and correct the results on the [Data Validation] screen of the workspace.

Screen when [Extraction Method] is set to [AI Extraction]

image

[Attribute Name]

Enter the attribute name.

  • A maximum of 100 characters can be entered for the attribute name.
  • You cannot set the same name as an existing attribute.

[Extraction Method]

Select [AI Extraction].

[Extracted Item]

Selects the preset attributes of the target form type.
If you select [Custom], custom attributes are used instead of preset attributes.

[Remove Space]

Configures whether to delete spaces from the extracted strings.

[Extract the string that matches the regular expression]

Enables you to specify the format of the string to extract using a regular expression. Enter the regular expression in the text box displayed by enabling this setting.

  • Regular expressions enable you to express multiple varied strings using a single string.

[OK]

Saves the settings.

Screen when [Extraction Method] is set to [Extract by Keyword]

image

[Attribute Name]

Enter the attribute name.

  • A maximum of 100 characters can be entered for the attribute name.
  • You cannot set the same name as an existing attribute.

[Extraction Method]

Select [Extract by Keyword].

[Extracted Item]

Selects the preset attributes of the target form type.
You cannot select preset attributes for AI extraction.
If you select [Custom], custom attributes are used instead of preset attributes.

[Keyword]

Enter the keyword in the text box displayed by clicking [Add]. You can click [Add] to add multiple keywords.

Click [image] to delete the entry field.

  • When searching for a keyword from the strings on a form, the following are not distinguished.
    • Spaces
    • Hyphens, long-vowel symbols, or minus symbols
    • Full-width and half-width characters
    • Upper case and lower case characters

[Keywords to Exclude]

Specifies a keyword to exclude, which helps avoid unintended strings (values) from being detected.

Enter the keyword to exclude in the text box displayed by clicking [Add]. You can click [Add] to add multiple keywords to exclude .

Click [image] to delete the entry field.

Specify a string that includes a keyword in [Keywords to Exclude].

When detecting attributes, the keyword to exclude (string A) and keyword to detect (string B) are searched for from the strings in the form. Areas where strings A and B overlap in the search results or form are excluded from the results due to string A.

Keyword Keyword to Exclude Description
Example 1 sub subsum Because "subsum" includes "sum", it will normally be detected, but if "subsum" is included in the keywords to exclude, "subsum" will not be detected.
Example 2 total subtotal Because "subtotal" includes "total", it will normally be detected, but if "subtotal" is included in the keywords to exclude, "subtotal" will not be detected.
Example 3 total tal "tal" does not include "total". The "tal" keyword to exclude is ignored.
Example 4 sub, grandtotal subtotal "subtotal" includes "sub" but does not include "grandtotal". In this case, values such as "sub", "grandtotal", and "subsum" (including "sub") are detected, but "subtotal" is not.
Example 5 costsubtotal costsubtotal If the keyword is the same as the keyword to exclude, the search results are the same for both and no values with "costsubtotal" are detected. (The "costsubtotal" keyword is disabled due to the keyword to exclude.)

  • When determining whether a keyword is included in a keyword to exclude or searching for keywords and keywords to exclude from the strings on a form, the following are not distinguished.
    • Spaces
    • Hyphens, long-vowel symbols, or minus symbols
    • Full-width and half-width characters
    • Upper case and lower case characters

[Extraction Position]

Enables you to select the top, bottom, left, or right of the keyword as the extraction position of the form, as well as [Right or bottom of keyword (prioritize right)] or [Right or bottom of keyword (prioritize bottom)].

[Remove Space]

Configures whether to delete spaces from the extracted strings.

[Extract the string that matches the regular expression]

Enables you to specify the format of the string to extract using a regular expression. Enter the regular expression in the text box displayed by enabling this setting.

  • Regular expressions enable you to express multiple varied strings using a single string.

[OK]

Saves the settings.

Screen when [Extraction Method] is set to [Extract by Area Specification]

image

[Attribute Name]

Enter the attribute name.

[Extraction Method]

Select [Extract by Area Specification].

[Area Specification]

Select the file on the displayed screen.

  • The file selected here cannot be checked or downloaded after uploading.

You can select files in the following formats.

  • JPEG (*.jfif, *.pjpeg, *.jpeg, *.pjp, *.jpg)
  • TIFF (*.tiff, *.tif)
  • PDF (*.pdf)

[Reduce]/[Enlarge]

Reduces/enlarges the form preview.

[Start Area Specification]

Reflects the X coordinates, Y coordinates, width, and height of the area in the form preview.

[Previous Page]/[Next Page]

Changes the page displayed in the form preview.

Form Preview

Displays a preview of the file selected in [Area Specification].

[X Coordinate]

Enter the X coordinates of the area displayed in the form preview by clicking [Start Area Specification] as a value from 0 to 7,014.

[Y Coordinate]

Enter the Y coordinates of the area displayed in the form preview by clicking [Start Area Specification] as a value from 0 to 10,204.

[Area Width]

Enter the width of the area displayed in the form preview by clicking [Start Area Specification] as a value from 1 to 7,014.

[Area Height]

Enter the height of the area displayed in the form preview by clicking [Start Area Specification] as a value from 1 to 10,204.

[Remove Space]

Configures whether to delete spaces from the extracted strings.

[OK]

Saves the settings.