Validation Rules

Validation rules are designed to help you maintain data quality by providing a customizable framework to enforce consistency, accuracy, and compliance with predefined standards. Follow the interactive tutorial below and learn how to set custom validation rules within super.AI.

You can skip through the steps by pressing the right arrow key to move forward or the left arrow key to move back.


How to set up validation rules

Note: You can set up one or more validation rules per field (key-value pair) or table column

  1. In the field or table column configuration, click "add validation rule"
  2. Choose ID: The validation rule ID is mandatory, users can choose a custom ID or it will be added automatically if not filled out.
    Note: Since this ID will appear in the push and email notifications, users want to ensure the name is intuitive and self-explanatory.
  3. Choose Title: The validation title is optional; however this title will appear in the push and email notifications, users want to ensure the name is intuitive and self-explanatory.
  4. Choose Description: The validation description is optional; however this title will appear in the push and email notifications, users want to ensure the name is intuitive and self-explanatory.
  5. Choose Validation Type: See details below
  6. Choose Action: See details below

Validation Types

  1. min. Confidence: This validation rule checks whether the confidence score of AI-generated predictions meets a defined minimum threshold. This helps ensure that the data or results provided by the system are reliable and within acceptable quality limits.

    This validation rules fails if

    • a field's confidence score score is below the chosen threshold
    • at least one non-header row's confidence score is below the chosen threshold (meaning all rows must contain a higher confidence score)

  1. Non-Empty: This validation rule ensures a field contains some data and is not left blank.
    This rule is commonly used to verify that required fields, such as names or IDs, have been filled out and are not null or empty.

    This validation rule fails if

    • there is a null value for the field
    • there is at least one non-header row without a null value of a target column (meaning all rows must contain any value in this column)

  1. Expression: An expression is a formula or logic-based rule that defines how data should be processed, calculated, or validated. It allows you to create custom checks or transformations by combining fields, operators, and values to meet specific project requirements.
  • Use {value} to reference an extracted value (either the value of the field or a value in a cell).
  • Use any constants of types integer , float or string
  • Use the following comparison operators: ==(equal), !=(not equal), >(greater than), <(less than), >=(greater than or equal to), <=(less than or equal to)

    Examples
  • {value} == "abc" (note that you need double quotes for values of type string)
  • {value} >= 120 (assuming that type of extracted value is integer or float)

    This validation rule fails if

    • the expression statement is incorrect
    • a field's value does not match the expression criteria
    • there is at least one non-header row not matching the expression criteria (meaning all rows must)

  1. RegEx: A regex, or regular expression, is a pattern used to search, match, or validate specific text or data formats. It allows you to define rules for identifying or filtering strings based on their structure, such as email addresses, phone numbers, or specific keywords
    Examples:
  • ^[\w\.-]+@[\w\.-]+\.\\w+$ to validate email addresses, example match: user@example.com
  • ^\d{4}-\d{2}-\d{2}$ to ensure dates are in YYYY-MM-DD format for consistency in processing
  • ^\d{5}(-\d{4})?$ to validate U.S. ZIP codes with or without the 4-digit extension, example match: 12345 or 12345-6789
  • ^[A-Z0-9]{8,12}$ to ensure the extracted value follows a strict alphanumeric format
  • ^.+$ to validate that the extracted value contains at least one character

    This validation rule fails if

    • a field's value does not match the RegEx criteria
    • there is at least one non-header row not matching the RegEx criteria (meaning all rows must)


Actions

Execute if succeeds toggle > By default, actions are executed on failed validation. This parameter allows us to execute actions also on successful validations (”passed”).

  1. Auto-tagging: Immediately adds a specified tag to the document.

  2. Email notification: Sends an email with the specified subject to the specified email IDs. Email IDs shall be divided using ,. Users can choose between the schedules hourly or daily (8.00 am UTC).
    Email is sent from validations@prod-eu.super.ai and contains validation rules details as well as a list of documents and direct link.
    Example:

  3. Push notification: Immediately send a webhook with the specified URL.

  4. Remove value: Immediately remove the value (field) or all values in a column (table column)


📘

One validation rule can have multiple actions

Users can fully customize actions, e.g. add tag A upon failure of a validarion rule and tag tagg B upon success of the same validation rule - or add a tag and send an email upon failure of a validation rule.


🚧

By default, failed validations lead to job state 'Needs Review'

Note that by default, if at least one validation rule is failed, job is automatically assigned the state 'Needs Review' (see Job States). This can be changed in the Review Settings.