By default, the workflow of each document is the following:
- Document rotation model (fully automated)
- Document processor model (fully automated)
- Form and signature extraction (fully automated)
- Fuzzy matching (fully automated)
- Document annotation (human-in-the-loop task)
You can see the status of those tasks by clicking on the data point in the work queue.
Completion of all five tasks are required to change data point status to "completed"
By default, all five tasks are required, meaning that the human-in-the-loop task cannot be skipped automatically and is required to be "completed" to change the datapoint/document status to "completed" and create downloadable outputs.
Click "Process myself" on the left side in order to review the processed documents. On the left side, you see the formatted document (with bounding boxes), and on the right side, you see the key-value pairs that have been automatically identified and classified.
Upon clicking the bounding boxes on the left side, you get directed to the key-value pair in the list. Upon clicking a value on the right side, you get directed to where this was found on the document.
By clicking "submit" you finish the HITL task and the data point/document status is "completed".
Data points are assigned to specific users
Once "process myself" is clicked, the data point/task is assigned to the user that clicked, and cannot be reviewed by another user. If you want to assign the data point/task to a different user, you have two options: Either you reprocess the data point and another user clicks "process myself" or you submit that data point and it can the annotated by another user subsequently.
A confidence score, or classification threshold, indicates how confident the machine learning model is that the respective value has been correctly assigned to the respective key. The score can have a value between 0 (low) and 1 (high).
You can review the confidence scores on a field level by clicking the three dots next the value.
When clicking "submit", the document annotation task is completed.
Then, all five tasks are completed, the document's/data point's status is "completed" and extracted information can be downloaded in .csv or .json format.
Values (tables) need to have keys (table-keys) assigned to them
Values of all data types (incl. tables) need to have keys assigned to be extracted. You might find a list of values in the "to be labeled" section; if those are not assigned to a key, they get lost after the task is submitted.
Updated 27 days ago
For a deep-dive into our labeling interface, please see the following How To guide