How to process a document using Document.Extract via the API

How to get the API Key

Super.AI uses API keys to authenticate API requests. You can view and manage your API keys in your dashboard by clicking on the profile icon in the lower left of the screen.

728

Access to API Keys in the lower left in your project’s dashboard

Upload PDFs to a given project (Document.Extract)

First, you have to upload a document to a specific project.

curl "https://api.super.ai/v1/apps/<project_uuid>/jobs" \
-X POST \
-H "API-KEY: <your_api_key>" \
-H "Content-Type: application/json" \
-d '{ "inputs": [ {"documentUrl":"<public_url_or_data_url>"} ]}'

If you are using data urls instead of public urls, generate the data url for a document:

curl "https://api.super.ai/v1/data?path=<your_data_path>&mimeType=application%2Fpdf&uploadUrl=true" \
-X POST \
-H "API-KEY: <your_api_key>"

This will return a json with a dataUrl key and a uploadUrl . The former must be used to actually store the file.

curl -v -H "Content-Type: application/pdf" --upload-file <your_file_path> <uploadUrl>

📘

Data Storage

More information about our data storage and urls can be found here

Get update that a job(s) is / have finished

During submission of the data points a callbackUrl can be added to the payload, which would be called (POST) when job is completed.

curl "https://api.super.ai/v1/apps/<project_uuid>/jobs" \
-X POST \
-H "API-KEY: <your_api_key>" \
-H "Content-Type: application/json" \
-d '{ "inputs": [ {"documentUrl":"<public_ur_or_data_url>"} ], "callbackUrl": "<your_callback_url>"}'

Payload:

{"id": <data_point_id>, "uuid": "<data_point_uuid>", "state": "<COMPLETED|CANCELED|FAILED>", "action": "<RESOLVED|EDITED>"}

Get Extracted information from document

curl 'https://api.super.ai/v1/jobs/<data_point_id>/response' \
-H 'api-key: <your_api_key>' 

This endpoint returns a JSON with a key response which contains a key url with the document url, and a dictionary of annotations. Each of these has bounding box position (boundingBox) and page number (pageNumber), content and confidence – see example below.

Sample JSON output

{
  "url": "data://ai/9715e76c-c86c-40f9-9ef4-92d6df46ee9e/0/invoice-example.pdf",
  "annotations": {
    "invoiceId": [
      {
        "boundingBox": {
          "top": 123,
          "left": 465,
          "width": 64,
          "height": 7
        },
        "content": "NPP/PI/2122/107",
        "confidence": 0.981,
        "pageNumber": 1
      }
    ],
    "invoiceDate": [
      {
        "boundingBox": {
          "top": 138,
          "left": 466,
          "width": 46,
          "height": 8
        },
        "content": "2022-03-02",
        "confidence": 0.979,
        "pageNumber": 1
      }
    ],
    "deliveryDate": [
      {
        "boundingBox": {
          "top": 153,
          "left": 466,
          "width": 46,
          "height": 8
        },
        "content": "2022-03-31",
        "confidence": 0.8908,
        "pageNumber": 1
      }
    ]
    }
}

Identify / Download data points filtered by a tag

curl 'https://api.super.ai/v1/apps/<project_id>/job_responses?tags=<your_tag_1>&tags=<your_tag_2>' \
-X 'POST' \
-H 'API-KEY: <your_api_key>'

This triggers an e-mail sent to the user with an URL to download a zip file that contains a JSON with all jobs that match the filters.

Download finished data points

curl 'https://api.super.ai/v1/apps/<project_id>/job_responses?statusIn=COMPLETED' \
-X 'POST' \
-H 'API-KEY: <your_api_key>'

Delete data points

curl 'https://api.super.ai/v1/jobs/<data_point_id>' \
-X 'DELETE' \
-H 'API-KEY: <your_api_key>'