How to use active learning when labeling data

If you’re training a machine learning model, it’s useful to prioritise your labeling so that the data points that are most useful to your model get labeled first. With active learning, we run your model over all data points that you upload, generating a model confidence score that you can use to decide which data points get processed first.

📘

Beta feature

This feature is in beta mode. To enable it for your projects, please contact us.

Uploading data for active learning

We recommend that you submit your data points in the parked data point state, so you can review the model score before the data points are processed.

To submit data points as parked, follow the instructions on our How to add data points to your project page.

How to review data points with the lowest model confidence

Data points with the lowest mode confidence are the strongest candidates to improve your model. Here’s how to get them to the top of your work queue:

  1. Go to your super.AI dashboard
  2. Open the relevant project
  3. Click Order above the work queue
  4. In the first dropdown, select Model confidence
  5. In the second, select Lowest first

How to choose what to label according to model confidence

You can choose which parked data points to queue for processing based on the model confidence score.

  1. Go to your super.AI dashboard
  2. Open the relevant project
  3. Click Filter at the top of right
  4. In the first dropdown, select Model confidence
  5. In the second dropdown, choose between is higher than/is lower than/is between
  6. In the input box, enter a figure or range of figures
  7. The work queue will update to reflect your settings. You can review the data points to confirm that they are the ones you’d like to label.
  8. Open the Other actions dropdown at the top right of the work queue
  9. Select Queue filtered