TABLE OF CONTENTS
- What is Generative Annotation?
- How do I use Generative Annotation?
- How do I use Cluster Suggestions?
- How do Cluster Suggestions help with model training?
- How do I use Assisted Labelling?
- How does Assisted Labelling help with model training?
What is Generative Annotation?
Generative Annotation uses Microsoft’s Azure OpenAI endpoint to generate AI suggested labels to accelerate taxonomy design and early phases of model training; and reduce time-to-value for all Communications Mining use cases.
It includes:
- Cluster Suggestions: Suggested new or existing labels for clusters based on their identified theme(s)
- Assisted Labelling: Automatic predictions for labels based on the label names or descriptions
How do I use Generative Annotation?
Generative Annotation features will be automatically enabled on datasets – you don't need to do anything to start using them.
Once a dataset is created, cluster suggestions will automatically be generated within a short period of time. If a taxonomy has been uploaded (highly recommended), Communications Mining will suggest both existing and new labels for clusters.
When a taxonomy is uploaded to a dataset, this will also automatically trigger an initial model to be trained with no training data, just using label names and descriptions – this may take a few minutes from when you've uploaded the taxonomy.
- For Cluster Suggestions: go to the Train tab and select a clusters batch or go to the Discover tab and select the Cluster mode to start labelling
- For Assisted Labelling: go to the Train tab and follow the recommended actions, or go to the Explore tab and select Shuffle or Teach Label mode to start labelling
Please Note: These features will not be available if your organization has chosen to disable Azure OpenAI services.
How do I use Cluster Suggestions?
Example of a Cluster Suggestion
Prerequisite: ‘Review and Label’ permission
Cluster Suggestions will appear on the top of each Cluster page (white shading with blue border). This can be one or multiple suggested labels for each cluster.
If you have Label sentiment analysis enabled, Cluster Suggestions will have either positive or negative sentiment (white shading with green or red border).
You can tell it’s an AI suggested label by the red sparkle icon next to the label name.
Example of an AI suggested label
Model trainers should review each Cluster Suggestion and:
- Accept it by clicking on it, or
- Assign a new label if you don’t agree with the given suggestion
How do Cluster Suggestions help with model training?
Cluster Suggestions can significantly speed up the first phase of the model training process by automatically generating suggested labels for each cluster.
It can also help with taxonomy design, if users are struggling to define the concepts they want to train.
Cluster Suggestions are generated based on the identified theme shared across the comments within a cluster.
The creation of clusters and generation of label suggestions is an automatic and completely unsupervised process with no human input required.
Label suggestions on clusters will be generated with or without a pre-defined taxonomy, but suggestions will be influenced and typically made more helpful by leveraging imported / existing labels.
How do I use Assisted Labelling?
Example of Assisted Labelling
Prerequisite 1: ‘Review and Label’ permission
Prerequisite 2: Imported list of label names
Optional but highly recommended: Imported list of label descriptions
Once the initial model has automatically trained using label names and descriptions as it's training input, predictions will appear for many of the comments in the dataset.
These predictions work in the exact same way as they have done previously – they are just generated with no training data.
If you have Label sentiment analysis enabled, initial predictions will have either positive or negative sentiment (different shades of green / red based on its confidence level).
Assisted Labelling works in any training batch or mode but it’s most effective to use in ‘Shuffle’ and ‘Teach Label’ (follow the regular labelling steps in each training batch in Train or Explore).
How does Assisted Labelling help with model training?
Assisted Labelling can significantly speed up the second phase of the model training process by automatically generating predictions for each label with sufficient context, with no training examples required.
Initial predictions will be driven by the quality of the label names and natural language descriptions (i.e. vague names might lead to vague or minimal predictions). Detailed label descriptions can boost the initial model’s performance.
As you train your dataset further, the platform will use both the label names and descriptions and your pinned examples to generate relevant label predictions.
These will keep improving with more training and ultimately rely only on annotated training examples when enough have been provided.
Assisted Labelling still requires supervised learning by accepting / rejecting the predictions, but it accelerates the most time-consuming part of model training by providing better predictions with zero or very few pinned examples.
Previous: Overview of the model training process | Next: Understanding the status of your dataset