User permissions required: ‘View Sources’ AND ‘Review and label’
Please Note: 'Teach Label' is a training mode solely for labelling unreviewed verbatims and as such the reviewed filter is disabled in this mode.
Introduction to using 'Teach Label'
'Teach' is the second step in the Explore phase and its purpose is to show predictions for a label where the model is most confused if it applies or not. Like previous steps, we need to confirm if the prediction is correct or incorrect, and by doing so provide the model strong training signals. It is the most important label-specific training mode.
Key steps
- Select Teach from the top-left dropdown menu as shown
- Select the label you wish to train - the default selection in Teach mode is to show unreviewed verbatims
- You will be presented with a selection of Verbatims where the model is most confused as to whether the selected label applied or not - review the predictions and apply the label if they are correct, or apply other labels if they are incorrect
- Predictions will range outwards from ~50% for data with no sentiment and 66% for data with sentiment enabled
- Remember to apply all other labels that apply as well as the specific label you are focusing on
You should use this training mode as required to boost the number of training examples for each label to above 25, whereby the platform can then accurately estimate the performance of the label.
The number of examples required for each label to perform well will depend on a number of factors. In the 'Refine' phase we cover how to understand and improve the performance of each label.
The platform will regularly recommend using 'Teach Label' as a means of improving the performance of specific labels by providing more varied training examples that it can use to identify other instances in your dataset where the label should apply.
What do we do when there are insufficient 'Teach' examples?
We may find after Discover and Shuffle that some labels still have very few examples, and where ‘Teach Label’ mode doesn’t surface useful training examples. In this case, we suggest to use the following training modes to provide the platform with more examples to learn from:
'Teach' not generating sufficient training examples
Option 1 - 'Search'
Searching for terms or phrases in Explore works the same as searching in Discover. One of two key differences is that in Explore you must review and label search results individually, rather than in bulk. You can search in Explore by simply typing in your search term in the search box at the top left of the page.
Accessing 'Search' in Explore
However, too much Search can bias your model which is something we want to avoid. Add no more than 10-12 examples per label in this training mode to avoid labelling bias. It's also important to allow the platform time to retrain before going back to ‘Teach’ mode.
For more information on how to use Search in Explore, click here.
Option 2 - 'Label'
Although training using 'Label' is not one of the main steps outlined in the Explore phase, it can still be useful in this phase of training. In Label mode, the platform shows you verbatims where that label is predicted in descending order of confidence (i.e. with the most confident predictions first and least confident at the bottom).
Accessing 'Label' training mode in Explore
However, it's only useful to review predictions that are not high-confidence (90%+). This is because when the model is very confident (i.e. above 90%), then by confirming the prediction you are not telling the model any new information, it's already confident that the label applies. Look for less confident examples further down the page if needed. Although, if predictions have high confidences and are wrong, then it's important to apply the correct label(s), thereby rejecting the incorrect prediction(s).
Useful tips
- If for a label there are multiple different ways of saying the same thing (e.g. A, B or C), make sure that you give the platform training examples for each way of saying it. If you give it 30 examples of A, and only a few of B and C, the model will struggle to pick up future examples of B or C for that label.
- Adding a new label to a mature taxonomy may mean it’s not been applied to previously reviewed verbatims. This then requires going back and teaching the model on new labels, using the 'Missed label' function – see here for how
Previous: Training using 'Shuffle' | Next: Training using 'Low Confidence'