PLEASE NOTE: UiPath Communications Mining's Knowledge Base has been fully migrated to UiPath Docs. Please navigate to equivalent articles in UiPath Docs (here) for up to date guidance, as this site will no longer be updated and maintained.

Knowledge Base

Model Training & Maintenance

Guides on how to create, improve and maintain Models in Communications Mining, using platform features such as Discover, Explore and Validation

Training using 'Low confidence'

 User permissions required: 'View Sources' AND 'Review and label'

 

The final key step in Explore is training using 'Low confidence' mode, which shows you verbatims that are not well covered by informative label predictions. These verbatims will have either no predictions or very low confidence predictions for labels that the platform understands to be informative.


'Informative labels' are those labels that the platform understands to be useful as standalone labels, by looking at how frequently they're assigned with other labels.


This is a very important step for improving the overall coverage of your modelIf you see verbatims which should have existing labels predicted for them, this is a sign that you need to complete more training for those labels. If you see relevant verbatims for which no current label is applicable, you may want to create new labels to capture them. 

 

You can assign labels to verbatims in this mode in the same way as any other Explore mode.


To access this mode, use the dropdown in the top left-hand corner of the Explore page:

 

 

Dropdown menu to access ‘Low confidence’

 


How much training should I do for this step?

 

This mode will present you with 20 verbatims at a time, and you should complete a reasonable amount of training in this mode, going through multiple pages of verbatims and applying the correct labels, to help increase the model's coverage (see here for a detailed explanation of coverage).

 

The total amount of training you need to complete in 'Low confidence' will depend on a few different factors:

 

  • How much training you completed in Shuffle and Teach - the more training you do in Shuffle and Teach, the more your training set should be a representative sample of the dataset as a whole, and the fewer relevant verbatims there should be in 'Low confidence'
  • The purpose of the dataset - if the dataset is intended to be used for automation and requires very high coverage, then you should complete a larger proportion of training in 'Low confidence' to identify the various edge cases for each label

 

At a minimum, you should aim to label 5 pages of verbatims in this mode. Later on in the Refine phase when you come to check your coverage, you may find that you need to complete more training in 'Low confidence' to improve your coverage further.
  


Previous: Training using 'Teach label' (Explore)     |      Next: Training using 'Search' (Explore) 

Did you find it helpful? Yes No

Send feedback
Sorry we couldn't be helpful. Help us improve this article with your feedback.

Sections

View all