PLEASE NOTE: UiPath Communications Mining's Knowledge Base has been fully migrated to UiPath Docs. Please navigate to equivalent articles in UiPath Docs (here) for up to date guidance, as this site will no longer be updated and maintained.

Knowledge Base

Getting Started

Balance

'Balance' is a term used to describe how well the training data for a model represents the dataset as a whole.

 

When the platform assesses how balanced a model is, it's essentially looking for labelling bias that can cause an imbalance between the training data and the dataset as a whole. 

 

To do this, it uses a labelling bias model that compares the reviewed and unreviewed data to ensure that the labelled data is representative of the whole dataset. If the data is not representative, model performance measures can be misleading and potentially unreliable.

 

Labelling bias is typically the result of an imbalance of the training modes used to assign labels, particularly if too much 'text search' is used and not enough 'Shuffle'.

 

The 'Rebalance' training mode shows verbatims that are under-represented in the reviewed set. Labelling examples in this mode will help to quickly address any imbalances in the dataset.

 


Next: Clusters

Did you find it helpful? Yes No

Send feedback
Sorry we couldn't be helpful. Help us improve this article with your feedback.

Sections

View all