User permissions required: 'View Sources' AND 'Review and label'
Identifying entity predictions
Predicted entities appear as colour highlighted text, such as in the first line of the verbatim below, with a different colour appearing for each different entity type. Once an entity has been confirmed by a user, by either manually applying it or accepting a prediction, the entity will appear as highlighted text with a bold, darker outline as shown below.
If a paragraph has had entities assigned, dismissed, or applied, it will appear highlighted in grey, as shown in the body of the verbatim below.
Entity format example
How does the platform make entity predictions for trainable entities?
When reviewing trainable entities, it's important to remember that the platform will learn from both the entity values that you assign, as well as the context of where they appear within the communications, i.e. the other language that's used around the values themselves.
The platform will consider the context of the language in the same paragraph as the entity value, as well as the single paragraphs (denoted by a new separated line) directly before and after the paragraph that the entity sits in.
Please Note: For entities that are not set to 'trainable', the platform's predictions are based entirely on the rules defined within the platform for that entity. This can be beneficial for when an entity absolutely has to follow a set format for a downstream automation, with any incorrect values causing a failure or exception.
Entity confidence scores
When the platform predicts which entities apply to a communication, it assigns each prediction a confidence score (%) to show how confident it is that the entity applies to the highlighted span of text. You can view an entity’s confidence score by hovering over the entity.
This confidence score is also made available via the API so that it can inform automated actions taken downstream.
Example of an Entity’s confidence score
Accepting and rejecting entity predictions
Once entities are enabled (see here), the platform will automatically start predicting them within the verbatims throughout your dataset. Users can then accept the predictions that are correct or reject them where they are incorrect. Each of these actions sends training signals that will be used to improve the platform’s understanding of that entity.
For the pre-trained entities that are trained offline (e.g. Monetary quantity, URL, etc.), it is more important from an improvement perspective for users to reject or correct wrong predictions than it is for them to accept correct predictions.
For the entities that train live in the platform, it is equally important to accept correct predictions as well as reject incorrect predictions. You do not, however, need to keep accepting many correct examples of each unique entity for these kinds (e.g. Example Bank Ltd. is a unique organisation entity) if you aren't finding incorrectly predicted ones.
The key caveat to this if that if you review any entity in a paragraph, you need to review all of the other entities in that paragraph.
To review an entity prediction, hover the mouse over the prediction and the entity review modal will appear, as shown in the example below. To accept it, click 'Confirm', to reject it, click 'Dismiss'.
Entities and labels can be trained independently of each other. Reviewing labels for a verbatim does not mean you have to review the entities in that same verbatim. It is, however, good practice to do both at the same time, as the most efficient use of your time whilst model training.
Please Note: It's very important when training entities to follow the best practices explained below - particularly regarding not partially labelling paragraphs.
To understand how well the platform is able to predict each entity enabled for a dataset (particularly the trainable ones), see here.
Example verbatim with both assigned and predicted entities
Please Note: It’s important that you reject incorrect entity predictions, but if the highlighted text was in fact a different entity (this would be more common for date-related entities) that you apply the correct one afterwards (see below on how to apply entities).
Applying entities
To apply an entity to some text where the platform may not have predicted it, users simply need to highlight the section of test like you would if you were going to copy it.
A dropdown menu will appear, as shown below, containing all of the entities that you have enabled for your dataset. Simply click the correct one to apply it, or press the corresponding keyboard shortcut.
The default keyboard shortcut for each entity is the letter is starts with. If more than one entity starts with the same letter, one will be assigned at random to the other.
An example verbatim showing entity application modal
Once an entity has been applied, it will be highlighted in colour with a bold outline (see below). Each entity type will have its own specific colour.
An example verbatim showing an applied ‘Policy Number’ entity
Please Note: A value for a given entity type cannot be split across multiple paragraphs. The full value must be contained within a paragraph for it to be extracted as one entity value.
Best Practice
Please Note: There are two very important best practices to remember when accepting, rejecting or applying entities within verbatims:
1. Don't split words
It’s important not to split words – the highlighted entity should cover the entire word (or several) in question, not just part of it (see the incorrect example on the left below, and the correct application on the right)
Incorrect example of the ‘Address Line’ entity being applied
Correct example of the ‘Address Line’ entity being applied
2. Don't partially label paragraphs
When labelling, if a user assigns one label to a verbatim, they should apply ALL labels that could apply to that verbatim, otherwise you teach the model that those other labels should not apply. For entities, the same is true, except entities are reviewed or applied at the paragraph level, rather than the whole verbatim.
Paragraphs in a verbatim are separated by new lines. The subject line of an email verbatim is considered its own single paragraph.
Make sure to review or apply all of the entities within a paragraph across all entity kinds if you review or apply one of them. Applying, accepting or rejecting entities in a paragraph means that the paragraph is treated as ‘reviewed’ by the platform from an entity perspective. Therefore, it’s important to accept or reject ALL of the predictions in that paragraph.
The example below shows the different paragraphs that have been reviewed within the email verbatim.
Example email verbatim showing correctly reviewed entities across multiple paragraphs
The verbatim shown below shows the same example where the user has not accepted or rejected all of the entity predictions in a single paragraph. This is incorrect, as the model will falsely treat the monetary quantity entity as an incorrect prediction.
Example email verbatim that has not been properly reviewed
Previous: Entity filtering | Next: Validation for entities