Knowledge Base

Automations & Communications Mining

Guides for creating Communications Mining enabled automations

UiPath + Re:infer Reference Architecture

UPDATE: 'Streams' were previously known as 'Triggers' in the platform. They've been renamed to better reflect the functionality that they provide for our users. This name change won't affect existing API routes. The trigger routes still exist, and new streams routes have been added that are identical in functionality. We'll gradually migrate our customers to these new streams routes in the coming months.

When seeking out automatable processes, the mantra of high volume, low complexity and structured inputs has been drilled into automation teams since their inception. Many good processes have failed to be automated due to a lack of reliable input data to feed an automation. 

One such barrier is a communication. In many organisations, most of peoples time is spent communicating; clients we work with find that 70% of employee time is spent responding to queries received via email. Although a lot of these requests were high volume and low complexity, the emails meant that they did not have structured inputs. The client could not unlock a large pipeline of automation opportunities as all the processes involved emails. Enter Re:infer....

Re:infer is a conversational data intelligence platform that allows you to understand, quantify and automate your communications channels. By using Re:infer, automation teams can find new opportunities and structure communications data in real time. This allows tools such as UiPath to action requests automatically.

Core Concepts



Re:infer's architecture is highly scalable, and has a few components. We'll work 'bottom up' to understand these components in turn. The first concept is the comment - this is just a single unit of communication uploaded into the platform, via an integration or the API. A comment can be anything from an email, customer feedback submission or CRM note; any piece of text where someone is expressing themselves in natural language. 

The next concept we'll introduce is the source. A source is essentially a container where multiple comments are stored. There is normally a single source per channel of communication. For example, you might have a source for a IT help mailbox, another source for a customer services mailbox and a final one for your contact us forms. 

Sources are then added into datasets which is where Re:infer can be trained to recognise concepts and data points within the communications data. A dataset can contain multiple sources and sources can belong to multiple datasets. The list of concepts that Re:infer is trained to recognise in a dataset is called a taxonomy. There are two types of taxonomies; a label taxonomy and an entity taxonomy. A label taxonomy is used to describes comments as a whole. For example, by recognising request types, sentiment and contact drivers, you might have labels such as "Order > Missing", "Flight > Booking" or "Urgent". The entity taxonomy is used to extract specific data points from the communication, such as IDs, people's names and product names. For example, you might teach Re:infer to recognise entities such as "Order Number", "Travel Date", "Departing Airport" and "Destination Airport".

Every time that you annotate data in the platform, Re:infer trains and evaluates a new model, which is versioned. A model is basically your taxonomies, saved at a point in time. This is useful as you can reference a model version that you know performs well based on the platform's validation statistics, and be confident that any additional training that you do will not impact downstream systems.

The two final concepts are closely related, streams and thresholds. A stream is essentially a queue of structured communications which can be consumed by downstream systems. In the same way that you read unread emails from Outlook, downstream systems can read unread comments from a stream. Re:infer will have structured the communications with all the label and entity predictions in structured JSON (see here). You'll also note that each label has a corresponding probability. This indicates how confident Re:infer is that the particular labels apply to the comment. It's important to understand that every communication that exists in your dataset is sent to every stream by default - you can only filter what is sent to your streams based on metadata, not based on the label / entity predictions that are being made. This means that most of the time you will have a single stream which handles all concepts.

Using the probability value, we can set a different threshold for each concept that we have trained Re:infer to recognise. This essentially tunes how sensitive the stream is to different topics and will determine whether a given label is predicted against a comment when it is at a specified probability. You can read more about thresholding here.

That concludes the components that you need to know about. Comments are created by an integration and live in sources, sources are added to datasets, taxonomies are built in datasets and are used to train models. Streams reference models and provide a constant queue of structured communications which RPA can read from.

High level UiPath + Re:infer Architecture



Re:infer has a flexible API and can easily integrate with RPA (and other tools) in various different ways. Our recommended approach is detailed in the diagram above, and is heavily inspired by UiPath's Robotic Enterprise Framework template. The approach recommends that for n automations, n + 1 processes are configured. A single feeder process is introduced which is responsible for reading the structured communications out of Re:infer's stream and distributing them to the relevant RPA processes via queues. Any exceptions that may occur due to Re:infer's extraction can then be reported back to the platform for manual review. The processes that get items out of the queues will be standard automations which read their input data from the queue item’s data.

Feeder Process


The feeder process should contain the logic displayed above. There are two key things that need to be understood here; fetching and advancing. Fetching is the process of getting data out of a Re:infer stream. You fetch up to 1024 comments at a time, however it's recommended that you limit your select so that each fetch takes no longer that two seconds to process. Advancing is where you essentially mark the comments as read in the stream so that you do not return the same emails next time you fetch from a stream. If you fetch multiple times in a row without advancing a stream, you will get the same emails every time. This fetch / advance mechanism ensures that you never drop a communication during exception scenarios.

Fetch and Advance Loop


1) Every stream has a current comment.2) You can fetch comments starting at this current comment. Here we're fetching 2.

3) Every comment returned from a stream will have a sequence_id
4) We can use this sequence_id to advance the current comment to the next in the queue. Now when we fetch 2 we will return comments 2 and 3

UiPath Integration Guide


Integrating UiPath and Re:infer is a simple task; There are a few steps that we need to go through in order to build our integration:

- Install and Configure the Reinfer <> UiPath Library

- Create our stream

- Build our queue feeder

- Plan for exceptions

Installing and Configuring the UiPath Library

Re:infer has an out-of-the-box connector with UiPath. This connector is available on the UiPath marketplace here. Once you have added the connector to your UiPath package manager, there are two pieces of configuration that we need to complete in order to get everything up and running:

- Re:infer API Token - The library needs access to a valid API token in order to authenticate with Re:infer. To give it this, all you need to do is create a credential asset with the name ‘re:infer Api Token’. No username is required however the password should be your Re:infer API token. See here for information on how to get access to your API token.


- Re:infer API - The library also needs to know which Re:infer instance to make API calls against. It will read the endpoint from an orchestrator asset with the name ‘re:infer Api Url’ that you need to create. This should be the base URL for your Re:infer API, for example 'https://<your_comany_name>'.

Once you have configured these two assets, the library will be ready to use!

Creating a stream


As discussed previously, a stream references a particular model version. When creating a stream you will first want to look at your validation page and ensure that the current version of your Re:infer model is performing at a standard that is acceptable for your business requirements. If this is not the case, further model training is required. Once you're happy with your model, you must save it in the models page. This will create a saved version of the model that we can reference in the stream


Once you have a saved model version of a well-performing model, you can create the stream in the streams page and configure the thresholds for each of the concepts that you have trained.



A key thing to consider when creating your stream is which threshold is right for your use case. A threshold lets you tune how sensitive Re:infer is when detecting a particular label, trading off precision and recall. You can read more about configuring thresholds here.

Building the queue feeder


Now that we have the stream available to read from, we can now go ahead and start to build our feeder process. The objective of this process will be to read comments out of Re:infer and then send them to the right place within your RPA solution. This will usually be to an Orchestrator queue where another process will pick the comment up for actioning.

Within the UiPath library that we previously configured we can use the fetch from stream activity to read the comments out of our stream. We'll need to provide the following inputs to the activity:

- Dataset Owner - This is the name of the project that you created your dataset in. This is the text before the / in datasets drop down at the top of Re:infer. For example, in bayes-inc/integrations-tutorial, the dataset owner is bayes-inc.


- Dataset Name - This is the name of your dataset. This is the text after the / in datasets drop down at the top of Re:infer. For example, in bayes-inc/integrations-tutorial, the dataset name is integrations-tutorial.


- Stream Name - This is the API name of the stream that you created. You can find this by viewing the streams page.


- Size - This is the number of emails that you want to read from the stream.

You'll see that we get two outputs from this activity: results which is an array of predictions - one for every email that has returned - and sequence id which relates to the final comment returned. 

Once you have these results, you can loop each one, read Re:infer's predictions, and then send that email to the relevant place for processing.

You need to make sure that a comment has not already been forwarded for processing previously. For example, a previous process might have sent the comment to the relevant queue, and then failed to advance the stream because the internet died. Using the Re:infer message id as a key and enforcing unique keys in your queue is a good way to avoid this duplication of work.

Understanding predictions


When you have advanced your comment out of the stream, there are two things that you are going to care about, Re:infer's label predictions and the entity predictions:

- Label Predictions - These are accessible in the labels field of the prediction. Because we have set our thresholds in the stream, we will only ever see labels that exceed their threshold so we don't normally need to worry about the probabilities in the RPA process. We just care if a particular label exists. For example, we might check if the label Account Change > Update Phone Number exists so that we know whether we need to send the comment to the RPA queue responsible for updating phone numbers.


- Entity Predictions - These are accessible in the entities field of the prediction. You will normally just care about reading from the formatted_value field, where the machine readable version of the entity value will be available to you. For example, the word "yesterday" will be presented in this field formatted in as YYYY-MM-DD HH:MM.

Once we have sent the comment to the relevant queue for processing we need to advance the stream so that we don't return that same comment next time that we fetch. This basically marks the email as read within Re:infer. To do this, we can use the advance a stream activity which takes the same inputs as fetch from a stream, but instead of needing size it requires the sequence id that you want to advance the stream to. Every prediction that is returned will have a sequence id field that you can use to advance the stream past that comment.

Note that the results should be processed sequentially to ensure that you don't advance past a comment that later fails.

Managing Exceptions

Exceptions are an inevitable part of any solution, and a great thing about Re:infer is that it can learn from these exceptions. If a downstream automation suspects that Re:infer has made an incorrect prediction - for example it has extracted an account number that you don't recognise - we can tag this comment as an exception where it can be manually reviewed by a model trainer, and correct Re:infer's training. To do this, we can use the tag exception activity which lets us tag a comment with an exception that is viewable in the platform. The exception message can contain any text, for example "Wrong Account Number".


Re:infer enables you to do more with your automation tools, you can use it to find new opportunities and automate them in real time. Our out of the box connectors make this a really light weight project and our recommended approach ensures that you can create a scalable NLP automation project. For more information or help regarding anything you've read in the blog, don't hesitate to contact

Did you find it helpful? Yes No

Send feedback
Sorry we couldn't be helpful. Help us improve this article with your feedback.