The Science Behind Joyous Actionable Themes

Organizations run Joyous campaigns to solve a known challenge or problem. A Joyous campaign is used to gather ideas, identify blockers, and determine required resources.

While organizations usually do know what their challenges and problems are, they usually don't have a prioritised list of actions to solve for them.

Joyous is the first AI solution that aims to automatically identify top actionable themes in employee feedback.

The Joyous Campaign Lifecycle

‍

The simplified lifecycle of a Joyous Campaign is:

Pick a challenge.
Gather expertise.
Identify actions.
Take action.

Actionable themes solve for step 3 in the lifecycle and includes:

Identifying actionable phrases from the open text comments made by employees.
Grouping them into common themes.
Presenting the top actionable themes in a campaign report.

The Journey So Far

In the beginning, organizations would select questions from our pre-defined library on HR topics. Joyous themes were generic and broad. It was therefore easy to identify common themes such as 'Support', 'Recognition', 'Work-life Balance', 'Remuneration' and so on.

Over time organizations started asking questions on specific operational topics. Some even shifted away from general HR topics altogether. The questions were also unique to their organization and domain.

At that point our goal changed. We wanted our themes to be actionable. We wanted to make it easy to answer these two questions: 'What action should be taken?' and 'What should be acted on?'

For example, 'Cables' is not an actionable theme. It tells you what should be acted upon but not what action to take. 'More time to repair cables' is actionable. It tells you what action should be taken ('More time to repair') and what should be acted on ('cables').

We want our customers to know what actions to take - at a glance. This means they no longer require internal analysts to do further manual analysis. And it reduces the time to know what actions to take down to zero.

As we set about achieving this goal we discovered that identifying actionable themes is not easy. In fact it's so hard there is no other existing technology that can do this. So, we are building our own novel solution.

Why This Is a Hard Problem

Joyous AI factors in nuances relating to text structure, writing style, length and organizational context. Particularly as they relate to people providing feedback in the workplace.

Text structure

Joyous conversations are unstructured. People are not selecting from a pre-defined list of options, nor being guided towards a particular format when writing. Therefore we do not know what structure the feedback will be provided in. People are also engaging in a back and forth conversation with another person. That person can take the conversation in any direction. The advantage of this is there is likely a lot of valuable insight within the messages. The disadvantage is that it is highly complex to extract it.

Writing style

People's brains are wired differently. At a basic language level the different ways people could compose text to describe an action is vast. Fluency and text quality are affected by many factors such as gender, age, native language, and education. Even the device being used to respond plays a role in writing style. Unpredictable writing style makes it harder to identify an actionable phrase.

Length

At a simplistic level, Joyous is a messaging app. The experience of responding is similar to using SMS, Messenger, Whatsapp and MS Teams. This means that the amount of words people use to respond can vary from one word to several paragraphs. Therefore we cannot rely on the text we are analyzing to be a predictable length.

Organizational context

Understanding what makes data actionable within your organization is a big challenge.The context of your organization is not known by any generic algorithm. All existing language algorithms use public domain information from sources like Wikipedia to learn about themes. Those sources don't include information that is relevant and unique to your organization. That's why using an existing algorithm won't work - they just don't understand your domain.

Our Novel Approach

While building our solution we are conscious that our customers can manually conduct analysis, and some will compare the accuracy of our automated solutions to the linguistic competence of the human brain. This means they have very high expectations for the accuracy Joyous should achieve versus the analysis they can conduct as a human familiar with their organization and domain.

We have taken a modular approach to solving this hard problem. Through extensive research and analysis of our existing customer datasets we have identified several contained features. If combined we estimate they can reach 80% coverage of all actionable phrases within Joyous conversation data.

Each module is contained in a stand-alone extractor model. We combine the extractor models to increase our overall coverage of actionable phrases. Considering our previously mentioned hard problems, and having knowledge of our datasets we have prioritised modules by highest estimated coverage first.

How We Validate Our Solution

The way we validate our solution also considers the hard challenges mentioned earlier.

Step 1 - Develop solution‍

For each module we begin by developing our initial solution. For obvious reasons, we won't be disclosing the details of our algorithms in this blog. All we can comfortably disclose is that we have specific solutions targeting varying lengths of messages and text patterns.

Step 2 - Manual Tagging

This is unlike an image processing problem where have or don't have a desirable state . One that can be understood by any human who looks at the photo. When it comes to natural language processing identifying an actionable phrase is incredibly subjective. It's what a human believes is actionable out of a message based on their knowledge of the domain. In short: it 'depends'. So, our next step for each new module is to develop training data. Sometimes we undertake this step in parallel with developing our solution.

We need to understand our customers business domain, and what makes data actionable within their domain. We need to develop this in-house because third parties don't have enough context about our customers domains.

First, we select a dataset of around 1,000 comments. We deliberately select data that was not used to influence our solution. Our full stack data scientists, full stack product people, and data analysts work together to manually tag comments. At Joyous we expect everyone building our solution to be intimately familiar with our data. They begin by highlighting the actionable phrases within each comment. They then look at the whole and decide what makes a theme. Finally they group the phrases into common themes.

This documented process is partially automated by data engineers. Our goal is to be as productive and efficient as possible. We waste no time on processes that could be automated with code. It is also performed in precisely the same way - delivering output in the same format - each time.

Step 3 - Validate solution

Once ready, we run our solution against the selected dataset. The actionable phrases automatically identified by the solution are added to the dataset.

We then use code to compare the manual phrases to the automated ones to calculate whether the automated result is the same as the manual result. Or whether the automated result is as good as a manual result if we missed it manually. This is called a True Positive (TP) result.

We also calculate a few other statistics that are important to know:

False Positive (FP). This is when the automated phrase doesn't match the manual phrase or one was found where none were expected.
True Negative (TN). This is when there was no automated phrase and no manual phrase identified.
False Negative (FN). This is when there is a manual phrase but no automated phrase.
Incomplete. The automated phrase only contains some of the manual phrase.
Total Possible Actionable Phrases. An estimate of all possible phrases from all comments within this dataset.

Using these statistics we then calculate the following measures:

Accuracy. The proportion of true results among the total number of cases examined: TP+TN/TP+TN+FP+FN.‍
Precision. How precise/accurate the model is out of the predicted cases: TP / (TP + FP).‍
Recall. The ability of the model to find all the relevant cases within a data set: TP / (TP + FN).‍
F1. A balanced measure across Precision and Recall: 2 * (Precision * Recall)/(Precision + Recall)‍
Impact. An estimate of how impactful a particular model is: TP / Total Possible Features

We aim to achieve greater than 80% accuracy, with greater than 70% recall before we consider a solution to be complete. Once we achieve these levels we will select and manually tag another dataset. We will run our solution against this new dataset and again examine the results. We then fine tune our solution until we achieve the desired accuracy across all datasets. We repeat this process until we achieve the desired level of overall accuracy after adding a new dataset without further fine tuning.

Some solutions have required as few as five datasets to achieve this outcome, others have required significantly more. This is indicative of the complex nature of this problem.

Our Progress So Far

So far we have two extraction modules that are complete and two that are nearly complete. These get us well on the way to 80% which is our near-term objective.

In the meantime our reporting analysts use an in-house tool for accelerated manual tagging to produce campaign reports. Our in-house tool uses our completed extraction modules. Customers can therefore receive a report shortly after a campaign concludes. The reports contain top actionable themes and most of the report generation is automated in code.

As an organization we are delighted with the progress we have made. To our knowledge nobody has achieved results like ours in this problem space.