A Semantics-Based Approach to Effective Email Management

A modern approach to an old problem

Photo by Joanna Kosinska on Unsplash

Billions of emails are sent every day. We spend a huge amount of time looking through them, figuring out what they are about and designating each a priority level. I was exploring this use case a few months ago when I realized it was a perfect scenario to apply AI and NLP.

Automating incoming email can drastically reduce time spent processing them one by one and be of great help to anyone, from the individual looking for an AI package to manage their inbox to the business looking to adopt technologies that collect value-added information from their emails.

Given the wide availability of RPA (Robotic Process Automation), choosing the proper NLP technology to start quickly with the implementation of your email management solution may be much easier than you think. The following explains how I developed a compact email automation solution relying on deep linguistics and NLU coming from expert.ai.

Project journey:

- Designing an Email Management Solution Based on Open Source NLU

- Semantics Welcome: A Conceptual Approach to Text Classification

- Extraction: Collect Valuable Information from Text

- Leveraging Sentiment in Email Automation

Tutorial and code at the bottom. Find further details on the methodologies on GitHub at this link (https://github.com/therealexpertai/email-management).

Expert.ai’s Edge NL API is an on-premise API capable of performing NLU tasks with no need for training or extra work. If customization is necessary, it can also run a project developed through the dedicated IDE expert.ai Studio.

I decided to use this API to build an NLU-based model anyone could hook up to an RPA or an email automation script. I employed my custom NLP model to classify text and discover the sender’s intention (e.g., whether the incoming message is a complaint, a support request, or a request for information).

The NLP model I designed also collects specific data such as the names of companies, people, and products. I expected these three to be often mentioned in these types of email. And since the Edge NL API provides built-in sentiment analysis capabilities, I added it to the loop. I thought it would be interesting for anyone passionate about data or business analysis to cross sentiment and intentions information with companies, people and products mentions.

For the classification part, I developed an algorithm following a Topic Modeling approach as well as a heuristics-based approach that relies on language structure.

I built this algorithm using expert.ai Studio, which provides access to the core NLU analysis of expert.ai and applies linguistic customization to it in the form of very basic if-then statements. For example, if a specific linguistic condition occurs, then annotate or apply a specific category to text.

The Topic Modeling approach relies heavily on semantics. Expert.ai’s NLU analysis actively understands the meaning of words, clustering them and allowing me to treat them as concepts. All concepts belong to a knowledge graph that you can actively employ in your linguistic if-then statements for text classification and data mining.

For this purpose, I often use a specific feature of the proprietary language called ANCESTOR. ANCESTOR allows querying of expert.ai’s vast knowledge graph to collect an entire branch of concepts (e.g., “pets” will trigger “dogs”, “cats”, “birds”, etc. because the knowledge graph already knows what pets are), then use them to annotate words and phrases that best characterize my model’s classes.

The advantage of this concept-based technique is that it made model development easier, faster and more effective. Selecting a few representative and key concepts from the graph made my rules richer and more powerful with just a few adjustments. By relying on the graph, I can take full advantage of its NLU analysis, making it possible to distinguish identical words by their meaning. This has been a fundamental value added to my model. Being capable of recognizing and distinguishing complaints, support requests, and information requests relying on meaning instead of words helped making the model smarter and more accurate.

This is a rule example for complaint emails.

This condition annotates text automatically, filing it under the “complaint” category every time one of the concepts in brackets is found in a sentence. Moreover, the condition navigates through the knowledge graph and leverages any synonym or similarly negative concept or word to expand its capability to draw a conclusion on the sender’s intention.

For instance, while working with Studio, I discovered that this knowledge graph contains a huge collection of verbs, adjectives and noun chains with dozens of commonly used words that reflect a negative opinion.

The heuristic approach, instead, works at a lower level, getting more into deep linguistics and language structure. Sometimes there are parts of text where being capable of addressing specific relationships between words, lemmas and other grammar elements makes a huge difference in collecting evidence to understand the sender’s intention, and therefore identifying and applying the most appropriate category to a portion of text. Heuristic rules help me to focus on these deep linguistics relationships as well as sequences of relevant terms that, when found, would immediately and effectively classify text under the most fitting category.

This is an example of the heuristics-based conditions I used to detect support requests.

Here, I tried to catch sentences like “I have a problem” and “has difficulties to…” I specified the structure of the sentence I want to grasp with the list of lemmas I am interested in. The “!SYNCON” commands are used to avoid the annotation of the category if a negated verb is present, so it doesn’t annotate cases like “I don’t have any problems…“ or “I have no problems with…”

These are just the two approaches I chose for this first implementation. Concepts and heuristics-based approaches seemed like the best way to implement a solution quickly, but the flexibility of the tool grants a variety of ways.

Emails are a form of communication. Therefore, they contain a lot of information. There are endless types of data you may want to retrieve from your emails that carry value. For instance, extracting the names of companies, people, and products from text, coupled with sentiment analysis and classification, allows you to build an NLP model that supports deeper analysis to match a company’s product to a complaint, while also measuring the sender’s sentiment and opinion towards that brand.

In this project, I employed the NER capabilities of expert.ai to actively extract people’s and companies’ proper nouns, while again using the knowledge graph to extract known and unknown products mentioned in the emails’ bodies.

Below you can find an example of how I again used ANCESTOR to pick an entire branch of the knowledge graph that focuses on product names and use it to extract those names, whether they are recognized by the technology or in the graph.

Here, the extraction is triggered when there is an ANCESTOR of “products” as “proper noun,” followed by the ANCESTOR of “products” as “noun.” An example could be: “I saw a Samsung Galaxy smartphone yesterday”.

Edge NL API can be set up for your NLP-driven project to provide access to a de facto rich list of built-in natural language processing features such as POS tagging, key phrase extraction, pretrained classification and more. And it is extremely easy to use. I decided to take advantage of the built-in sentiment analysis, and it took me just a few lines of code to add it to my text analysis pipeline (you can find both Python and Java SDK on expert.ai’s developer portal).

Sentiment analysis was a nice addition to the model. Being capable of addressing email sentiment along with its sender’s intentions (classification) allows you to provide a rich list of information that is suitable for additional cross-referencing and data analytics. For instance, one could use automatic sentiment analysis to infer the severity of an issue mentioned in an email and prioritize emails accordingly.

This is an example of the results you can get from the customized Edge NL API I produced including sentiment analysis:

In a scenario where we collect key information from incoming emails and perform automatic sentiment analysis on text, cross referencing very negative or very positive sentiment with sender’s intentions and product mentions, it’s easy to see how this could support customer care practices by automatically providing enough information to identify issues or disappointed customers before a support request email becomes an actual complaint.

What I ended up with is an NLU-driven email automation package that can collect crucial information to draw the sender’s intentions and sentiment along with critical product and company-specific data. This model allows for endless applications: one could employ it to build a cutting-edge RPA to filter emails and route them to the proper inboxes while performing prioritization, or even use the model for further data analytics that cross-references information on products or brands subject to complaints.

It is possible to further customize the model with expert.ai Studio and extract any other critical information that would bring further value to the solution. In addition to tailoring your own extraction and classification algorithms for specific industries and use cases (for instance, generate automated replies to your customers’ requests), you can also use built-in analysis features from Edge NL API to further enrich the results. Or you could quickly replicate the same process for a different language (5 are supported).

The project I created with Studio is here. You can find the NLP model and its respective Edge NL API at this link here — all ready to launch! As the IDE and API are free to use, everyone’s very welcome to try these packages out and even contribute on GitHub.


First things first: download the Email Management package, unpack it, then head over to expert.ai’s developer portal to register your credentials and begin using the API.

From expert.ai’s developer portal, click on the Developer menu (left side of the dashboard) then on Studio to download the IDE setup file for Windows (there is a Linux version too!!).

Beware that you will need Python 3.5 or above to launch the API. Let’s start!

These are four steps you can take to build your own Edge NL API.

Browse to your [Python folder]/scripts and run pip from a command line to install the library

pip install expertai-nlapi

Then set up your credentials as environment variables.

  • Linux:
  • Windows

They can also be defined inside your code importing the OS library.

import os
os.environ["EAI_USERNAME"] = YOUR_USER

Then import expert.ai’s Edge NL API client

Create a “text” variable where you’re going to paste your sample texts to feed to the API.

text = """Dear Mr. Avery
I am writing today to complain about my new Blender 365HB. I purchased this blender at your Woodbury, NY store on 8/9/2020.
I purchased the blender only 3 weeks ago and it is falling apart. The insert for the blade has warped and now I must stand there and hold the blender when it is on. Also, one of the blades on the blender is bend after blending ice. Now I'm very disappointed and I hate this blender so much! This performance is unacceptable and your store would not take it back because it was past the 14-day return. This is ridiculous! How was I supposed to know that all these defects would happen within the 14 days? This is bonkers!
I have attached a pdf below of my receipt. In addition, I have attached before and after pictures of the blender. I look forward to your response and your proposed solution to this problem. I have had blenders from Universal Blenders Co before and never have had this problem. Please contact me at the email I am sending you this from or my cell phone 845*******. Thank you for your help in advanced.
Best Regards,
Jane Billings

I suggest using triple quotes so that no escaping is needed.

Create a for cycle to invoke the API and get all classification results in ranking order. I chose to print both the attributes, IDs and their respective descriptions separated by one tab to make it easier to read.

Do the same for the data mining results: create a for cycle to invoke the API and get data mining results, printing extraction classes, names and their respective results.

Head over to Edge NL API’s documentation for more information or to customize your API calls with more features.

Head over to the Email Management folder (the one you downloaded from GitHub) and navigate to “package\edge” folder.

Launch the NLP model by clicking on the “runmeWindows.cmd” file, then go back to your Edge NL API script (make sure your expert.ai credentials are properly set). If you’re experiencing any issues at this stage, you can refer to this tutorial here. If you make any edits to the model in expert.ai Studio and want to try your new model out, all you need to do is pressing Ctrl+Alt+P in Studio and follow the wizard to create a new model, head over to “package\edge” again to launch the “runmeWindows.cmd” file. Run the Python API script and enjoy!

A Philosopher of Science and Cognitive Scientist in love with the potential of Machine Learning and NLP.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store