Artificial Intelligence

Min Read

Cracking the challenge of unstructured medical text

Matanya Hatan

Published On

June 1, 2020

Natural Language Processing (NLP) is a technology built to help computers understand human language. Many advances have been made in recent years as artificial intelligence research has intersected with NLP. Today, NLP is used for a wide range of applications like translation, voice assistants, document classification, and more.

At DigitalOwl, we have harnessed NLP's capabilities into the world of "medical insurance", helping Underwriters and Claim Analysts assess applicants’ and insureds' medical records. With advanced algorithms, we can identify all meaningful information in medical documents (medical conditions, dates, body parts, treatments, outcomes, etc.). Just as important, we can extract pertinent non-medical phrases that are critical to understanding the full context of the subject’s medical history specifically for insurance purposes (return to work, ADLs, restrictions, and limitations, etc.).

As pioneers in applying NLP in the insurance industry, we face many unique challenges that arise from the integration of NLP and medical information like the variety of writing-styles of different physicians and the amount of information in each case.

This article focuses on the fascinating solution we’ve developed for understanding the context of words in a medical document: Analyzing the position of words in the document.

The meaning of the position of words in a sentence:

The order of words in the sentence matter. Different orders of the same words generate different meanings. The set of words: I / Like / Do / Not / Why / Trips can have a positive or negative meaning when you change the orders of the words:

"Why do I not like trips?" -Vs.- "I do like trips, why not?"

Imagine that you come home after a long working day, and your partner says "You seem to have gone through a hard day, you deserve a long rest," but the words are mixed, and instead you hear "You seem to have gone through a long day of rest, you deserve hard work."

In all NLP tasks, the form in which the text is analyzed is in the form of a sequence. That means that every word has a number. The computer goes through the text line by line without considering the page structure at all.

The meaning of the position of words on the page:

To understand the text, mainstream NLP models index each word using a simple sequence. For example, the top left word is “1”, the next word to the right is “2”, and so forth, line by line.

But this isn’t good enough. As humans, when we read a document, we not only scan the text from left to right, but our brain also directs us to "strategic" places on the page, searching for familiar patterns.

For example, in medical records, the date in one of the top corners of the page is usually the visit date (even if there are few dates in the text), and the name in the top right corner is often the hospital name.

That's why we've developed a unique model, which is aware of the locations of the words on the page. Let's say you have a page with two lists of medical findings:

As we mentioned, one way to process the words is to index them by sequence from left to right.

The results of this processing method will be that the model gets this input:

NLP models index each word using a simple sequence.

And in this way, how can the models possibly know if Anamnesis (12) is an existing or non-existent condition?

Our solution is to enter all the information to the model:

With DigitalOwl's AI, every word is coordinated in space.

In this way, every word is coordinated in space. The word “Hand” gets the coordinate (20, 14), and “Anamnesis” gets (28, 57). In this way, the model gets the full structure of this page, and can easily say that Anamnesis is a non-existent condition.

Sometimes, it is not just the context between words that is location-dependent, but also the role of each word. Sometimes a page has many dates, but each page only has one printed date. Many times this date will be written in the top right corner (as you can see in the following image)

These capabilities make our NLP model more precise and faster.

Left Picture: The focus on finding the visit date. Right Picture: The focus on finding medical conditions.

Of course, all of this does not make the model refer only to location, but it certainly helps assign a better meaning for each word.

Imagine if you were tasked with locating a doctor's name within a document. Would you instinctively begin at the top left corner and methodically scan each line? Probably not. Similarly, NLP models don’t have to rely on such rigid sequential processing.

Natural Language Processing (NLP)

AI Innovations and Applications

Matanya Hatan

Data Science Team Lead

DigitalOwl

About the author

Matanya is the Data Science Team Lead at DigitalOwl, bringing over four years of invaluable experience to the role. His steadfast guidance ensures the delivery of impactful outcomes and propels the team's success.

Resources

Latest News and Updates

Stay informed with the latest news and updates from DigitalOwl.

View All

Smarter Case Building Starts Here: How AI Is Transforming Legal Case Management

Artificial Intelligence

5 min read

Smarter Case Building Starts Here: How AI Is Transforming Legal Case Management

PI attorneys need more than summaries—they need smart tools that surface the right insights fast. In this blog Krystina Murawski shares how DigitalOwl’s enhanced Self-Serve portal helps legal teams move faster and build stronger cases.

Learn more

Closing the Gaps: AI’s Role in Tackling the Insurance Industry’s Biggest Challenges

Artificial Intelligence

5 min read

Closing the Gaps: AI’s Role in Tackling the Insurance Industry’s Biggest Challenges

Users report saving 60% to 70% of the time spent on medical record reviews with AI. However, choosing a service provider committed to transparency, fairness, and accountability is essential to ensuring responsible and effective AI implementation.

Learn more

How AI is Reinventing Life Underwriting with Actionable Medical Insights

Company Updates

5 min read

How AI is Reinventing Life Underwriting with Actionable Medical Insights

The life insurance industry is grounded in expertise, precision, and trust, yet it faces growing inefficiencies as underwriting workloads and the complexity of medical record reviews increase. This blog will explore how DigitalOwl has leveraged imagination to overcome challenges and equip underwriters with the most effective tools for medical record reviews.

Learn more

Smarter Case Building Starts Here: How AI Is Transforming Legal Case Management

Smarter Case Building Starts Here: How AI Is Transforming Legal Case Management

Cookies acceptance

Cracking the challenge of unstructured medical text

The meaning of the position of words in a sentence:

The meaning of the position of words on the page:

Latest News and Updates

Smarter Case Building Starts Here: How AI Is Transforming Legal Case Management

Closing the Gaps: AI’s Role in Tackling the Insurance Industry’s Biggest Challenges

How AI is Reinventing Life Underwriting with Actionable Medical Insights