A Simplified Approach to AI Clinical Data Summarization

As a medical affairs/medical information professional, summarizing clinical trial data is a common exercise. Recently, several articles have been published on the use of artificial intelligence (AI) in medical information. These usually follow a “here is what is possible one day” and “here are the flaws and roadblocks” format. This is great, but how can we use AI to summarize clinical trial data today?

I have also been working on “what is possible one day,” seeking a way to lean on artificial intelligence (AI) to make summarizing clinical trial data more efficient through my site medinfoai.com. The first step involves either a programming-heavy software development solution, or a machine learning solution to pull key information from PDFs into an AI language model. Next, training an AI model is required to use that information to create an accurate, clear, concise summary. Again, these are things that take a long time to build. What if there was a way to use AI for clinical data summarization today?

The Idea

Recently, while reading through a clinical trial, a thought occurred. Selecting key sections to summarize in response to questions is easy. Eventually, I will figure out how to automate this. Why not figure out the second half, AI summarization, first? There must be a simple tool to highlight information in a PDF manually and then leverage an AI model to pull and summarize only that specific content. This would provide precise, custom-tailored responses and make the entire summarization process more efficient.

The Process

The approach I landed on involves Adobe’s highlighting feature. Here’s how it works:

1. Highlighting in Adobe Acrobat: Start by highlighting text in Acrobat (must be a full version of Adobe, not Adobe Reader) using the “copy selected text into highlight comment pop-ups” option. This essentially transfers the highlighted portion into a comment.

2. Categorization: To categorize the content, label the beginning of the comment as:

Objective
Design
Inclusion Criteria
Exclusion Criteria
Treatment
Primary Endpoint
Secondary Endpoint
Safety Endpoints
Baseline Characteristics
Primary Results
Secondary Results
Safety Results
Limitations
Conclusions

3. Creating a Summary: After categorizing, generate a comment summary in Acrobat.

4. Integrating with ChatGPT: Transfer the comment summary into ChatGPT, providing a custom prompt to mold the summary to the desired specifications. See the prompt example below. You should create your own prompt that fits your needs and writing style.

MedInfo AI Prompt — Sample Prompt for AI Clinical Data Summarization

Why This Simple Approach is a Game-Changer

By controlling the content that is highlighted and summarized, this method offers several distinct advantages:

Customization: Given that we often need to tailor our summaries based on the questions they aim to answer, this approach offers the perfect balance of clinical AI and human insight. Only highlight the information you want summarized.
Efficiency in Data Checking: The original document is already highlighted, simplifying the reference check.
Set up for Success: The categorized comments can potentially train future AI models, automating the data extraction process further.
Start Today: This approach does not require training an AI model or programming anything. You only need an Adobe account and access to an AI language model like ChatGPT.

What’s Next?

The system I tested with ChatGPT-4 worked well. But it doesn’t stop here. First, I need to keep tweaking the prompt. There are a lot of opportunities to set further AI guide rails to lead the language model to a more precise summary that closely meets the user’s needs. Secondly, the goal is to automate this process further, using my manual comment categorizations to train an AI model to pull these key sections automatically. Ideally, a fully trained model could take a question, select the appropriate info to summarize, and summarize it in a way that answers that specific question. Last but not least, I aim to integrate this program with a more medically-trained summarization model. Chat GPT is great, but there are smarter systems out there that are already trained on clinical trial data. Why not use those to summarize instead?

So, what’s next on this journey? I’m diving into programming within Adobe Acrobat to streamline the section tagging and feed that content into an advanced AI clinical data summarization model. Once that is complete, I will work on building a dataset that can be used to train an AI model for section tagging in Adobe.

Stay Connected!

For more insights, follow coremedcom.com and medinfoai.com and join us on LinkedIn (CoreMed Communications and MedInfo AI). Let’s embrace the future of AI in pharma medical information together.