Clinical studies and privacy: anonymize data with this Narrativa model
Clinical studies and privacy: anonymize data with this Narrativa model

AI in Life Sciences
AI in Life Sciences
AI in Life Sciences
By Sofía Sánchez González
Privacy is one of the most talked about topics these days and keeping our data safe seems impossible sometimes. Clinical studies are not immune; to carry them out, it’s necessary to anonymize the data of participants so that science may advance while respecting the rights of those involved. But did you know that you can easily comply with privacy regulations and anonymize data using artificial intelligence?
How does our AI model work?
At Narrativa we wanted to provide a solution to the privacy problem. It all started with our technical team investigating medical files found in the Plan for the Promotion of Language Technologies.
The TL Plan aims to promote the development of natural language processing, machine translation and conversational systems in Spanish and other languages.
For this reason, they often release datasets. Among these datasets, clinical comments from patients were found—and then our technical team had an idea.
https://github.com/PlanTL-GOB-ES/SPACCC_MEDDOCAN
In this dataset there is an extensive list of patient data:
- Names
- Surname
- Location data
- Phone number
- Date of birth
- Date of admission
- Treating doctor
But to carry out a clinical study and anonymize the data we have to accomplish several steps.
Two steps to anonymization
1.Identify the information on each tab
The dataset has a large amalgamation of data, but they aren’t sorted. We start with the most difficult part: identifying the information and classifying it. (It’s crucial to know where there is sensitive and personal information.)
https://github.com/PlanTL-GOB-ES/SPACCC_MEDDOCAN/blob/master/corpus/train/brat/S0004-06142005000500011-1.txt
2. Anonymization
https://github.com/PlanTL-GOB-ES/SPACCC_MEDDOCAN/blob/master/corpus/train/brat/S0004-06142005000500011-1.ann
Once we have identified all the patient data, we have to mask (or redact) their personal data to protect their privacy and comply with confidentiality protocols. The model created by Narrativa offers two masking options:
-
To cover it up/black it out
-
To use false/made-up information
Privacy and pharmaceutical companies
Data is the new oil, but we have to be careful when dealing with it—especially when it comes to information as sensitive and personal as medical data. Clinical studies must scrupulously comply with current regulations and Narrativa can help pharmaceutical companies to do this.
Anonymizing data is the most fundamental step; worldwide there has been a huge push for data protection regulation. Take a look at this study in Spanish that delves into the reasons.
Narrativa specializes in collecting, processing and analyzing data, so your company can focus on what’s truly important. With this model, pharmaceutical companies will be able to streamline these privacy processes with much more powerful technology. In fact, the model is more than 70% accurate with the F1 metric.
A solution for all types of companies
Sure pharmaceutical companies can benefit from this model, but so can businesses in other industries. It can also be decisive for companies that conduct research in other sectors, both social and economic.
When we talk about artificial intelligence, data first has to be anonymized when training any model. As each company is different, Narrativa offers personalized solutions for your organization. If you want to anonymize any type of data at your company, don’t hesitate to contact us!
About Narrativa
Narrativa® is the global leader in generative AI content automation. Through the no-code Narrativa® Navigator platform and the collaborative writing assistant, Narrativa® Sidekick, organizations large and small are empowered to accelerate content creation at scale with greater speed, accuracy, and efficiency.
For companies in the life sciences industry, Narrativa® Navigator provides secure and specialized AI-powered automation features. It includes complementary user-friendly tools such as CSR Atlas, Narrative Pathway, TLF Voyager, and Redaction Scout, which operate cohesively to transform clinical data into submission-ready regulatory documents. From database to delivery, pharmaceutical sponsors, biotech firms, and contract research organizations (CROs) rely on Narrativa® to streamline workflows, decrease costs, and reduce time-to-market across the clinical lifecycle and, more broadly, throughout their entire businesses.
The dynamic Narrativa® Navigator platform also supports non-clinical industries such as finance, marketing, and media. It helps teams drive measurable impact by creating high-quality, scalable content on any topic. Available as a self-serve SaaS solution or a fully managed service, built-in AI agents enable the production, refinement, and iteration of large volumes of SEO-optimized news articles, engaging blog posts, insightful thought leadership pieces, in-depth financial reports, dynamic social media posts, compelling white papers, and much more.
Explore www.narrativa.com and follow on LinkedIn, Facebook, Instagram, and X. Accelerate the potential with Narrativa®.