How to build a web scraping AI Agent with Narrativa® Navigator

Web Scraper
Web Scraper
Web Scraper
By Cristina Blanco
Today, AI Agents have become one of the most powerful tools in the AI ecosystem. They allow you to delegate complex tasks to autonomous systems capable of interpreting, deciding, and executing actions on your behalf. For users, mastering agents means gaining flexibility, autonomy, and far greater capability to build advanced workflows without relying on traditional programming.
In this context, Narrativa® Navigator introduces AI Agents designed specifically for extracting information from the web, enabling you to turn any public webpage into a structured dataset through a simple configuration process. This functionality provides a level of operational freedom that previously required code or external integrations.
Narrativa note: When we refer to AI Agents in Narrativa® Navigator, we are talking about operational entities designed to perform tasks according to your instructions. They do not act independently outside the secure environment of the platform. Their “behavior” is an operational narrative built to help you delegate complex processes in an easy and safe way.
1. What can you do with AI Agents?
Before getting started, it’s helpful to understand why this feature is so powerful:
✔ Total independence
No need for APIs, scripts, or complex technical setups. The agent navigates the web for you.
✔ Customized data extraction
You decide exactly which data points you want and how they should be stored.
✔ Full flexibility
Works with virtually any public webpage.
✔ Time-saving automation
Automates repetitive tasks that you previously had to do manually.
2. How to create a web scraping AI Agent
Below is a step-by-step guide on how to use this feature.
Step 1: Access the Data tab
From the platform sidebar:
- Click Data.
- Select Create new Data.
Step 2: Choose the data source type
In the creation wizard:
- Select AI Agent (Web Scraping).
- Enter one or more URLs from which you want to extract information.
Step 3: Configure the information the agent should extract
This is where you define how your dataset will be structured.
Add one column for each data point
For every column, specify:
- Column name
- Detailed description, including:
- which value the agent should look for
- where it is located on the webpage
- any useful contextual clues (HTML structure, tags, attributes…)
- Data type such as:
- Text (default)
- Numeric
- Other formats as needed
The agent will analyze the webpage according to your instructions to locate the correct value.
Tip: The clearer and more precise your description is, the more accurate the extraction will be.
Step 4: Preview the extraction
Click Connect to generate a preview of the extracted data.
You will be able to validate:
- Whether the agent is retrieving the correct information
- Whether the format matches what you need
- Whether descriptions need refinement or more columns should be added
You can adjust the configuration at any time. It’s not a rigid or linear process.
Step 5: Configure the dataset as usual
Once the extraction looks correct:
- Select the columns you want to use
- Define the Unique ID
- Set up the Scheduler, same as with any other data source

Final result
You’ll obtain a dynamic dataset, fully tailored to your needs and capable of updating automatically. This dataset can then be used in:
- Workflows
- Narratives
- Content projects
- Analytical processes within Narrativa® Navigator
About Narrativa
Narrativa® Agentic AI solutions unlock a faster, smarter future for life sciences organizations, helping them to efficiently produce complex, high-volume documentation for regulatory and commercialization workflows. By automating content creation, Narrativa® delivers greater speed, accuracy, and consistency—while ensuring full compliance in highly regulated environments.
The Narrativa® Navigator platform provides secure and specialized Agentic AI-powered automation features. It includes complementary user-friendly tools such as Clinical Atlas for CSR and Protocol generation, Narrative Pathway, TLF Voyager, and Redaction Scout, which operate cohesively to transform clinical data into submission-ready documents for regulatory and commercialization. From database to delivery, pharmaceutical sponsors, biotech firms, and contract research organizations (CROs) rely on Narrativa® to streamline workflows, decrease costs, and reduce time-to-market across the clinical lifecycle and, more broadly, throughout their entire businesses.
Explore www.narrativa.com and follow on LinkedIn, Facebook, Instagram, and X.

