Agentic Data Stack: A New Era For Data Professionals

The right data stack opportunities for AI Agent workflow integrations, Challenges to navigate, and the contextual significance of the Human-in-the-loop

Jul 17, 2025

Before Diving in, The Modern Data Survey Report is Here!

Subscribers hear first 🧡

230+ Industry voices with 15+ years of experience on average and from across 48 Countries came together to participate in the 1st edition of the Modern Data Survey.

Get Your Copy

Back to this week’s Deep Dive!

This piece is a community contribution from Alejandro Aboy, a data engineer with nearly a decade of experience across analytics, engineering, and digital tracking. Currently at Workpath, he is the main data platform stack owner, building scalable pipelines, automating workflows, and prototyping AI-powered SaaS analytics using tools like Airflow, dbt, Aurora, and Metabase. With prior roles at AILY Labs and Ironhack, Alejandro has built a strong foundation in web and marketing analytics, shifting into modern data stack tools to deliver robust ELT workflows, visualisations, and governance systems. He also writes The Pipe & The Line, where he reflects on modern data engineering and the evolving stack. We’re thrilled to feature their unique insights on Modern Data 101!

We actively collaborate with data experts to bring the best resources to a 10,000+ strong community of data leaders and practitioners. If you have something to share, reach out!

🫴🏻 Share your ideas and work: community@moderndata101.com

*Note: Opinions expressed in contributions are not our own and are only curated by us for broader access and discussion. All submissions are vetted for quality & relevance. We keep it information-first and do not support any promotions, paid or otherwise!

TOC

Agentic (Data) Roles: Where Are They Going
Agentic Horizons for Data Teams
Pipeline Implementation, Debugging & Monitoring
Documentation & Governance
Data-Model Design from Business Requirements
Self-Served Agentic Analytics
Do We Need To Learn For Agents To Learn
Agentic Headaches (a.k.a Challenges)
Final words

There's a lot of hype around AI, specializing in web app prototyping, but what about our beloved data world?

You open LinkedIn and see the usual posts:

BREAKING: OpenAI releases new prompting guides
LATEST: Anthropic/DeepSeek/Google launches the greatest model ever
“I created this 892-step n8n workflow to read all my emails. Comment on this post so you can ignore yours too!”

You get the point: AI is everywhere, but I don't think we’re fully grasping where it's heading. We're automating both content creation and consumption. We're generating LinkedIn posts with AI and summarizing them using AI because there's simply too much content to process.

This trend isn't limited to creative writing, research, or coding. It’s transforming how data professionals interact with stakeholders.

Agentic (Data) Roles: Where Are They Going?

Depending on your role, this shift will look different.

Stakeholders using agents means data professionals must now translate business context into inputs that agents can understand and execute.

Here are some of the scenarios I imagine:

Data Analysts

Translating business use cases.
Preparing RAG inputs.
Adapting stakeholder metric definitions.

Data Engineers

Setting secure database designs & agent tool scopes.
Setting up RAG Pipelines.
Crafting Agents backend (memory, session, etc)

Data Scientists

Experiment with AI models.
Enable fine-tuned models.
Observability & LLMs Evaluation.

These role shifts are already happening in some companies, and the shift WILL accelerate.

The key message is:

Good data professionals already bridge technical and business knowledge. Now, they must also understand AI-native communication (i.e., English for LLMs, prompt engineering).
We’re entering a new level of cross-functionality.

Agentic Horizons for Data Teams

I foresee the biggest impact in these areas. I will list a usage case and actionables for each one of them.

Agentic Roles in Data Teams — Agentic Horizons for Data Teams (for representation only)

Pipeline Implementation, Debugging & Monitoring

Use case

An agent detects a bug, finds prior incidents in Slack, proposes a fix, and opens a pull request.

What you can do
Design feedback loops while monitoring pipelines. Hold post-mortems and document everything. This improves your RAG system’s memory and knowledge retrieval.

📝 Related Read(s)
How AI Agents & Data Products Work Together to Support Cross-Domain Queries & Decisions for Businesses
Travis Thompson, Brij Mohan Singh, and Ritwika Chowdhury
·
Jan 16
Read full story
Data Quality: A Cultural Device in the Age of AI-Driven Adoption
Animesh Kumar
·
May 29
Read full story

Documentation & Governance

Use case

The backend team scopes a feature. An agent searches Google Drive, finds an older failed attempt, and flags lessons learned to consider.

What you can do
Ensure documentation is clear, current, and non-contradictory. Use a custom GPT or coded agent to auto-generate documentation with formats like Markdown or XML for higher quality output.

📝 Related Read(s)
Data Governance 3.0: Harnessing the Partnership Between Governance and AI Innovation
Amy Raygada
·
Jan 30
Read full story
Governance for AI Agents with Data Developer Platforms
Brij Mohan Singh, Travis Thompson, and Ritwika Chowdhury
·
December 5, 2024
Read full story

Data-Model Design from Business Requirements

Use case

You need to model a new business definition. The agent retrieves contextual info and proposes an implementation strategy.

What you can do
Feed meeting notes, interviews, and documentation into agents via declarative assets (e.g., dbt YAML files). Strong data contracts will become essential for agent-driven environments.

📝 Related Read(s)
Right-to-left data modelling ⬇️
Where Exactly Data Becomes Product: Illustrated Guide to Data Products in Action
Animesh Kumar and Travis Thompson
·
August 1, 2024
Read full story
The Power Combo of AI Agents and the Modular Data Stack: AI that Reasons
Animesh Kumar, Brij Mohan Singh, and Ritwika Chowdhury
·
October 24, 2024
Read full story

Self-Served Agentic Analytics

Use case

Stakeholders can’t identify the best dashboard or even understand it. An agent guides them, offering context and support.

What you can do
Equip agents with well-documented API access to tools like Tableau or Metabase. This avoids "dashboard graveyards."

📝 Related Read(s)
The Dashboard Doppelgänger: When GenAI Meets the Human Gaze
Antonio Neto and Livia Fazolato
·
Jun 19
Read full story
AI Augmentation to Scale Data Products to a Data Product Ecosystem
Brij Mohan Singh, Ritwika Chowdhury, and Rakesh Vishvakarma
·
August 9, 2024
Read full story

Soon, you’ll hear job titles like PromptOps Specialist or VibeOps AI Expert. What’s remarkable is that these roles are not far off, they’re already here.

Generated image — **Caricature of Self-Served Agentic Analytics,** Generated by ChatGPT | Source: Author

Do We Need To Learn For Agents To Learn?

Agents don’t replace human learning: they require it.

Agents need tools and instructions (just like humans do) until they learn to operate independently.

This means data professionals will shift from executors to orchestrators. Knowing what you're doing becomes even more critical.

We still need to master:

Best practices
Technical fundamentals
New tools for business performance
How to spot agent hallucinations

A good approach: implement Human-in-the-Loop workflows EARLY.

Explain and Confirm: Agent drafts; human approves.
Override: Agent acts unless a human vetoes.
Sandbox and Deploy: Agent tests changes in isolated branches.

You need to have a high-level understanding of what agents are doing. Otherwise, you are designing agentic technical debt.

Agentic Headaches (a.k.a Challenges)

Everything comes with trade-offs and new challenges. Ignoring these issues can create serious problems:

Observability: Know what agents are doing and why; use this input to iterate through versioning until you get a consistent system
Guardrails: Filter actions to prevent bad decisions, outputs or insecure LLM interactions.
Security Design: Give agents only the necessary tools, authentication scope and minimal access to do what they have to do.

Bad outputs are one thing. Worse is when no one knows how to trace them, add preventive mitigations around them, or even protect the company system’s integrity if the agents fail.

Final Words

All these topics have a big learning-adopting curve that can take years for many companies to adapt. Eventually, the hype will fade and become the norm. By then, best practices will be clearer.

The bottom line: Whether you're an analyst, engineer, or scientist, you must refine and observe how agents behave.

Don’t wait for a stakeholder complaint to act. Be agentic (see what I did there)

Value remains the most critical outcome. Its quality will rely on how we translate business intent into agent-ready instructions.

One last thing: knowing when NOT to use AI will bring even more value to businesses.

TL;DR

Agentic (Data) Roles: Where Are They Going?
Roles will need to be the best bridge between business, tech and AI systems.
Agentic Horizons for Data Teams
Documentation, business context and relevant formatted input will be critical.
Do We Need To Learn For Agents To Learn?
Someone has to build the agentic architecture that makes sense. Learning best practices is more critical than ever.
Agentic Headaches (a.k.a Challenges)
Observability, Guardrails and Secure Design are critical topics to cover actively in any company working with LLMs.
Final words
Use AI in your Data Stack wisely, or don’t use it.

Author Connect

Find me on LinkedIn 🙌🏻

From MD101 🧡

The Modern Data Survey Report: By the Community, for the Community

230+ Industry voices with 15+ years of experience on average and from across 48 Countries came together to participate in the first edition of the Modern Data Survey

Locating specific time sinkholes
Uncovering recurrent and unnecessary tooling gaps
Projecting the Desired Data Stack in Demand
And much more!

Get Your Copy of the Report

A guest post by

Alejandro Aboy

Ex Web Analytics Specialist, currently working as Data Engineer. Building & growing my Data & AI career by sharing the tools and lessons I wish I had when I started.

Modern Data 101

Agentic Data Stack: A New Era For Data Professionals

The right data stack opportunities for AI Agent workflow integrations, Challenges to navigate, and the contextual significance of the Human-in-the-loop

Before Diving in, The Modern Data Survey Report is Here!

Back to this week’s Deep Dive!

Agentic (Data) Roles: Where Are They Going?

Data Analysts

Data Engineers

Data Scientists

Agentic Horizons for Data Teams

Pipeline Implementation, Debugging & Monitoring

Use case

How AI Agents & Data Products Work Together to Support Cross-Domain Queries & Decisions for Businesses

Data Quality: A Cultural Device in the Age of AI-Driven Adoption

Documentation & Governance

Use case

Data Governance 3.0: Harnessing the Partnership Between Governance and AI Innovation

Governance for AI Agents with Data Developer Platforms

Data-Model Design from Business Requirements

Use case

Where Exactly Data Becomes Product: Illustrated Guide to Data Products in Action

The Power Combo of AI Agents and the Modular Data Stack: AI that Reasons

Self-Served Agentic Analytics

Use case

The Dashboard Doppelgänger: When GenAI Meets the Human Gaze

AI Augmentation to Scale Data Products to a Data Product Ecosystem

Do We Need To Learn For Agents To Learn?

Agentic Headaches (a.k.a Challenges)

Final Words

TL;DR

Author Connect

The Modern Data Survey Report: By the Community, for the Community

Discussion about this post