Agentic Data Stack: A New Era For Data Professionals
The right data stack opportunities for AI Agent workflow integrations, Challenges to navigate, and the contextual significance of the Human-in-the-loop
Before Diving in, The Modern Data Survey Report is Here!
Subscribers hear first 🧡
230+ Industry voices with 15+ years of experience on average and from across 48 Countries came together to participate in the 1st edition of the Modern Data Survey.
Back to this week’s Deep Dive!
This piece is a community contribution from Alejandro Aboy, a data engineer with nearly a decade of experience across analytics, engineering, and digital tracking. Currently at Workpath, he is the main data platform stack owner, building scalable pipelines, automating workflows, and prototyping AI-powered SaaS analytics using tools like Airflow, dbt, Aurora, and Metabase. With prior roles at AILY Labs and Ironhack, Alejandro has built a strong foundation in web and marketing analytics, shifting into modern data stack tools to deliver robust ELT workflows, visualisations, and governance systems. He also writes The Pipe & The Line, where he reflects on modern data engineering and the evolving stack. We’re thrilled to feature their unique insights on Modern Data 101!
We actively collaborate with data experts to bring the best resources to a 10,000+ strong community of data leaders and practitioners. If you have something to share, reach out!
🫴🏻 Share your ideas and work: community@moderndata101.com
*Note: Opinions expressed in contributions are not our own and are only curated by us for broader access and discussion. All submissions are vetted for quality & relevance. We keep it information-first and do not support any promotions, paid or otherwise!
TOC
Agentic (Data) Roles: Where Are They Going
Agentic Horizons for Data Teams
Pipeline Implementation, Debugging & Monitoring
Documentation & Governance
Data-Model Design from Business Requirements
Self-Served Agentic Analytics
Do We Need To Learn For Agents To Learn
Agentic Headaches (a.k.a Challenges)
Final words
There's a lot of hype around AI, specializing in web app prototyping, but what about our beloved data world?
You open LinkedIn and see the usual posts:
BREAKING: OpenAI releases new prompting guides
LATEST: Anthropic/DeepSeek/Google launches the greatest model ever
“I created this 892-step n8n workflow to read all my emails. Comment on this post so you can ignore yours too!”
You get the point: AI is everywhere, but I don't think we’re fully grasping where it's heading. We're automating both content creation and consumption. We're generating LinkedIn posts with AI and summarizing them using AI because there's simply too much content to process.
This trend isn't limited to creative writing, research, or coding. It’s transforming how data professionals interact with stakeholders.
Agentic (Data) Roles: Where Are They Going?
Depending on your role, this shift will look different.
Stakeholders using agents means data professionals must now translate business context into inputs that agents can understand and execute.
Here are some of the scenarios I imagine:
Data Analysts
Translating business use cases.
Preparing RAG inputs.
Adapting stakeholder metric definitions.
Data Engineers
Setting secure database designs & agent tool scopes.
Setting up RAG Pipelines.
Crafting Agents backend (memory, session, etc)
Data Scientists
Experiment with AI models.
Enable fine-tuned models.
Observability & LLMs Evaluation.
These role shifts are already happening in some companies, and the shift WILL accelerate.
The key message is:
Good data professionals already bridge technical and business knowledge. Now, they must also understand AI-native communication (i.e., English for LLMs, prompt engineering).
We’re entering a new level of cross-functionality.
.
Agentic Horizons for Data Teams
I foresee the biggest impact in these areas. I will list a usage case and actionables for each one of them.
Pipeline Implementation, Debugging & Monitoring
Use case
An agent detects a bug, finds prior incidents in Slack, proposes a fix, and opens a pull request.
What you can do
Design feedback loops while monitoring pipelines. Hold post-mortems and document everything. This improves your RAG system’s memory and knowledge retrieval.
📝 Related Read(s)
.
Documentation & Governance
Use case
The backend team scopes a feature. An agent searches Google Drive, finds an older failed attempt, and flags lessons learned to consider.
What you can do
Ensure documentation is clear, current, and non-contradictory. Use a custom GPT or coded agent to auto-generate documentation with formats like Markdown or XML for higher quality output.
📝 Related Read(s)
.
Data-Model Design from Business Requirements
Use case
You need to model a new business definition. The agent retrieves contextual info and proposes an implementation strategy.
What you can do
Feed meeting notes, interviews, and documentation into agents via declarative assets (e.g., dbt YAML files). Strong data contracts will become essential for agent-driven environments.
📝 Related Read(s)
Right-to-left data modelling ⬇️
.
Self-Served Agentic Analytics
Use case
Stakeholders can’t identify the best dashboard or even understand it. An agent guides them, offering context and support.
What you can do
Equip agents with well-documented API access to tools like Tableau or Metabase. This avoids "dashboard graveyards."
📝 Related Read(s)
Soon, you’ll hear job titles like PromptOps Specialist or VibeOps AI Expert. What’s remarkable is that these roles are not far off, they’re already here.
Do We Need To Learn For Agents To Learn?
Agents don’t replace human learning: they require it.
Agents need tools and instructions (just like humans do) until they learn to operate independently.
This means data professionals will shift from executors to orchestrators. Knowing what you're doing becomes even more critical.
We still need to master:
Best practices
Technical fundamentals
New tools for business performance
How to spot agent hallucinations
A good approach: implement Human-in-the-Loop workflows EARLY.
Explain and Confirm: Agent drafts; human approves.
Override: Agent acts unless a human vetoes.
Sandbox and Deploy: Agent tests changes in isolated branches.
You need to have a high-level understanding of what agents are doing. Otherwise, you are designing agentic technical debt.
Agentic Headaches (a.k.a Challenges)
Everything comes with trade-offs and new challenges. Ignoring these issues can create serious problems:
Observability: Know what agents are doing and why; use this input to iterate through versioning until you get a consistent system
Guardrails: Filter actions to prevent bad decisions, outputs or insecure LLM interactions.
Security Design: Give agents only the necessary tools, authentication scope and minimal access to do what they have to do.
Bad outputs are one thing. Worse is when no one knows how to trace them, add preventive mitigations around them, or even protect the company system’s integrity if the agents fail.
Final Words
All these topics have a big learning-adopting curve that can take years for many companies to adapt. Eventually, the hype will fade and become the norm. By then, best practices will be clearer.
The bottom line: Whether you're an analyst, engineer, or scientist, you must refine and observe how agents behave.
Don’t wait for a stakeholder complaint to act. Be agentic (see what I did there)
Value remains the most critical outcome. Its quality will rely on how we translate business intent into agent-ready instructions.
One last thing: knowing when NOT to use AI will bring even more value to businesses.
TL;DR
Agentic (Data) Roles: Where Are They Going?
Roles will need to be the best bridge between business, tech and AI systems.Agentic Horizons for Data Teams
Documentation, business context and relevant formatted input will be critical.Do We Need To Learn For Agents To Learn?
Someone has to build the agentic architecture that makes sense. Learning best practices is more critical than ever.Agentic Headaches (a.k.a Challenges)
Observability, Guardrails and Secure Design are critical topics to cover actively in any company working with LLMs.Final words
Use AI in your Data Stack wisely, or don’t use it.
Author Connect
Find me on LinkedIn 🙌🏻
From MD101 🧡
The Modern Data Survey Report: By the Community, for the Community
230+ Industry voices with 15+ years of experience on average and from across 48 Countries came together to participate in the first edition of the Modern Data Survey
Locating specific time sinkholes
Uncovering recurrent and unnecessary tooling gaps
Projecting the Desired Data Stack in Demand
And much more!