Are you a data engineer wondering how AI fits into your day-to-day work?
with everything happening in AI right now, it’s easy to feel like you’re either falling behind… or not sure where to even start.
In this live session, I sat down with Alejandro Aboy , a Senior Data and AI Engineer to talk about:
How data engineers are really using AI today?
What’s changed in the role?
How you can start, practically?
and more…
If you’ve been asking yourself, “Where do I fit in this AI shift?”, this session is for you.
About Alejandro:
He is a Senior Data Engineer at Workpath and the founder of The Pipe & The Line. He writes hands-on tutorials on Modern Data Engineering and AI.
To access my previous live sessions, you can check out the series1 where we discuss topics like transitioning from Data Analyst or Application Support roles into Data Engineering, and more.
Also, if you want to learn how to think like a senior engineer, I highly recommend checking out my live session with Yordan Ivanov .
Here’s a breakdown of the key ideas and practical insights we covered in this session:
AI engineering is inseparable from data engineering
Alejandro defines AI engineering in a simple, wide way: it’s “putting systems together for all the context for AI to work.” That means agents, MCP integrations, multi-agent workflows, orchestration, all of it. But here’s the catch:
"You cannot really do good AI engineering without the data engineering part."
The idea that AI engineering is a standalone discipline you can pick up in a bootcamp is, in Alejandro's view, misleading. Without understanding how to build and shape data systems, AI use cases won't hold up. Today, roughly 80–90% of his work touches AI, but he can't cleanly separate it from data engineering, because every AI use case pulls him back to the data layer.
The skills that transfer, and why data modelling is the big one
When asked which data engineering skills are most important for AI work, Alejandro was clear: data modelling accounts for around 80% of the impact.
The reason is conceptual. When you build a data model, you’re doing careful, long-term thinking, handling edge cases, encoding business logic, making decisions about how information should be structured and described. That process maps almost directly onto what AI practitioners call context engineering.
“Context engineering is a cool way of saying data modelling.”
If your dbt project has well-written column and model descriptions, accurate data contracts, and clear use-case documentation, then AI will perform well against it. Skip that work, and you’re building on sand.
Orchestration skills transfer too. Data engineers think naturally in terms of inputs, outputs, dependencies, and sequencing, exactly the mental model needed to design AI workflows. Most AI use cases, Alejandro notes, don’t require agents at all. A well-designed workflow handles them fine.
Note: Alejandro and I recently shared a deep dive on agentic data modelling using OpenMetadata and MCP, where we show how metadata, lineage, and context come together to help AI understand impact across the entire data stack.
How he actually works with AI day-to-day
Alejandro’s primary tool is Claude Code, but he uses it well beyond writing code. He’s built MCP connections to most of the services he works with, databases, data catalogues, and orchestration tools, so he can interact with the full system without leaving his environment.
A lot of his time goes into what he calls “working on the machine”: iterating on rule files, refining reusable skills and commands, and improving prompts that run repeatedly. He reads every code change carefully, and his most common instruction to AI is: cut this, you don’t need it. Overgeneration is a consistent pattern, and accepting the first output is, in his words, exactly what makes people less relevant over time.
Here are a couple of great posts Alejandro wrote on Claude Code that I found really helpful:
If you’re thinking about how to incorporate AI into your work but don’t know where to begin, these are practical, high-impact starting points we discussed:
Find out what AI use cases your company is working on, and map what data you control that connects to them. Don’t wait to be asked.
Start writing column and model descriptions as if they’re prompts, precise, use-case oriented, and useful to a system that can’t ask for clarification.
Learn RAG. Alejandro calls it the ETL of AI, the most direct one-to-one equivalent for data engineers entering AI work.
Set up MCPs for the tools you use most: a database, a data catalogue, and a transformation tool. Build the connective tissue around your existing stack.
Don’t settle for the first output. Read the code, challenge it, ask AI to trim it. That critical layer is where your expertise actually lives.
The challenge AI can’t solve yet
One of the most candid moments in our conversation came when Alejandro described working on a debugging agent for Airflow. The agent correctly identified an error caused by a Pandas/Polars migration, but immediately suggested removing Pandas entirely, ignoring both the comment in the codebase explaining why the migration was paused and the broader production risk of touching Docker dependencies.
The agent didn’t lack information. It lacked instinct, the kind of common-sense judgment that tells an experienced engineer when not to act. That gap, Alejandro says, is one of the deepest challenges in AI engineering right now, and it’s precisely the gap that experienced data engineers are positioned to close.
“You cannot teach AI to have instinct right now!”
Using AI when requirements are a mess
One of the most practical questions from our audience was about the “real world” of data engineering: How do you use AI when requirements are ad-hoc, unstructured, or just plain confusing?
Aljeandro replied:
“I’ve never seen a situation where requirements were well-structured and specific. You use AI to challenge whatever you got as input until you get enough substance to go back to the stakeholder.”
which means he doesn’t wait for perfect requirements to start using AI; he uses AI to build the requirements.
The Stress-Test: He feeds messy stakeholder requests into AI to find the “logic gaps” and edge cases he might be missing. It’s an iterative process of challenging the input until there’s enough substance to actually build.
Code-Base Inspection: Instead of guessing if a vague request is possible, he uses parallel agents to inspect his existing dbt models and Airflow DAGs at scale. This allows him to see exactly what a “small change” might break before he even starts coding.
The Warning for Senior Engineers
The real risk isn’t juniors being replaced. It’s seniors losing their edge if they stop thinking critically. The experience you’ve built over the years, especially knowing when AI is wrong, is valuable, and you need to keep using it.
Thank you
Thank you to everyone who joined the live session and shared thoughtful questions. I hope we were able to provide useful insights.
If you have any more questions or thoughts, feel free to drop them in the comments. I’ll do my best to respond.
Looking forward to seeing you in the next session.
Erfan
https://pipeline2insights.substack.com/t/career-transition









