Modeling Semantics: How Data Models and Ontologies Connect to Build Your Semantic Foundations
Ontology vs. Conceptual Data Modeling, Implementation of Semantic Foundations, and Using Agents in the Right Places for Enrichment
About Our Contributing Expert
Juha Korpela | Consultant, Enterprise Data Management
This piece is a community contribution from Juha Korpela, Independent Consultant and Founder of Helsinki Data Week, a community-first data conference. With deep expertise in information architecture, data products, and modern operating models, Juha has spent his career helping organisations truly understand what their data means and how to use that semantic clarity to build better systems.
Formerly Chief Product Officer at Ellie Technologies and now the voice behind the “Common Sense Data” Substack, Juha is also a speaker, trainer, and advisor shaping the resurgence of conceptual modeling in the industry. We’re thrilled to feature his unique insights on Modern Data 101!
We actively collaborate with data experts to bring the best resources to a 15,000+ strong community of data leaders and practitioners. If you have something to share, reach out!
🫴🏻 Share your ideas and work: community@moderndata101.com
*Note: Opinions expressed in contributions are not our own and are only curated by us for broader access and discussion. All submissions are vetted for quality & relevance. We keep it information-first and do not support any promotions, paid or otherwise!
Let’s Dive In!
Knowledge Management Provides Context for AI
Knowledge Management and Information Architecture have had a rocket ride to the top of the data world’s consciousness due to Generative AI. The ability to organize, store, and serve structured semantics as context to various agents and chatbots is widely recognized as a winning ingredient in the GenAI race, reducing hallucinations and improving accuracy.
Terms like taxonomies, ontologies, and knowledge graphs are being thrown around as if just been invented, but veterans of the trade know better: there’s nothing new under the sun.
Knowledge Management and the Library Sciences, from which these subjects were born, are well-known disciplines, and the theory behind concepts like the Semantic Web is solid. It’s merely the utilization of these that has now changed with GenAI.
Data Modeling Foundations Return
But when it comes to organizing, storing, and serving semantics, there have always been two schools of thought, usually with very little cross-pollination between them. The other viewpoints outside ontologies and knowledge graphs have been coming from the data modeling world.
Traditionally, data modeling has had different levels of abstraction to cover different needs at different levels of detail. Conceptual, Logical, and Physical modeling has been a well-recognized three-level layout for data modeling activities (you can check my views on these three levels on my Substack).
But sadly, at some point in the Big Data craze of yesteryear, many data experts reduced data modeling to the Physical level only, focusing almost exclusively on the technical structures of data storage.
📝 Related Read
Where Semantics Was Compromised
By forgoing Conceptual modeling to a large extent, data experts had let go of a very practical method for doing exactly the same thing that is now required from taxonomies, ontologies, and knowledge graphs: describing structured semantics.
At the core of both ontologies and conceptual data models are things: real-life entities that exist in the real business, irrespective of the systems we have built. You might call these things “entities” or “objects” or “nodes” or whatever you like,
…but they are what you need to understand in order to describe
(to a human or an agent) what goes on in your business.
Think of “Customer”, “Order”, “Product”, “Delivery”, and so on. These are what you have data about, no matter how the data is technically stored in database tables or files.
In addition to the list of things, to fully understand the business context, we need relationships between the things. How do the things in our business interact with each other? Think “Customer makes an Order”, “Product is added to Delivery”, and so on.
Ontology vs. Conceptual Model
An ontology is, in simple terms, a list of things (and their definitions) with a list of the relationships between them. In the Knowledge Management world, this would be formalized according to, say, RDF standards.
A conceptual model is, in simple terms, also a list of things and their relationships. Data modelers traditionally produce an Entity-Relationship Diagram out of it, with a list of entity definitions (a Glossary) attached.
Now here’s the important thing to understand, regardless of which world you are coming from:
the semantical content you capture with both approaches is exactly the same!
Merely the method of capturing, organizing, and storing that information is different.
For me personally, the method of conceptual modeling feels natural, as I’ve done data modeling for around 15 years now. I know what questions I need to ask people (or what documents to read) to capture information about the entities and their relationships, I know how to draw the diagram, I know how to create the glossary, and I know what tools I can use to help.
For someone coming from a semantic web background, building formalized ontologies according to the RDF standard feels natural, with all the methods and tools that come with it.
We’re both still working on semantics: in effect, we’re capturing the exact same ontology, thus storing information about business context to be used later.
📝 Related Read
Technical Implementation of Semantics
For us data modelers, the utilization of these models has traditionally focused on the technical implementation of data solutions, and we’ve thus followed a path from Conceptual to Logical to Physical. That is, if we have done conceptual modeling at all!
But especially now, in the age of context-hungry AI, we have to realize we’ve been sitting on a semantics gold mine: conceptual data modeling is an excellent method for figuring out what the entities and relationships should be.
Why is this important? Because the most valuable semantical information is that which is unique to the organization, and those semantics are the hardest to capture.
While AI tools can be used to find semantical concepts from unstructured data and various knowledge bases, a lot of this information is tacit knowledge in the business experts’ heads. Conceptual modeling is a known-good method for getting that tacit knowledge out.
Data Modeling as Semantic Discovery
I envision a world where we build the semantic foundation of an organization with a set of tools at our disposal:

We use industry standards and existing knowledge bases to cover the basic structures that are common to most organizations within an industry
We use conceptual modeling methods as a surgical knife to cut through tacit knowledge and unearth & document the valuable, unique semantics of the organization
We use AI agents as “semantic helpers” to trawl through tons of documentation and find details to add around the strong core that has been formed
This semantic foundation will then act as the context provider for all your agents and chatbots, but also for humans! Context is king in today’s world. By looking at data modeling as not only a technical design method, but as a semantic discovery method, we enable a powerful tool for building this context.
MD101 Support ☎️
If you have any queries about the piece, feel free to connect with the author(s). Or feel free to connect with the MD101 team directly at community@moderndata101.com 🧡
Author Connect
Connect with me on LinkedIn 🤝🏻
Continue following Juha Korpela’s Insights on his Substack!
From MD101 team 🧡
The Modern Data Masterclass: Learn from 5 Masters in Data Products, Agentic Ecosystems, and Data Adoption!
With our latest 10,000 subscribers milestone, we opened up The Modern Data Masterclass for all to tune in and find countless insights from top data experts in the field. We are extremely appreciative of the time and effort they’ve dedicatedly shared with us to make this happen for the data community.










Hi Juha,
You state "An ontology is, in simple terms, a list of things (and their definitions) with a list of the relationships between them. In the Knowledge Management world, this would be formalized according to, say, RDF standards."
This is not correct and I think it is worth clarifying.
An ontology is a formal, logical reasoning model that defines concepts as classes, properties, attributes and relations. Ontologies support interoperability and machine readability according to standards.
Dr. Achim Reiz:
https://www.linkedin.com/posts/reiz_webinar-rag-reasoning-activity-7328062684124385281-OcuQ?utm_source=social_share_send&utm_medium=android_app&rcm=ACoAABQYawkBEHMgPby3nKk5ocfmxJ4QA-WzucE&utm_campaign=copy_link&lipi=urn%3Ali%3Apage%3Ad_flagship3_messaging_conversation_detail%3BsgcLDbzkT16LK7ywunFx1g%3D%3D
Totally agree (in fact I argued for basically the same thing in my own recent post on the subject, great minds etc etc 🤣)
Glad to see some deeper thinking on how semantics can play a very important role now and in the future for AI leverage.