High ROI Data Science

High ROI Data Science

Share this post

High ROI Data Science
High ROI Data Science
Redefining The Data Catalog. Vocabularies, Taxonomies, And Ontologies

Redefining The Data Catalog. Vocabularies, Taxonomies, And Ontologies

Vin Vashishta's avatar
Vin Vashishta
Jul 29, 2023
∙ Paid
12

Share this post

High ROI Data Science
High ROI Data Science
Redefining The Data Catalog. Vocabularies, Taxonomies, And Ontologies
2
1
Share

The blueprint for modern data engineering is Joe Reis and Matt Housley’s book Fundamentals of Data Engineering. One of their core tenets is building a forward-looking data architecture. It’s not enough to build for today’s needs. Decisions today must amplify the technology waves to come and result in infrastructure that supports changing business needs.

The authors’ primary focus was architecture. I’m explaining a different level of architecture that manages the knowledge vs. the storage and movement of data. Most of this framework is architecture and tool-agnostic. There are parallels to mesh, fabric, graph DB, and other concepts, but this isn’t a pitch for any architecture, stack, or platform.

I’m diving into the implementation side of the series. This article explains the framework for analytics and data science-centric data management. Let’s begin by defining a top-level vocabulary for the contents of a data catalog. I’ll use that as a starting point to explain how to transition from data catalogs and dictionaries to the new framework.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Vin Vashishta
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share