High ROI AI

High ROI AI

The 3 Data Science Lifecycles and Workflows

The Data, Research, and Model Development Lifecycles. This is a living document that I will update regularly as the field moves forward.

Vin Vashishta's avatar
Vin Vashishta
Jan 05, 2022
∙ Paid
2
Share

This is a complete workflow description of the 3 main Data Science Lifecycles. Every business will have a customized implementation based on their needs, capabilities, and maturity. The business’s Data Science Value Stream will dictate which parts of the workflow need to be implemented.

Job descriptions should be created based on these workflows. At the highest level, the Data Lifecycle is managed by Data Engineers and Data Librarians. The Research Lifecycle is managed by Researchers and Applied ML Researchers. The Model Development Lifecycle is managed by ML Engineers and MLOps Engineers.

Individual elements of these workflows eventually build out their own roles. In early stages, workflows can be managed completely by the basic roles. Quality and Testing, Product and Requirements, Research Oversight and Review, Platform Architecture, and other roles must be added when workflow steps requires specialized capabilities and/or when level of effort becomes unsustainable for a cross functional role.

Typically, the business implements the Data Lifecycle, then the Model Development Lifecycle, and finally the Research Lifecycle as Machine Learning Maturity increases.

Each lifecycle has basic and advanced elements of the workflow. The workflow fills in as business needs advance.

Each step adds complexity and costs. These must be justified by returns. Bottom line, don’t implement a complex workflow unless it provides obvious value.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Vin Vashishta
Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture