Generic Data Science Is Unsustainable. To Save Our Field, We Must Change The Way We Teach It.
Students are taught generic modeling techniques, math, and software engineering. We need to be teaching research and hard science. Why?
Data Science must create a competitive advantage for businesses. It’s too expensive not to. The way we teach Data Science now focuses on the wrong capabilities to support that goal.
The Data Science we teach to students is indistinguishable from advanced analytics. Deep learning without research to support it, results in expensive descriptive models. There is no advantage to be gained. In many cases, being first to market with a new Machine Learning based product is a disadvantage.
First to market companies spend more than fast followers, those who wait and see what works then quickly bring copies to market. First to market creates and proves out the customer demand. They go through the product improvement iterations to fit and finish. It can take a year before the product is successful.
The cost of initial R&D plus improvements plus marketing are seen as worth it because they create a long term opportunity. What really happens is fast followers see the opportunity for new revenue and enter the product space. Generic models built on generic data create products which are easy to copy.
It takes a lot less time to bring the substitutes to market. It’s a well understood product. Competitors have an existing customer base to work with and discover unmet needs. They can bring a product to market with one or two improvements which takes a significant market share away from the business who get to market first.
There is no barrier to entry. Patents are easily circumvented and difficult to enforce around Machine Learning models. In most cases, it is cheaper to wait for others to prove out the market than to be first in.
From an ROI standpoint, generic Data Science is unsustainable.