The Pragmatic Data Scientist: Insights into Machine Learning and Data Science in the industry.
In the first post “Artificial Intelligence is a moving target”, we will discuss how the curiously loose definition of what constitutes an AI system has shaped the industry.
In the coming weeks, we will publish four subsequent posts with the titles:
- “Engineering is the bottleneck”,
- “Deduction is preferable to induction”,
- “Manipulate the problem-model-data-metric equation”, and
- “Engineering means making the right trade-offs”.
Who are these posts for?
The intended audiences for these posts are managers and data scientists*. However, other stakeholders that work with data science teams could also benefit from internalizing these insights.
* We use the term “data scientist” to refer to “machine learning practitioners” because this is the terminology that we believe is most commonly used in the industry.
Each of the insights will:
- Express an opinion that challenges some of the assumptions set by the academic community.
- Focus on commercially applicable data science, as opposed to academic exercises or Kaggle competitions.
- Help explain interesting parts of the field, such as how to become a better data scientist, or how to solve a concrete problem.
- Be pragmatic. Worse is better, because perfection is the enemy of good.
Our inspiration for these posts
Published in 1999, “The Pragmatic Programmer” takes an opinionated view on software engineering presented as a list of insights. We have benefited tremendously from the guidance provided by the book in our engineering work.
Our posts are an attempt to present such opinionated insights into the field of data science. Notice that these posts are written with a computer scientist/hacker perspective rather than the
perspective of a mathematician or statistician. We encourage you to stay critical, to reflect on your own principles and insights, and to help us revise these posts. We will update these posts continuously as we gain new insights from your thoughts and comments.
The 5 insights
In the first insight “Artificial Intelligence is a moving target”, we take a look at how the AI industry is affected by the fact that there does not exist a clear-cut definition of what AI is. We argue that because of the loose definitions of what constitutes an AI system, marketers rather than engineers have got to decide what is branded “AI”.
In the second insight “Engineering is the bottleneck” we argue that the vast majority of companies need AI engineers, not AI scientists. This is because most companies are better of developing pragmatic solutions based off of existing research than conducting their own research. We also discuss solid engineering principles that can make or break industrial AI systems.
In the third insight “Deduction is preferable to induction” we argue for the importance of creating simple solutions before bringing out the big guns. Creating simple solutions have a number of desirable properties such as constituting a good baseline and ease of maintenance. We argue why data scientists should favor a top-down approach to a bottom-up approach, and strive to develop a data independent solution when possible.
In the fourth insight “Manipulate the problem-model-data-metric equation” we go into detail about what distinguishes pragmatic data science in the industry from academic and competitive data science. We argue that because industrial engineers can change more of the variables of the equation, they must master a broad range of skills, ranging from data collection techniques to stakeholder management.
In the fifth insight “Engineering means making the right trade-offs” we argue that pragmatic engineers must strike a balance between different opposing desirable properties of any solution. We explain how the end goals of industrial data science teams differ from that of academic and competitive data science teams.
It is our hope that these insights will give you a broad understanding of the challenges in the field of AI, and make you reflect on how to develop pragmatic data science solutions.
Let’s take a look at the first insight: “Artificial Intelligence is a moving target”.
Get ‘The Pragmatic Data Scientist’ as a whitepaper