In part one of this blog post, I began trying to apply some of the wisdom of E. W. Dijkstra to the field of Machine Learning. We’ll continue with that discussion here and offer a few concluding thoughts.
Avoid involvement in projects so vague that their failure could remain invisible — such involvement tends to corrupt one’s scientific integrity.
As professionals trying to draw conclusions from data, we are scientists in a very important and concrete sense. Thus, we ought to bear, at least to some degree, the mantle foisted upon us by that title. This means in part that we have a responsibility to be skeptical of our own successes.
A big part of this skepticism in Machine Learning comes from one’s choice of evaluation metric. A few years ago, I wrote a paper about evaluation metrics for binary classification that showed empirically that you could often make one solution look better than another just by a clever choice of metric, even when all of your choices were well-accepted metrics.
Of course, as I mentioned above, the metric you choose for success should depend greatly on the context in which the dataset is to be deployed. What if you don’t? It’s usually not hard to make a project look somewhat successful even if you know the results aren’t robust enough to be deployed into a production environment. This is the definition of an invisible failure.
A great perspective on metrics choice comes from Dr. Kiri Wagstaff, who argues that Machine Learning scientists would be better served by measuring success not by accuracy or AUC, but by real world criteria like the number of people better served and lives improved. The opposite of “invisible failure” is “conspicuous success” and this should be our goal. Unless we choose projects and metrics that allow for that goal, we’ll never achieve it.
Write as if your work is going to be studied by a thousand people.
I think Dijkstra meant this as an admonishment to maintain both pride in and quality of one’s work even in the absence of immediate notoriety, but I’m going to take this comment in a slightly different direction.
It’s easy to get caught up with the idea of supervised Machine Learning and process automation as the only goals of working with data. While those goals are very nice, sometimes the most powerful and transformative things that come out of the data are stories. I’ve interviewed for data engineering-type roles at many different businesses, including a Fortune 500 company. The department in which I interviewed was focused entirely on telling stories from the data: “The highest value users for the website have this behavior pattern. Here are some efforts to get more users to follow that pattern. How successful were those efforts? Why?” There might not be a predictive model anywhere in that list of steps, but those are stories that can only be told by data, and give information that might dramatically change efforts throughout the company.
On a lighter note, I was once working in a situation where we needed to test an implementation of topic modeling. To do this, we went behind the company to their recycling dumpster, pulled out a few thousand documents, did OCR on them, and fed them to our topic modeling software, our idea being that this was a great example of a messy and diverse document collection. Among the topics that were found in the documents was one containing the words “position,” “salary,” “full-time,” and “contract.” It turned out that a lot of the documents were job postings that had been printed out by employees who were “testing the waters” so to speak; an interesting comment on employee morale at that moment.
Many datasets have a story to tell, if only we will listen. Part of a data engineer’s job is to write that story, and if done right it might indeed be read by a thousand people.
Raise your standards as high as you can live with, avoid wasting your time on routine problems, and always try to work as closely as possible at the boundary of your abilities. Do this because it is the only way of discovering how that boundary should be moved forward.
Readers of this blog post are just as likely as anyone to fall victim to the classic maxim, “When all you have is a hammer, everything is a nail.” I remember a job interview where my interrogator appeared disinterested in talking further after I wasn’t able to solve a certain optimization using Lagrange multipliers. The mindset isn’t uncommon: “I have my toolbox. It’s worked in the past, so everything else must be irrelevant.”
Even more broadly, Machine Learning practitioners tend to obsess over the details of model choice and parameter fine tuning, when the conventional wisdom in the field is that vastly more improvement comes from feature engineering than from fiddling with your model. Why the disconnect? Part of it is that feature engineering is usually comparatively boring and non-technical. Another part is that it’s hard, requiring the modeler to explore knowledge outside of their field and (heaven forbid) talk to people.
You should always be looking to make the effort that most improves your chances of success on your current or future projects. Often, that will require moving outside of your comfort zone or doing things that you normally wouldn’t. But the more often you do these things, the larger your toolbox gets, and the more likely success will follow.
This idea and many of the other ideas in these posts apply throughout life for everyone, but for those working on the frontier of a new field they are especially applicable. At this stage in the game, a win for one of us is really a win for all of us; every victory further legitimizes the field and expands the field for everyone. If we’re all careful about the work we choose, are creative in our approaches, and ensure that our successes are both numerous and meaningful in the real world, then the current promise surrounding Machine Learning is sure to become a reality.