At this year’s Strata Data Conference, many of us arrived in anticipation of staying current on the cutting edge in big data systems — and we were not disappointed. Speakers touted the continued rise of the unified cloud-based data platform, the associated mainstreaming of deep learning, the continued shift toward real-time frameworks, and the growth of Docker-based model deployment. Despite these promising developments, there was a countervailing theme that offset (to an extent) the allure of innovation: businesses often fall short in deploying machine learning systems to production. A recent Gartner survey found that while three-quarters of businesses are invested or planning to invest in big data, only 15% have big data projects in production [1]. Why might this be the case?

A Skills Gap or a Communication Gap?

One commonly-asserted reason is a skills gap, either on the part of data science, engineering, business, or all of the above. One of the pitfalls of moving from prototype to production is the introduction of data that is logically inconsistent with the conditions of the model training data. David Talby, CTO of Pacific AI, noted in his Strata executive briefing that “the greatest model, trained on data inconsistent with the data it faces in the real world, will at best perform unreliably, and at worst fail catastrophically.” Some businesses have worked around this by limiting the scope of training data to views of production data, which requires heavy collaboration between data science and engineering. However, even with this precaution, there is still the risk of other data inconsistencies, such as local trends in the training or validation of data, seasonality, changes in the population or underlying schema, or even the endogenous effect of deploying the model itself (for instance, users learning from the decisioning of a model). From both a technical and theoretical perspective, putting a model into production is always a delicate process.

In a Strata keynote, Cassie Kozyrkov, Chief Data Scientist at Google, noted that a company cannot “just hire some new PhDs, give them two to four years, let them take some knocks on the chin, and assume they’ll figure it out.” Yet this industry cliché is still a common approach. Businesses hire teams of specialists in models and algorithms who are asked to work cross-functionally to the utmost extent: with line-of-business stakeholders on one hand (not limited to product managers, business analysts, and high-level leadership), and in-the-trenches technicians, such as upstream and downstream engineers, on the other. Line-of-business and engineering professionals have a similar challenge, as they work directly with one another as well as with data science in order to satisfy business needs.

The prevailing view that the problem is a skills gap has an element of truth to it — indeed, the skills necessary to successfully deploy a model are different than the ones needed for prototyping. However, the underlying issue may really be a communication gap. Data scientists need their models to fit as seamlessly as possible within the framework of complex data processing systems, while meeting tight SLAs. Especially for new models that do not operate within established precedent, this requires first assessing business needs, followed by a deep examination of deployment options, and finally a determination of an appropriate theoretical and technical approach.

Often, these cross-functional deliberations occur over a series of meetings where the needs of business and feasibility of deployment are discussed in the abstract, supplemented by reviewing business and engineering documentation and building proof-of-concept models to build a baseline. While these first steps have value, they do not go far enough in mitigating deployment risk. Even if everyone comes together and there is a handshake agreement to proceed with developing a certain model, there can be unstated assumptions held by members of each domain that, if discovered, require revisiting either the technical modeling approach, the deployment scheme, or both.

Instead of holding a series of relatively decontextualized meetings, it may be more effective for data scientists to conduct “ride-along” sessions with their business and engineering partners where they can observe or (where possible) assist in hands-on work [2]. This can enhance data scientists’ understanding of overarching business objectives and technical constraints, and aid in building a model that meets customer needs, while also having a high probability of successful deployment.

Likewise, business and engineering team members can take the initiative to understand and empathize with their cross-functional peers. For instance, engineers could periodically conduct their own end-to-end machine learning projects to better understand the theoretical constraints of data science. In a similar vein, business professionals could engage in an ongoing deep dive into the practical fundamentals of data science and engineering to better understand some of the technical aspects of extracting value from big data.

Learning in Production

When cross-functional collaboration results in a successful initial deployment, it is important to take a moment and appreciate the milestone for what it is: the apex of a shared effort between passionate professionals working on problems of exceptional complexity. However, this is not the end of the project. To maximize value and minimize risk in production, the team must continuously assess the production model, validating its performance and the underlying architecture.

Here as well, data scientists, engineers, and business professionals have different but complementary perspectives. Data scientists can assess the health of a model using high-level model metrics, or digging deep into its inner workings. Even when immediate value has been established, ongoing assessment is always necessary given that models degrade over time [3]. Business stakeholders can assess the performance of a model via business metrics, and share these views with data science to give insight into the downstream impact of a model. Finally, engineering can verify the health of the model by continually evaluating the underlying system and input data, as well as looking ahead to future schema or entity changes that may impact a model, which requires a sophisticated understanding of all the models that depend on their system.

Big Data is a Team Sport

Even as technical possibilities expand in the era of cloud-based, self-serve systems, communication and coordination remain key for delivering the promise of AI/ML systems at scale. The more data science, line-of-business, and engineering can think like and communicate with their functional counterparts, the greater the likelihood of successful deployment, and ultimately shifting from allure to impact in big data.

References:

[1] Van der Meulen, R. (2016) “Gartner Survey Reveals Investment in Big Data Is Up but Fewer Organizations Plan to Invest” https://www.gartner.com/newsroom/id/3466117

[2] At Intuit, we call these “follow-me-home” sessions: Colvin, G. (2017). “How Intuit Reinvents Itself” http://fortune.com/2017/10/20/how-intuit-reinvents-itself/

[3] Talby, D. (2018) “Lessons learned turning machine learning models into real products and services” https://www.oreilly.com/ideas/lessons-learned-turning-machine-learning-models-into-real-products-and-services

Bio:

Rod Albuyeh is a Data Scientist on the Money Movement team in Intuit’s Security, Risk, and Fraud (SRF) group, where he leverages machine learning to mitigate risk associated with credit card and ACH payments. Prior to that, he served as a Winifred Greenleaf Allen Doctoral Fellow at the University of Southern California where he researched political psychology using applied statistics and experimental methods. Outside of work he is a dad, Brazilian Jiu Jitsu practitioner, and musician.