Ziyou "Eugene" Yan has a wonderfully compact [Guide to Machine Learning in Production](https://eugeneyan.com/writing/practical-guide-to-maintaining-machine-learning/) that he based off Google's famous [Rules of ML](https://developers.google.com/machine-learning/guides/rules-of-ml). ## Must-haves (extended) - Understand the concept of [defensive UX for Human-AI interactions](Guidelines%20for%20human-AI%20interactions.md). - Validate your models on independent test sets before deployment. - Have continuous input data validation. - Basic checks, like the number of rows, the presence of erroneous duplicate values, and the proportion of null values. - Then, check the ranges of numeric and date columns, and the unique values of categorical columns. - In NLP: The (Zipfian) distribution of (key-) words is not shifting over time. - See [feature monitoring and alerting suggestions](Feature%20monitoring%20and%20alerting%20suggestions.md) for more suggestions. - Make it easy to roll back deployments. - Security and privacy has been addressed. - Auth/auth - e.g., who can access which data/resources. - Personal Identifying Information (PII) is always encrypted and can be removed from the model and data if needed. - Differential Privacy: Data about one person does not leak to others. ## Good-to-haves - Your data scientists should own the whole project lifecycle from design to operations. - Invest in data science tooling. - Have a template to develop and deploy machine learning models. - Set up a centralized tracker to share and review results. - Suggestions (for both points combined): [MLFlow](https://www.mlflow.org/), [Weights & Biases](https://wandb.ai/), [Kedro](https://kedro.org/), [Pachyderm](https://pachyderm.io/), or even just good old [DVC (w/CML) + Streamlit](https://github.com/sicara/dvc-streamlit-example) - Have a workflow manager to schedule data jobs with dependencies. - Suggestions: [Airflow](https://airflow.apache.org/), [dbt](https://www.getdbt.com/), [Metaflow](https://metaflow.org/), [Argo Workflows](https://argoproj.github.io/argo-workflows/), or even just good old Python [`joblib`](https://joblib.readthedocs.io/en/stable/). - Invest in customer service tooling.