If you only want to run quick analyses every now and then, installing [JupyterLab Desktop](https://github.com/jupyterlab/jupyterlab-desktop) is probably sufficient. ```shell brew install --cask jupyterlab ``` If not: 1. Set up a virtual environment with either Conda or Poetry: - [Install miniconda](Miniconda%20installation.md) from Anaconda and create a [Conda environment](Conda%20environments.md) - Or, do a [[Poetry setup]], possibly with the full-blown [project boilerplate](Tech/Python/Project%20Boilerplate/An%20Overview.md). 2. Create a working directory (unless you used Poetry) and gather the data for the experiments. 3. Install the following production packages: `jupyterlab pandas` and for development: `dvc[s3,azure,ssh] nbdime papermill` (for data versioning, git diffing, and running notebooks, respectively), and optionally `wandb` ([Weights & Biases](https://wandb.ai/site)) - via `conda install -c conda-forge` - or `poetry add`, with `--group dev` where needed, followed by `poetry install` 4. Initialize your git environment: `git init` 5. Add [nbdime](https://nbdime.readthedocs.io/en/latest/) to use an interactive difftool: [`poetry run` followed by] `nbdime config-git --enable` - To show git diffs in the Jupyter UI, also run `nbdime extensions --enable --user` - You should clear all outputs of all cells before checking notebooks in with git, even with nbdime; Only check in outputs if you need to track them. - Tim Staley has a nice [blog post](http://timstaley.co.uk/posts/making-git-and-jupyter-notebooks-play-nice/) how to [strip Jupyter outputs as a `git` filter with `jq`](Strip%20Jupyter%20outputs%20as%20Git%20filter%20with%20jq.md). 1. [Version-control your data with DVC](Version-control%20your%20data%20with%20DVC.md) (unless you are using Weights & Biases for that.) 2. Start [`poetry shell`, and then] [`wandb login` to get Weights & Biases started, and then] run `jupyter lab` to create your first notebook 3. To compare notebooks in detail, use `git difftool --tool nbdime -- *.ipynb` (for git diff comparisons, nbstripout will be used) 4. [Papermill](https://github.com/nteract/papermill) lets you run notebooks with dvc: `papermill input.ipynb output.ipynb -p param1 val1 ...` Note that there is no good way to sync the nbdime setup via git, as they modify the .git/config file. Therefore, every user checking out this project who wants to use that tool needs to set it up on their own (`nbdime config-git`). Refer to the [Jupyter notebook preambles](Jupyter%20notebook%20preambles.md) to get you started with a new notebook.