# Favorite #Python commandline packages
Extract CSV data from Excel files:
```shell
pipx install --user --upgrade xlsx2csv
```
However, this fails after a few usages. Better to use headless Libre Office:
```shell
/Applications/LibreOffice.app/Contents/MacOS/soffice --headless --convert-to csv "$file"
```
For assertionquery.py (Cascade):
```shell
pip install --user --upgrade beautifulsoup4
```
Essential developer packages:
```shell
poetry add --dev pytest mypy ruff black
```
Essential web developer packages:
```shell
brew install httpie
```
Or, on Linux:
```shell
apt install httpie
```
- HTTPie desktop app: https://httpie.io/desktop
Essential global python development packages (w/o Poetry or for pypi):
```shell
pip install --upgrade twine build pipreqs ipython
```
## Recommendations
See [[PyEnv usage]], [[Conda usage]] and [[Poetry usage]] to set up packages, never use the system python/pip directly (with the only exceptions above *possibly*)!
The [Project Boilerplate](Tech/Python/Project%20Boilerplate/An%20Overview.md) explains how to set up a state-of-the-art baseline for developing in Python.
And [Jupyter environment](Jupyter%20environment.md) explains how to set up a Data Analytics and Machine Learning environment (typically, on top of that baseline).
### Advanced Python
- `numpy cython` # high-performance Python packages that can put Python on par with C
- `classes returns` # *dry-python* [type classes](https://classes.readthedocs.io/en/latest/pages/why.html) for [ad-hoc polymorphism with late binding](https://en.wikipedia.org/wiki/Ad_hoc_polymorphism) and [railway-oriented programming](https://returns.readthedocs.io/en/latest/pages/railway.html) to handle [None](https://returns.readthedocs.io/en/latest/pages/maybe.html) & [exceptions](https://returns.readthedocs.io/en/latest/pages/result.html), [context-based dependency injection](https://returns.readthedocs.io/en/latest/pages/context.html), [data pipelines](https://returns.readthedocs.io/en/latest/pages/pipeline.html), futures, and serialization, allowing you to implement [higher kinded types](https://returns.readthedocs.io/en/latest/pages/hkt.html) (a.k.a., [Container types](https://returns.readthedocs.io/en/latest/pages/container.html)).
- **Do not use this unless you agree that your project & team will use the functional style and you are willing to RTFM.** (You were warned not to make a mess of your shiny new Python project.)
- `tox tox-pyenv tox-venv behave hypothesis` # advanced testing libraries
- The [tox](https://tox.wiki) ecosystem allows you to test libraries or frameworks that must support multiple versions of Python
- [behave](https://behave.readthedocs.io/en/latest/) enables [Gherkin-style](https://behave.readthedocs.io/en/latest/philosophy/?highlight=gherkin#the-gherkin-language) behavior-driven development
- [hypothesis](https://hypothesis.readthedocs.io/en/latest/index.html) supports [advanced testing techniques](https://hypothesis.readthedocs.io/en/latest/manifesto.html), such as property-based testing or fuzzing
- `pipreqs` # requirements.txt generator that saves *all* packages your current Python environment can see
### Data Science
- `pandas openpyxl statsmodels scipy` # statistics and data analytics
- `matplotlib seaborn` # data visualization
- `dvc[s3,azure,ssh]` # data versioning
- `streamlit` # data science UI
- `jupyterlab ipywidgets widgetsnbextension jupyter_contrib_nbextensions` # Jupyter
- `jupytext` # handle Jupyter notebooks as regular plain-text files
- `nbdime` # make Jupyter notebook git[-diff/merge]-friendly
### Developer tooling
- `black ruff mypy` # type-safe, clean development 101
- `pytest` # testing
- `py-spy line_profiler pyre-check` # debugging
### Machine Learning
- `pytorch torchvision pytorch-ignite tensorboard transformers` # deep learning stack
- `dvc[s3,azure,ssh] wandb mlflow` # experiment, model, and data tracking
- `jupyterlab jupytext ipywidgets widgetsnbextension jupyter_contrib_nbextensions` # Jupyter
- `sklearn sklearn-pandas tensorflow` # machine learning
- `gensim spacy sentence-transformers transformers` # NLP
For **PyTorch**, check your CUDA installation and [go to the PyTorch website](https://pytorch.org/get-started/locally/) to find out what to download.
For **Jupyter Extended**, call:
`jupyter contrib nbextension install # optionally: --user`
### Web development
- `fastapi` # asynchronous REST API framework
- [Real Python tutorial](https://realpython.com/courses/python-rest-apis-with-fastapi/)
- `django` # solid, but oppinionated web framework (recommended for newcomers)
- `flask connexion[swagger-ui] SQLAlchemy` # advanced web framework backend
- [Real Python tutorial](https://realpython.com/flask-connexion-rest-api/)
- `pydantic` # data validation and serialization
- `beautifulsoup4` # HTML and XML parsing
If you are planning to build [Hypermedia-Driven Applications (HDAs)](https://htmx.org/essays/hypermedia-driven-applications/) with [HTMX](https://htmx.org/), also take a look a [PyHAT](https://github.com/PyHAT-stack/awesome-python-htmx).
### Utilities
- `ipython` # better REPL than the default IDLE
- `graphtage` # diffing tree-like data (HTML, YAML, JSON, etc.)
- `beautifulsoup4 textract goose3` # text extraction
To install **Graphviz** use either of the following, depending on wether you are using pip or conda:
- `brew install graphviz` followed by `pip install graphviz`
- `conda install graphviz` followed by `conde install python-graphviz`