# Imports
Using Matplotlib for plotting, and SciKit Learn to create the predictions:
```python
import matplotlib.pyplot as plt
from sklearn.model_selection import cross_val_predict
from sklearn.metrics import precision_recall_curve
```
# Setup
Create a classifier, set or learn a cutoff for it (default is zero), and run (cross-validated) predictions with it. *Importantly*, ask for scores, not classes, as the returned vales by setting *`method="decision_function"`*.
```python
my_clf = ... # some classifier
my_threshold = ... # some cutoff value
y_scores = cross_val_predict(
my_clf, X_train, y_train, cv=3, method="decision_function"
)
```
Some classifiers (e.g., Random Forest) only expose a `predict_proba` method - in those cases, simply adapt the above `method` parameter.
# Calculate Score Curves
Use the true binary class label and the classifier scores for those instances as an input to the SciKit Learn method to calculate the **`precision_recall_curve`** from the `metrics` package.
```python
precisions, recalls, scores = precision_recall_curve(y_train, y_scores)
```
If you want to show the chosen threshold, you need to find it on the precision and recall axes, too:
```python
idx = (scores >= threshold).argmax() # first index ≥ threshold
```
# Precision and Recall Plots
With this in place, you can plot the two curves for precision and recall separately:
```python
plt.plot(scores, precisions[:-1], "b--", label="Precision", linewidth=2) plt.plot(scores, recalls[:-1], "g-", label="Recall", linewidth=2) plt.vlines(my_threshold, 0, 1.0, "k", "dotted", label="Threshold")
plt.plot(thresholds[idx], precisions[idx], "bo")
plt.plot(thresholds[idx], recalls[idx], "go")
plt.grid()
plt.xlabel("Classifier Score")
plt.legend(loc="center right")
plt.show()
```
![[Precision-Recall Plot.png]]
# Precision/Recall Curve
If you rather plot the precision-recall curve itself, that can be done with the same dataset; To do that:
```python
plt.plot(recalls, precisions, linewidth=2, label="Precision/Recall curve")
plt.plot([recalls[idx], recalls[idx]], [0., precisions[idx]], "k:")
plt.plot([0., recalls[idx]], [precisions[idx], precisions[idx]], "k:")
plt.plot([recalls[idx]], [precisions[idx]], "ko", label="Threshold")
plt.xlabel("Recall")
plt.ylabel("Precision")
plt.axis([0, 1, 0, 1])
plt.grid()
plt.legend(loc="lower left")
```
![[Precision-Recall Curve.png]]
# Analysis: AUC PR
To calculate the exact area under the precision-recall curve, use the `auc` function from the `metrics` package:
```python
from sklearn.metrics import auc
auc(recalls, precisions)
```
To get the exact precision and recall at the threshold:
```python
print("Precision@Threshold =", precisions[idx])
print("Recall@Threshold =", recalls[idx])
```