Simple steps to find a good learning rate:
1. Identify a lower bound rate just before the loss stops decreasing.
2. Identify an upper bound rate just before training becomes unstable.
3. Generate an exponentially increasing list of losses from the lower to the upper bound.
4. Train one batch with each rate, starting from the lowest, and measure the loss after each rate increase.
5. Plot the exponent of the rate against the loss to find the optimal learning rate with the lowest loss.
As an example, if you found `0.0001 = 1e-3` and `1.0 = 1e0` as the lower and upper boundaries, to establish the optimal the learning rate over 1000 batch steps:
```python
n_batches = 1000
lr_exp = torch.linspace(-3.2, 0.2, n_batches)
lr_s = 10 ** lr_exp
losses = train_batched(X, Y, n_batches, lr_s)
plt.plot(losses, lr_exp)
```
Example plot (x is the exponent of the learning rate; y is the loss):

Here, the optimal learning rate would be around `1e-0.75 = 0.178`