Repeatedly testing on the test set and tuning models against it.