Building a good classifier in Python can be a tedious process. There are multiple models to try, and each one has its own set of hyperparameters that can affect the results. The most common way to approach this is to do an exhaustive search of all possible combinations with something like GridSearchCV. This process often takes minutes to hours, even on a machine with lots of cores and memory.
In this Free Code Friday, we will look at a way you can use Spark to “turn it up to 11” and be able to try more combinations in less time, ultimately leading to a better classifier.