************************************************** Pump it Up: Data Prep ************************************************** In this submodule, we'll do a little EDA data prep for the ``pumpitup`` project. This isn't meant to be an exhaustive example as our focus is really on classification modeling in Module 2. Nevertheless, there are some useful tips in here including: * automated EDA tools for Python, * doing factor lumping with a port of the R package, ``forcats``, * creating a data prep script, * getting your data ready for use with sklearn for classification models. You'll be working in your newly created ``pumpitup`` project folder. Start by opening the ``data_prep.ipynb`` notebook in Jupyter Lab. Here is a screencast to help guide you through the notebook: * `SCREENCAST: Pump it Up data prep `_ (25:27) Move on to the last submodule, :doc:`mod2d_pumpitup_modeling`.