Datasets

Datasets and machine learning models are the basic ingredients of the Interpretability. This section contains a description of every dataset we refer to in this book and have used used in our coding examples. The datasets are available in our git repository and are open-source datasets that can be accessed by anyone.

Datasets are not picked just because they look good or they are open source, but they were chosen carefully to replicate industrial use cases. These datasets are used in the industry regularly and data scientists are regularly using these datasets to test the models that build. The data sets range from marketing data to medical data set, which covers a wide variety of problems.

Datasets

  1. Medical Cost Personal Dataset

  2. Telecom Churn Dataset

  3. Sales Opportunity Size Dataset

  4. Pima Indians Diabetes Dataset

Few datasets were acquired from Squark website and others were from Kaggle.

Last updated