Datasets

Datasets and machine learning models are the basic ingredients of the Interpretability. This section contains a description of every dataset we refer to in this book and have used used in our coding examples. The datasets are available in our git repository and are open-source datasets that can be accessed by anyone.

Datasets are not picked just because they look good or they are open source, but they were chosen carefully to replicate industrial use cases. These datasets are used in the industry regularly and data scientists are regularly using these datasets to test the models that build. The data sets range from marketing data to medical data set, which covers a wide variety of problems.

Datasets

Medical Cost Personal Dataset
Telecom Churn Dataset
Sales Opportunity Size Dataset
Pima Indians Diabetes Dataset

Few datasets were acquired from Squark website and others were from Kaggle.

PreviousALE (Accumulated Local Effects Plot)NextMedical Cost Personal Dataset

Last updated 4 years ago

Was this helpful?