GBM - PDP
Last updated
Last updated
Gradient Boosting Machine is a machine learning algorithm that forms an ensemble of weakly predicted decision trees
It constructs a forward stage-wise additive model by implementing gradient descent in function space
Also known as MART (Multiple Additive Regression Trees) and GBRT (Gradient Boosted Regression Trees)
Dataset: Pima Indians Diabetes; Target: Outcome
The data is trained by calling the GradientBoostingClassifier function from Scikit learn Library
Accuracy:
For this model, we will interpret with Partial Dependence Plots.
With just few lines of code, we can plot the PDPs for any dataset using the sklearn partial_dependence library.
PDP for every feature
The above plot shows how change in output varies with variations in feature values. Some key points for interpretation from the above plots:
As Pregnancies increase, the person's chances of becoming diabetic go up
Higher the Glucose, higher the chances of person becoming diabetic
BMI more than 25 increases an individuals chances of becoming diabetic
3-D PDPs
These plots show the combined effect of two features on the change in output. As seen above, a reduction in both - Insulin and DiabetesPedigreeFunction, results in negative change of a person being diabetic (nearing non-diabetic situation).
PDP interact plot
The below plot shows the change output prediction (value inside square) for every combination of values between the features Insulin and DiabetesPedigreeFunction(values given by scale).