Tuning Parameters in GBM for Best Modeling
The Gradient Boosted Machine (GBM) is one of the most widely used Machine Learning models. This article provides several useful methods to tune your GBM model.
Join the DZone community and get the full member experience.Join For Free
If you're is trying to build a GMB, here is a practical guideline of ranges for the following parameters.
With GBMs, a good tree depth starting point is around 6. It’s unusual to have deeper than 10. More than about 20 would be incredibly problem-specific. Some problems do well with a large number of shallow trees.
I usually start by trying 5, 7, and 11. If 5 is the best, then I'll try lower numbers. If 11 is the best, then I'll keep increasing (17, 23, etc.) until it stops improving. However, I'd say that about 95% of the time, you won't need to go past a depth of 13. Time can also be a consideration here as well — more depth = longer training times.
The learning rate is related to the number of trees. The more trees, the lower you can make the learning rate. A starting rule of thumb is (1/number of trees). I would consider 5000 trees to be on the high side. 100 is not unusual.
I always use early stopping, so I don't really take the number of trees into consideration when choosing my learning rate. If you're not using early stopping, then the rule of thumb that Tom provided is good. However, I would highly recommend using early stopping along with cross-validation if you aren't using it already. I usually start with a learning rate of 0.05 or 0.10 (depending on the size of the dataset and for how long I'm willing to train). This is just for tuning the other parameters, i.e. getting them in the right vicinity of what is optimal. Things like optimum tree depth are the about the same whether the learning rate is 0.10 or 0.01, so there's no sense training for longer periods of time with a lower learning rate. Once I have my parameters set, I will then lower the learning rate to 0.01 or 0.001 (again, this depends on how long you're willing to train for plus how much improvement you'll get from a lower learning rate — problem dependent). Now, I might start doing some more fine tuning with the lower rate (seeing if tree depth should be 9, 10, or 11, for example).
Start with a
col_sample_rate of around 0.7.
I start with 0.7, then try 0.3. If 0.3 is worse, then try 1.0. For most problems I've encountered, that's usually good enough. If you find 0.3 is best, then you can, of course, try 0.2 and 0.4, but I tend not to get any more precise than the nearest tenth, i.e. I rarely use numbers like 0.35 or 0.73 — tuning too granularly tends to lead to overfitting.
Start with default (0.00001) to under 0.0005.
I haven't used this on a large number of problems, but I find the default (0.00001) or just setting it to 0 generally works well. If I see that my training loss is much lower than my validation loss, then I will sometimes increase this parameter as it will help keep the model from overfitting. I also believe it will be more effective when tree depth is large; deeper trees are more likely to make splits with small improvements. All of the splits in shallow trees tend to have high improvement, so the minimum will always be exceeded. I don't think I've ever set this value higher than 0.0005.
Here is a pointer to the detailed parameter documentation for GBM:
Published at DZone with permission of Avkash Chauhan, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.