Hyperparameter optimization methods in machine learning

Pozdniakovych O.; Pozdniakovych O.; Позднякович О.Є.

doi:10.33111/mise.104.12

Modeling and Information System in Economics

ISSN 2708-9746

Oleksandr Pozdniakovych

Hyperparameter optimization methods in machine learning

DOI:

10.33111/mise.104.12

Анотація: Стаття присвячена проблемі оптимізації гіперпараметрів у машинному навчанні. Проводиться аналіз різних методів для поліпшення продуктивності моделей. У статті досліджено сучасні підходи до опти мізації гіперпараметрів у машинному навчанні, що є критично важливими для підвищення точності та узагальнюючої здатності моделей глибо кого навчання. Розглянуто проблему залежності ефективності моделей від вибору гіперпараметрів, таких як швидкість навчання, параметри регуляризації та архітектурні особливості мережі. Проаналізовано пере ваги та обмеження основних методів оптимізації: базових (grid search, random search), байєсівських, еволюційних алгоритмів, а також методів зі змінною точністю (successive halving, Hyperband). Особливу увагу приді лено використанню байєсівської оптимізації із сурогатними моделями на основі гауссових процесів, які дозволяють прогнозувати значення цільової функції та рівень невизначеності. Описано функції збору, такі як Expected Improvement, Probability of Improvement та Upper Confidence Bound, що ке рують процесом вибору нових гіперпараметрів. Показано, що ефективне використання обчислювальних ресурсів, зокрема через методи змінної точності, є ключовим чинником у задачах масштабного налаштування моделей. Визначено, що еволюційні алгоритми забезпечують гнучкість у дослідженні простору гіперпараметрів, хоча мають високу обчислювальну складність. У висновках підкреслено важливість автоматизації про цесу налаштування гіперпараметрів, інтеграції цих методів у життєвий цикл моделі та перспективність подальших досліджень у напряму масштабованих і розподілених обчислень. Результати мають практичну цінність для побудови продуктивних моделей у середовищах з обмеженими ресурсами та високими вимогами до точності прогнозування.

Abstract: The article is devoted to the problem of hyperparameter optimization in machine learning. Different methods for improving model performance are analyzed. The article explores modern approaches to hyperparameter optimization in machine learning, which is critically important for enhancing the accuracy and generalization capability of deep learning models. It addresses the issue of model performance dependency on the selection of hyperparameters such as learning rate, regularization parameters, and network architecture. The study analyzes the strengths and limitations of the main optimization methods: basic (grid search, random search), Bayesian, evolutionary algorithms, and variable fidelity methods (successive halving, Hyperband). Special attention is given to Bayesian optimization using surrogate models based on Gaussian processes, which allow for predicting the value of the objective function and the level of uncertainty. Acquisition functions such as Expected Improvement, Probability of Improvement, and Upper Confidence Bound are described as key mechanisms in guiding the selection of new hyperparameters. The article shows that efficient use of computational resources, particularly through variable fidelity approaches, is a crucial factor in large-scale model tuning tasks. It is determined that evolutionary algorithms provide flexibility in exploring the hyperparameter space, albeit with high computational complexity. The conclusions emphasize the importance of automating the hyperparameter tuning process, integrating these methods into the model development lifecycle, and the prospects for future research in scalable and distributed computing. The findings are practically valuable for building high-performance models in resource-constrained environments with strict predictive accuracy requirements.

Key words: optimization of hyperparameters, machine learning, deep neural networks.

UDC: 004.8:004.21

To cite paper

In APA style

Pozdniakovych, O. (2024). Hyperparameter optimization methods in machine learning. Modeling and Information System in Economics, 104, 135-143. http://doi.org/10.33111/mise.104.12

In MON style

Позднякович О.Є. Методи оптимізації гіперпараметрів у машинному навчанні. Моделювання та інформаційні системи в економіці. 2024. № 104. С. 135-143. http://doi.org/10.33111/mise.104.12 (дата звернення: 04.10.2025).

With transliteration

Pozdniakovych, O. (2024) Metody optymizatsii hiperparametriv u mashynnomu navchanni [Hyperparameter optimization methods in machine learning]. Modeling and Information System in Economics, no. 104. pp. 135-143. http://doi.org/10.33111/mise.104.12 [in Ukrainian] (accessed 04 Oct 2025).

# 104 / 2024

Download Paper

96

Views

23

Downloads

0

Cited by

1. Kohavi, R., John, G.: Automatic Parameter Selection by Minimizing Estimated Error. In: Prieditis, A., Russell, S. (eds.) Proceedings of the Twelfth International Conference on Machine Learning, pp. 304–312. Morgan Kaufmann Publishers (1995).

2. Escalante, H., Montes, M., Sucar, E.: Particle Swarm Model Selection. Journal of Machine Learning Research 10, 405–440 (2009).

3. Mantovani, R., Horvath, T., Cerri, R., Vanschoren, J., Carvalho, A.: Hyper-Parameter Tuning of a Decision Tree Induction Algorithm. In: 2016 5th Brazilian Conference on Intelligent Systems (BRACIS). pp. 37–42. IEEE Computer Society Press (2016). 142

4. Olson, R., La Cava, W., Mustahsan, Z., Varik, A., Moore, J.: Data driven advice for applying machine learning to bioinformatics problems. In: Proceedings of the Pacific Symposium in Biocomputing 2018. pp. 192–203 (2018).

5. Sanders, S., Giraud-Carrier, C.: Informing the Use of Hyperparameter Optimization Through Metalearning. In: Gottumukkala, R., Ning, X., Dong, G., Raghavan, V., Aluru, S., Karypis, G., Miele, L., Wu, X. (eds.) 2017 IEEE International Conference on Big Data (Big Data). IEEE Computer Society Press (2017).

6. Thornton, C., Hutter, F., Hoos, H., Leyton-Brown, K.: Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In: Dhillon, I., Koren, Y., Ghani, R., Senator, T., Bradley, P., Parekh, R., He, J., Grossman, R., Uthurusamy, R. (eds.) The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDDʼ13). pp. 847– 855. ACM Press (2013).

7. Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. Journal of Machine Learning Research 13, 281–305 (2012).

8. Rasmussen, C., Williams, C.: Gaussian Processes for Machine Learning. The MIT Press (2006).

9. Balcan, M., Weinberger, K. (eds.): Proceedings of the 33rd International Conference on Machine Learning (ICMLʼ17), vol. 48. Proceedings of Machine Learning Research (2016).

10. Srinivas, N., Krause, A., Kakade, S. M., & Seeger, M. W. (2012). Information-theoretic regret bounds for Gaussian process optimization in the bandit setting. IEEE Transactions on Information Theory, 58(5), 3250-3265.

11. Simon, D.: Evolutionary optimization algorithms. John Wiley & Sons (2013).

12. Kohavi, R., John, G.: Automatic Parameter Selection by Minimizing Estimated Error. In: Prieditis, A., Russell, S. (eds.) Proceedings of the Twelfth International Conference on Machine Learning, pp. 304–312. Morgan Kaufmann Publishers (1995).

13. Provost, F., Jensen, D., Oates, T.: Efficient progressive sampling. In: Fayyad, U., Chaudhuri, S., Madigan, D. (eds.) The 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDDʼ99). pp. 23–32. ACM Press (1999).

14. Li, L., Jamieson, K., DeSalvo, G., Rostamizadeh, A., Talwalkar, A.: Hyperband: A novel bandit-based approach to hyperparameter optimization. Journal of Machine Learning Research 18(185), 1–52 (2018).

15. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A., Fei-Fei, L.: Imagenet large scale visual recognition challenge. International Journal of Computer Vision 115(3), 211–252 (2015).

Меню

Методи оптимізації гіперпараметрів у машинному навчанні

Hyperparameter optimization methods in machine learning

10.33111/mise.104.12

References