Learning processes play an important role in enhancing understanding and analyzing real phenomena. Most of these methodologies revolve around solving penalized optimization problems. A significant challenge arises in the choice of the penalty hyperparameter, which is typically user-specified or determined through Grid search approaches. There is a lack of automated tuning procedures for the estimation of these hyperparameters, particularly in unsupervised learning scenarios. In this paper, we focus on the unsupervised context and propose a bi-level strategy to address the issue of tuning the penalty hyperparameter. We establish suitable conditions for the existence of a minimizer in an infinite-dimensional Hilbert space, along with presenting some theoretical considerations. These results can be applied in situations where obtaining an exact minimizer is unfeasible. Working on the estimation of the hyperparameter with the gradient-based method, we also introduce a modified version of Ekeland’s principle as a stopping criterion for these methods. Our approach distinguishes from conventional techniques by reducing reliance on random or black-box strategies, resulting in stronger mathematical generalization.
Theoretical Aspects in Penalty Hyperparameters Optimization
Esposito F.;Selicato L.;
2023-01-01
Abstract
Learning processes play an important role in enhancing understanding and analyzing real phenomena. Most of these methodologies revolve around solving penalized optimization problems. A significant challenge arises in the choice of the penalty hyperparameter, which is typically user-specified or determined through Grid search approaches. There is a lack of automated tuning procedures for the estimation of these hyperparameters, particularly in unsupervised learning scenarios. In this paper, we focus on the unsupervised context and propose a bi-level strategy to address the issue of tuning the penalty hyperparameter. We establish suitable conditions for the existence of a minimizer in an infinite-dimensional Hilbert space, along with presenting some theoretical considerations. These results can be applied in situations where obtaining an exact minimizer is unfeasible. Working on the estimation of the hyperparameter with the gradient-based method, we also introduce a modified version of Ekeland’s principle as a stopping criterion for these methods. Our approach distinguishes from conventional techniques by reducing reliance on random or black-box strategies, resulting in stronger mathematical generalization.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.