Reoptimization Nearly Solves Weakly Coupled Markov Decision Processes

Nicolas Gast; Bruno Gaujal; Chen Yan

Pré-Publication, Document De Travail Année : 2024

Reoptimization Nearly Solves Weakly Coupled Markov Decision Processes

(1) , (1) , (1)

Nicolas Gast

Fonction : Auteur
PersonId : 1247
IdHAL : nicolas-gast
ORCID : 0000-0001-6884-8698
IdRef : 233247874

Performance analysis and optimization of LARge Infrastructures and Systems

Bruno Gaujal

Fonction : Auteur
PersonId : 11644
IdHAL : bruno-gaujal
ORCID : 0000-0001-9081-8401
IdRef : 074658441

Performance analysis and optimization of LARge Infrastructures and Systems

Chen Yan

Fonction : Auteur
PersonId : 1102255

Performance analysis and optimization of LARge Infrastructures and Systems

Résumé

We propose a new policy, called the LP-update policy, to solve finite horizon weakly-coupled Markov decision processes. The latter can be seen as multi-constraint multi-action bandits, and generalize the classical restless bandit problems. Our solution is based on re-solving periodically a relaxed version of the original problem, that can be cast as a linear program (LP). When the problem is made of $N$ statistically identical sub-components, we show that the LP-update policy becomes asymptotically optimal at rate $O(T^2/\sqrt{N})$. This rate can be improved to $O(T/\sqrt{N})$ if the problem satisfies some ergodicity property and to $O(1/N)$ if the problem is non-degenerate. The definition of non-degeneracy extends the same notion for restless bandits. By using this property, we also improve the computational efficiency of the LP-update policy. We illustrate the performance of our policy on randomly generated examples, as well as a generalized applicant screening problem, and show that it outperforms existing heuristics.

Mots clés

Markov processes stochastic optimization linear programming large scale optimization restless bandit

Domaines

Recherche opérationnelle [math.OC] Probabilités [math.PR]

Fichier principal

LP_update_for_weakly_coupled_MDP.pdf (946.02 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Nicolas Gast : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-04570177

Soumis le : lundi 6 mai 2024-21:03:07

Dernière modification le : vendredi 17 mai 2024-14:17:37

Dates et versions

hal-04570177 , version 1 (06-05-2024)

Licence

Paternité

Identifiants

HAL Id : hal-04570177 , version 1

Citer

Nicolas Gast, Bruno Gaujal, Chen Yan. Reoptimization Nearly Solves Weakly Coupled Markov Decision Processes. 2024. ⟨hal-04570177⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS INRIA LIG LIG_SRCPR INRIA2 TDS-MACS LIG-SRCPR-POLARIS ANR LIG_SIDCH

0 Consultations

0 Téléchargements

Reoptimization Nearly Solves Weakly Coupled Markov Decision Processes

Résumé

Mots clés

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Partager