Multi-objective Contextual Bandit Problem with Similarity Information

Turgay, Eralp; Oner, Doruk; Tekin, Cem

doi:10.81043/aperta.36243

Yayınlanmış 1 Ocak 2018 | Sürüm v1

Konferans bildirisi Açık

Multi-objective Contextual Bandit Problem with Similarity Information

1. Bilkent Univ, Elect & Elect Engn Dept, Ankara, Turkey

In this paper we propose the multi-objective contextual bandit problem with similarity information. This problem extends the classical contextual bandit problem with similarity information by introducing multiple and possibly conflicting objectives. Since the best arm in each objective can be different given the context, learning the best arm based on a single objective can jeopardize the rewards obtained from the other objectives. In order to evaluate the performance of the learner in this setup, we use a performance metric called the contextual Pareto regret. Essentially, the contextual Pareto regret is the sum of the distances of the arms chosen by the learner to the context dependent Pareto front. For this problem, we develop a new online learning algorithm called Pareto Contextual Zooming (PCZ), which exploits the idea of contextual zooming to learn the arms that are close to the Pareto front for each observed context by adaptively partitioning the joint context-arm set according to the observed rewards and locations of the contextarm pairs selected in the past. Then, we prove that PCZ achieves (O) over tilde (T(1+dp)/(2+dp)) Pareto regret where d(p) is the Pareto zooming dimension that depends on the size of the set of near-optimal context-arm pairs. Moreover, we show that this regret bound is nearly optimal by providing an almost matching Omega(T(1+dp)/(2+dp)) lower bound.

Dosyalar

bib-98d03afa-45f5-4015-b64e-948c1a730ab9.txt

Dosyalar (186 Bytes)

Ad	Boyut	Hepisini indir
bib-98d03afa-45f5-4015-b64e-948c1a730ab9.txt md5:a4030790b8222bde78db5fb2ab93b262	186 Bytes	Ön İzleme İndir

	Tüm sürümler	Bu sürüm
Görüntüleme	58	58
İndirilenler	13	13
Veri miktarı	2.4 kB	2.4 kB

Multi-objective Contextual Bandit Problem with Similarity Information

Dosyalar

bib-98d03afa-45f5-4015-b64e-948c1a730ab9.txt

Dosyalar (186 Bytes)

TÜBİTAK ULAKBİM

İLETİŞİM

Multi-objective Contextual Bandit Problem with Similarity Information

Oluşturanlar

Açıklama

Dosyalar

bib-98d03afa-45f5-4015-b64e-948c1a730ab9.txt

Dosyalar (186 Bytes)