Leveraging auxiliary image descriptions for dense video captioning

Boran, Emre; Erdem, Aykut; Ikizler-Cinbis, Nazli; Erdem, Erkut; Madhyastha, Pranava; Specia, Lucia

doi:10.1016/j.patrec.2021.02.009

Yayınlanmış 1 Ocak 2021 | Sürüm v1

Dergi makalesi Açık

Leveraging auxiliary image descriptions for dense video captioning

1. Hacettepe Univ, Dept Comp Engn, Ankara, Turkey
2. Koc Univ, Dept Comp Engn, Istanbul, Turkey
3. Imperial Coll, Dept Comp, London, England

Collecting textual descriptions is an especially costly task for dense video captioning, since each event in the video needs to be annotated separately and a long descriptive paragraph needs to be provided. In this paper, we investigate a way to mitigate this heavy burden and propose to leverage captions of visually similar images as auxiliary context. Our model successfully fetches visually relevant images and combines noun and verb phrases from their captions to generating coherent descriptions. To this end, we use a generator and discriminator design, together with an attention-based fusion technique, to incorporate image captions as context in the video caption generation process. The experiments on the challenging ActivityNet Captions dataset demonstrate that our proposed approach achieves more accurate and more diverse video descriptions compared to the strong baseline using METEOR, BLEU and CIDEr-D metrics and qualitative evaluations.

Dosyalar

bib-f8d5883e-1ead-4a1c-9d93-07fdfac2bbeb.txt

Dosyalar (203 Bytes)

Ad	Boyut	Hepisini indir
bib-f8d5883e-1ead-4a1c-9d93-07fdfac2bbeb.txt md5:0c651a046c6d583332a68608b76d4f17	203 Bytes	Ön İzleme İndir

	Tüm sürümler	Bu sürüm
Görüntüleme	38	38
İndirilenler	11	11
Veri miktarı	2.2 kB	2.2 kB

Leveraging auxiliary image descriptions for dense video captioning

Dosyalar

bib-f8d5883e-1ead-4a1c-9d93-07fdfac2bbeb.txt

Dosyalar (203 Bytes)

TÜBİTAK ULAKBİM

İLETİŞİM

Leveraging auxiliary image descriptions for dense video captioning

Oluşturanlar

Açıklama

Dosyalar

bib-f8d5883e-1ead-4a1c-9d93-07fdfac2bbeb.txt

Dosyalar (203 Bytes)