Auxiliary Classifier based Residual RNN for Image Captioning

Cayli, Ozkan; Kilic, Volkan; Onan, Aytug; Wang, Wenwu

doi:10.48623/aperta.254535

Published January 1, 2022 | Version v1

Conference paper Open

Auxiliary Classifier based Residual RNN for Image Captioning

1. Izmir Katip Celebi Univ, Elect & Elect Engn Grad Program, Izmir, Turkey
2. Izmir Katip Celebi Univ, Dept Comp Engn, Izmir, Turkey
3. Univ Surrey, Ctr Vis Speech & Signal Proc CVSSP, Guildford, Surrey, England

Image captioning aims to generate a description of visual contents with natural language automatically. This is useful in several potential applications, such as image understanding and virtual assistants. With recent advances in deep neural networks, natural and semantic text generation has been improved in image captioning. However, maintaining the gradient flow between neurons in consecutive layers becomes challenging as the network gets deeper. In this paper, we propose to integrate an auxiliary classifier in the residual recurrent neural network, which enables the gradient flow to reach the bottom layers for enhanced caption generation. Experiments on the MSCOCO and VizWiz datasets demonstrate the advantage of our proposed approach over the state-of-the-art approaches in several performance metrics.

Files

bib-8eb1a6c7-3400-43e7-b50b-7056411dc07c.txt

Files (174 Bytes)

Name	Size	Download all
bib-8eb1a6c7-3400-43e7-b50b-7056411dc07c.txt md5:45385fdf6b3671c60cdd7bfac2222143	174 Bytes	Preview Download

	All versions	This version
Views	33	33
Downloads	8	8
Data volume	1.4 kB	1.4 kB

Auxiliary Classifier based Residual RNN for Image Captioning

Files

bib-8eb1a6c7-3400-43e7-b50b-7056411dc07c.txt

Files (174 Bytes)

TÜBİTAK ULAKBİM

CONTACT

Auxiliary Classifier based Residual RNN for Image Captioning

Creators

Description

Files

bib-8eb1a6c7-3400-43e7-b50b-7056411dc07c.txt

Files (174 Bytes)