Published January 1, 2022
| Version v1
Conference paper
Open
Auxiliary Classifier based Residual RNN for Image Captioning
- 1. Izmir Katip Celebi Univ, Elect & Elect Engn Grad Program, Izmir, Turkey
- 2. Izmir Katip Celebi Univ, Dept Comp Engn, Izmir, Turkey
- 3. Univ Surrey, Ctr Vis Speech & Signal Proc CVSSP, Guildford, Surrey, England
Description
Image captioning aims to generate a description of visual contents with natural language automatically. This is useful in several potential applications, such as image understanding and virtual assistants. With recent advances in deep neural networks, natural and semantic text generation has been improved in image captioning. However, maintaining the gradient flow between neurons in consecutive layers becomes challenging as the network gets deeper. In this paper, we propose to integrate an auxiliary classifier in the residual recurrent neural network, which enables the gradient flow to reach the bottom layers for enhanced caption generation. Experiments on the MSCOCO and VizWiz datasets demonstrate the advantage of our proposed approach over the state-of-the-art approaches in several performance metrics.
Files
bib-8eb1a6c7-3400-43e7-b50b-7056411dc07c.txt
Files
(174 Bytes)
| Name | Size | Download all |
|---|---|---|
|
md5:45385fdf6b3671c60cdd7bfac2222143
|
174 Bytes | Preview Download |