Published January 1, 2008 | Version v1
Conference paper Open

Identification of Coreferential Chains in Video Texts for Semantic Annotation of News Videos

  • 1. TUBITAK Uzay Inst, Power Elect Grp, TR-06531 Ankara, Turkey
  • 2. Middle E Tech Univ, Dept Comp Engn, TR-06531 Ankara, Turkey

Description

People can benefit from today's video archives of huge sizes only through appropriate and effective ways of querying the video data. In order to query the video data through high-level semantic entities such as objects, events, and relations, these entities should be properly extracted and the corresponding video shots should be annotated accordingly. Video texts, which comprise the caption texts on the frames as well as transcription texts obtained through automatic speech recognition techniques, are valuable sources of information for semantic modeling of the videos. In this paper, we present an approach for the extraction of semantic objects from videos by utilizing lexical resources along with the identification of coreference chains in the corresponding video texts. Coreference is a phenomenon in natural language texts where a number of entities in the text refer to the same real world entity. Therefore, while the domain-specific lexical resources aid in the determination of salient entities in the video text, the identification of coreference chains prevents the superfluous extraction of the same underlying entities due to their different surface forms in the video texts. The proposed approach is significant for its being the first attempt to address the importance of coreference phenomenon in video texts for precise entity extraction during the semantic modeling of news videos with a hands-on application. The approach has been evaluated on Turkish political news texts from the METU Turkish corpus and a number of evaluation problems faced such as sparseness of annotated evaluation data for Turkish are also pointed out together with further research directions to pursue.

Files

bib-d0f6eb10-067c-4606-ae01-341d619f359a.txt

Files (190 Bytes)

Name Size Download all
md5:dae01606f9308c441186e3ee5a06c7f7
190 Bytes Preview Download