Published January 1, 2014 | Version v1
Journal article Open

Sentiment-Focused Web Crawling

  • 1. Middle E Tech Univ, TR-06531 Ankara, Turkey
  • 2. Yahoo Labs, Barcelona, Spain

Description

Sentiments and opinions expressed in Web pages towards objects, entities, and products constitute an important portion of the textual content available in the Web. In the last decade, the analysis of such content has gained importance due to its high potential for monetization. Despite the vast interest in sentiment analysis, somewhat surprisingly, the discovery of sentimental or opinionated Web content is mostly ignored. This work aims to fill this gap and addresses the problem of quickly discovering and fetching the sentimental content present in the Web. To this end, we design a sentiment-focused Web crawling framework. In particular, we propose different sentiment-focused Web crawling strategies that prioritize discovered URLs based on their predicted sentiment scores. Through simulations, these strategies are shown to achieve considerable performance improvement over general-purpose Web crawling strategies in discovery of sentimental Web content.

Files

bib-2c9c4c2c-8c0a-49f5-88c1-9c4e1dea0f01.txt

Files (114 Bytes)

Name Size Download all
md5:e677b1bdc74070eb42ff320fe7bcd60c
114 Bytes Preview Download