Technical Reports

HPL-2010-50

Click here for full text: PDF

Identifying Themes in Social Media and Detecting Sentiments

Pal, Jayanta Kumar; Saha, Abhisek
HP Laboratories

HPL-2010-50

Keyword(s): Social Media, Text mining

Abstract: Recently, a huge wave of social media has generated significant impact in people's perceptions about technological domains. They are captured in several blogs/forums, where the themes relate to products of several companies. One of the companies can be interested to track them as resources for customer perceptions and detect user sentiments. The keyword- based approaches for identifying such themes fail to give satisfactory level of accuracy. Here, we address the above problems using statistical text-mining of blog entries. The crux of the analysis lies in mining quantitative information from textual entries. Once the relevant blog entries for the company/its competitors are filtered out, the theme identification is performed using a highly accurate novel technique termed as 'Best Separators Algorithm'. Logistic regression coupled with dimension reduction technique (singular value decomposition) is used to identify the tonality of those blogs. The final analysis shows significant improvement in terms of accuracy over popular approaches.

6 Pages

External Posting Date: April 6, 2010 [Fulltext]. Approved for External Publication
Internal Posting Date: April 6, 2010 [Fulltext]

Back to Index