Click here for full text:
Pragmatic Text Mining: Minimizing Human Effort to Quantify Many Issues in Call Logs
Forman, George; Kirshenbaum, Evan; Suermondt, Jaap
HPL-2006-60R1
Keyword(s): text mining; log processing; supervised machine learning; quantification; text classification; applications; pattern recognition
Abstract: We discuss our experiences in analyzing customer- support issues from the unstructured free-text fields of technical-support call logs. The identification of frequent issues and their accurate quantification is essential in order to track aggregate costs broken down by issue type, to appropriately target engineering resources, and to provide the best diagnosis, support and documentation for most common issues. We present a new set of techniques for doing this efficiently on an industrial scale, without requiring manual coding of calls in the call center. Our approach involves (1) a new text clustering method to identify common and emerging issues; (2) a method to rapidly train large numbers of categorizers in a practical, interactive manner; and (3) a method to accurately quantify categories, even in the face of inaccurate classifications and training sets that necessarily cannot match the class distribution of each new month's data. We present our methodology and a tool we developed and deployed that uses these methods for tracking ongoing support issues and discovering emerging issues at HP.
10 Pages
Back to Index
|