Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP

HP.com home


Technical Reports



» 

HP Labs

» Research
» News and events
» Technical reports
» About HP Labs
» Careers @ HP Labs
» People
» Worldwide sites
» Downloads
Content starts here

 
Click here for full text: PDF

DRR Research beyond COTS OCR Software: A Survey

Lin, Xiaofan

HPL-2004-167

Keyword(s): document recognition and retrieval; COTS software; OCR

Abstract: After decades of research, Optical Character Recognition (OCR) has entered into a relatively mature stage. Commercial off-the-shelf (COTS) OCR software packages have become powerful tools in Document Recognition and Retrieval (DRR) applications. One question naturally arises: What areas are left for new DRR research beyond COTS OCR software? There are many discussions around it in recent conferences. This paper attempts to address this question through a systematic survey of recently reported DRR projects as well as our own Digital Content Re-Mastering (DCRM) research at HP Labs. This survey has shown that custom DRR research is still in great need for better accuracy and reliability, complementary contents, or downstream information retrieval. Several concrete observations are also made on the basis of this survey: First, the basic character/word recognition is mostly taken on by COTS software, with a few exceptions. Second, system-level research with regard to reliability and guaranteed accuracy can seldom be replaced by COTS software. Third, document-level structure understanding still has much room to expand. Fourth, post-OCR information retrieval also has many challenging research topics. Notes: Copyright SPIE. To be presented at and published in the SPIE Conference on Document Recognition and Retrieval XII, 16-20 January 2005, San Jose, CA

9 Pages

Back to Index

»Technical Reports

» 2009
» 2008
» 2007
» 2006
» 2005
» 2004
» 2003
» 2002
» 2001
» 2000
» 1990 - 1999

Heritage Technical Reports

» Compaq & DEC Technical Reports
» Tandem Technical Reports
Printable version
Privacy statement Using this site means you accept its terms Feedback to HP Labs
© 2009 Hewlett-Packard Development Company, L.P.