Research
People
- Leadership
- People
- Jobs
- Awards
- Blogs
About
- About
- Worldwide Sites

HP Labs India

Research - Intuitive Multimodal and Gestural Interaction project

Home

Who We Are
- »Director's message
- »Director's biography
- »Our people

Research

Opportunities
- »Careers
- »Internships

Collaboration
- »Universities
- »BITS-HP Labs India PhD fellowship

News & Information
- »Lectures
- »Workshops and conferences
- »Awards
- »Publications
- »Press
- »Project archive

Downloads
- »Lab brochure
- »Whitepapers
- »Demos

Linguistic Resources for Handwriting Recognition

Test Dataset hpl-tamil-iwfhr06-test for IWFHR 2006 Online Tamil Handwritten Character Recognition Competition.

Description

The dataset contains approximately 170 isolated samples each of 156 Tamil “characters” (total of 26926 samples) written by native Tamil writers including school children, university graduates, and adults from the cities of Bangalore, Karnataka, India and Salem, Tamil Nadu, India. The data was collected using HP TabletPCs and is in standard UNIPEN format.

The samples have been randomised across writers and classes, and are serially numbered from 00000 - 26925.

The ground truth for the samples can be downloaded from here.

An offline version of the data is also available in the form of bi-level TIFF images, generated from the online data using simple piecewise linear interpolation with a constant thickening factor applied.

The data is available only for research use.

Note

To register for the competition, please go to the competition home page.
Training data (collected as part of the same data collection effort) was made available earlier, as per the competition schedule.
The combined dataset (training + test) will be made available on conclusion of the competition.

Related Links

Competition Home Page
Training Dataset hpl-tamil-iwfhr06-train
Ground Truth for the Test Data ---- New!

Downloads

Downloading the dataset implies that you have understood and accepted the terms of the license agreement.

Online data, UNIPEN format, tar.gz file
Version 1.0, Released May 04, 2006, 15.4 MB
Offline (image) data, Bi-level TIFF, tar.gz file
Version 1.0, Released May 04, 2006, 12.6MB

Note: On downloading these files with Internet Explorer on Windows XP, the filename extension is changed to ".tar.tar", which is incorrect. It is recommended the file be restored to ".tar.gz" once downloaded.

Report an issue with this dataset

This page was last updated on January 22, 2010

HP Labs India

Home

Who We Are

Research

Opportunities

Collaboration

News & Information

Downloads

Contact us