HP Labs India
Research - Intuitive Multimodal and Gestural Interaction project| Linguistic Resources for Handwriting Recognition |
| Training Dataset hpl-tamil-iwfhr06-train for IWFHR 2006 Online Tamil Handwritten Character Recognition Competition. |
| Description |
The dataset contains approximately 300 isolated samples each of 156 Tamil “characters” (details) written by native Tamil writers including school children, university graduates, and adults from the cities of Bangalore, Karnataka, India and Salem, Tamil Nadu, India. The data was collected using HP TabletPCs and is in standard UNIPEN format.
An offline version of the data is also available in the form of
bi-level TIFF images, generated from the online data using simple
piecewise linear interpolation with a constant thickening factor
applied.
The data is available only for research use.
| Note |
- To register for the competition, please go to the competition home page.
- Test data (collected as part of the same data collection effort) will be made available later, as per the competition schedule.
- The combined dataset (training + test) will be made available on conclusion of the competition.

| Related Links |
| Downloads |
Downloading the dataset implies that you have understood and accepted the terms of the license agreement.
- Online data,
UNIPEN format, tar.gz file
Version 0.2, Released Feb 1, 2006, 30 MB - Offline (image)
data, Bi-level TIFF, tar.gz file
Released Feb 1, 2006, 25 MB
Note: On downloading these files with Internet Explorer on Windows XP, the filename extension is changed to ".tar.tar", which is incorrect. It is recommended the file be restored to ".tar.gz" once downloaded.
Report
an issue with this dataset