Handwriting Recognition

  • by user1
  • 28 February, 2022

Transcriptions of 400,000 handwritten names

LicenseCC0: Public Domain

Tagsmusicimage datatext datanlpdeep learningand 1 more

Overview

This dataset consists of more than four hundred thousand handwritten names collected through charity projects.

Character Recognition utilizes image processing technologies to convert characters on scanned documents into digital forms. It typically performs well in machine-printed fonts. However, it still poses difficult challenges for machines to recognize handwritten characters, because of the huge variation in individual writing styles.

There are 206,799 first names and 207,024 surnames in total. The data was divided into a training set (331,059), testing set (41,382), and validation set (41,382) respectively.

Content

The input data here are hundreds of thousands of images of handwritten names. In the Data, you’ll find the transcribed images broken up into test, training, and validation sets.

Image Lable follow the following naming format enabling you to extend the data set with your own data.

ImageURL
D2M150010079F00021first name.jpg
D2M150010079F00021surname.jpg
D2M150010079F00032surname.jpg
D2M150010079F00043first name.jpg
D2M150010079F00043surname.jpg
D2M150010079F00054first name.jpg
D2M150010079F00065first name.jpg
D2M150010079F00065surname.jpg
D2M150010079F00076first name.jpg

Inspiration

The Inspiration of this is to explore the task of classifying handwritten text and to convert handwritten text into the digital format using various approaches out there

Size: 1321360 KB Price: Free Author: landlord Data source: kaggle.com