Fingerspelling Recognition Using Letter-to-Letter Transitions

Susanna Ricco, Carlo Tomasi

To communicate a proper noun or other uncommon word in American Sign Language (ASL), a signer spells the word by forming a distinct gesture to represent each letter in a process called fingerspelling. Existing systems that translate fingerspelling sequences into text have achieved impressive recognition rates using a specific representation of the shape of a signer's hand. In this representation, one must specify the bending angle at each of the joints in the hand, defining a skeletal model of the hand. It is difficult to compute the bending angles from video; these systems use sensors like Cybergloves to directly measure the required angles at each joint.

We wish to recognize and translate fingerspelling sequences from native signers using only video from a single camera. Our approach focuses on recognizing the gestures corresponding to transitions between letters rather than the letters themselves. Our goal is to describe these transitional gestures using semantically meaningful labels (e.g., extend finger) and then use these descriptions to recognize the starting and ending letters of the transition. Motion during the transitions can be used to distinguish letters whose static handshapes look similar (and would otherwise require a reconstructed skeletal model of the hand to recognize correctly). Additionally, focusing on recognizing transitional motions rather than individual static letters will allow us to generalize to native signers who fingerspell at high speeds without pausing at each letter.

View our poster for some preliminary results.