ASL Motion-Capture Corpus: First Release

Linguistic and Assistive Technologies Laboratory (LATLab)

Overview

The ASL Motion-Capture Corpus is the result of a multi-year project to collect, annotate, and analyze an ASL motion-capture corpus of multi-sentential discourse. At this time, we are ready to release to the research community the first sub-portion of our corpus that has been checked for quality. The corpus consists of unscripted, single-signer, multi-sentence ASL passages that were the result of various prompting strategies that were designed to encourage signers to use pronominal spatial reference yet minimize the use of classifier predicates. The annotation of the corpus includes glosses for each sign, an English translation of each passage, and details about the establishment and use of pronominal spatial reference points in space. Using this data, we are seeking computational models of the referential use of signing space and of spatially inflected verb forms for use in American Sign Language (ASL) animations, which have accessibility applications for deaf users.

How to Obtain the Files

Please send email to matt at cs.qc.cuny.edu to inquire about accessing the corpus.

What format of files do we release?

The corpus consists of four types of files, for each story that we have recorded.

Gloss text file, with start and end keyframe number for each gloss
Gloss text file for the non-dominant hand (only available for some videos), with start and end keyframe number for each gloss
Referents list text file that identifies the entities that have been established in the signing space during this story
English translation text file
BVH file - motion capture data in a commonly distributed file format
FBX file - motion capture data in MotionBuilder format, as originally recorded at our lab
Videos in MOV format: front, side, face views

How many stories and signers are included in this release?

This first release of the corpus consists of data collected from 3 signers, a total of 98 stories. Each story is generally 30 seconds to 4 minutes in length.

Citations and More Information

If you make use of this corpus, please cite the following publication:

Pengfei Lu, Matt Huenerfauth. 2012. "CUNY American Sign Language Motion-Capture Corpus: First Release." Proceedings of the 5th Workshop on the Representation and Processing of Sign Languages: Interactions between Corpus and Lexicon, The 8th International Conference on Language Resources and Evaluation (LREC 2012), Istanbul, Turkey.
[Adobe Acrobat PDF.]

Examples of the Data

Examples of excerpts of the data contained in the corpus may be available by request. Please send email to matt at cs.qc.cuny.edu to request access.

Funding Support

This material is based upon work supported in part by the National Science Foundation under award number 0746556.