American Sign Language-English Machine Translation
I worked on a government-funded research and development project that developed a bi-directional communication system between signers and speakers. I led a team of linguists and engineers and successfully completed the 5-year long project, exploring end-to-end Tranformer and “segment, classify, decode”-pipeline approaches. The team developed a deep neural network-based sign recognition model that recognizes American Sign language, both sign glosses and fingerspells, and an ASL-English MT model that translates the recognized ASL glosses and fingerspells into English. The demo also translated Spoken English into sign glosses and fingerspells that were animated by an avatar.
Here are the links to the research paper and related open source codes from the early stage of the project:
- Improving American Sign Language Recognition with Synthetic Data, MT Summit 2019
- https://github.com/dragonfly-asl/ASLRFeatureExtractor
- https://github.com/dragonfly-asl/SyntheticDataGenerator
The final demo system is not in the public domain (yet?) and is being further refined and maintained by the project sponsor.
Neural Machine Translation
- Denoising objectives with mono/multilingual text + MT objectives for training NMT (Raffel et al., 2019; Ma et al., 2021; Chi et al., 2021) with Multi-Task, Multi-langpair Learning
- Document-level LLM Prompt-based MT
- Tied Transformers (Xia et al., 2019)
- Retrieval Augmented NMT (Hoang et al., 2023)
- Position Encoders for LLM: ALiBi (Press et al., 2022), RoPE (Su et al., 2021), XPOS (Sun et al., 2021)
- Segment merger, spliter for Dialogue MT
- Multimodal (Video, Audio, Text) Transformers
- CTC Decoding with alignment
- SVD Softmax (Shim et al., 2017) luatorch, Cuda
Cross-Language Information Retrieval
- Learning Semantics with Deep Belief Networks (COLING 2012)
- Translating Queries with Pseudo-Contextual Information (Unpublished, PDF)
- Cross-lingual Link Discovery
- Patent CLIR
Multilingual Natural Language Processing
- Evaluating Multilanguage-Comparability of Subjectivity Analysis Systems (ACL 2010)
- Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis (ACL-ICJNLP 2009)
- Conveying Subjectivity of a Lexicon of One Language into Another Using a Bilingual Dictionary and a Link Analysis Algorithm (ICCPOL 2009)