Idea & Research
Brainstormed concept; literature survey on ASL translation methods.
Bridging English to ASL in real-time for inclusive communication.
Talk2Sign is a Progressive Web App that instantly translates text, audio, and video into American Sign Language, empowering Deaf and Hard-of-Hearing users to access multimedia seamlessly.
See Project TimelineExplored state-of-the-art ASL systems; pivoted from avatar animation to real ASLLVD videos using gloss translation.
Fine-tuned T5-Base on custom English-gloss dataset. Applied token filtering, normalization, and manual gloss corrections.
React PWA frontend → Flask backend → T5 inference → gloss mapped to video segments → stitched with FFmpeg.
Low BLEU scores, incorrect token mapping, FFmpeg merge glitches. Solved via preprocessing, caching, and segment filtering.
Live demo supports text/audio/image/YouTube input with Firebase Auth, AssemblyAI ASR, Google Vision OCR.
Translated 400+ phrases across 4 input types with ~4s latency. Fully modular and accessible from any modern browser.
Brainstormed concept; literature survey on ASL translation methods.
Created wireframes and mockups; validated design with user feedback.
Cleaned and tokenized ASLLVD dataset; normalized gloss tokens and split data for training.
Explored LSTM, CNN, and 3D-CNN approaches; evaluated performance and feasibility. Switched to fine-tuning T5 (small→base); optimized hyperparameters. Built React PWA, integrated Firebase Auth, Google Vision OCR, and AssemblyAI ASR.
Connected Flask API to model, set up YouTube Subtitles API and full pipeline.
Implemented FFmpeg logic to stitch ASLLVD clips into continuous ASL videos.
Final integration of all modules.
Expand into medical, legal & other specialized vocabularies.
Enable ASL translation without network—perfect for low-connectivity scenarios.
Curate larger, more diverse gloss - video corpora and apply transfer learning to boost accuracy.
💬 Got questions, feedback, or want to collaborate? Drop us a line at talk2sign@gmail.com