Talk2Sign

Project Summary

Research & Idea

Explored state-of-the-art ASL systems; pivoted from avatar animation to real ASLLVD videos using gloss translation.

Model Training

Fine-tuned T5-Base on custom English-gloss dataset. Applied token filtering, normalization, and manual gloss corrections.

System Design

React PWA frontend → Flask backend → T5 inference → gloss mapped to video segments → stitched with FFmpeg.

Challenges

Low BLEU scores, incorrect token mapping, FFmpeg merge glitches. Solved via preprocessing, caching, and segment filtering.

Deployment

Live demo supports text/audio/image/YouTube input with Firebase Auth, AssemblyAI ASR, Google Vision OCR.

Impact

Translated 400+ phrases across 4 input types with ~4s latency. Fully modular and accessible from any modern browser.

Project Timeline

Jul '24

🧠

Idea & Research

Brainstormed concept; literature survey on ASL translation methods.

Oct '24

🎨

UI/UX Design

Created wireframes and mockups; validated design with user feedback.

Nov '24

🔍

Data Preprocessing

Cleaned and tokenized ASLLVD dataset; normalized gloss tokens and split data for training.

Dec '24

⚙️

Model Prototyping & Fine-Tuning + Frontend & APIs Implementation

Explored LSTM, CNN, and 3D-CNN approaches; evaluated performance and feasibility. Switched to fine-tuning T5 (small→base); optimized hyperparameters. Built React PWA, integrated Firebase Auth, Google Vision OCR, and AssemblyAI ASR.

Jan '25

🔗

Backend Integration

Connected Flask API to model, set up YouTube Subtitles API and full pipeline.

March '25

🎞️

Video Stitching

Implemented FFmpeg logic to stitch ASLLVD clips into continuous ASL videos.

April '25

⚙️

Integration

Final integration of all modules.

Future Work

Domain-specific datasets

Expand into medical, legal & other specialized vocabularies.

Offline PWA support

Enable ASL translation without network—perfect for low-connectivity scenarios.

Advanced Dataset & Model Optimization

Curate larger, more diverse gloss - video corpora and apply transfer learning to boost accuracy.

Team & Contributions