Talk2Sign

Bridging English to ASL in real-time for inclusive communication.

Talk2Sign is a Progressive Web App that instantly translates text, audio, and video into American Sign Language, empowering Deaf and Hard-of-Hearing users to access multimedia seamlessly.

See Project Timeline

Project Summary

Research & Idea

Explored state-of-the-art ASL systems; pivoted from avatar animation to real ASLLVD videos using gloss translation.

Model Training

Fine-tuned T5-Base on custom English-gloss dataset. Applied token filtering, normalization, and manual gloss corrections.

System Design

React PWA frontend → Flask backend → T5 inference → gloss mapped to video segments → stitched with FFmpeg.

Challenges

Low BLEU scores, incorrect token mapping, FFmpeg merge glitches. Solved via preprocessing, caching, and segment filtering.

Deployment

Live demo supports text/audio/image/YouTube input with Firebase Auth, AssemblyAI ASR, Google Vision OCR.

Impact

Translated 400+ phrases across 4 input types with ~4s latency. Fully modular and accessible from any modern browser.

Live Demo & Source

Tech Stack

React
Flask
PyTorch
FFmpeg
FFmpeg
Firebase
Vision API
AssemblyAI
AssemblyAI

Project Timeline

Jul '24
🧠

Idea & Research

Brainstormed concept; literature survey on ASL translation methods.

Oct '24
🎨

UI/UX Design

Created wireframes and mockups; validated design with user feedback.

Nov '24
🔍

Data Preprocessing

Cleaned and tokenized ASLLVD dataset; normalized gloss tokens and split data for training.

Dec '24
⚙️

Model Prototyping & Fine-Tuning + Frontend & APIs Implementation

Explored LSTM, CNN, and 3D-CNN approaches; evaluated performance and feasibility. Switched to fine-tuning T5 (small→base); optimized hyperparameters. Built React PWA, integrated Firebase Auth, Google Vision OCR, and AssemblyAI ASR.

Jan '25
🔗

Backend Integration

Connected Flask API to model, set up YouTube Subtitles API and full pipeline.

March '25
🎞️

Video Stitching

Implemented FFmpeg logic to stitch ASLLVD clips into continuous ASL videos.

April '25
⚙️

Integration

Final integration of all modules.

Future Work

Domain-specific datasets

Expand into medical, legal & other specialized vocabularies.

Offline PWA support

Enable ASL translation without network—perfect for low-connectivity scenarios.

Advanced Dataset & Model Optimization

Curate larger, more diverse gloss - video corpora and apply transfer learning to boost accuracy.

Team & Contributions

NW

Nifra Wahaj

Model training & Video stitching & Documentation

RN

Rubina Noor

Model training & Frontend & UX design & PWA

AZ

Aman Zeeshan

Documentation & APIs

💬 Got questions, feedback, or want to collaborate? Drop us a line at talk2sign@gmail.com