Gestural AI – Real-Time ASL Interpreter

A real-time American Sign Language recognition system achieving 94% accuracy

Overview

Developed a cutting-edge real-time American Sign Language (ASL) recognition system that achieves 94% accuracy on a dataset of 20,000+ videos and 166,000 images. This project demonstrates advanced computer vision techniques and real-time inference optimization.

Technical Implementation

The system leverages multiple state-of-the-art deep learning architectures:

  • I3D (Inflated 3D ConvNet) for temporal feature extraction from video sequences
  • ResNet for robust spatial feature learning
  • MobileNet for optimized real-time inference on edge devices

Key Features

  • Real-time Processing: Optimized for low-latency inference suitable for live communication
  • High Accuracy: Achieved 94% recognition accuracy on diverse ASL gestures
  • Scalable Architecture: Deployed via Docker on Linux servers for production use
  • Multi-modal Input: Supports both video and image input formats

Technologies Used

  • Python for core implementation
  • OpenCV for video processing and computer vision
  • MediaPipe for hand tracking and pose estimation
  • TensorFlow for deep learning model training and inference
  • Streamlit for user interface development
  • Docker for containerized deployment

Impact

This project has significant implications for accessibility technology, enabling real-time communication between deaf and hearing individuals. The high accuracy and real-time performance make it suitable for practical applications in education, healthcare, and daily communication.

Future Enhancements

  • Integration with mobile applications for on-the-go accessibility
  • Support for additional sign languages beyond ASL
  • Real-time translation to spoken language
  • Integration with video conferencing platforms