GuideDog is an AI/ML-based mobile app designed to assist the lives of the visually impaired, 100% voice-controlled

Last update: Nov 24, 2021

Related tags

Overview

Guidedog

Authors: Kyuhee Jo, Steven Gunarso, Jacky Wang, Raghav Sharma

GuideDog is an AI/ML-based mobile app designed to assist the lives of the visually impaired, 100% voice-controlled. You may as well think of it as "speaking guide dog," as the name suggests. It has three key features based on the scene captured by your mobile phone:

Reads text upon command
Describes the scene around you upon command
Warns you if there is an obstacle in front of you

Check out this demo video to learn more about our app!

Android App

UI/UX
- Simple and Responsive
- Voice Assistant architecture for targeted audience
Libraries / APIs
- GC Speech-to-text and Text-to-Speech
- Android SDK , androidX
- ML Kit object detection and tracking api
- TensorFlow Lite MobileNet Image Classification Model

Backend

Flask API
- Image Captioning
- Optical Character Recognition
Deployment
- Google App Engine
- fast central API with different endpoints

Image Captioning

We used tensorflow to build and train model for image captioning on MS-COCO 2014 based on the paper Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. The model uses standard convolutional network as an encoder to extract features from images (we use Inception V3) and feed the generated features into an attention-based decoder generate sentences. While the paper used LSTM model as a decoder, we use a simpler RNN instead.

GuideDog is an AI/ML-based mobile app designed to assist the lives of the visually impaired, 100% voice-controlled

Related tags

Overview

Guidedog

Android App

Backend

Image Captioning

Get more insights : Devpost

Owner

Kyuhee Jo

Graph WaveNet apdapted for brain connectivity analysis.

Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking

Detectron2-FC a fast construction platform of neural network algorithm based on detectron2

Predicting Student Attentiveness using OpenCV

A Peer-to-peer Platform for Secure, Privacy-preserving, Decentralized Data Science

Official implement of Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer

Code for BMVC2021 paper "Boundary Guided Context Aggregation for Semantic Segmentation"

CondNet: Conditional Classifier for Scene Segmentation

AI virtual gym is an AI program which can be used to exercise and can be used to see if we are doing the exercises

CLADE - Efficient Semantic Image Synthesis via Class-Adaptive Normalization (TPAMI 2021)

ConE: Cone Embeddings for Multi-Hop Reasoning over Knowledge Graphs

Learning Lightweight Low-Light Enhancement Network using Pseudo Well-Exposed Images

A deep learning object detector framework written in Python for supporting Land Search and Rescue Missions.

Global-Local Context Network for Person Search

African language Speech Recognition - Speech-to-Text

PiRank: Learning to Rank via Differentiable Sorting

Augmented CLIP - Training simple models to predict CLIP image embeddings from text embeddings, and vice versa.

FairEdit: Preserving Fairness in Graph Neural Networks through Greedy Graph Editing

STARCH compuets regional extreme storm physical characteristics and moisture balance based on spatiotemporal precipitation data from reanalysis or climate model data.

Efficient semidefinite bounds for multi-label discrete graphical models.