Image Captioning using CNN ,LSTM and Attention

Last update: Dec 16, 2021

Related tags

Deep Learning imagecaptioningproject

Overview

Image Captioning using CNN ,LSTM and Attention

This is a deeplearning model which tries to summarize an image into a text .

Installation

Install this project with pip3. Use python version 3.7

  pip3 install -R requirements.txt
  python3 app.py

these commands are applicable if you want to try the website in localhost.

you can also install docker and build an image from the docker file and run it.

  docker build -f Dockerfile -t imagecaptioning:api .
  docker run -p 8080:8080 -ti imagecaptioning

Deployment

To deploy this project in google cloud app engine . First create an project in app engine. Install google SDK to push ptojects into your local machine then run the following commands.

  gcloud init
  gcloud app deploy

choose the right project and then push the application to the cloud. This is an monolithic application so a single docker image is complied on the app engine.

Demo

link to demo-https://lucky-dahlia-333406.el.r.appspot.com/index

FAQ

why is this project implimented in tensorflow ?

Tensorflow is actively maintained by google and is very convenient to deploy on a server .It automatically switches to gpu while training if it finds one.

what is BELU score ?

BLEU, or the Bilingual Evaluation Understudy, is a score for comparing a candidate translation of text to one or more reference translations.Although developed for translation, it can be used to evaluate text generated for a suite of natural language processing tasks.

In this project, you will discover the BLEU score for evaluating and scoring candidate text using the NLTK library in Python.

Authors

License

MIT

Image Captioning using CNN ,LSTM and Attention

Related tags

Overview

Image Captioning using CNN ,LSTM and Attention

Installation

Deployment

Demo

FAQ

why is this project implimented in tensorflow ?

what is BELU score ?

Authors

License

Owner

ASUTOSH GHANTO

repro_eval is a collection of measures to evaluate the reproducibility/replicability of system-oriented IR experiments

Large Scale Multi-Illuminant (LSMI) Dataset for Developing White Balance Algorithm under Mixed Illumination

Source Code For Template-Based Named Entity Recognition Using BART

Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement (NeurIPS 2020)

The Official Implementation of Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose [NIPS 2021].

PyTorch implementation of normalizing flow models

[AAAI 2021] EMLight: Lighting Estimation via Spherical Distribution Approximation and [ICCV 2021] Sparse Needlets for Lighting Estimation with Spherical Transport Loss

SAS output to EXCEL converter for Cornell/MIT Language and acquisition lab

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

Course on computational design, non-linear optimization, and dynamics of soft systems at UIUC.

PyTorch ,ONNX and TensorRT implementation of YOLOv4

Towards Part-Based Understanding of RGB-D Scans

Distance-Ratio-Based Formulation for Metric Learning

Efficient Training of Audio Transformers with Patchout

Image Captioning using CNN and Transformers

Large-scale Hyperspectral Image Clustering Using Contrastive Learning, CIKM 21 Workshop

nanodet_plus,yolov5_v6.0

U-Net implementation in PyTorch for FLAIR abnormality segmentation in brain MRI

Face recognition with trained classifiers for detecting objects using OpenCV

SeMask: Semantically Masked Transformers for Semantic Segmentation.