Skip to content

An exercise of Image Captioning using RNN. This is a project of Udacity Computer Vision Nanodegree

Notifications You must be signed in to change notification settings

waterwheel31/P2_Image_Captioning2

Repository files navigation

Image Captioning

This is an excercise of Image Captioning, as a part of Udacity Comuputer Vision Nanodegree Program.

What is Image Captioning?

Image captioning is to attach a short descriptiong sentence to a image. This tries to automatically generate the sentence by loading images.

Dataset Used

Used COCO dataset (http://cocodataset.org)

Network Structure

The network is as below. This is an encoder-decoder structure. Encoder part is a pre-trained CNN(ResNet), and provides an embedded vector that decribes the features of images. Decoder part is RNN (LSTM) and transforms the features into word vector.

image

Some Results

This simple network surprisingly works well.

image

image

image

About

An exercise of Image Captioning using RNN. This is a project of Udacity Computer Vision Nanodegree

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published