Image-Captioning

Generate Captions for images using Encoder, Decoder model.

Using the datasets Flickr8k from University of Ilinois, neural network was trained to generate the captions for the images. The model is built on Keras with Tensorflow 2.0 beta.

How Image Captioning works ->

The task of image captioning can be divided into two modules logically – one is an image based model – which extracts the features and nuances out of our image, and the other is a language based model – which translates the features and objects given by our image based model to a natural sentence.

For our image based model (viz encoder) – we usually rely on a Convolutional Neural Network model. And for our language based model (viz decoder) – we rely on a Recurrent Neural Network. The image below summarizes the approach given above.

Usually, a pretrained CNN extracts the features from our input image. The feature vector is linearly transformed to have the same dimension as the input dimension of the RNN/LSTM network. This network is trained as a language model on our feature vector.

For training our LSTM model, we predefine our label and target text.

Captions generated for the test set by the model

Image	Caption Provied	Generated Caption
	a child is jumping into a swimming pool a little boy in black shorts jumps into a backyard pool next to a yellow stucco house a little boy jumps high above the swimming pool with a man in the background the boy jumps into the blue pool young child in midair descending into pool	a boy jumps into a pool
	a group of girls dancing on a stage a group of teenage girls do a synchronized dance in hanna montana costumes girls in a line do a dance the girls are dancing and wearing leather outfits young girls in dancing outfits perform on a stage	a group of people are dancing on the street
	a girl crouched up over a swing at the park spinning around a girl is kneeling on a swing at the park a girl with a red shirt and long brown hair is laying across a twisted swing with sand underneath a young girl in a red shirt swings face down on the swings girl is playing at park on a swing	a child swinging on a swing
	greyhounds racing on a track a greyhound race with the lead dog wearing yellow and black stripes and the number dogs racing at a track greyhound dogs race on the track with leading the way some greyhound dogs are racing on a dirt track	dogs are racing
	a child wearing a helmet riding a mountain bike very fast through a forest a cyclist streaks through the trees a person wearing a red helmet riding a white bike a person wearing a red helmet and yellow and white clothing is riding a bike outside in a wooded area the bmx biker rides through the forest	a man in a helmet is riding a bike in the woods

Captions generated for Images outside the datasets

Image	Captions
	people are watching hot air balloons in the street
	a man in a yellow kayak is holding a paddle
	a crowd of people are gathered around a track
	a man in a blue shirt is walking on the rocks by a lake

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Images		Images
models		models
Image Caption.ipynb		Image Caption.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image-Captioning

How Image Captioning works ->

Captions generated for the test set by the model

Captions generated for Images outside the datasets

About

Releases

Packages

Contributors 2

Languages

aryalmilan/Image-Captioning

Folders and files

Latest commit

History

Repository files navigation

Image-Captioning

How Image Captioning works ->

Captions generated for the test set by the model

Captions generated for Images outside the datasets

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages