Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

remote sensing image classification and image caption by PyTorch

NotificationsYou must be signed in to change notification settings

TalentBoy2333/remote-sensing-image-caption

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

This is a project of remote sensing image.

  • Classification of remote sensing image
  • Remote sensing image caption

I'm usingtorch 0.4,opencv-python,numpy,matplotlib inpython 3.6

Data

I usedNWPU-RESISC45 dataset, you can download this dataset(http://www.escience.cn/people/JunweiHan/NWPU-RESISC45.html) to train your model, or you can download other dataset, but you should pay attention to the difference of the way to load data from daataset, we maybe used different way to load data.
All in all, check thedataset.py.

Model

We can choose two pre-train models,resnet_v101 andmobilenet_v2.

classifier=Classifier(model_name,class_number,True)

Then, I just add a full connect layer and softmax to classify.

How to use

If you want to train you own model, you just need to prepare the dataset, then run thetrain.py.

python train.py

By the way, you can modify thetrain.py, to setpre-train model,batch size,epoch,learning rate, and continue training base on themodel which was saved in last training.
The training model will be saved in./models/train/.

If you want to predict the class of a new remote sensing image.
First, you should modify thepredict.py to set image name, load the parameters of model and you can also see the display of result.

image_name='test.jpg'classifier=get_classifier('mobilenet','./models/train/classifier_50.pkl')predict(classifier,image_name,True)

Then, just run thepredict.py.

python predict.py

Image Caption

Data

I used theRSICD dataset
Lu X, Wang B, Zheng X, et al. Exploring Models and Data for Remote Sensing Image Caption Generation[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017.
to train my model, you can download this dataset athttps://github.com/201528014227051/RSICD_optimal
Or, you can use your own dataset, same asClassification, modify thedata.py anddataloader.py for your dataset.

Model

I usedShow, Attend and Tell model, you can read this paper:Xu, Kelvin, et al. “Show, attend and tell: Neural image caption generation with visual attention.” arXiv preprint arXiv:1502.03044 (2015)., or you can refer tohttps://github.com/sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning
My model incloudencoder,decoder andattention.
Because of our project is used on ARM, so we must simplify the network, our encoder ismobilenet_v2, we delete the full connect layer for classification, make themobilenet_v2 output the feature map of image(size 77).

Attention part is composed of some full connect layer, input is the hidden layer's output of decoder, output is a tensor(size 1
49), reshape this tensor to size 7*7.
Then, we can get feature vector byattention tenser(7*7) andfeature map(7*7), and this feature vector is the input of decoder.
Attention image:


Decoder is base on LSTM, input is the embedding of words in dictorary(every word in dictionary is aone-hot code, and they will be transformed to feature vector by embedding layer), hidden layer is connected with full connect layer and softmax, output is the probability of the next word.
By the way, first input is a signal of beginning:<start>, and the last output is a signal of endding:..

Data Augmentation

I used a similar approach toSSD(https://arxiv.org/abs/1512.02325), in evary iteration, I change the value of random pixels of the mini-batch, add random lighting noise, randomly swap image channels, randomly adjust the contrast of image. Then, I randomly crop a part of the image sample in mini-batch and randomly mirror the image after sample cropping.

Train

If you want to train your model, make sure that you have theRSICD dataset, if your dataset is different fromRSICD, you should modifydata.py anddataloader.py for your data.
if you want to change the details of the model, you should modifymodel.py andconfig.py.
Inconfig.py, you can also modify thelearning rate,batch size,epoch and so on.
Then, we can start training by runningtrain.py, you can modify the functiontrain() to decide to training from nothing or traning from last model parameters.

python train.py

Predict

I used beam search to find the best sentence of image caption because beam search consider more possibility.

Beam Search(Assuming that the dictionary is [a, b, c], beam size chooses 2):

Step 1: When generating the first word, choose the two words with the highest probability, then the current sequence isa orb.

Step 2: When the second word is generated, we combine the current sequencea orb with all the words in the dictionary to get six new sequencesaa,ab,ac,ba,bb,bc, and then select two of them with the highest probability as the current sequence,ab orbb.

Step 3: Repeat this process until the terminator('.') is encountered. The final output is two sequences with the highest probability.

I set the parameter of beam search to3, you can modify the parameter ineval.py.

defbeam_search(data,decoder,encoder_output,parameter_B=3):

If you want to predict an image, you should modifypredict.py to set test image name and the path of model parameters.

predict('test.jpg', ['./models/train/encoder_mobilenet_20000.pkl','./models/train/decoder_20000.pkl'])

Then, you can runpredict.py, and you will see theimage,sentence of image caption andthe distribution image of attention module for every word.

python predict.py

Example:
image:

caption:

Evaluation

I useBLEU-4 to evaluate the quality of generated sentences.
BLEU-4:https://www.aclweb.org/anthology/P02-1040

My training BLEU:

About

remote sensing image classification and image caption by PyTorch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2026 Movatter.jp