WaSR - A water-obstacle separation and refinement network for unmanned surface vehicles

https://arxiv.org/abs/2001.01921 (ICRA 2020)

Obstacle detection by semantic segmentation shows a great promise for autonomous navigation in unmanned surface vehicles (USV). However, existing methods suffer from poor estimation of the water edge in the presence of visual ambiguities, poor detection of small obstacles and high false-positive rate on water reflections and wakes. We propose a new deep encoder-decoder architecture, a water-obstacle separation and refinement network (WaSR), to address these issues. Detection and water edge accuracy are improved by a novel decoder that gradually fuses inertial information from IMU with the visual features from the encoder. In addition, a novel loss function is designed to increase the separation between water and obstacle features early on in the network. Subsequently, the capacity of the remaining layers in the decoder is better utilised, leading to a significant reduction in false positives and increased true positives. Experimental results show that WaSR outperforms the current state-of-the-art by a large margin, yielding a 14% increase in F-measure over the second-best method.

Updates:

[March 2020] Thomas Clunie ported WaSR to Python3 and Tensorflow 1.15.2
[February 2020] Initial commit

To-Do:

Port the IMU variation fully to Python
Upload requirements.txt file for quick installation
Re-upload weights
Update the read-me file

1. Installation

With Dockfile

Requirements

Dockerfile is created for tensorflow-gpu 1.8.0 in python 2.7, base from docker image of tensorflow:tensorflow tag 1.8.0-gpu
You need to have NVIDIA graphic card compatible to at least CUDA 8.0
The container-toolkit must be installed follow NVIDIA's Instruction
In case you don't have NVIDA graphic card, then edit the Dockerfile to remove all the GPU requirements

Instructions

Download the Dockerfile, go in to the folder where the Dockerfile store
Pull out the container - install all the requirements - download the pre-trained weight - run the script

docker build -t wasr_docker .

Show result

docker run --gpus all -it --rm -e DISPLAY=${DISPLAY} -v /tmp/.X11-unix:/tmp/.X11-unix -v $HOME:/home/$USER wasr_docker bash -c "display /home/docker/wasr_network/test.jpg & display /home/docker/wasr_network/output/output_mask.png"

Requirements

To successfully run WaSR you will need the following packages:

Python >= 2.7.14
OpenCV >= 3.4
Tensorflow >= 1.2.0 (GPU) / >= 1.4.1 (CPU)
MatPlotLib
Numpy

Execute the following sequence of commands to download and install required packages and libraries (Ubuntu):

$ sudo apt-get update
$ sudo apt-get install python2.7
$ sudo apt-get install python-opencv
$ pip install -r requirements.txt

2. Architecture overview

The WaSR architecture consists of a contracting path (encoder) and an expansive path (decoder). The purpose of the encoder is construction of deep rich features, while the primary task of the decoder is fusion of inertial and visual information, increasing the spatial resolution and producing the segmentation output.

Encoder

Following the recent analysis [1] of deep networks on a maritime segmentation task, we base our encoder on the low-to-mid level backbone parts of DeepLab2 [2], i.e., a ResNet-101 [3] backbone with atrous convolutions. In particular, the model is composed of four residual convolutional blocks (denoted as res2, res3, res4 and res5) combined with max-pooling layers. Hybrid atrous convolutions are added to the last two blocks for increasing the receptive field and encoding a local context information into deep features.

Decoder

The primary tasks of the decoder is fusion of visual and inertial information. We introduce the inertial information by constructing an IMU feature channel that encodes location of horizon at a pixel level. In particular, camera-IMU projection [4] is used to estimate the horizon line and a binary mask with all pixels below the horizon set to one is constructed. This IMU mask serves a prior probability of water location and for improving the estimated location of the water edge in the output segmentation.

The IMU mask is treated as an externally generated feature channel, which is fused with the encoder features at multiple levels of the decoder. However, the values in the IMU channel and the encoder features are at different scales. To avoid having to manually adjust the fusion weights, we apply approaches called Attention Refinement Modules (ARM) and Feature Fusion Module (FFM) proposed by [5] to learn an optimal fusion strategy.

The final block of the decoder is Atrous Spatial Pyramid Pooling (ASPP) module [2], followed by a softmax which improve the segmentation of small structures (such as small buoys) and produces the final segmentation mask.

Semantic seperation loss

Since we would like to enforce clustering of water features, we can approximate their distribution by a Guassian with per-channel means and variances, where we assume channel independence for computational tractability. Similarity of all other pixels corresponding to obstacles can be measured as a joint probability under this Gaussian, i.e.,

We would like to enforce learning of features that minimize this probability. By expanding the equation for water per-channel standard deviations, taking the log of the above equation, flipping the sign and inverting, we arrive at the following equivalent obstacle-water separation loss

3. Running WaSR

Training

To train the network from scratch (or from some pretrained weights) use scripts wasr_train_noimu.py for the NO-IMU variation or wasr_train_imu.py for the IMU variation. Both scripts expect the same input arguments. When fine-tunning the network make sure to freeze the pretrained parameters for initial n iterations and train only the last layer.

Input Arguments

batch-size - number of images sent to the network in one step
data-dir - path to the directory containing the MODD2 dataset
data-list - path to the file listing the images in the dataset
grad-update-every - number of steps after which gradient update is applied
ignore-label - the value of the label to ignore during the training
input-size - comma-separated string with height and width of images (default: 384,512)
is-training - whether to update the running means and variances during the training
learning-rate - base learning rate for training with polynomial decay
momentum - moment component of the optimiser
not-restore-last - whether to no restore last layers (when using weights from pretrained encoder network)
num-classes - number of classes to predict
num-steps - number of training steps (this are not epoch!)
power - decay parameter to compute the learning rate
restore-from - where restore model parameters from
snapshot-dir - where to save snapshots of the model
weight-decay - regularisation parameter for L2-loss

Pretrained Weights

WaSR NO-IMU variant - weights are available for download here
WaSR IMU variant - To-Do

Inference

To perform the inference on a specific single image use scripts wasr_inference_noimu_general.py for the WaSR NO-IMU variant or wasr_inference_imu_general.py for the WaSR IMU variant. Both scripts expect the same input arguments and can be run on images from arbitrary maritime dataset.

Input Arguments (General Inference)

dataset-path - path to MODD2 dataset files on which inference is performed
model-weights - path to the file with model weights
num-classes - number of classes to predict
save-dir - where to save predicted mask
img-path - path to the image on which we want to run inference

Example usage:

python wasr_inference_noimu_general.py --img-path example_1.jpg

The above command will take image example_1.jpg from folder test_images/ and segment it. The segmentation result will be saved in the output/ folder by default.

Example input image	Example segmentation output

To run the inference on the MODD2 dataset use the provided bash scripts wasr_inferences_noimu.sh for the WaSR NO-IMU variant or wasr_inferences_imu.sh for the WaSR IMU variant. Bash scripts will run corresponding Python codes (wasr_inference_noimu.py and wasr_inference_imu.py).

Input Arguments (Python MODD2 inference script)

dataset-path - path to MODD2 dataset files on which inference is performed
model-weights - path to the file with model weights
num-classes - number of classes to predict
save-dir - where to save predicted mask
seq - sequence number to evaluate
seq-txt - path to the file listing the images in the sequence

4. References

[1] Bovcon et. al, The MaSTr1325 Dataset for Training Deep USV Obstacle Detection Models, IROS 2019
[2] Chen et. al, Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs, TPAMI 2018
[3] He et. al, Deep residual learning for image recognition, CVPR 2016
[4] Bovcon et. al, Stereo Obstacle Detection for Unmanned Surface Vehicles by IMU-assisted Semantic Segmentation, RAS 2018
[5] Yu et. al, Bisenet: Bilateral segmentation network for real-time semantic segmentation, ECCV 2018

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
example_weights/no_imu		example_weights/no_imu
figures		figures
kaffe		kaffe
output		output
test_images		test_images
wasr_models		wasr_models
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
test.jpg		test.jpg
wasr_inference_noimu.py		wasr_inference_noimu.py
wasr_inference_noimu_general.py		wasr_inference_noimu_general.py
wasr_inferences_noimu.sh		wasr_inferences_noimu.sh
wasr_train_noimu.py		wasr_train_noimu.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WaSR - A water-obstacle separation and refinement network for unmanned surface vehicles

1. Installation

With Dockfile

Requirements

Instructions

Requirements

2. Architecture overview

Encoder

Decoder

Semantic seperation loss

3. Running WaSR

Training

Input Arguments

Pretrained Weights

Inference

Input Arguments (General Inference)

Input Arguments (Python MODD2 inference script)

4. References

About

Releases

Packages

Languages

License

t-thanh/wasr_network

Folders and files

Latest commit

History

Repository files navigation

WaSR - A water-obstacle separation and refinement network for unmanned surface vehicles

1. Installation

With Dockfile

Requirements

Instructions

Requirements

2. Architecture overview

Encoder

Decoder

Semantic seperation loss

3. Running WaSR

Training

Input Arguments

Pretrained Weights

Inference

Input Arguments (General Inference)

Input Arguments (Python MODD2 inference script)

4. References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages