0% found this document useful (0 votes)

10 views18 pages

Segmentation by Gan

This document presents an improved Generative Adversarial Network (GAN) model, referred to as Seg-GAN, for the task of image semantic segmentation, addressing limitations of traditional methods that rely on Conditional Random Fields (CRFs). The proposed model integrates Convolutional CRFs to enhance segmentation accuracy and efficiency while learning an end-to-end mapping from input images to segmentation outputs. Experimental results demonstrate that Seg-GAN outperforms existing state-of-the-art methods in both perceptual and quantitative metrics.

Uploaded by

tattou2110

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views18 pages

Segmentation by Gan

Uploaded by

tattou2110

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Semantic Segmentation by Improved Generative Adversarial Networks

a, a a a a
ZengShun Zhao * , Yulong Wang , Ke Liu ,Haoran Yang ，Qian Sun ，
b
Heng Qiao
a
College of Electronic and Information Engineering,
Shandong University of Science and Technology, Qingdao, 266590, P.R.China
* Correspondence Author: zhaozs@sdust.edu.cn;zhaozengshun@163.com
* Co-Correspondence Author:17854252591@163.com， sunqian940411@163.com
b
Department of Electrical& Computer Engineering, University of Florida,
Gainesville, FL 32611,USA

Abstract
While most existing segmentation methods usually combined the powerful
feature extraction capabilities of CNNs with Conditional Random Fields (CRFs)
post-processing, the result always limited by the fault of CRFs . Due to the
notoriously slow calculation speeds and poor efficiency of CRFs, in recent years,
CRFs post-processing has been gradually eliminated. In this paper, an improved
Generative Adversarial Networks (GANs) for image semantic segmentation task
(semantic segmentation by GANs, Seg-GAN) is proposed to facilitate further
segmentation research. In addition, we introduce Convolutional CRFs (ConvCRFs) as
an effective improvement solution for the image semantic segmentation task. Towards
the goal of differentiating the segmentation results from the ground truth distribution
and improving the details of the output images, the proposed discriminator network is
specially designed in a full convolutional manner combined with cascaded ConvCRFs.
Besides, the adversarial loss aggressively encourages the output image to be close to
the distribution of the ground truth. Our method not only learns an end-to-end
mapping from input image to corresponding output image, but also learns a loss
function to train this mapping. The experiments show that our method achieves better
performance than state-of-the-art methods.

*
Corresponding author. E-mail addresses: zhaozengshun@163.com (ZengShun Zhao )

1
Key words: Semantic segmentation; GANs; CNNs; ConvCRFs; CRFs
1 Introduction
Image semantic segmentation has become one of the noteworthy and active
research eares in the filed of computer vision and computer graphics, and attracted
increasing attention in many application of daily life, such as autopilot [1], medical
image analysis [2], geographic information system and smart dressing system. The
core task of semantic segmentation is to mark a semantic label, e.g., table, bus, plane,
person ,or horse, to each pixel in images. In recent years, while many recent
approaches[] have been proposed to tackle this task, and many popular datasets have
been built for the research of semantic segmentation, image semantic segmentation is
still facing challenges.
Despite the accuracy of semantic segmentation based on deep learning methods
are far superior to the traditional methods in extracting local features and performing
good predictions utilizing small field of view, deep neural networks lack the
capability to utilize global context information and slow calculations during training.
In addition, the long training times of the current generation of CRFs in
post-processing also make more in-depth research and experiments with such
structured models impractical.
For typically CNN-based segmentation networks, the corresponding ground
truths and their original images in the training dataset are used to train the
segmentation network, and the constant training of the network is guided by directly
comparing the differences between the segmented results and the ground truths.
In this paper, we applied a deep learning based approach for image semantic
segmentation. More specifically, we proposed Seg-GANs for this task, which was
inspired by the GANs and ConvCRFs. Similar to the standard GANs, Seg-GAN also
consists of two feed-forward convolutional neural networks (CNNs), the segmentation
network S and the discriminator network D. The segmentation network plays a similar
role of the generative network in the original GANs, and the goal of the segmentation
network S is to generate segmentation results from the input images and assign the
labels for every pixel. The discriminative network D aims to discover the discrepancy
2
between the segmented results and the corresponding ground-truth image. Besides,
the discriminative network adopts four cascaded ConvCRFs [4] layers to facilitate the
fully connected network to have modest accuracy and inference speed improvement.
In addition, a focal loss function is introduced to calculate the confidence map
generated by the discriminative network. The loss function,including two-class
cross-entropy error loss and adversarial loss,guides the constant training of the
discriminative network and improves the accuracy. In this paper, we demonstrate that
the proposed networks are effective in the segmentation task. Experiments show that
our method outperforms current state-of-the-art methods both perceptually and
quantitatively.
Our proposed method differs from the existing traditional [5, 6] or other deep
learning based approaches [7-12]. The traditional approaches need to extract the
features of the images manually. The deep learning based approaches are usually
based on CNNs.
There are three main innovations in our work:
(1) We proposed Seg-GANs, an end-to-end generative adversarial network, for
image semantic segmentation based on residual networks framework. In our
algorithm, the generative network counts on a remarkable basement
segmentation network to generate the class prediction image. Normally,
CRFs based segmentation network can considerable improve the
accuracy[9-11], but with the risk of decreasing the computation efficiency
and hindering the construction of an end-to-end framework. We abandoned
the CRFs post-processing module in the segmentation network and put CRFs
into the end-to-end training to improve the accuracy.
(2) Based on the structure of GAN, Seg-GANs combine two-class cross-entropy
error loss to calculate loss function. The discriminative network we designed
consists of a fully connected networks with four cascade ConvCRFs layer
and given up the multi-scale fusion method, and the output of the
discriminative network substitute by the confidence map according to our
networks rather than a simple loss value. And then we can control the
3
discriminator to input image of any size. Each value in the confidence map is
sampled from a different region of the input and represents the confidence
values of all the segmentation targets in the image.
(3) ConvCRFs have indisputable success in the speeding up inference and
training as described in research [4]. We first introduce ConvCRFs to replace
the CRFs to modify this problem, in our algorithm, by fusing ConvCRFs
layer with the discriminative network to improve the efficiency of calculation.
We further demonstrate that Generative Adversarial Networks are useful in
the image semantic segmentation task, and can achieve better accuracy than
the other deep learning based methods. Our method directly learns an
end-to-end mapping which can effectively estimate the reasonable results
from input images and make the calculation more effectively.
2 Related Works
In the early studies, various constructive methods have been proposed for image
semantic segmentation, such as threshold-based, region-based [5], edge-based [6] and
cluster-based techniques are proposed for segmenting the image. While these
traditional methods simply separate objects from the background, this process has to
manually design a large number of features. The quality of the several features
directly determines the quality of the segmentation results, which is time-consuming
and labor-intensive and not practical enough.
In recent years, with the increasing research on convolutional neural networks
and deep learning framework in image semantic segementation [13,14], more and
more studies related to deep convolutional neural networks has been carried out to
increasingly improve the semantic segmentation methods, which are expected to
ameliorate the accuracy of recognition and prediction. Long et al. [7] were the first to
applie deep convolutional neural networks to the task of image semantic segmentation,
which utilized the fully connected layer to replace the convolutional layer in FCN
module. This fully convolutional neural network (FCN), as one of the most popular
prototypes of the encoder-decoder framework, is adopted for pixel-level image
classification. By up-sampling with transposed convolution, full-size segmented
4
image could be restored with classified pixels. With the improvement in GPU
performance and optimization algorithms, researchers started to train larger and
deeper neural networks.
Noe et al. proposed DeconvNet [15] with a more extensive decoder than the
original FCN. The mentioned decoder is symmetric with respect to the number and
feature sizes of the encoder. Aside from the deconvolution, the DeconvNet decoder
network also uses unpooling layers as a part of improvement. Since the DeconvNet
uses two fully-connected layers in its encoder, it is relatively large in memory
consuming compared with the original FCN.
Motivated to reduce the number of parameters and the amount of memory
required by segmentation networks, Badrinarayanan et al. propose the SegNet [1],
which encoder is topologically identical to the 13 convolutional layers of VGG-16
[16], but in contrast to the original FCN and DeconvNet, the decoder contains only
up-sampling (unpooling) operations and convolution, therefore eliminating
deconvolution altogether. Architectures that store and use feature maps from an
encoder during classification is outperformed SegNet but require more memory
during inference.
The second method that is widely used in semantic segmentation is the dilated
convolutional structure [17]. The DeepLab method has developed about four versions.
The main contributions of the former two versions DeepLab-v1 [9] and DeepLab-v2
[10] are the combination of convolutional neural networks and fully connected CRFs
and the model innovatively applies the dilated convolution algorithm to the
convolutional neural network models. The biggest difference between DeepLab-v3
[11] and DeepLab-v3+ [12] with the previous two versions is that the CRFs
post-processing module in DeepLab-v3 and DeepLab-v3+ is abandoned and
substituted with the changed atrous spatial pyramid pooling (ASPP) [10] structure, but
the cascading network structure inevitable puts tremendous pressure on GPU memory.
Recently, there are a large body of successful extended applications based on
generative adversarial networks (GANs) (e.g., SRGAN [18], DCGAN [19], Pix2Pix
[20]) since Goodfellow first officially proposed GANs in 2014. GANs perform an
5
adversarial process alternating between identifying and faking, and the generative
adversarial loss is formulated to evaluate the discrepancy between the generated
distribution and the real data distribution. A lot of researches reveal that generative
adversarial loss is beneficial for improving the performance of the networks. Inspired
by the success of generative adversarial networks (GANs) on image-to-image
translation [20], we designed an efficient GAN network for image semantic
segmentation. The work closest in scope to ours is the one proposed by Luc et al. [8],
where the adversarial network is used to aid the training for semantic segmentation.
However, it does not show substantial improvement over the baseline.
To further introduce global information into CNNs, Deeplab uses the fully
connected CRFs (FullCRFs) as an independent post-processing step. FullCRFs [21]
utilize two Gaussian kernels with hand crafted features as illustrated in the original
publication [21], Krähenbühl and Koltun optimized the remaining parameters with a
combination of grid-search and expectation maximization. In the next work [22] they

novelty used gradient decent that for the message passing the identity (𝑘𝐺 ∗ 𝑄)′ =

𝑘𝐺 ∗ 𝑄′ is valid. However, for the reason of using back propagation without

computing gradients with respect to the Gaussian kernel 𝑘𝐺 , the features of the
Gaussian kernel therefore cannot be learned. The subsequently proposed CRFasRNN
[23] adopts the same ideas to implement joint CRFs and CNN training and also
requires hand-crafted Gaussian features like [22].
But the long training times of the current generation of CRFs make more
in-depth research and experiments with such structured models impractical. In order
to circumvent the issue of notoriously slow training and long inference times of CRFs,
Teichmann M T T. et al. [4] developed Convolutional CRFs (ConvCRFs), a novel
CRFs design, which adding the strong and valid assumption of conditional
independence so as to remove the permutohedral lattice [24] approximation. Based on
the validation experiments of this approach, this approach increases training and
inference speed by two orders of magnitude. Besides, the ConvCRFs implementation
utilizing a learnable compatibility transformation as well as learnable Gaussian
6
features performs best and reformulating a large proportion of the inference as
convolutions thus can be implemented highly efficiently on GPUs.
3 Generative Adversarial Networks for image semantic segmentation
In this section, we will introduce the proposed structure of Generative Adversarial
Networks for image semantic segmentation. The main framework of proposed
algorithm is shown in Fig. 1. Compared to original GAN model, the architecture of
the proposed Seg-GAN is based on two separate deep convolutional neural networks,
namely the segmentation network S and discriminator network D, whose combined
efforts aim at obtaining a reasonable result for a given input image.

Segmentation Network Class predictions

Results
Discriminator Network
LeakyReLU

LeakyReLU

LeakyReLU
Input Image
Conv1

Conv2

Conv3

Conv4

Conv5
Lce Ladv
LD
Confidence Map

Label Map Ground truth

ConvCRF

Figure 1: Architecture of the proposed Seg-GAN.

3.1 Segmentation network
The segmentation network S is designed for generating a reasonable result by
segmenting the given input image. The structure of generative network is inspired by
the configuration of DeepLab-v2 [10] framework with ResNet-101 [25] model
pre-trained on MS COCO dataset which is our segmentation baseline network and
without CRFs post-processing. For the GPU memory consuming consideration, we
abandon to adopt the multi-scale fusion proposed in Chen et al [10]. Following the
recent work on semantic segmentation, we drop the last classification layer and revise
the stride of the last two convolution layers from 2 to 1, making the resolution of the
output feature maps effectively 1/ 8 times the input image size. Towards the goal of
enlarging the receptive fields, we adopt the dilated convolution in conv4 and conv5
layers with a stride of 2 and 4, respectively. After the last layer, we use the Atrous
Spatial Pyramid Pooling (ASPP) proposed in Chen et al. [10] as the final classifier.
7
Correspondingly, we apply an up-sampling layer along with the softmax output to
adapt the size of the input image. The architecture of the segmentation network S is
demonstrated in the Fig. 2.

256,1×1
512,1×1
128,3×3
128,1×1

64,3×3
64,1×1

Figure 2: The architecture of the segmentation network S.

3.2 Discriminative network
The discriminative network D is proposed to compute the discrepancy between
the data distribution of the ground-truth labels and the predicted label images
generated by generative network. The proposed discriminator network is specially
designed in a full convolutional manner combined with cascaded ConvCRFs to
differentiate the segmentation results from the ground truth distribution and improve
the details of the output images. It consists of 5 convolution layers with kernel 3  3
with channel numbers {64, 128, 256, 512, 1} and stride of 2. Each convolution layer
is followed by a Leaky-ReLU parameterized by 0.2 except the last layer. We first use
4 ConvCRF modules cascaded with the full convolutional networks.
3.3 Generative adversarial loss
The GAN-based models have been widely used in learning generative model due
to their indisputable success in image generation. The GANs was proposed to solve
the disadvantages of other generative models. Instead of maximizing the possibility,
GANs introduce the theory of adversarial learning between the generator and the
discriminator. This adversarial process gives GANs obvious advantages over the other
generative models. Moreover, GANs can sample the generated data in a simple way
8
unlike other models in which the sampling is notoriously slow and inaccurate. For
these advantages, GAN gained our attention, and this is the original intention for us to
use the framework of GAN. We therefore adapt the GANs learning strategy to tackle
the problem of image semantic segmentation. More specifically, the proposed
Seg-GAN consists of two feed-forward convolutional neural networks (CNNs): the
segmentation network S and the discriminator network D. The reason why we use
CNN is that it can greatly stabilize GAN training. Seg-GAN suggests an architecture
guideline in which the segmentation network is composed of a CNN, and the
discriminator is composed of a full convolutional manner combined with cascaded
ConvCRFs [4]. Batch normalization, ReLU and LeakyReLU activation functions are
utilized for the segmentation network and the discriminator to help stabilize the GAN
training.
The purpose of the segmentation network S is to generate labeled segmentation
results 𝑆(𝑥) from input image 𝑥 . Meanwhile, each input image 𝑥 has a
corresponding ground-truth image 𝑦. 𝑆 (𝑥 ) is encouraged to have the same data
distribution with the ground-truth image 𝑦. The goal of the discriminator network D
is to discover the discrepancy between the data distribution of segmentation results
and the corresponding ground-truth image. S and D compete with each other to
achieve their respective purposes, thus generate the term adversarial. To train the

discriminator network, we minimize the cross-entropy loss LD with respect to two

classes. The loss can be expressed as:

h, w
  h , w
LD =- 1  yn  log 1  D  S  X n     y log  D Y  
n n
 h , w
(1)

In Eq. 1, X n is the input image with size of h  w  3 . We denote the

segmentation network as S   which has a corresponding output S  X n  . For our

fully convolutional discriminator, we denote it as D   . Yn is the corresponding

ground-truth label. Where yn  1 when the image is drawn from the ground-truth

label, and yn  0 when the image is generated from the segmentation network.

9
We propose to train the segmentation network via minimizing a multi-task loss
function:

Lseg  Lce   Ladv (2)

where Lce , Ladv denote the multi-class cross entropy loss, the adversarial loss,

respectively. λ represents a hyper-parameter for balancing the proportion of the Lce

in multi-task loss function. Lce and Ladv are respectively obtained by:

Lce  Yn
h , w cC
h , w, c 

log S  X n 
 h , w, c 
 (3)


Ladv   log D  S  X n  
h, w
 h , w
 (4)

where c is the number of categories in the dataset. With this adversarial loss, we
first try to train the segmentation network to cheat the discriminator by maximizing
the probability of the segmentation prediction being considered as the ground truth
distribution.
4 Experiments
4.1 Dataset
We now detail the architectures we used for our preliminary experiment on the
PASCAL VOC2012 [26] segmentation benchmark, which is a commonly used
evaluation benchmark for semantic segmentation. It contains 20 objects except the
background with annotations on daily captured images. As is common practice, we
use the extra annotation set in SBD [27] for training, which provide a total of 10582
training images. We evaluate our models on the standard validation set with 1449
images.
4.2 Training settings
During the constant training process, we adopt the random scaling and
cropping with size 319 × 319 for each image. The weights of the networks are
initialized from the ResNet-101 model pre-trained on MS COCO dataset. In particular,
we opt the Stochastic Gradient Descent (SGD) with Nesterov acceleration for the
optimizer, where the momentum is set as 0.9 and the weight decay with factor
5 × 10−4 . The initial learning rate is set to 2.5 × 10−4 and is decreased with
10
polynomial decay with power of 0.9 as we reference the research of Chen et al. [9].
Besides, towards the goal of training the discriminator, we use an Adam solver with a

learning rate of 104 and the same polynomial decay as the segmentation network.

The momentum is set to 0.9 and 0.999. With each update of the segmentation network
S, the generative network G will also be updated once time. We trained each model
(in 50000 iterations with a batch-size of 11) on an Nvidia GeFore GTX1080Ti GPU
using Pytorch [28] repository.
To show the capabilities of Seg-GANs we evaluate our method with several
state-of-the-art algorithms, which including Luc et al., SegNet, FCN, DeepLab-v2 and
DeepLab-v3. We use the PASCAL VOC2012 dataset as a basis, but augment the
ground-truth with the goal to simulate prediction errors.
The MIoU is the most commonly used evaluation standard for semantic
segmentation. It calculates the ratio of the intersection and union of the two sets, and
finally averages the result. For semantic segmentation, the ratio between the predicted
value and the true value is obtained. First, IoU is calculated on each class, and finally
MIoU is obtained. To summarize, the MIoU can be defined as:
1 t pii
MIoU  
t  1 i 0  t p   j  0 p ji  pii
t
(5)
j  0 ij

where t+1 is the category number. pii , pij and p ji denote true positive, false

positive and false negative, respectively.

To obtain the best performance of the model, we set multiple hyper-parameter
λ (see Equation (2)), which are 0.01, 0.02, 0.05, and 0.005, respectively. We trained
each hyper-parameter training process in 50000 iterations, and each training process
takes about 12 hours. The loss function curve under λ = 0.01 are revealed in Fig. 3.

11
Figure 3. The loss function curve under λ = 0.01 of the segmentation network
and discriminative network respectively.
Due to the use of a pre-trained model, the segmentation network converges after
approximately 20,000 iterations, and the loss floats between 0.1 and 0.3. At this time,
the loss value of the discriminator network fluctuates around 0.2. Towards the goal of
obtaining the optimal model in this paper, we tested the MIoU curve between the
models obtained from 10,000 iterations to the models obtained from 50,000 iterations
on a test dataset with a step size of 1000, as shown in Fig. 4.

Figure 4. The curve of MIoU, from 10,000 iterations to 50,000 iterations , with a step
size of 1000.

4.2 Intuitive visual comparison

To show the capabilities of Seg-GANs, in this section we evaluate our method
with the state-of-the-art algorithm DeepLab-v2 [10]. In the generative network, we
opt to DeepLab-v2 without CRFs post-processing as our basis segmentation network.

12
Figure 5 illustrate the comparative results of our method and the state-of-the-art
algorithm DeepLab-v2. As shown in Fig. 5, we provide a visual comparison that our
method is more detailed and completed than the DeepLab-v2 basic model. When
segmenting some targets with more complex shapes, we can maintain the details
without the big splits like DeepLab-v2. And the algorithm proposed in this paper
abandoned to use the post-processing method, which is more efficient than
DeepLab-v2.

(a) Input (b) DeepLab-v2 (C) Seg-GANs

Figure 5. The results of the DeepLab-v2 and Seg-GANs
4.3 Quantitative comparisons
For the sake of proving the efficiency of our proposed method, we compare
Seg-GAN with several methods: SegNet, FCN, Luc et al., DeepLab-v2, DeepLab-v3.
Using the exactly same dataset, we directly reference the results of SegNet, FCN in
the original paper. We trained and tested DeepLab-v2 and DeepLab-v3 according to

13
their paper. As shown in Table 1, our method always yields the highest scores. The
results show that the results of proposed Seg-GAN outperform the other algorithms
significantly.
Table 1 The average results of MIoU(%) on the VOC2012 dataset [26]

Number Methods MIoU (%)

1 SegNet[1] 60.50
2 FCN[7] 67.20

3 Luc et al.[8] 72.0

4 DeepLab-v2 (without post-processing)[10] 75.94

5 DeepLab-v2 (with post-processing)[10] 78.27

6 DeepLab-v3[11] 77.21
7 Ours (   0.01 ) 80.14

The performance of the proposed model is affected by the hyper-parameter λ,

which is a very sensitive parameter, and its value largely affects the accuracy of the
segmentation. So we set different values, including 0.01, 0.02, 0.05, and 0.005. Table
2 gives the effect of different values on the performance of the proposed model.
Table 2 Hyper-parametric analysis
 MIoU (%)

0.01 80.14
0.02 79.78

0.05 78.34

0.005 78.17

5 Conclusion
In this paper, a new end-to-end semantic segmentation model called Seg-GANs
is proposed. In the Seg-GANs algorithm, the cascaded ConvCRFs is combined with
GAN in the discriminative network adding the strong and valid assumption of
conditional independence, and the cross-entropy error loss and adversarial loss are

14
utilized to guide the training process through back propagation. Our generative
network takes a remarkable basement segmentation network into consideration by
integrating the existing segmentation network to realize the estimation of the original
image. The discriminative network differentiates the segmentation results from the
ground truth distribution and improves the details of the output images. The results
show that the proposed Seg-GANs considerably improve the accuracy of the
segmentation results. Our future work will mainly focus on exploring the potential of
ConvCRFs in other structured applications such as instance segmentation and further
improving the accuracy of small samples and reducing training time.

Declarations:
1.Availability of data and material: The dataset used during the current study is
VOC2012 dataset [26], are available online or from the corresponding author on
reasonable request.
2. Competing interests: The authors declare that they have no competing interests
3.Fundings: This work was supported in part by the National Natural Science
Foundation of China (Grant No. 61403281), the Natural Science Foundation of
Shandong Province (ZR2014FM002), China Postdoctoral Science Special Foundation
Funded Project (2015T80717).
4. Authors' contributions: ZZ was a major contributor in writing the manuscript.
And he analyzed and interpreted the entire framework of GANS with the help of ZW
and Q.S. and H.Y. performed the coding and experiments. All authors read and
approved the final manuscript.
5. Acknowledgments: Not applicable
References
1. Badrinarayanan V , Kendall A , Cipolla R . SegNet: A Deep Convolutional
Encoder-Decoder Architecture for Scene Segmentation[J]. IEEE Transactions on
Pattern Analysis and Machine Intelligence, 2017:1-1..
2. Ronneberger O, Fischer P, Brox T. U-Net: Convolutional Networks for
Biomedical Image Segmentation[M]// Medical Image Computing and
15
Computer-Assisted Intervention — MICCAI 2015. Springer International
Publishing, 2015:234-241.
3. Goodfellow I J, Pouget-Abadie J, Mirza M, et al. Generative Adversarial
Networks[J]. Advances in Neural Information Processing Systems, 2014,
3:2672-2680.
4. Teichmann M T T , Cipolla R . Convolutional CRFs for Semantic Segmentation[J].
arXiv preprint arXiv:1805.04777, 2018.
5. Shih F Y , Cheng S . Automatic seeded region growing for color image
segmentation[J]. Image and Vision Computing, 2005, 23(10):877-886.
6. Beare R . A Locally Constrained Watershed Transform[J]. IEEE Transactions on
Pattern Analysis & Machine Intelligence, 2006, 28(7):1063-1074.
7. Long J , Shelhamer E , Darrell T. Fully Convolutional Networks for Semantic
Segmentation[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence,
2014, 39(4):640-651.
8. Luc P , Couprie C , Chintala S , et al. Semantic Segmentation using Adversarial
Networks[J]. arXiv preprint arXiv:1611.08408, 2016.
9. Chen L C, Papandreou G, Kokkinos I, et al. Semantic Image Segmentation with
Deep Convolutional Nets and Fully Connected CRFs[J]. Computer Science,
2014(4):357-361.
10. Chen L C , Papandreou G , Kokkinos I , et al. DeepLab: Semantic Image
Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully
Connected CRFs[J]. IEEE Transactions on Pattern Analysis & Machine
Intelligence, 2016, 40(4):834-848.
11. Chen L C , Papandreou G , Schroff F , et al. Rethinking Atrous Convolution for
Semantic Image Segmentation[J]. arXiv preprint arXiv:1706.05587, 2017.
12. Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, Hartwig
Adam. Encoder-Decoder with Atrous Separable Convolution for Semantic Image
Segmentation. The European Conference on Computer Vision (ECCV), 2018, pp.
801-818.

16
13. Zhao Z , Sun Q , Yang H , et al. Compression artifacts reduction by improved
generative adversarial networks[J]. EURASIP Journal on Image and Video
Processing, 2019, 2019(1):62.
14. Chenggang Yan, Liang Li, Chunjie Zhang, et al. Cross-modality Bridging and
Knowledge Transferring for Image Understanding[J]. IEEE Transactions on
Multimedia, 2019, PP(99):1-1.
15. H. Noh, S. Hong and B. Han, "Learning Deconvolution Network for Semantic
Segmentation," 2015 IEEE International Conference on Computer Vision (ICCV),
Santiago, 2015, pp. 1520-1528.
16. Simonyan K, Zisserman A. Very Deep Convolutional Networks for Large-Scale
Image Recognition[J]. Computer Science, 2014.
17. Yu F, Koltun V. Multi-Scale Context Aggregation by Dilated Convolutions[J].
arXiv preprint arXiv:1511.07122, 2015.
18. Ledig C, Wang Z, Shi W, et al. Photo-Realistic Single Image Super-Resolution
Using a Generative Adversarial Network. arXiv.org, Sept. 2016.
19. Radford A, Metz L, Chintala S. Unsupervised Representation Learning with Deep
Convolutional Generative Adversarial Networks[J]. Computer Science, 2015.
20. P. Isola, J. Zhu, T. Zhou and A. A. Efros, "Image-to-Image Translation with
Conditional Adversarial Networks," 2017 IEEE Conference on Computer Vision
and Pattern Recognition (CVPR), Honolulu, Hawaii, USA, 2017, pp. 5967-5976.
21. Philipp Krähenbühl and Vladlen Koltun. Efficient inference in fully connected
crfs with Gaussian edge potentials. In Advances in neural information processing
systems. 2011, pp. 109–117.
22. Philipp Krähenbühl and Vladlen Koltun. Parameter learning and convergent
inference for dense random fields. In International Conference on Machine
Learning. 2013, pp. 513–521.
23. Shuai Zheng, Sadeep Jayasumana, Bernardino Romera-Paredes, Vibhav Vineet,
Zhizhong Su, Dalong Du, Chang Huang, and Philip HS Torr. Conditional random
fields as recurrent neural networks. In Proceedings of the IEEE International
Conference on Computer Vision. 2015, pp. 1529–1537.
17
24. Andrew Adams, Jongmin Baek, and Myers Abraham Davis. Fast
high-dimensional filtering using the permutohedral lattice. In Computer Graphics
Forum, volume 29, pages 753–762. Wiley Online Library, 2010.
25. He K, Zhang X, Ren S, et al. Deep Residual Learning for Image Recognition[C]//
IEEE Conference on Computer Vision & Pattern Recognition. 2016.
26. Everingham M, Gool L V, Williams C K I, et al. The Pascal Visual Object Classes
(VOC) Challenge[J]. International Journal of Computer Vision, 2010,
88(2):303-338.
27. Hariharan B , Arbelaez P , Bourdev L D , et al. Semantic contours from inverse
detectors[C]// IEEE International Conference on Computer Vision, ICCV 2011,
Barcelona, Spain, November 6-13, 2011. IEEE, 2011.
28. Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang,
Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer.
Automatic differentiation in pytorch. In NIPS-W, 2017.

Semantic Segmentation
No ratings yet
Semantic Segmentation
22 pages
CNN-Based Semantic Image Segmentation
No ratings yet
CNN-Based Semantic Image Segmentation
10 pages
Deep Learning in Semantic Segmentation
No ratings yet
Deep Learning in Semantic Segmentation
28 pages
Sensors: Semantic Segmentation With Transfer Learning For Off-Road Autonomous Driving
No ratings yet
Sensors: Semantic Segmentation With Transfer Learning For Off-Road Autonomous Driving
21 pages
ComSIS 967-2411
No ratings yet
ComSIS 967-2411
20 pages
(IJCST-V12I3P11) :M. Rega, Dr. S. Sivakumar
No ratings yet
(IJCST-V12I3P11) :M. Rega, Dr. S. Sivakumar
6 pages
Large Kernel Matters
No ratings yet
Large Kernel Matters
11 pages
【全局卷积GAP】2017 - Large - Kernel - Matters - Improve - Semantic - Segmentation - by - Global - Convolutional - Network
No ratings yet
【全局卷积GAP】2017 - Large - Kernel - Matters - Improve - Semantic - Segmentation - by - Global - Convolutional - Network
9 pages
2018 - SeGAN - Adversarial Network With Multi-Scale L 1 Loss For Medical
No ratings yet
2018 - SeGAN - Adversarial Network With Multi-Scale L 1 Loss For Medical
10 pages
Applsci 08 00837 PDF
No ratings yet
Applsci 08 00837 PDF
17 pages
ML Report-Image Segmentation
No ratings yet
ML Report-Image Segmentation
19 pages
DL Unit 5
No ratings yet
DL Unit 5
63 pages
2015 - DeepLab v1 - Semantic Image Segmentation With Deep Convolutional Nets and Fully Connected Crfs
No ratings yet
2015 - DeepLab v1 - Semantic Image Segmentation With Deep Convolutional Nets and Fully Connected Crfs
14 pages
Generalizability of Semantic Segmentation Techniques: Keshav Bhandari Texas State University, San Marcos, TX
No ratings yet
Generalizability of Semantic Segmentation Techniques: Keshav Bhandari Texas State University, San Marcos, TX
6 pages
Deconvolution Network ICCV 2015 Paper PDF
No ratings yet
Deconvolution Network ICCV 2015 Paper PDF
9 pages
Sensors: Depth Estimation and Semantic Segmentation From A Single RGB Image Using A Hybrid Convolutional Neural Network
No ratings yet
Sensors: Depth Estimation and Semantic Segmentation From A Single RGB Image Using A Hybrid Convolutional Neural Network
20 pages
A Comparative Study of Real-Time Semantic Segmentation For Autonomous Driving
No ratings yet
A Comparative Study of Real-Time Semantic Segmentation For Autonomous Driving
11 pages
Dlcv2017d3l1segmentation 170623173102
No ratings yet
Dlcv2017d3l1segmentation 170623173102
36 pages
Unsupervised Image Segmentation Model
No ratings yet
Unsupervised Image Segmentation Model
13 pages
Implementation of Deep Neural Networks Learning On Unmanned Aerial Vehicle Based Remote-Sensing
No ratings yet
Implementation of Deep Neural Networks Learning On Unmanned Aerial Vehicle Based Remote-Sensing
7 pages
02 Semantic Segmentation 2024
No ratings yet
02 Semantic Segmentation 2024
53 pages
Jiang 2021
No ratings yet
Jiang 2021
11 pages
Semantic Segmentation by Using Down-Sampling and S
No ratings yet
Semantic Segmentation by Using Down-Sampling and S
14 pages
IJRAR1DUP001
No ratings yet
IJRAR1DUP001
3 pages
Harley MSC Thesis Menos Especializadpo
No ratings yet
Harley MSC Thesis Menos Especializadpo
71 pages
2210 11810FJFTJTsu
No ratings yet
2210 11810FJFTJTsu
13 pages
DL Segmentation 2
No ratings yet
DL Segmentation 2
18 pages
MPFNet Multiscale Prediction Network With Cross Fu
No ratings yet
MPFNet Multiscale Prediction Network With Cross Fu
12 pages
Optimisation of Semantic Segmentation Algorithm For Autonomous Driving Using U-NET Architecture
No ratings yet
Optimisation of Semantic Segmentation Algorithm For Autonomous Driving Using U-NET Architecture
16 pages
A Strong Baseline For Generalized Few-Shot Semantic Segmentation
No ratings yet
A Strong Baseline For Generalized Few-Shot Semantic Segmentation
14 pages
Seg 2
No ratings yet
Seg 2
13 pages
Fully Convolutional Networks For Semantic Segmentation: Jonathan Long Evan Shelhamer Trevor Darrell UC Berkeley
No ratings yet
Fully Convolutional Networks For Semantic Segmentation: Jonathan Long Evan Shelhamer Trevor Darrell UC Berkeley
10 pages
Lecture 13 Image Segmentation Using Convolutional Neural Network
No ratings yet
Lecture 13 Image Segmentation Using Convolutional Neural Network
9 pages
2016 - Semantic Segmentation Using Adversarial Networks
No ratings yet
2016 - Semantic Segmentation Using Adversarial Networks
12 pages
Semantic Segmentation with Keras
No ratings yet
Semantic Segmentation with Keras
5 pages
Fully Convolutional Networks For Semantic Segmentation
No ratings yet
Fully Convolutional Networks For Semantic Segmentation
12 pages
Overview of Semantic Segmentation
No ratings yet
Overview of Semantic Segmentation
20 pages
Semantic Image Segmentation Using An Improved Hierarchical Graphical Model
No ratings yet
Semantic Image Segmentation Using An Improved Hierarchical Graphical Model
8 pages
Fully Convolutional Networks For Semantic Segmentation
No ratings yet
Fully Convolutional Networks For Semantic Segmentation
12 pages
Applsci 11 08802 - Compressed
No ratings yet
Applsci 11 08802 - Compressed
28 pages
Contrast Ives Eg
No ratings yet
Contrast Ives Eg
14 pages
Wang Dual Super-Resolution Learning For Semantic Segmentation CVPR 2020 Paper
No ratings yet
Wang Dual Super-Resolution Learning For Semantic Segmentation CVPR 2020 Paper
10 pages
A Novel Framework For Semantic Image Segmentation Using Generative Adversarial Networks and Neural Architecture Search
No ratings yet
A Novel Framework For Semantic Image Segmentation Using Generative Adversarial Networks and Neural Architecture Search
6 pages
Remotesensing 13 03065 v2
No ratings yet
Remotesensing 13 03065 v2
20 pages
Traditional Methods Like Fully Conv
No ratings yet
Traditional Methods Like Fully Conv
2 pages
Deep Learning in Image Segmentation
No ratings yet
Deep Learning in Image Segmentation
23 pages
Repurposing Gans For One-Shot Semantic Part Segmentation
No ratings yet
Repurposing Gans For One-Shot Semantic Part Segmentation
14 pages
The One Hundred Layers Tiramisu: Fully Convolutional Densenets For Semantic Segmentation
No ratings yet
The One Hundred Layers Tiramisu: Fully Convolutional Densenets For Semantic Segmentation
9 pages
A Review On Deep Learning Techniques Applied To Semantic Segmentation
No ratings yet
A Review On Deep Learning Techniques Applied To Semantic Segmentation
23 pages
NNDL Unit 5
No ratings yet
NNDL Unit 5
21 pages
Semantic Segmentation With Attention Mechanism For
No ratings yet
Semantic Segmentation With Attention Mechanism For
13 pages
CN4SRSS Combined Network For Super Resolution Reco - 2024 - Engineering Applica
No ratings yet
CN4SRSS Combined Network For Super Resolution Reco - 2024 - Engineering Applica
32 pages
Semantic Instance Segmentation With A Discriminative Loss Function
No ratings yet
Semantic Instance Segmentation With A Discriminative Loss Function
10 pages
【SETR】Zheng Rethinking Semantic Segmentation From a Sequence-To-Sequence Perspective With Transformers CVPR 2021 Paper
No ratings yet
【SETR】Zheng Rethinking Semantic Segmentation From a Sequence-To-Sequence Perspective With Transformers CVPR 2021 Paper
10 pages
A Beginner's Guide To Deep Learning Based Semantic Segmentation Using Keras - Divam Gupta
No ratings yet
A Beginner's Guide To Deep Learning Based Semantic Segmentation Using Keras - Divam Gupta
14 pages
Ke Unsupervised Hierarchical Semantic Segmentation With Multiview Cosegmentation and Clustering Transformers CVPR 2022 Paper
No ratings yet
Ke Unsupervised Hierarchical Semantic Segmentation With Multiview Cosegmentation and Clustering Transformers CVPR 2022 Paper
11 pages
A Comprehensive Review of Modern Object Segmentation Approaches
No ratings yet
A Comprehensive Review of Modern Object Segmentation Approaches
177 pages
Lec 2 (Image Segemnation)
No ratings yet
Lec 2 (Image Segemnation)
52 pages
L D C E S S: Earning Ense Onvolutional Mbeddings FOR Emantic Egmentation
No ratings yet
L D C E S S: Earning Ense Onvolutional Mbeddings FOR Emantic Egmentation
10 pages
Philips Certaflux Led Panel 60120
No ratings yet
Philips Certaflux Led Panel 60120
11 pages
Fluid Properties: Density, Specific Volume, Specific Weight, Specific Gravity, and Pressure
No ratings yet
Fluid Properties: Density, Specific Volume, Specific Weight, Specific Gravity, and Pressure
1 page
Goc Revision ALpOMyojEgukEz4E
No ratings yet
Goc Revision ALpOMyojEgukEz4E
36 pages
Applied Maths-Unit3.2
No ratings yet
Applied Maths-Unit3.2
3 pages
Thyristor Three-Phase Rectifier/Inverter Guide
100% (1)
Thyristor Three-Phase Rectifier/Inverter Guide
8 pages
Session 1-Introduction To Data Analytics
No ratings yet
Session 1-Introduction To Data Analytics
42 pages
Bridge Pier Design Specifications
No ratings yet
Bridge Pier Design Specifications
25 pages
Ce 365 MCQ 2025 PRELIM REVISED
No ratings yet
Ce 365 MCQ 2025 PRELIM REVISED
48 pages
Final Revisiom T 2 Grade 12 G M101
No ratings yet
Final Revisiom T 2 Grade 12 G M101
39 pages
Properties of Matter Explained
No ratings yet
Properties of Matter Explained
112 pages
C. Henry Edwards, David E. Penney - Differential Equations - Computing and Modeling-Pearson (2013) - 1
No ratings yet
C. Henry Edwards, David E. Penney - Differential Equations - Computing and Modeling-Pearson (2013) - 1
13 pages
FMC Conventional Wellhead Breakdown
100% (1)
FMC Conventional Wellhead Breakdown
13 pages
Greengrass v2 Developer Guide
No ratings yet
Greengrass v2 Developer Guide
947 pages
Third Term ss2 Physics
No ratings yet
Third Term ss2 Physics
90 pages
Roy M Broad: Networking Performance: A Study of The Benefits of Business Networking in The West Midlands
No ratings yet
Roy M Broad: Networking Performance: A Study of The Benefits of Business Networking in The West Midlands
366 pages
Fixed Axial Pump
No ratings yet
Fixed Axial Pump
76 pages
Attachment A980727a5ed0537d
No ratings yet
Attachment A980727a5ed0537d
21 pages
Minitab SPC
No ratings yet
Minitab SPC
11 pages
Hospital Management Software Development: Olawale Ayotunde Sobogungod
No ratings yet
Hospital Management Software Development: Olawale Ayotunde Sobogungod
3 pages
Alkhatib Et Al - 2022 - Assessing Explanation Quality by Venn Prediction
No ratings yet
Alkhatib Et Al - 2022 - Assessing Explanation Quality by Venn Prediction
13 pages
KG Basin
No ratings yet
KG Basin
8 pages
Bazaar Guide for Developers
No ratings yet
Bazaar Guide for Developers
97 pages
Lines and Planes
No ratings yet
Lines and Planes
3 pages
Expo Lesson Plan
No ratings yet
Expo Lesson Plan
28 pages
Reversible Computing
No ratings yet
Reversible Computing
2 pages
Differential Calculus Test
No ratings yet
Differential Calculus Test
2 pages
Latihan Soal-Soal Bab 1-4 (Fismod)
No ratings yet
Latihan Soal-Soal Bab 1-4 (Fismod)
35 pages
Guideline For Customer Notifications PCN V5.0
No ratings yet
Guideline For Customer Notifications PCN V5.0
21 pages
Engineering Students' Grinding Lab
No ratings yet
Engineering Students' Grinding Lab
9 pages
MHWirth - Pile Top Drill Rigs - en (Brochure)
No ratings yet
MHWirth - Pile Top Drill Rigs - en (Brochure)
12 pages