StructGAN: Image Restoration Maintaining Structural Consistency Using A Two-Step Generative Adversarial Network

Zahin, Nahian Muhtasim; Rahman, Md. Mushfiqur; Mahmud, Kazi Raiyan

dc.contributor.author	Zahin, Nahian Muhtasim
dc.contributor.author	Rahman, Md. Mushfiqur
dc.contributor.author	Mahmud, Kazi Raiyan
dc.date.accessioned	2022-04-17T17:23:17Z
dc.date.available	2022-04-17T17:23:17Z
dc.date.issued	2021-03-30
dc.identifier.citation	[1] R. Dahl, M. Norouzi, and J. Shlens, “Pixel recursive super resolution,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 5439–5448. [2] X. Li, G. Hu, J. Zhu, W. Zuo, M. Wang, and L. Zhang, “Learning symmetry consistent deep cnns for face completion,” IEEE Transactions on Image Processing, vol. 29, pp. 7641–7655, 2020. [3] “Wikipedia: Total variation denoising,” https://en.wikipedia.org/wiki/Total_ variation_denoising, accessed: 2020-09-28. [4] “Before and after comparisons of adobe’s amazing image deblurring feature,” https://www.slrlounge.com/ zoom-and-enhance-google-brain-super-resolution-tech-make-tv-trope-a-reality/, accessed: 2020-09-28. [5] “Learning generative adversarial networks (gans),” https://laptrinhx.com/ learning-generative-adversarial-networks-gans-2910834212/, accessed: 2021-03- 01. [6] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1125–1134. 46 [7] J. Yu, Z. Lin, J. Yang, X. Shen, X. Lu, and T. S. Huang, “Generative image inpainting with contextual attention,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 5505–5514. [8] H. Liu, B. Jiang, Y. Xiao, and C. Yang, “Coherent semantic attention for image inpainting,” in Proceedings of the IEEE International Conference on Computer Vision, 2019, pp. 4170–4179. [9] J. G. Schavemaker, M. J. Reinders, J. J. Gerbrands, and E. Backer, “Image sharpening by morphological filtering,” Pattern Recognition, vol. 33, no. 6, pp. 997–1012, 2000. [10] A. Polesel, G. Ramponi, and V. J. Mathews, “Image enhancement via adaptive unsharp masking,” IEEE transactions on image processing, vol. 9, no. 3, pp. 505–510, 2000. [11] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in neural information processing systems, 2014, pp. 2672–2680. [12] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2223–2232. [13] J. Zhang, D. Zhao, R. Xiong, S. Ma, and W. Gao, “Image restoration using joint statistical modeling in a space-transform domain,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 24, no. 6, pp. 915–928, 2014. [14] Y. Tai, J. Yang, X. Liu, and C. Xu, “Memnet: A persistent memory network for image restoration,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 4539–4547. [15] M. K. Mihcak, I. Kozintsev, K. Ramchandran, and P. Moulin, “Low-complexity image denoising based on statistical modeling of wavelet coefficients,” IEEE Signal Processing Letters, vol. 6, no. 12, pp. 300–303, 1999. 47 [16] M. K. Mihcak, I. Kozintsev, and K. Ramchandran, “Spatially adaptive statistical modeling of wavelet image coefficients and its application to denoising,” in 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No. 99CH36258), vol. 6. IEEE, 1999, pp. 3253–3256. [17] G. Fan and X.-G. Xia, “Wavelet-based statistical image processing using hidden markov tree model,” in Proc. 34th Annual Conference on Information Sciences and Systems, Princeton, NJ, USA, 2000. [18] A. Pizurica, W. Philips, I. Lemahieu, and M. Acheroy, “A joint inter-and intrascale statistical model for bayesian wavelet based image denoising,” IEEE Transactions on Image Processing, vol. 11, no. 5, pp. 545–557, 2002. [19] A. Buades, B. Coll, and J.-M. Morel, “A non-local algorithm for image denoising,” in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 2. IEEE, 2005, pp. 60–65. [20] S. Lefkimmiatis, “Non-local color image denoising with convolutional neural networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 3587–3596. [21] C.-A. Deledalle, J. Salmon, A. S. Dalalyan et al., “Image denoising with patch based pca: local versus global.” in BMVC, vol. 81, 2011, pp. 425–455. [22] J. Wang, Y. Guo, Y. Ying, Y. Liu, and Q. Peng, “Fast non-local algorithm for image denoising,” in 2006 International Conference on Image Processing. IEEE, 2006, pp. 1429–1432. [23] T. Chen, K.-K. Ma, and L.-H. Chen, “Tri-state median filter for image denoising,” IEEE Transactions on Image processing, vol. 8, no. 12, pp. 1834–1838, 1999. [24] M. Zhang and B. K. Gunturk, “Multiresolution bilateral filtering for image denoising,” IEEE Transactions on image processing, vol. 17, no. 12, pp. 2324–2333, 2008. 48 [25] S. M. LoPresto, K. Ramchandran, and M. T. Orchard, “Image coding based on mixture modeling of wavelet coefficients and a fast estimation-quantization framework,” in Proceedings DCC’97. Data Compression Conference. IEEE, 1997, pp. 221–230. [26] N. R. Goodman, “Statistical analysis based on a certain multivariate complex gaussian distribution (an introduction),” The Annals of mathematical statistics, vol. 34, no. 1, pp. 152–177, 1963. [27] S. Cho and S. Lee, “Fast motion deblurring,” in ACM SIGGRAPH Asia 2009 papers, 2009, pp. 1–8. [28] L. Xu and J. Jia, “Two-phase kernel estimation for robust motion deblurring,” in European conference on computer vision. Springer, 2010, pp. 157–170. [29] S. K. Nayar and M. Ben-Ezra, “Motion-based motion deblurring,” IEEE transactions on pattern analysis and machine intelligence, vol. 26, no. 6, pp. 689–698, 2004. [30] J. Biemond, R. L. Lagendijk, and R. M. Mersereau, “Iterative methods for image deblurring,” Proceedings of the IEEE, vol. 78, no. 5, pp. 856–883, 1990. [31] P. C. Hansen, J. G. Nagy, and D. P. O’leary, Deblurring images: matrices, spectra, and filtering. SIAM, 2006. [32] Q. Shan, J. Jia, and A. Agarwala, “High-quality motion deblurring from a single image,” Acm transactions on graphics (tog), vol. 27, no. 3, pp. 1–10, 2008. [33] B. Sahu, “Towards data science: An evolution in single image super resolution using deep learning,” https://towardsdatascience.com/ an-evolution-in-single-image-super-resolution-using-deep-learning-66f0adfb2d6b, 2019, accessed: 2020-09-29. [34] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3431–3440. 49 [35] C. Dong, C. C. Loy, K. He, and X. Tang, “Image super-resolution using deep convolutional networks,” IEEE transactions on pattern analysis and machine intelligence, vol. 38, no. 2, pp. 295–307, 2015. [36] C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang et al., “Photo-realistic single image super-resolution using a generative adversarial network,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4681–4690. [37] J. Johnson, A. Alahi, and L. Fei-Fei, “Perceptual losses for real-time style transfer and super-resolution,” in European conference on computer vision. Springer, 2016, pp. 694–711. [38] X. Wang, K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, Y. Qiao, and C. Change Loy, “Esrgan: Enhanced super-resolution generative adversarial networks,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 0–0. [39] H. Zhao, O. Gallo, I. Frosio, and J. Kautz, “Loss functions for image restoration with neural networks,” IEEE Transactions on computational imaging, vol. 3, no. 1, pp. 47–57, 2016. [40] X. Wang, “Laplacian operator-based edge detectors,” IEEE transactions on pattern analysis and machine intelligence, vol. 29, no. 5, pp. 886–890, 2007. [41] “Google colaboratory,” https://colab.research.google.com/, accessed: 2010-09-30. [42] “Kaggle vs. colab faceoff — which free gpu provider is tops?” https:// towardsdatascience.com/kaggle-vs-colab-faceoff-which-free-gpu-provider-is-tops-d4f0cd625029, accessed: 2020-01-02. [43] X. Mao, Q. Li, H. Xie, R. Y. Lau, Z. Wang, and S. Paul Smolley, “Least squares generative adversarial networks,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2794–2802. 50 [44] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft coco: Common objects in context,” in European conference on computer vision. Springer, 2014, pp. 740–755. [45] B. Zhou, A. Khosla, A. Lapedriza, A. Torralba, and A. Oliva, “Places: An image database for deep scene understanding,” arXiv preprint arXiv:1610.02055, 2016. [46] A. Joulin and F. Paris, “Facebook ai research,” Learning Visual Features from Large Weakly Supervised Data, 2015. [47] “Poisson noise - wikipedia,” https://en.wikipedia.org/wiki/Shot_noise, accessed: 2010-09-30. 5	en_US
dc.identifier.uri	http://hdl.handle.net/123456789/1350
dc.description	Supervised by Prof. Md. Hasanul Kabir, Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh	en_US
dc.description.abstract	Image restoration deals with the removal of noise, blurriness, missing patches, and other kinds of distortions in broken images. Traditional reconstruction and restoration approaches suffer from different kinds of limitations. In our work, we have improved upon those models by introducing novel structure loss that emphasizes the overall image structure rather than individual pixels. Our proposed model StructGAN can achieve a higher SSIM (Structural Similarity Index Measure) score while not massively compromising other noise metrics. Overall, our proposed model uses generative adversarial networks with a two-step generator network, a dual discriminator network, and coherent semantic attention (CSA) layer. The two-step generator helps refine the output. The dual discriminator ensures local and global correctness. The CSA layer ensures semantic consistency. Along with these, our model incorporates the novel structure loss. The structure loss is based on the Laplacian filter that calculates the overall structure-map of the image and tries to replicate the structure-map in the generation step. The results obtained by our model are qualitatively comparable to the performance of the state-of-the-art models. For certain metrics, e.g. SSIM, StructGAN quantitatively outperforms other models.	en_US
dc.language.iso	en	en_US
dc.publisher	Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur, Bangladesh	en_US
dc.title	StructGAN: Image Restoration Maintaining Structural Consistency Using A Two-Step Generative Adversarial Network	en_US
dc.type	Thesis	en_US