A Study about Learning Graph Representation on Farmhouse Apple Quality Images with Graph Transformer

Ji Hun Bae;Ju Hwan Lee;Gwang Hyun Yu;Gyeong Ju Kwon;Jin Young Kim;

doi:10.30693/SMJ.2023.12.1.9

스마트미디어저널 (Smart Media Journal)

제12권1호
/
Pages.9-16
/
2023
/
2287-1322(pISSN)
/
2288-9671(eISSN)

한국스마트미디어학회 (THE KOREAN INSTITUTE OF SMART MEDIA)

DOI QR Code

그래프 트랜스포머 기반 농가 사과 품질 이미지의 그래프 표현 학습 연구

A Study about Learning Graph Representation on Farmhouse Apple Quality Images with Graph Transformer

배지훈 (전남대학교 ICT융합시스템공학과) ;
이주환 (전남대학교 ICT융합시스템공학과) ;
유광현 (전남대학교 ICT융합시스템공학과) ;
권경주 (주식회사 리눅스아이티) ;
김진영 (전남대학교 ICT융합시스템공학과)

투고 : 2022.11.30
심사 : 2023.01.06
발행 : 2023.02.28

https://doi.org/10.30693/SMJ.2023.12.1.9 인용 PDF

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

최근 농가의 사과 품질 선별 작업에서 인적자원의 한계를 극복하기 위해 합성곱 신경망(CNN) 기반 시스템이 개발되고 있다. 그러나 합성곱 신경망은 동일한 크기의 이미지만을 입력받기 때문에 샘플링 등의 전처리 과정이 요구될 수 있으며, 과도 샘플링의 경우 화질 저하, 블러링 등 원본 이미지의 정보손실 문제가 발생한다. 본 논문에서는 위 문제를 최소화하기 위하여, 원본 이미지의 패치 기반 그래프를 생성하고 그래프 트랜스포머 모델의 랜덤워크 기반 위치 인코딩 방법을 제안한다. 위 방법은 랜덤워크 알고리즘 기반 위치정보가 없는 패치들의 위치 임베딩 정보를 지속적으로 학습하고, 기존 그래프 트랜스포머의 자가 주의집중 기법을 통해 유익한 노드정보들을 집계함으로써 최적의 그래프 구조를 찾는다. 따라서 무작위 노드 순서의 새로운 그래프 구조와 이미지의 객체 위치에 따른 임의의 그래프 구조에서도 강건한 성질을 가지며, 좋은 성능을 보여준다. 5가지 사과 품질 데이터셋으로 실험하였을 때, 다른 GNN 모델보다 최소 1.3%에서 최대 4.7%의 학습 정확도가 높았으며, ResNet18 모델의 23.52M보다 약 15% 적은 3.59M의 파라미터 수를 보유하여 연산량 절감에 따른 빠른 추론 속도를 보이며 그 효과를 증명한다.

Recently, a convolutional neural network (CNN) based system is being developed to overcome the limitations of human resources in the apple quality classification of farmhouse. However, since convolutional neural networks receive only images of the same size, preprocessing such as sampling may be required, and in the case of oversampling, information loss of the original image such as image quality degradation and blurring occurs. In this paper, in order to minimize the above problem, to generate a image patch based graph of an original image and propose a random walk-based positional encoding method to apply the graph transformer model. The above method continuously learns the position embedding information of patches which don't have a positional information based on the random walk algorithm, and finds the optimal graph structure by aggregating useful node information through the self-attention technique of graph transformer model. Therefore, it is robust and shows good performance even in a new graph structure of random node order and an arbitrary graph structure according to the location of an object in an image. As a result, when experimented with 5 apple quality datasets, the learning accuracy was higher than other GNN models by a minimum of 1.3% to a maximum of 4.7%, and the number of parameters was 3.59M, which was about 15% less than the 23.52M of the ResNet18 model. Therefore, it shows fast reasoning speed according to the reduction of the amount of computation and proves the effect.

키워드

과제정보

본 연구는 2022년도 중소벤처기업부의 기술개발사업 지원에 의한 연구임 (No. S3268440).

참고문헌

Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Communications of the ACM, Vol. 60, No. 6, pp. 84-90, 2017 https://doi.org/10.1145/3065386
Simonyan, Karen, and Andrew Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014.
He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016
Huang, Gao, et al. "Densely connected convolutional networks," Proceedings of the IEEE conference on computer vision and pattern recognition, 2017
사과 품종 분류를 위한 CNN기반 모델링 및 분류 기법 연구(2021), http://cultalab.jbnu.ac.kr, (accessed. Nov. 23, 2022)
Lee, S., Lee, Y., Lee, E., & Han, S. (2022). Comparison of CNN-based models for apple pest classification. Proceedings of the Korea Information Processing Society Conference, 460-463. https://doi.org/10.3745/PKIPS.Y2022M05A.460
Howard, Andrew G., et al. "Mobilenets: Efficient convolutional neural networks for mobile vision applications," arXiv preprint arXiv:1704.04861, 2017.
Ying, Rex, et al. "Graph convolutional neural networks for web-scale recommender systems," Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. 2018.
Yu, Bing, Haoteng Yin, and Zhanxing Zhu. "Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting." arXiv preprint arXiv:1709.04875, 2017.
Avelar, Pedro HC, et al. "Superpixel image classification with graph attention networks." 2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI). IEEE, 2020.
Long, J.; Yan, Z.; Chen, H. A Graph Neural Network for superpixel image classification. J. Phys. Conf. Ser. 2021, 1871, 012071
Achanta, R.; Shaji, A.; Smith, K.; Lucchi, A.; Fua, P.; Susstrunk, S. SLIC "superpixels compared to state-of-the-art superpixel methods," IEEE Trans. Pattern Anal. Mach. Intell., Vol. 34, pp. 2274-2282, 2012. https://doi.org/10.1109/TPAMI.2012.120
Vedaldi, A.; Soatto, S. "Quick shift and kernel methods for mode seeking," In Proceedings of the European Conference on Computer Vision, pp. 705-718, Marseille, France, Oct. 2008.
Felzenszwalb, P.F.; Huttenlocher, D.P. "Efficient graph-based image segmentation," Int. J. Comput. Vis. Vol. 59, pp. 167-181, 2004. https://doi.org/10.1023/B:VISI.0000022288.19776.77
Vaswani, Ashish, et al. "Attention is all you need," Advances in neural information processing systems 30, 2017.
Dosovitskiy, Alexey, et al. "An image is worth 16x16 words: Transformers for image recognition at scale," arXiv preprint arXiv:2010.11929, 2020.
Peterson, Leif E. "K-nearest neighbor," Scholarpedia Vol. 4, No. 2, 2009.
Dwivedi, Vijay Prakash, and Xavier Bresson. "A generalization of transformer networks to graphs," arXiv preprint arXiv:2012.09699, 2020.
Velickovic, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. "Graph attention networks," arXiv 2017, arXiv:1710.10903
Dwivedi, V.P.; Joshi, C.K.; Laurent, T.; Bengio, Y.; Bresson, X. "Benchmarking graph neural networks," arXiv 2020, arXiv:2003.00982
Ruder, Sebastian. "An overview of gradient descent optimization algorithms," arXiv preprint arXiv:1609.04747, 2016.
Defferrard, M.; Bresson, X.; Vandergheynst, P. "Convolutional neural networks on graphs with fast localized spectral filtering," In Advances in Neural Information Processing Systems 29 (NIPS 2016); Curran Associates, Inc.: Red Hook, NY, USA, Vol. 29, 2017.
Kipf, T.N.; Welling, M. "Semi-supervised classification with graph convolutional networks," arXiv 2016, arXiv:1609.02907.
Han, Kai, et al. "Vision GNN: An Image is Worth Graph of Nodes," arXiv preprint arXiv:2206.00272, 2022.

스마트미디어저널 (Smart Media Journal)

그래프 트랜스포머 기반 농가 사과 품질 이미지의 그래프 표현 학습 연구

A Study about Learning Graph Representation on Farmhouse Apple Quality Images with Graph Transformer

초록

키워드

과제정보

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)