• Title/Summary/Keyword: learning through the image

Search Result 925, Processing Time 0.037 seconds

Land Use Feature Extraction and Sprawl Development Prediction from Quickbird Satellite Imagery Using Dempster-Shafer and Land Transformation Model

  • Saharkhiz, Maryam Adel;Pradhan, Biswajeet;Rizeei, Hossein Mojaddadi;Jung, Hyung-Sup
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.1
    • /
    • pp.15-27
    • /
    • 2020
  • Accurate knowledge of land use/land cover (LULC) features and their relative changes over upon the time are essential for sustainable urban management. Urban sprawl growth has been always also a worldwide concern that needs to carefully monitor particularly in a developing country where unplanned building constriction has been expanding at a high rate. Recently, remotely sensed imageries with a very high spatial/spectral resolution and state of the art machine learning approaches sent the urban classification and growth monitoring to a higher level. In this research, we classified the Quickbird satellite imagery by object-based image analysis of Dempster-Shafer (OBIA-DS) for the years of 2002 and 2015 at Karbala-Iraq. The real LULC changes including, residential sprawl expansion, amongst these years, were identified via change detection procedure. In accordance with extracted features of LULC and detected trend of urban pattern, the future LULC dynamic was simulated by using land transformation model (LTM) in geospatial information system (GIS) platform. Both classification and prediction stages were successfully validated using ground control points (GCPs) through accuracy assessment metric of Kappa coefficient that indicated 0.87 and 0.91 for 2002 and 2015 classification as well as 0.79 for prediction part. Detail results revealed a substantial growth in building over fifteen years that mostly replaced by agriculture and orchard field. The prediction scenario of LULC sprawl development for 2030 revealed a substantial decline in green and agriculture land as well as an extensive increment in build-up area especially at the countryside of the city without following the residential pattern standard. The proposed method helps urban decision-makers to identify the detail temporal-spatial growth pattern of highly populated cities like Karbala. Additionally, the results of this study can be considered as a probable future map in order to design enough future social services and amenities for the local inhabitants.

2D and 3D Hand Pose Estimation Based on Skip Connection Form (스킵 연결 형태 기반의 손 관절 2D 및 3D 검출 기법)

  • Ku, Jong-Hoe;Kim, Mi-Kyung;Cha, Eui-Young
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.12
    • /
    • pp.1574-1580
    • /
    • 2020
  • Traditional pose estimation methods include using special devices or images through image processing. The disadvantage of using a device is that the environment in which the device can be used is limited and costly. The use of cameras and image processing has the advantage of reducing environmental constraints and costs, but the performance is lower. CNN(Convolutional Neural Networks) were studied for pose estimation just using only camera without these disadvantage. Various techniques were proposed to increase cognitive performance. In this paper, the effect of the skip connection on the network was experimented by using various skip connections on the joint recognition of the hand. Experiments have confirmed that the presence of additional skip connections other than the basic skip connections has a better effect on performance, but the network with downward skip connections is the best performance.

Performance Evaluation of Efficient Vision Transformers on Embedded Edge Platforms (임베디드 엣지 플랫폼에서의 경량 비전 트랜스포머 성능 평가)

  • Minha Lee;Seongjae Lee;Taehyoun Kim
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.18 no.3
    • /
    • pp.89-100
    • /
    • 2023
  • Recently, on-device artificial intelligence (AI) solutions using mobile devices and embedded edge devices have emerged in various fields, such as computer vision, to address network traffic burdens, low-energy operations, and security problems. Although vision transformer deep learning models have outperformed conventional convolutional neural network (CNN) models in computer vision, they require more computations and parameters than CNN models. Thus, they are not directly applicable to embedded edge devices with limited hardware resources. Many researchers have proposed various model compression methods or lightweight architectures for vision transformers; however, there are only a few studies evaluating the effects of model compression techniques of vision transformers on performance. Regarding this problem, this paper presents a performance evaluation of vision transformers on embedded platforms. We investigated the behaviors of three vision transformers: DeiT, LeViT, and MobileViT. Each model performance was evaluated by accuracy and inference time on edge devices using the ImageNet dataset. We assessed the effects of the quantization method applied to the models on latency enhancement and accuracy degradation by profiling the proportion of response time occupied by major operations. In addition, we evaluated the performance of each model on GPU and EdgeTPU-based edge devices. In our experimental results, LeViT showed the best performance in CPU-based edge devices, and DeiT-small showed the highest performance improvement in GPU-based edge devices. In addition, only MobileViT models showed performance improvement on EdgeTPU. Summarizing the analysis results through profiling, the degree of performance improvement of each vision transformer model was highly dependent on the proportion of parts that could be optimized in the target edge device. In summary, to apply vision transformers to on-device AI solutions, either proper operation composition and optimizations specific to target edge devices must be considered.

Semantic Object Detection based on LiDAR Distance-based Clustering Techniques for Lightweight Embedded Processors (경량형 임베디드 프로세서를 위한 라이다 거리 기반 클러스터링 기법을 활용한 의미론적 물체 인식)

  • Jung, Dongkyu;Park, Daejin
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.10
    • /
    • pp.1453-1461
    • /
    • 2022
  • The accuracy of peripheral object recognition algorithms using 3D data sensors such as LiDAR in autonomous vehicles has been increasing through many studies, but this requires high performance hardware and complex structures. This object recognition algorithm acts as a large load on the main processor of an autonomous vehicle that requires performing and managing many processors while driving. To reduce this load and simultaneously exploit the advantages of 3D sensor data, we propose 2D data-based recognition using the ROI generated by extracting physical properties from 3D sensor data. In the environment where the brightness value was reduced by 50% in the basic image, it showed 5.3% higher accuracy and 28.57% lower performance time than the existing 2D-based model. Instead of having a 2.46 percent lower accuracy than the 3D-based model in the base image, it has a 6.25 percent reduction in performance time.

Development of an Automatic Classification Model for Construction Site Photos with Semantic Analysis based on Korean Construction Specification (표준시방서 기반의 의미론적 분석을 반영한 건설 현장 사진 자동 분류 모델 개발)

  • Park, Min-Geon;Kim, Kyung-Hwan
    • Korean Journal of Construction Engineering and Management
    • /
    • v.25 no.3
    • /
    • pp.58-67
    • /
    • 2024
  • In the era of the fourth industrial revolution, data plays a vital role in enhancing the productivity of industries. To advance digitalization in the construction industry, which suffers from a lack of available data, this study proposes a model that classifies construction site photos by work types. Unlike traditional image classification models that solely rely on visual data, the model in this study includes semantic analysis of construction work types. This is achieved by extracting the significance of relationships between objects and work types from the standard construction specification. These relationships are then used to enhance the classification process by correlating them with objects detected in photos. This model improves the interpretability and reliability of classification results, offering convenience to field operators in photo categorization tasks. Additionally, the model's practical utility has been validated through integration into a classification program. As a result, this study is expected to contribute to the digitalization of the construction industry.

A study on counting number of passengers by moving object detection (이동 객체 검출을 통한 승객 인원 개수에 대한 연구)

  • Yoo, Sang-Hyun
    • Journal of Internet Computing and Services
    • /
    • v.21 no.2
    • /
    • pp.9-18
    • /
    • 2020
  • In the field of image processing, a method of detecting and counting passengers as moving objects when getting on and off the bus has been studied. Among these technologies, one of the artificial intelligence techniques, the deep learning technique is used. As another method, a method of detecting an object using a stereo vision camera is also used. However, these techniques require expensive hardware equipment because of the computational complexity of used to detect objects. However, most video equipments have a significant decrease in computational processing power, and thus, in order to detect passengers on the bus, there is a need for an image processing technology suitable for various equipment using a relatively low computational technique. Therefore, in this paper, we propose a technique that can efficiently obtain the number of passengers on the bus by detecting the contour of the object through the background subtraction suitable for low-cost equipment. Experiments have shown that passengers were counted with approximately 70% accuracy on lower-end machines than those equipped with stereo vision camera.

Super Resolution Technique Through Improved Neighbor Embedding (개선된 네이버 임베딩에 의한 초해상도 기법)

  • Eum, Kyoung-Bae
    • Journal of Digital Contents Society
    • /
    • v.15 no.6
    • /
    • pp.737-743
    • /
    • 2014
  • For single image super resolution (SR), interpolation based and example based algorithms are extensively used. The interpolation algorithms have the strength of theoretical simplicity. However, those algorithms are tending to produce high resolution images with jagged edges, because they are not able to use more priori information. Example based algorithms have been studied in the past few years. For example based SR, the nearest neighbor based algorithms are extensively considered. Among them, neighbor embedding (NE) has been inspired by manifold learning method, particularly locally linear embedding. However, the sizes of local training sets are always too small. So, NE algorithm is weak in the performance of the visuality and quantitative measure by the poor generalization of nearest neighbor estimation. An improved NE algorithm with Support Vector Regression (SVR) was proposed to solve this problem. Given a low resolution image, the pixel values in its high resolution version are estimated by the improved NE. Comparing with bicubic and NE, the improvements of 1.25 dB and 2.33 dB are achieved in PSNR. Experimental results show that proposed method is quantitatively and visually more effective than prior works using bicubic interpolation and NE.

Super Resolution Algorithm using TV-G Decomposition (TV-G 분해를 이용한 초해상도 알고리즘)

  • Eum, Kyoung-Bae;Beom, Dong-Kyu
    • Journal of Digital Contents Society
    • /
    • v.18 no.8
    • /
    • pp.1517-1522
    • /
    • 2017
  • Among single image SR techniques, the TV based SR approach seems most successful in terms of edge preservation and no artifacts. But, this approach achieves insufficient SR for texture component. In this paper, we proposed a new TV-G decomposition based SR method to solve this problem. We proposed the SVR based up-sampling to get better edge preservation in the structure component. The NNE used the relaxed constraint to improve the NE. We used the NNE based learning method to improve the resolution of the texture component. Through experimental results, we quantitatively and qualitatively confirm the improved results of the proposed SR method when comparing with conventional interpolation method, ScSR, TV and NNE.

S-FDS : a Smart Fire Detection System based on the Integration of Fuzzy Logic and Deep Learning (S-FDS : 퍼지로직과 딥러닝 통합 기반의 스마트 화재감지 시스템)

  • Jang, Jun-Yeong;Lee, Kang-Woon;Kim, Young-Jin;Kim, Won-Tae
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.54 no.4
    • /
    • pp.50-58
    • /
    • 2017
  • Recently, some methods of converging heterogeneous fire sensor data have been proposed for effective fire detection, but the rule-based methods have low adaptability and accuracy, and the fuzzy inference methods suffer from detection speed and accuracy by lack of consideration for images. In addition, a few image-based deep learning methods were researched, but it was too difficult to rapidly recognize the fire event in absence of cameras or out of scope of a camera in practical situations. In this paper, we propose a novel fire detection system combining a deep learning algorithm based on CNN and fuzzy inference engine based on heterogeneous fire sensor data including temperature, humidity, gas, and smoke density. we show it is possible for the proposed system to rapidly detect fire by utilizing images and to decide fire in a reliable way by utilizing multi-sensor data. Also, we apply distributed computing architecture to fire detection algorithm in order to avoid concentration of computing power on a server and to enhance scalability as a result. Finally, we prove the performance of the system through two experiments by means of NIST's fire dynamics simulator in both cases of an explosively spreading fire and a gradually growing fire.

Timeline Tag Cloud Generation for Broadcasting Contents using Blog Postings (블로그 포스팅을 이용한 방송 콘텐츠 영상의 타임라인 단위 태그 클라우드 생성)

  • Son, Jeong-Woo;Kim, Hwa-Suk;Kim, Sun-Joong;Cho, Keeseong
    • Journal of KIISE
    • /
    • v.42 no.5
    • /
    • pp.637-641
    • /
    • 2015
  • Due to the recent increasement of user created contents like SNS, blog posts, and so on, broadcast contents are actively re-construction by its users. Especially, on some genres like drama, movie, various information from cars and film sites to clothes and watches in a content is spreaded out to other users through blog postings. Since such information can be an additional information for the content, they can be used for providing high-quality broadcast services. For this purpose, in this paper, we propose timeline tag cloud generation method for broadcasting contents. In the proposed method, blog postings on the target contents are first gathered and then, images and words around images are extracted from a blog post as a tag set. An extracted tag set is tagged on a specific timeline of the target content. In experiments, to prove the efficiency of the proposed method, we evaluated the performances of the proposed image matching and tag cloud generation methods.