• Title/Summary/Keyword: ART Algorithm

Search Result 582, Processing Time 0.022 seconds

Road Extraction from Images Using Semantic Segmentation Algorithm (영상 기반 Semantic Segmentation 알고리즘을 이용한 도로 추출)

  • Oh, Haeng Yeol;Jeon, Seung Bae;Kim, Geon;Jeong, Myeong-Hun
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.40 no.3
    • /
    • pp.239-247
    • /
    • 2022
  • Cities are becoming more complex due to rapid industrialization and population growth in modern times. In particular, urban areas are rapidly changing due to housing site development, reconstruction, and demolition. Thus accurate road information is necessary for various purposes, such as High Definition Map for autonomous car driving. In the case of the Republic of Korea, accurate spatial information can be generated by making a map through the existing map production process. However, targeting a large area is limited due to time and money. Road, one of the map elements, is a hub and essential means of transportation that provides many different resources for human civilization. Therefore, it is essential to update road information accurately and quickly. This study uses Semantic Segmentation algorithms Such as LinkNet, D-LinkNet, and NL-LinkNet to extract roads from drone images and then apply hyperparameter optimization to models with the highest performance. As a result, the LinkNet model using pre-trained ResNet-34 as the encoder achieved 85.125 mIoU. Subsequent studies should focus on comparing the results of this study with those of studies using state-of-the-art object detection algorithms or semi-supervised learning-based Semantic Segmentation techniques. The results of this study can be applied to improve the speed of the existing map update process.

A Study on AI Algorithm that can be used to Arts Exhibition : Focusing on the Development and Evaluation of the Chatbot Model (예술 전시에 활용 가능한 AI 알고리즘 연구 : 챗봇 모델 개발 및 평가를 중심으로)

  • Choi, Hak-Hyeon;Yoon, Mi-Ra
    • Journal of Korea Entertainment Industry Association
    • /
    • v.15 no.4
    • /
    • pp.369-381
    • /
    • 2021
  • Artificial Intelligence(AI) technology can be used in arts exhibitions ranging from planning exhibitions, filed progress, and evaluation. AI has been expanded its scope from planning exhibition and guidance services to tools for creating arts. This paper focuses on chatbots that utilize exhibition and AI technology convergence to provide information and services. To study more specifically, I developed a chatbot for exhibition services using the Naver Clova chatbot tool and information from the National Museum of Modern and Contemporary Art(MMCA), Korea. In this study, information was limited to viewing and exhibition rather than all information of the MMCA, and the chatbot was developed which provides a scenario type to get an answering user want to gain through a button and a text question and answer(Q&A) type to directly input a question. As a result of evaluating the chatbot with six items according to ELIZA's chatbot evaluation scale, a score of 4.2 out of 5 was derived by completing the development of a chatbot to be used to deliver viewing and exhibition information. The future research task is to create a perfect chatbot model that can be used in an actual arts exhibition space by connecting the developed chatbot with continuous scenario answers, resolving text Q&A-type answer failures and errors, and expanding additional services.

Training of a Siamese Network to Build a Tracker without Using Tracking Labels (샴 네트워크를 사용하여 추적 레이블을 사용하지 않는 다중 객체 검출 및 추적기 학습에 관한 연구)

  • Kang, Jungyu;Song, Yoo-Seung;Min, Kyoung-Wook;Choi, Jeong Dan
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.21 no.5
    • /
    • pp.274-286
    • /
    • 2022
  • Multi-object tracking has been studied for a long time under computer vision and plays a critical role in applications such as autonomous driving and driving assistance. Multi-object tracking techniques generally consist of a detector that detects objects and a tracker that tracks the detected objects. Various publicly available datasets allow us to train a detector model without much effort. However, there are relatively few publicly available datasets for training a tracker model, and configuring own tracker datasets takes a long time compared to configuring detector datasets. Hence, the detector is often developed separately with a tracker module. However, the separated tracker should be adjusted whenever the former detector model is changed. This study proposes a system that can train a model that performs detection and tracking simultaneously using only the detector training datasets. In particular, a Siam network with augmentation is used to compose the detector and tracker. Experiments are conducted on public datasets to verify that the proposed algorithm can formulate a real-time multi-object tracker comparable to the state-of-the-art tracker models.

Obstacle Avoidance of Unmanned Surface Vehicle based on 3D Lidar for VFH Algorithm (무인수상정의 장애물 회피를 위한 3차원 라이다 기반 VFH 알고리즘 연구)

  • Weon, Ihn-Sik;Lee, Soon-Geul;Ryu, Jae-Kwan
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.8 no.3
    • /
    • pp.945-953
    • /
    • 2018
  • In this paper, we use 3-D LIDAR for obstacle detection and avoidance maneuver for autonomous unmanned operation. It is aimed to avoid obstacle avoidance in unmanned water under marine condition using only single sensor. 3D lidar uses Quanergy's M8 sensor to collect surrounding obstacle data and includes layer information and intensity information in obstacle information. The collected data is converted into a three-dimensional Cartesian coordinate system, which is then mapped to a two-dimensional coordinate system. The data including the obstacle information converted into the two-dimensional coordinate system includes noise data on the water surface. So, basically, the noise data generated regularly is defined by defining a hypothetical region of interest based on the assumption of unmanned water. The noise data generated thereafter are set to a threshold value in the histogram data calculated by the Vector Field Histogram, And the noise data is removed in proportion to the amount of noise. Using the removed data, the relative object was searched according to the unmanned averaging motion, and the density map of the data was made while keeping one cell on the virtual grid map. A polar histogram was generated for the generated obstacle map, and the avoidance direction was selected using the boundary value.

Parameter search methodology of support vector machines for improving performance (속도 향상을 위한 서포트 벡터 머신의 파라미터 탐색 방법론)

  • Lee, Sung-Bo;Kim, Jae-young;Kim, Cheol-Hong;Kim, Jong-Myon
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.3
    • /
    • pp.329-337
    • /
    • 2017
  • This paper proposes a search method that explores parameters C and σ values of support vector machines (SVM) to improve performance while maintaining search accuracy. A traditional grid search method requires tremendous computational times because it searches all available combinations of C and σ values to find optimal combinations which provide the best performance of SVM. To address this issue, this paper proposes a deep search method that reduces computational time. In the first stage, it divides C-σ- accurate metrics into four regions, searches a median value of each region, and then selects a point of the highest accurate value as a start point. In the second stage, the selected start points are re-divided into four regions, and then the highest accurate point is assigned as a new search point. In the third stage, after eight points near the search point. are explored and the highest accurate value is assigned as a new search point, corresponding points are divided into four parts and it calculates an accurate value. In the last stage, it is continued until an accurate metric value is the highest compared to the neighborhood point values. If it is not satisfied, it is repeated from the second stage with the input level value. Experimental results using normal and defect bearings show that the proposed deep search algorithm outperforms the conventional algorithms in terms of performance and search time.

Analysis and Evaluation of Frequent Pattern Mining Technique based on Landmark Window (랜드마크 윈도우 기반의 빈발 패턴 마이닝 기법의 분석 및 성능평가)

  • Pyun, Gwangbum;Yun, Unil
    • Journal of Internet Computing and Services
    • /
    • v.15 no.3
    • /
    • pp.101-107
    • /
    • 2014
  • With the development of online service, recent forms of databases have been changed from static database structures to dynamic stream database structures. Previous data mining techniques have been used as tools of decision making such as establishment of marketing strategies and DNA analyses. However, the capability to analyze real-time data more quickly is necessary in the recent interesting areas such as sensor network, robotics, and artificial intelligence. Landmark window-based frequent pattern mining, one of the stream mining approaches, performs mining operations with respect to parts of databases or each transaction of them, instead of all the data. In this paper, we analyze and evaluate the techniques of the well-known landmark window-based frequent pattern mining algorithms, called Lossy counting and hMiner. When Lossy counting mines frequent patterns from a set of new transactions, it performs union operations between the previous and current mining results. hMiner, which is a state-of-the-art algorithm based on the landmark window model, conducts mining operations whenever a new transaction occurs. Since hMiner extracts frequent patterns as soon as a new transaction is entered, we can obtain the latest mining results reflecting real-time information. For this reason, such algorithms are also called online mining approaches. We evaluate and compare the performance of the primitive algorithm, Lossy counting and the latest one, hMiner. As the criteria of our performance analysis, we first consider algorithms' total runtime and average processing time per transaction. In addition, to compare the efficiency of storage structures between them, their maximum memory usage is also evaluated. Lastly, we show how stably the two algorithms conduct their mining works with respect to the databases that feature gradually increasing items. With respect to the evaluation results of mining time and transaction processing, hMiner has higher speed than that of Lossy counting. Since hMiner stores candidate frequent patterns in a hash method, it can directly access candidate frequent patterns. Meanwhile, Lossy counting stores them in a lattice manner; thus, it has to search for multiple nodes in order to access the candidate frequent patterns. On the other hand, hMiner shows worse performance than that of Lossy counting in terms of maximum memory usage. hMiner should have all of the information for candidate frequent patterns to store them to hash's buckets, while Lossy counting stores them, reducing their information by using the lattice method. Since the storage of Lossy counting can share items concurrently included in multiple patterns, its memory usage is more efficient than that of hMiner. However, hMiner presents better efficiency than that of Lossy counting with respect to scalability evaluation due to the following reasons. If the number of items is increased, shared items are decreased in contrast; thereby, Lossy counting's memory efficiency is weakened. Furthermore, if the number of transactions becomes higher, its pruning effect becomes worse. From the experimental results, we can determine that the landmark window-based frequent pattern mining algorithms are suitable for real-time systems although they require a significant amount of memory. Hence, we need to improve their data structures more efficiently in order to utilize them additionally in resource-constrained environments such as WSN(Wireless sensor network).

System Development for Measuring Group Engagement in the Art Center (공연장에서 다중 몰입도 측정을 위한 시스템 개발)

  • Ryu, Joon Mo;Choi, Il Young;Choi, Lee Kwon;Kim, Jae Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.3
    • /
    • pp.45-58
    • /
    • 2014
  • The Korean Culture Contents spread out to Worldwide, because the Korean wave is sweeping in the world. The contents stand in the middle of the Korean wave that we are used it. Each country is ongoing to keep their Culture industry improve the national brand and High added value. Performing contents is important factor of arousal in the enterprise industry. To improve high arousal confidence of product and positive attitude by populace is one of important factor by advertiser. Culture contents is the same situation. If culture contents have trusted by everyone, they will give information their around to spread word-of-mouth. So, many researcher study to measure for person's arousal analysis by statistical survey, physiological response, body movement and facial expression. First, Statistical survey has a problem that it is not possible to measure each person's arousal real time and we cannot get good survey result after they watched contents. Second, physiological response should be checked with surround because experimenter sets sensors up their chair or space by each of them. Additionally it is difficult to handle provided amount of information with real time from their sensor. Third, body movement is easy to get their movement from camera but it difficult to set up experimental condition, to measure their body language and to get the meaning. Lastly, many researcher study facial expression. They measures facial expression, eye tracking and face posed. Most of previous studies about arousal and interest are mostly limited to reaction of just one person and they have problems with application multi audiences. They have a particular method, for example they need room light surround, but set limits only one person and special environment condition in the laboratory. Also, we need to measure arousal in the contents, but is difficult to define also it is not easy to collect reaction by audiences immediately. Many audience in the theater watch performance. We suggest the system to measure multi-audience's reaction with real-time during performance. We use difference image analysis method for multi-audience but it weaks a dark field. To overcome dark environment during recoding IR camera can get the photo from dark area. In addition we present Multi-Audience Engagement Index (MAEI) to calculate algorithm which sources from sound, audience' movement and eye tracking value. Algorithm calculates audience arousal from the mobile survey, sound value, audience' reaction and audience eye's tracking. It improves accuracy of Multi-Audience Engagement Index, we compare Multi-Audience Engagement Index with mobile survey. And then it send the result to reporting system and proposal an interested persons. Mobile surveys are easy, fast, and visitors' discomfort can be minimized. Also additional information can be provided mobile advantage. Mobile application to communicate with the database, real-time information on visitors' attitudes focused on the content stored. Database can provide different survey every time based on provided information. The example shown in the survey are as follows: Impressive scene, Satisfied, Touched, Interested, Didn't pay attention and so on. The suggested system is combine as 3 parts. The system consist of three parts, External Device, Server and Internal Device. External Device can record multi-Audience in the dark field with IR camera and sound signal. Also we use survey with mobile application and send the data to ERD Server DB. The Server part's contain contents' data, such as each scene's weights value, group audience weights index, camera control program, algorithm and calculate Multi-Audience Engagement Index. Internal Device presents Multi-Audience Engagement Index with Web UI, print and display field monitor. Our system is test-operated by the Mogencelab in the DMC display exhibition hall which is located in the Sangam Dong, Mapo Gu, Seoul. We have still gotten from visitor daily. If we find this system audience arousal factor with this will be very useful to create contents.

The way to make training data for deep learning model to recognize keywords in product catalog image at E-commerce (온라인 쇼핑몰에서 상품 설명 이미지 내의 키워드 인식을 위한 딥러닝 훈련 데이터 자동 생성 방안)

  • Kim, Kitae;Oh, Wonseok;Lim, Geunwon;Cha, Eunwoo;Shin, Minyoung;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.1-23
    • /
    • 2018
  • From the 21st century, various high-quality services have come up with the growth of the internet or 'Information and Communication Technologies'. Especially, the scale of E-commerce industry in which Amazon and E-bay are standing out is exploding in a large way. As E-commerce grows, Customers could get what they want to buy easily while comparing various products because more products have been registered at online shopping malls. However, a problem has arisen with the growth of E-commerce. As too many products have been registered, it has become difficult for customers to search what they really need in the flood of products. When customers search for desired products with a generalized keyword, too many products have come out as a result. On the contrary, few products have been searched if customers type in details of products because concrete product-attributes have been registered rarely. In this situation, recognizing texts in images automatically with a machine can be a solution. Because bulk of product details are written in catalogs as image format, most of product information are not searched with text inputs in the current text-based searching system. It means if information in images can be converted to text format, customers can search products with product-details, which make them shop more conveniently. There are various existing OCR(Optical Character Recognition) programs which can recognize texts in images. But existing OCR programs are hard to be applied to catalog because they have problems in recognizing texts in certain circumstances, like texts are not big enough or fonts are not consistent. Therefore, this research suggests the way to recognize keywords in catalog with the Deep Learning algorithm which is state of the art in image-recognition area from 2010s. Single Shot Multibox Detector(SSD), which is a credited model for object-detection performance, can be used with structures re-designed to take into account the difference of text from object. But there is an issue that SSD model needs a lot of labeled-train data to be trained, because of the characteristic of deep learning algorithms, that it should be trained by supervised-learning. To collect data, we can try labelling location and classification information to texts in catalog manually. But if data are collected manually, many problems would come up. Some keywords would be missed because human can make mistakes while labelling train data. And it becomes too time-consuming to collect train data considering the scale of data needed or costly if a lot of workers are hired to shorten the time. Furthermore, if some specific keywords are needed to be trained, searching images that have the words would be difficult, as well. To solve the data issue, this research developed a program which create train data automatically. This program can make images which have various keywords and pictures like catalog and save location-information of keywords at the same time. With this program, not only data can be collected efficiently, but also the performance of SSD model becomes better. The SSD model recorded 81.99% of recognition rate with 20,000 data created by the program. Moreover, this research had an efficiency test of SSD model according to data differences to analyze what feature of data exert influence upon the performance of recognizing texts in images. As a result, it is figured out that the number of labeled keywords, the addition of overlapped keyword label, the existence of keywords that is not labeled, the spaces among keywords and the differences of background images are related to the performance of SSD model. This test can lead performance improvement of SSD model or other text-recognizing machine based on deep learning algorithm with high-quality data. SSD model which is re-designed to recognize texts in images and the program developed for creating train data are expected to contribute to improvement of searching system in E-commerce. Suppliers can put less time to register keywords for products and customers can search products with product-details which is written on the catalog.

Evaluation of Dose Change by Using the Deformable Image Registration (DIR) on the Intensity Modulated Radiation Therapy (IMRT) with Glottis Cancer (성문암 세기조절 방사선치료에서 변형영상정합을 이용한 선량변화 평가)

  • Kim, Woo Chul;Min, Chul Kee;Lee, Suk;Choi, Sang Hyoun;Cho, Kwang Hwan;Jung, Jae Hong;Kim, Eun Seog;Yeo, Seung-Gu;Kwon, Soo-Il;Lee, Kil-Dong
    • Progress in Medical Physics
    • /
    • v.25 no.3
    • /
    • pp.167-175
    • /
    • 2014
  • The purpose of this study is to evaluate the variation of the dose which is delivered to the patients with glottis cancer under IMRT (intensity modulated radiation therapy) by using the 3D registration with CBCT (cone beam CT) images and the DIR (deformable image registration) techniques. The CBCT images which were obtained at a one-week interval were reconstructed by using B-spline algorithm in DIR system, and doses were recalculated based on the newly obtained CBCT images. The dose distributions to the tumor and the critical organs were compared with reference. For the change of volume depending on weight at 3 to 5 weeks, there was increased of 1.38~2.04 kg on average. For the body surface depending on weight, there was decreased of 2.1 mm. The dose with transmitted to the carotid since three weeks was increased compared be more than 8.76% planned, and the thyroid gland was decreased to 26.4%. For the physical evaluation factors of the tumor, PITV, TCI, rDHI, mDHI, and CN were decreased to 4.32%, 5.78%, 44.54%, 12.32%, and 7.11%, respectively. Moreover, $D_{max}$, $D_{mean}$, $V_{67.50}$, and $D_{95}$ for PTV were increased or decreased to 2.99%, 1.52%, 5.78%, and 11.94%, respectively. Although there was no change of volume depending on weight, the change of body types occurred, and IMRT with the narrow composure margin sensitively responded to such a changing. For the glottis IMRT, the patient's weight changes should be observed and recorded to evaluate the actual dose distribution by using the DIR techniques, and more the adaptive treatment planning during the treatment course is needed to deliver the accurate dose to the patients.

Restoring Omitted Sentence Constituents in Encyclopedia Documents Using Structural SVM (Structural SVM을 이용한 백과사전 문서 내 생략 문장성분 복원)

  • Hwang, Min-Kook;Kim, Youngtae;Ra, Dongyul;Lim, Soojong;Kim, Hyunki
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.131-150
    • /
    • 2015
  • Omission of noun phrases for obligatory cases is a common phenomenon in sentences of Korean and Japanese, which is not observed in English. When an argument of a predicate can be filled with a noun phrase co-referential with the title, the argument is more easily omitted in Encyclopedia texts. The omitted noun phrase is called a zero anaphor or zero pronoun. Encyclopedias like Wikipedia are major source for information extraction by intelligent application systems such as information retrieval and question answering systems. However, omission of noun phrases makes the quality of information extraction poor. This paper deals with the problem of developing a system that can restore omitted noun phrases in encyclopedia documents. The problem that our system deals with is almost similar to zero anaphora resolution which is one of the important problems in natural language processing. A noun phrase existing in the text that can be used for restoration is called an antecedent. An antecedent must be co-referential with the zero anaphor. While the candidates for the antecedent are only noun phrases in the same text in case of zero anaphora resolution, the title is also a candidate in our problem. In our system, the first stage is in charge of detecting the zero anaphor. In the second stage, antecedent search is carried out by considering the candidates. If antecedent search fails, an attempt made, in the third stage, to use the title as the antecedent. The main characteristic of our system is to make use of a structural SVM for finding the antecedent. The noun phrases in the text that appear before the position of zero anaphor comprise the search space. The main technique used in the methods proposed in previous research works is to perform binary classification for all the noun phrases in the search space. The noun phrase classified to be an antecedent with highest confidence is selected as the antecedent. However, we propose in this paper that antecedent search is viewed as the problem of assigning the antecedent indicator labels to a sequence of noun phrases. In other words, sequence labeling is employed in antecedent search in the text. We are the first to suggest this idea. To perform sequence labeling, we suggest to use a structural SVM which receives a sequence of noun phrases as input and returns the sequence of labels as output. An output label takes one of two values: one indicating that the corresponding noun phrase is the antecedent and the other indicating that it is not. The structural SVM we used is based on the modified Pegasos algorithm which exploits a subgradient descent methodology used for optimization problems. To train and test our system we selected a set of Wikipedia texts and constructed the annotated corpus in which gold-standard answers are provided such as zero anaphors and their possible antecedents. Training examples are prepared using the annotated corpus and used to train the SVMs and test the system. For zero anaphor detection, sentences are parsed by a syntactic analyzer and subject or object cases omitted are identified. Thus performance of our system is dependent on that of the syntactic analyzer, which is a limitation of our system. When an antecedent is not found in the text, our system tries to use the title to restore the zero anaphor. This is based on binary classification using the regular SVM. The experiment showed that our system's performance is F1 = 68.58%. This means that state-of-the-art system can be developed with our technique. It is expected that future work that enables the system to utilize semantic information can lead to a significant performance improvement.