• Title/Summary/Keyword: Feature Learning

Search Result 1,916, Processing Time 0.028 seconds

Classification of hysteretic loop feature for runoff generation through a unsupervised machine learning algorithm (비지도 기계학습을 통한 유출 발생 내 이력 현상 구분)

  • Lee, Eunhyung;Jeon, Hangtak;Kim, Dahong;Friday, Bassey Bassey;Kim, Sanghyun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.360-360
    • /
    • 2022
  • 토양수분과 유출 간 관계를 정량화하는 것은 수문 기작 및 유출 발생 과정의 이해를 위한 중요한 정보를 제공한다. 특히, 유출과정의 특성화는 수문 사상에 따른 불포화대 내 토양수 및 토사 손실 제어와 산사태 및 비점오염원 발생 예측을 위해 필수적이다. 유출과정과 관련된 비선형성과 복잡성을 확인하기 위해 토양수분과 유출 사이의 이력 거동이 조사되었다. 특히, 수문 과정 내 이력 현상 구체화를 위해 정성적인 시각적 분류 및 정량적 평가를 위한 이력 지수들이 개발되었다. 정성적인 시각적 분류는 시간에 따라 시계 및 반시계방향으로 다중 루프 형상을 나누는 방식으로 진행되었고, 정량적 평가의 경우 이력 고리(Hysteretic loop) 내 상승 고리(Rising limb)와 하강 고리(Falling limb)의 차이를 기준으로 한 지수로 이력 현상을 특성화하였다. 이전에 제안된 방법론들은 연구자의 판단이 들어가기 때문에 보편적이지 않고 이력 현상을 개발된 지수에 맞춤에 따라 자료 손실이 나타나는 한계가 존재한다. 자료의 손실 없이 불포화대 내 발생 가능한 대표 이력 현상을 자동으로 추출하기 위해 적합한 비지도 학습기반 기계학습 방법론의 제안이 필요하다. 우리 연구에서는 국내 산지 사면에서 강우 사상 동안 다중 깊이(10, 30, 60cm)로 56개의 토양수분 측정지점에서 확보된 토양수분 시계열 자료와 산지 사면 내 위어를 통해 확보된 유출 시계열 자료를 사용하였다. 먼저, 기존에 분류 방법을 기반으로 계절 및 공간특성에 따라 지배적으로 발생하는 토양수분-유출 간 이력 현상을 특성화하였다. 다음으로, 토양수분-유출 간 이력 패턴을 자료 손실 없이 형상화하여 자동으로 데이터베이스화하는 알고리즘을 개발하였다. 마지막으로, 비지도 학습방법을 이용하여 데이터베이스화된 실제 발현 이력 현상 내 확률분포를 최대한 가깝게 추정하는 은닉층을 반복적인 재구성 학습을 통해 구현함으로써 대표 이력 현상 패턴을 추출하였다.

  • PDF

Ship Type Prediction using Random Forest with Limited Ship Information (제한적 선박 정보와 무작위의 숲 분류기를 이용한 선종 예측)

  • Ho-Kun Jeon;Jae Rim Han
    • Proceedings of the Korean Institute of Navigation and Port Research Conference
    • /
    • 2022.06a
    • /
    • pp.106-107
    • /
    • 2022
  • The ship type identification of the surrounding ship is important information for navigators and VTS officers since they can estimate the maneuverability and near-future route of the ships. However, it is more than frequent that the information is not provided due to transmission trouble and seafarers' unfamiliarity with AIS. Thus, this study suggests predicting ship types through the Random Forest classifier after preparing a training and test dataset that contains ship features and types. The AIS data for Ulsan coast in 2018 was used for this study. The method may provide the effect that many navigators and VTS officers discuss and share the experience of predicting ship types.

  • PDF

A deep and multiscale network for pavement crack detection based on function-specific modules

  • Guolong Wang;Kelvin C.P. Wang;Allen A. Zhang;Guangwei Yang
    • Smart Structures and Systems
    • /
    • v.32 no.3
    • /
    • pp.135-151
    • /
    • 2023
  • Using 3D asphalt pavement surface data, a deep and multiscale network named CrackNet-M is proposed in this paper for pixel-level crack detection for improvements in both accuracy and robustness. The CrackNet-M consists of four function-specific architectural modules: a central branch net (CBN), a crack map enhancement (CME) module, three pooling feature pyramids (PFP), and an output layer. The CBN maintains crack boundaries using no pooling reductions throughout all convolutional layers. The CME applies a pooling layer to enhance potential thin cracks for better continuity, consuming no data loss and attenuation when working jointly with CBN. The PFP modules implement direct down-sampling and pyramidal up-sampling with multiscale contexts specifically for the detection of thick cracks and exclusion of non-crack patterns. Finally, the output layer is optimized with a skip layer supervision technique proposed to further improve the network performance. Compared with traditional supervisions, the skip layer supervision brings about not only significant performance gains with respect to both accuracy and robustness but a faster convergence rate. CrackNet-M was trained on a total of 2,500 pixel-wise annotated 3D pavement images and finely scaled with another 200 images with full considerations on accuracy and efficiency. CrackNet-M can potentially achieve crack detection in real-time with a processing speed of 40 ms/image. The experimental results on 500 testing images demonstrate that CrackNet-M can effectively detect both thick and thin cracks from various pavement surfaces with a high level of Precision (94.28%), Recall (93.89%), and F-measure (94.04%). In addition, the proposed CrackNet-M compares favorably to other well-developed networks with respect to the detection of thin cracks as well as the removal of shoulder drop-offs.

Comparative Study of AI Models for Reliability Function Estimation in NPP Digital I&C System Failure Prediction (원전 디지털 I&C 계통 고장예측을 위한 신뢰도 함수 추정 인공지능 모델 비교연구)

  • DaeYoung Lee;JeongHun Lee;SeungHyeok Yang
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.28 no.6
    • /
    • pp.1-10
    • /
    • 2023
  • The nuclear power plant(NPP)'s Instrumentation and Control(I&C) system periodically conducts integrity checks for the maintenance of self-diagnostic function during normal operation. Additionally, it performs functionality and performance checks during planned preventive maintenance periods. However, there is a need for technological development to diagnose failures and prevent accidents in advance. In this paper, we studied methods for estimating the reliability function by utilizing environmental data and self-diagnostic data of the I&C equipment. To obtain failure data, we assumed probability distributions for component features of the I&C equipment and generated virtual failure data. Using this failure data, we estimated the reliability function using representative artificial intelligence(AI) models used in survival analysis(DeepSurve, DeepHit). And we also estimated the reliability function through the Cox regression model of the traditional semi-parametric method. We confirmed the feasibility through the residual lifetime calculations based on environmental and diagnostic data.

Analysis of Deep Learning-Based Pedestrian Environment Assessment Factors Using Urban Street View Images (도시 스트리트뷰 영상을 이용한 딥러닝 기반 보행환경 평가 요소 분석)

  • Ji-Yeon Hwang;Cheol-Ung Choi;Kwang-Woo Nam;Chang-Woo Lee
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.28 no.6
    • /
    • pp.45-52
    • /
    • 2023
  • Recently, as the importance of walking in daily life has been emphasized, projects to guarantee walking rights and create a pedestrian environment are being promoted throughout the region. In previous studies, a pedestrian environment assessment was conducted using Jeonju-si road images, and an image comparison pair data set was constructed. However, data sets expressed in numbers have difficulty in generalizing the judgment criteria of pedestrian environment assessors or visually identifying the pedestrian environment preferred by pedestrians. Therefore, this study proposes a method to interpret the results of the pedestrian environment assessment through data visualization by building a web application. According to the semantic segmentation result of analyzing the walking environment components that affect pedestrian environment assessors, it was confirmed that pedestrians did not prefer environments with a lot of "earth" and "grass," and preferred environments with "signboards" and "sidewalks." The proposed study is expected to identify and analyze the results randomly selected by participants in the future pedestrian environment evaluation, and believed that more improved accuracy can be obtained by pre-processing the data purification process.

An Ensemble Approach for Cyber Bullying Text messages and Images

  • Zarapala Sunitha Bai;Sreelatha Malempati
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.11
    • /
    • pp.59-66
    • /
    • 2023
  • Text mining (TM) is most widely used to find patterns from various text documents. Cyber-bullying is the term that is used to abuse a person online or offline platform. Nowadays cyber-bullying becomes more dangerous to people who are using social networking sites (SNS). Cyber-bullying is of many types such as text messaging, morphed images, morphed videos, etc. It is a very difficult task to prevent this type of abuse of the person in online SNS. Finding accurate text mining patterns gives better results in detecting cyber-bullying on any platform. Cyber-bullying is developed with the online SNS to send defamatory statements or orally bully other persons or by using the online platform to abuse in front of SNS users. Deep Learning (DL) is one of the significant domains which are used to extract and learn the quality features dynamically from the low-level text inclusions. In this scenario, Convolutional neural networks (CNN) are used for training the text data, images, and videos. CNN is a very powerful approach to training on these types of data and achieved better text classification. In this paper, an Ensemble model is introduced with the integration of Term Frequency (TF)-Inverse document frequency (IDF) and Deep Neural Network (DNN) with advanced feature-extracting techniques to classify the bullying text, images, and videos. The proposed approach also focused on reducing the training time and memory usage which helps the classification improvement.

Deep learning-based anomaly detection in acceleration data of long-span cable-stayed bridges

  • Seungjun Lee;Jaebeom Lee;Minsun Kim;Sangmok Lee;Young-Joo Lee
    • Smart Structures and Systems
    • /
    • v.33 no.2
    • /
    • pp.93-103
    • /
    • 2024
  • Despite the rapid development of sensors, structural health monitoring (SHM) still faces challenges in monitoring due to the degradation of devices and harsh environmental loads. These challenges can lead to measurement errors, missing data, or outliers, which can affect the accuracy and reliability of SHM systems. To address this problem, this study proposes a classification method that detects anomaly patterns in sensor data. The proposed classification method involves several steps. First, data scaling is conducted to adjust the scale of the raw data, which may have different magnitudes and ranges. This step ensures that the data is on the same scale, facilitating the comparison of data across different sensors. Next, informative features in the time and frequency domains are extracted and used as input for a deep neural network model. The model can effectively detect the most probable anomaly pattern, allowing for the timely identification of potential issues. To demonstrate the effectiveness of the proposed method, it was applied to actual data obtained from a long-span cable-stayed bridge in China. The results of the study have successfully verified the proposed method's applicability to practical SHM systems for civil infrastructures. The method has the potential to significantly enhance the safety and reliability of civil infrastructures by detecting potential issues and anomalies at an early stage.

The Systematization of Personality Education Contents in the 7th Curriculum for Home Economics (제7차 가정과 교육과정에 따른 학교 인성교육 내용 체계화 방안)

  • 왕석순
    • Journal of Korean Home Economics Education Association
    • /
    • v.16 no.2
    • /
    • pp.13-26
    • /
    • 2004
  • This study tried to suggest teaching and learning activities which can be effectively utilized in Home Economics Education by analyzing and suggesting “The objective and contents of personality education” in Home Economics Area in the curriculum of 7th Technology & Home Economics. As a result, Personality education can be implemented in all areas of Home Economics Education. Especially in Home Economics Education, the following personality education can be implemented. First of all, it can teach the equality among family members by teaching the values of equality and respecting human rights. Secondly, it can teach to recognize and implement various values related to environmental protection. Thirdly, it can teach the ethics related to care which was claimed by Gilligan and other people - in other words, charity, forgiveness, friendship, love, sacrifice, concession, conversation, compromise and etc. Especially these kind of values are extended to also the ethics of care for others, neighborhood, and community not only for family care which was the traditional Home Economics education criticized as family selfishness. On the other hand, personality education in Home Economics Education is different from other subjects. It enables students to act through experiences not just emotion or knowledge by learning actual relationships among family members in daily life. This kind of feature is proving the fact that Home Economics Education can play a very effective role in achieving the objective of moral behavior The results of this study proves that Home Economics Education is an effective subject which can conduct personality education by the objectives and contents that are different from other subjects. This provides the reasonable cause for Home Economics Education to be an required subject in school curricula. Future study should be conducted as an empirical research to develop personality programs(activities for teaching & learning) which can be implemented in Home Economics Education and to accumulate empirical data of such programs.

  • PDF

The way to make training data for deep learning model to recognize keywords in product catalog image at E-commerce (온라인 쇼핑몰에서 상품 설명 이미지 내의 키워드 인식을 위한 딥러닝 훈련 데이터 자동 생성 방안)

  • Kim, Kitae;Oh, Wonseok;Lim, Geunwon;Cha, Eunwoo;Shin, Minyoung;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.1-23
    • /
    • 2018
  • From the 21st century, various high-quality services have come up with the growth of the internet or 'Information and Communication Technologies'. Especially, the scale of E-commerce industry in which Amazon and E-bay are standing out is exploding in a large way. As E-commerce grows, Customers could get what they want to buy easily while comparing various products because more products have been registered at online shopping malls. However, a problem has arisen with the growth of E-commerce. As too many products have been registered, it has become difficult for customers to search what they really need in the flood of products. When customers search for desired products with a generalized keyword, too many products have come out as a result. On the contrary, few products have been searched if customers type in details of products because concrete product-attributes have been registered rarely. In this situation, recognizing texts in images automatically with a machine can be a solution. Because bulk of product details are written in catalogs as image format, most of product information are not searched with text inputs in the current text-based searching system. It means if information in images can be converted to text format, customers can search products with product-details, which make them shop more conveniently. There are various existing OCR(Optical Character Recognition) programs which can recognize texts in images. But existing OCR programs are hard to be applied to catalog because they have problems in recognizing texts in certain circumstances, like texts are not big enough or fonts are not consistent. Therefore, this research suggests the way to recognize keywords in catalog with the Deep Learning algorithm which is state of the art in image-recognition area from 2010s. Single Shot Multibox Detector(SSD), which is a credited model for object-detection performance, can be used with structures re-designed to take into account the difference of text from object. But there is an issue that SSD model needs a lot of labeled-train data to be trained, because of the characteristic of deep learning algorithms, that it should be trained by supervised-learning. To collect data, we can try labelling location and classification information to texts in catalog manually. But if data are collected manually, many problems would come up. Some keywords would be missed because human can make mistakes while labelling train data. And it becomes too time-consuming to collect train data considering the scale of data needed or costly if a lot of workers are hired to shorten the time. Furthermore, if some specific keywords are needed to be trained, searching images that have the words would be difficult, as well. To solve the data issue, this research developed a program which create train data automatically. This program can make images which have various keywords and pictures like catalog and save location-information of keywords at the same time. With this program, not only data can be collected efficiently, but also the performance of SSD model becomes better. The SSD model recorded 81.99% of recognition rate with 20,000 data created by the program. Moreover, this research had an efficiency test of SSD model according to data differences to analyze what feature of data exert influence upon the performance of recognizing texts in images. As a result, it is figured out that the number of labeled keywords, the addition of overlapped keyword label, the existence of keywords that is not labeled, the spaces among keywords and the differences of background images are related to the performance of SSD model. This test can lead performance improvement of SSD model or other text-recognizing machine based on deep learning algorithm with high-quality data. SSD model which is re-designed to recognize texts in images and the program developed for creating train data are expected to contribute to improvement of searching system in E-commerce. Suppliers can put less time to register keywords for products and customers can search products with product-details which is written on the catalog.

Optimal supervised LSA method using selective feature dimension reduction (선택적 자질 차원 축소를 이용한 최적의 지도적 LSA 방법)

  • Kim, Jung-Ho;Kim, Myung-Kyu;Cha, Myung-Hoon;In, Joo-Ho;Chae, Soo-Hoan
    • Science of Emotion and Sensibility
    • /
    • v.13 no.1
    • /
    • pp.47-60
    • /
    • 2010
  • Most of the researches about classification usually have used kNN(k-Nearest Neighbor), SVM(Support Vector Machine), which are known as learn-based model, and Bayesian classifier, NNA(Neural Network Algorithm), which are known as statistics-based methods. However, there are some limitations of space and time when classifying so many web pages in recent internet. Moreover, most studies of classification are using uni-gram feature representation which is not good to represent real meaning of words. In case of Korean web page classification, there are some problems because of korean words property that the words have multiple meanings(polysemy). For these reasons, LSA(Latent Semantic Analysis) is proposed to classify well in these environment(large data set and words' polysemy). LSA uses SVD(Singular Value Decomposition) which decomposes the original term-document matrix to three different matrices and reduces their dimension. From this SVD's work, it is possible to create new low-level semantic space for representing vectors, which can make classification efficient and analyze latent meaning of words or document(or web pages). Although LSA is good at classification, it has some drawbacks in classification. As SVD reduces dimensions of matrix and creates new semantic space, it doesn't consider which dimensions discriminate vectors well but it does consider which dimensions represent vectors well. It is a reason why LSA doesn't improve performance of classification as expectation. In this paper, we propose new LSA which selects optimal dimensions to discriminate and represent vectors well as minimizing drawbacks and improving performance. This method that we propose shows better and more stable performance than other LSAs' in low-dimension space. In addition, we derive more improvement in classification as creating and selecting features by reducing stopwords and weighting specific values to them statistically.

  • PDF