• Title/Summary/Keyword: Automatic validation

Search Result 180, Processing Time 0.038 seconds

Grading System of Movie Review through the Use of An Appraisal Dictionary and Computation of Semantic Segments (감정어휘 평가사전과 의미마디 연산을 이용한 영화평 등급화 시스템)

  • Ko, Min-Su;Shin, Hyo-Pil
    • Korean Journal of Cognitive Science
    • /
    • v.21 no.4
    • /
    • pp.669-696
    • /
    • 2010
  • Assuming that the whole meaning of a document is a composition of the meanings of each part, this paper proposes to study the automatic grading of movie reviews which contain sentimental expressions. This will be accomplished by calculating the values of semantic segments and performing data classification for each review. The ARSSA(The Automatic Rating System for Sentiment analysis using an Appraisal dictionary) system is an effort to model decision making processes in a manner similar to that of the human mind. This aims to resolve the discontinuity between the numerical ranking and textual rationalization present in the binary structure of the current review rating system: {rate: review}. This model can be realized by performing analysis on the abstract menas extracted from each review. The performance of this system was experimentally calculated by performing a 10-fold Cross-Validation test of 1000 reviews obtained from the Naver Movie site. The system achieved an 85% F1 Score when compared to predefined values using a predefined appraisal dictionary.

  • PDF

Automatic Construction of SHACL Schemas for RDF Knowledge Graphs Generated by R2RML Mappings

  • Choi, Ji-Woong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.8
    • /
    • pp.9-21
    • /
    • 2020
  • With the proliferation of RDF knowledge graphs(KGs), there arose a need of a standardized schema representation of the graph model for effective data interchangeability and interoperability. The need resulted in the development of SHACL specification to describe and validate RDF graph's structure by W3C. Relational databases(RDBs) are one of major sources for acquiring structured knowledge. The standard for automatic generation of RDF KGs from RDBs is R2RML, which is also developed by W3C. Since R2RML is designed to generate only RDF data graphs from RDBs, additional manual tasks are required to create the schemas for the graphs. In this paper we propose an approach to automatically generate SHACL schemas for RDF KGs populated by R2RML mappings. The key of our approach is that the SHACL shemas are built only from R2RML documents. We describe an implementation of our appraoch. Then, we show the validity of our approach with R2RML test cases designed by W3C.

Korean Syntactic Rules using Composite Labels (복합 레이블을 적용한 한국어 구문 규칙)

  • 김성용;이공주;최기선
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.2
    • /
    • pp.235-244
    • /
    • 2004
  • We propose a format of a binary phrase structure grammar with composite labels. The grammar adopts binary rules so that the dependency between two sub-trees can be represented in the label of the tree. The label of a tree is composed of two attributes, each of which is extracted from each sub-tree so that it can represent the compositional information of the tree. The composite label is generated from part-of-speech tags using an automatic labeling algorithm. Since the proposed rule description scheme is binary and uses only part-of-speech information, it can readily be used in dependency grammar and be applied to other languages as well. In the best-1 context-free cross validation on 31,080 tree-tagged corpus, the labeled precision is 79.30%, which outperforms phrase structure grammar and dependency grammar by 5% and by 4%, respectively. It shows that the proposed rule description scheme is effective for parsing Korean.

A Novel, Deep Learning-Based, Automatic Photometric Analysis Software for Breast Aesthetic Scoring

  • Joseph Kyu-hyung Park;Seungchul Baek;Chan Yeong Heo;Jae Hoon Jeong;Yujin Myung
    • Archives of Plastic Surgery
    • /
    • v.51 no.1
    • /
    • pp.30-35
    • /
    • 2024
  • Background Breast aesthetics evaluation often relies on subjective assessments, leading to the need for objective, automated tools. We developed the Seoul Breast Esthetic Scoring Tool (S-BEST), a photometric analysis software that utilizes a DenseNet-264 deep learning model to automatically evaluate breast landmarks and asymmetry indices. Methods S-BEST was trained on a dataset of frontal breast photographs annotated with 30 specific landmarks, divided into an 80-20 training-validation split. The software requires the distances of sternal notch to nipple or nipple-to-nipple as input and performs image preprocessing steps, including ratio correction and 8-bit normalization. Breast asymmetry indices and centimeter-based measurements are provided as the output. The accuracy of S-BEST was validated using a paired t-test and Bland-Altman plots, comparing its measurements to those obtained from physical examinations of 100 females diagnosed with breast cancer. Results S-BEST demonstrated high accuracy in automatic landmark localization, with most distances showing no statistically significant difference compared with physical measurements. However, the nipple to inframammary fold distance showed a significant bias, with a coefficient of determination ranging from 0.3787 to 0.4234 for the left and right sides, respectively. Conclusion S-BEST provides a fast, reliable, and automated approach for breast aesthetic evaluation based on 2D frontal photographs. While limited by its inability to capture volumetric attributes or multiple viewpoints, it serves as an accessible tool for both clinical and research applications.

Developing Surface Water Quality Modeling Framework Considering Spatial Resolution of Pollutant Load Estimation for Saemangeum Using HSPF (오염원 산정단위 수준의 소유역 세분화를 고려한 새만금유역 수문·수질모델링 적용성 검토)

  • Seong, Chounghyun;Hwang, Syewoon;Oh, Chansung;Cho, Jaepil
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.59 no.3
    • /
    • pp.83-96
    • /
    • 2017
  • This study presented a surface water quality modeling framework considering the spatial resolution of pollutant load estimation to better represent stream water quality characteristics in the Saemangeum watershed which has been focused on keeping its water resources sustainable after the Saemangeum embankment construction. The watershed delineated into 804 sub-watersheds in total based on the administrative districts, which were units for pollutant load estimation and counted as 739 in the watershed, Digital Elevation Model (DEM), and agricultural structures such as drainage canal. The established model consists of 7 Mangyung (MG) sub-models, 7 Dongjin (DJ) sub-models, and 3 Reclaimed sub-models, and the sub-models were simulated in a sequence of upstream to downstream based on its connectivity. The hydrologic calibration and validation of the model were conducted from 14 flow stations for the period of 2009 and 2013 using an automatic calibration scheme. The model performance to the hydrologic stations for calibration and validation showed that the Nash-Sutcliffe coefficient (NSE) ranged from 0.66 to 0.97, PBIAS were -31.0~16.5 %, and $R^2$ were from 0.75 to 0.98, respectively in a monthly time step and therefore, the model showed its hydrological applicability to the watershed. The water quality calibration and validation were conducted based on the 29 stations with the water quality constituents of DO, BOD, TN, and TP during the same period with the flow. The water quality model were manually calibrated, and generally showed an applicability by resulting reasonable variability and seasonality, although some exceptional simulation results were identified in some upstream stations under low-flow conditions. The spatial subdivision in the model framework were compared with previous studies to assess the consideration of administrative boundaries for watershed delineation, and this study outperformed in flow, but showed a similar level of model performance in water quality. The framework presented here can be applicable in a regional scale watershed as well as in a need of fine-resolution simulation.

Automatic gasometer reading system using selective optical character recognition (관심 문자열 인식 기술을 이용한 가스계량기 자동 검침 시스템)

  • Lee, Kyohyuk;Kim, Taeyeon;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.1-25
    • /
    • 2020
  • In this paper, we suggest an application system architecture which provides accurate, fast and efficient automatic gasometer reading function. The system captures gasometer image using mobile device camera, transmits the image to a cloud server on top of private LTE network, and analyzes the image to extract character information of device ID and gas usage amount by selective optical character recognition based on deep learning technology. In general, there are many types of character in an image and optical character recognition technology extracts all character information in an image. But some applications need to ignore non-of-interest types of character and only have to focus on some specific types of characters. For an example of the application, automatic gasometer reading system only need to extract device ID and gas usage amount character information from gasometer images to send bill to users. Non-of-interest character strings, such as device type, manufacturer, manufacturing date, specification and etc., are not valuable information to the application. Thus, the application have to analyze point of interest region and specific types of characters to extract valuable information only. We adopted CNN (Convolutional Neural Network) based object detection and CRNN (Convolutional Recurrent Neural Network) technology for selective optical character recognition which only analyze point of interest region for selective character information extraction. We build up 3 neural networks for the application system. The first is a convolutional neural network which detects point of interest region of gas usage amount and device ID information character strings, the second is another convolutional neural network which transforms spatial information of point of interest region to spatial sequential feature vectors, and the third is bi-directional long short term memory network which converts spatial sequential information to character strings using time-series analysis mapping from feature vectors to character strings. In this research, point of interest character strings are device ID and gas usage amount. Device ID consists of 12 arabic character strings and gas usage amount consists of 4 ~ 5 arabic character strings. All system components are implemented in Amazon Web Service Cloud with Intel Zeon E5-2686 v4 CPU and NVidia TESLA V100 GPU. The system architecture adopts master-lave processing structure for efficient and fast parallel processing coping with about 700,000 requests per day. Mobile device captures gasometer image and transmits to master process in AWS cloud. Master process runs on Intel Zeon CPU and pushes reading request from mobile device to an input queue with FIFO (First In First Out) structure. Slave process consists of 3 types of deep neural networks which conduct character recognition process and runs on NVidia GPU module. Slave process is always polling the input queue to get recognition request. If there are some requests from master process in the input queue, slave process converts the image in the input queue to device ID character string, gas usage amount character string and position information of the strings, returns the information to output queue, and switch to idle mode to poll the input queue. Master process gets final information form the output queue and delivers the information to the mobile device. We used total 27,120 gasometer images for training, validation and testing of 3 types of deep neural network. 22,985 images were used for training and validation, 4,135 images were used for testing. We randomly splitted 22,985 images with 8:2 ratio for training and validation respectively for each training epoch. 4,135 test image were categorized into 5 types (Normal, noise, reflex, scale and slant). Normal data is clean image data, noise means image with noise signal, relfex means image with light reflection in gasometer region, scale means images with small object size due to long-distance capturing and slant means images which is not horizontally flat. Final character string recognition accuracies for device ID and gas usage amount of normal data are 0.960 and 0.864 respectively.

Retrieving Protein Domain Encoding DNA Sequences Automatically Through Database Cross-referencing

  • Choi, Yoon-Sup;Yang, Jae-Seong;Ryu, Sung-Ho;Kim, Sang-Uk
    • Bioinformatics and Biosystems
    • /
    • v.1 no.2
    • /
    • pp.95-98
    • /
    • 2006
  • Recent proteomic studies of protein domains require high-throughput and systematic approaches. Since most experiments using protein domains, the modules of protein-protein interactions, require gene cloning, the first experimental step should be retrieving DNA sequences of domain encoding regions from databases. For a large scale proteomic research, however, it is a laborious task to extract a large number of domain sequences manually from several inter-linked databases. We present a new methodology to retrieve DNA sequences of domain encoding regions through automatic database cross-referencing. To extract protein domain encoding regions, it traverses several inter-connected database with validation process. And we applied this method to retrieve all the EGF domain encoding DNA sequences of homo sapiens. This new algorithm was implemented using Python library PAMIE, which enables to cross-reference across distinct databases automatically.

  • PDF

Develoment of high-sensitivity wireless strain sensor for structural health monitoring

  • Jo, Hongki;Park, Jong-Woong;Spencer, B.F. Jr.;Jung, Hyung-Jo
    • Smart Structures and Systems
    • /
    • v.11 no.5
    • /
    • pp.477-496
    • /
    • 2013
  • Due to their cost-effectiveness and ease of installation, wireless smart sensors (WSS) have received considerable recent attention for structural health monitoring of civil infrastructure. Though various wireless smart sensor networks (WSSN) have been successfully implemented for full-scale structural health monitoring (SHM) applications, monitoring of low-level ambient strain still remains a challenging problem for WSS due to A/D converter (ADC) resolution, inherent circuit noise, and the need for automatic operation. In this paper, the design and validation of high-precision strain sensor board for the Imote2 WSS platform and its application to SHM of a cable-stayed bridge are presented. By accurate and automated balancing of the Wheatstone bridge, signal amplification of up to 2507-times can be obtained, while keeping signal mean close to the center of the ADC span, which allows utilization of the full span of the ADC. For better applicability to SHM for real-world structures, temperature compensation and shunt calibration are also implemented. Moreover, the sensor board has been designed to accommodate a friction-type magnet strain sensor, in addition to traditional foil-type strain gages, facilitating fast and easy deployment. The wireless strain sensor board performance is verified through both laboratory-scale tests and deployment on a full-scale cable-stayed bridge.

Composition and Use of Biosafety Level 3 Facility (생물안전 3등급 연구시설의 구성 및 이용)

  • Kim, Changhwan;Hur, Gyeunghaeng;Lee, Wangeol;Jung, Seongtae
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.18 no.3
    • /
    • pp.335-342
    • /
    • 2015
  • Laboratory facilities for biology are designed as biosafety level 1, biosafety level 2, biosafety level 3, and biosafety level 4. Biosafety level designations are based on a composite of the design features, construction, containment facilities, equipment, practice and operation procedures required for working with agents from the various risk groups. Generally, biosafety level 3 means the facility that is appropriate for the experiments using pathogens which can cause serious diseases by aerosol transmission. The biosafety level assigned for the specific work to be done is driven by professional judgement based on a risk assessment, rather than by automatic assignment according to the particular risk group designation of the pathogenic agents to be used. In this paper, we introduced the biosafety level 3 facility operated in ADD(Agency for defense development). It contains the overview of facility, microbiological experiment, animal experiment, decontamination and waste disposal. Biosafety level 3 laboratory in ADD has served the vital role in the research of biological agents and antidote development.

On variable bandwidth Kernel Regression Estimation (변수평활량을 이용한 커널회귀함수 추정)

  • Seog, Kyung-Ha;Chung, Sung-Suk;Kim, Dae-Hak
    • Journal of the Korean Data and Information Science Society
    • /
    • v.9 no.2
    • /
    • pp.179-188
    • /
    • 1998
  • Local polynomial regression estimation is the most popular one among kernel type regression estimator. In local polynomial regression function esimation bandwidth selection is crucial problem like the kernel estimation. When the regression curve has complicated structure variable bandwidth selection will be appropriate. In this paper, we propose a variable bandwidth selection method fully data driven. We will choose the bandwdith by selecting minimising estiamted MSE which is estimated by the pilot bandwidth study via croos-validation method. Monte carlo simulation was conducted in order to show the superiority of proposed bandwidth selection method.

  • PDF