• Title/Summary/Keyword: Term Extraction

Search Result 339, Processing Time 0.026 seconds

An Automatic Extraction of English-Korean Bilingual Terms by Using Word-level Presumptive Alignment (단어 단위의 추정 정렬을 통한 영-한 대역어의 자동 추출)

  • Lee, Kong Joo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.6
    • /
    • pp.433-442
    • /
    • 2013
  • A set of bilingual terms is one of the most important factors in building language-related applications such as a machine translation system and a cross-lingual information system. In this paper, we introduce a new approach that automatically extracts candidates of English-Korean bilingual terms by using a bilingual parallel corpus and a basic English-Korean lexicon. This approach can be useful even though the size of the parallel corpus is small. A sentence alignment is achieved first for the document-level parallel corpus. We can align words between a pair of aligned sentences by referencing a basic bilingual lexicon. For unaligned words between a pair of aligned sentences, several assumptions are applied in order to align bilingual term candidates of two languages. A location of a sentence, a relation between words, and linguistic information between two languages are examples of the assumptions. An experimental result shows approximately 71.7% accuracy for the English-Korean bilingual term candidates which are automatically extracted from 1,000 bilingual parallel corpus.

Earthquake events classification using convolutional recurrent neural network (합성곱 순환 신경망 구조를 이용한 지진 이벤트 분류 기법)

  • Ku, Bonhwa;Kim, Gwantae;Jang, Su;Ko, Hanseok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.6
    • /
    • pp.592-599
    • /
    • 2020
  • This paper proposes a Convolutional Recurrent Neural Net (CRNN) structure that can simultaneously reflect both static and dynamic characteristics of seismic waveforms for various earthquake events classification. Addressing various earthquake events, including not only micro-earthquakes and artificial-earthquakes but also macro-earthquakes, requires both effective feature extraction and a classifier that can discriminate seismic waveform under noisy environment. First, we extract the static characteristics of seismic waveform through an attention-based convolution layer. Then, the extracted feature-map is sequentially injected as input to a multi-input single-output Long Short-Term Memory (LSTM) network structure to extract the dynamic characteristic for various seismic event classifications. Subsequently, we perform earthquake events classification through two fully connected layers and softmax function. Representative experimental results using domestic and foreign earthquake database show that the proposed model provides an effective structure for various earthquake events classification.

Development and Evaluation of a Badge-type Passive Sampler for the Measurement of Short-term Nitrogen Dioxide in Ambient Air (대기 중 이산화질소의 단기 측정을 위한 뱃지형 passive sampler의 개발 및 평가)

  • Kim Sun Kyu;Yim Bong Been;Jung Eui Suk;Kim Sun Tae
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.22 no.1
    • /
    • pp.117-126
    • /
    • 2006
  • The purpose of this study is to develop a badge-type passive sampler for the measurement of short-term nitrogen dioxide and to evaluate its performance. The principle of the method is a colorimetric reaction of nitrogen dioxide with sulfanilic acid, N-1-naphthylethylendiamine, and phosphoric acid. First, it has been shown that the filter paper should be rinsed with ultrapure water and ultrasound, and then dried in a vacuumed desiccator. The concentration and volume of absorption reagent (triethanolamine) were $20\%$ and 100 ${\mu}L$, respectively. The extraction time was determined as 60 min. Second, duplicate measurements (n= 116) were carried out for evaluating the precision of the passive sampler. The relative error and the correlation coefficient between duplicates are $3.4\pm 3.0\%$ and 0.994, respectively. In addition, the $95\%$ confidence interval of intraclass correlation coefficient and the estimated value are 0.992$\sim$0.996 and 0.994, respectively. Third, a paired t-test was carried out for evaluating the accuracy of the passive sampler (n=40). In the result of the test, the $95\%$ confidence interval of the difference was -1.710 ppb <$\gamma$< 0.788 ppb. Finally, the average concentration of blanks, measurement detection limit, limit of detection, and limit of quantification are $2.4\pm 0.4$ ppb, 104 ppb, 3.8 ppb, and 7.0 ppb, respectively.

Characteristics of thunderstorms relevant to the wind loading of structures

  • Solari, Giovanni;Burlando, Massimiliano;De Gaetano, Patrizia;Repetto, Maria Pia
    • Wind and Structures
    • /
    • v.20 no.6
    • /
    • pp.763-791
    • /
    • 2015
  • "Wind and Ports" is a European project that has been carried out since 2009 to handle wind forecast in port areas through an integrated system made up of an extensive in-situ wind monitoring network, the numerical simulation of wind fields, the statistical analysis of wind climate, and algorithms for medium-term (1-3 days) and short term (0.5-2 hours) wind forecasting. The in-situ wind monitoring network, currently made up of 22 ultrasonic anemometers, provides a unique opportunity for detecting high resolution thunderstorm records and studying their dominant characteristics relevant to wind engineering with special concern for wind actions on structures. In such a framework, the wind velocity of thunderstorms is firstly decomposed into the sum of a slowly-varying mean part plus a residual fluctuation dealt with as a non-stationary random process. The fluctuation, in turn, is expressed as the product of its slowly-varying standard deviation by a reduced turbulence component dealt with as a rapidly-varying stationary Gaussian random process with zero mean and unit standard deviation. The extraction of the mean part of the wind velocity is carried out through a moving average filter, and the effect of the moving average period on the statistical properties of the decomposed signals is evaluated. Among other aspects, special attention is given to the thunderstorm duration, the turbulence intensity, the power spectral density and the integral length scale. Some noteworthy wind velocity ratios that play a crucial role in the thunderstorm loading and response of structures are also analyzed.

Design of Very Short-term Precipitation Forecasting Classifier Based on Polynomial Radial Basis Function Neural Networks for the Effective Extraction of Predictive Factors (예보인자의 효과적 추출을 위한 다항식 방사형 기저 함수 신경회로망 기반 초단기 강수예측 분류기의 설계)

  • Kim, Hyun-Myung;Oh, Sung-Kwun;Kim, Hyun-Ki
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.64 no.1
    • /
    • pp.128-135
    • /
    • 2015
  • In this study, we develop the very short-term precipitation forecasting model as well as classifier based on polynomial radial basis function neural networks by using AWS(Automatic Weather Station) and KLAPS(Korea Local Analysis and Prediction System) meteorological data. The polynomial-based radial basis function neural networks is designed to realize precipitation forecasting model as well as classifier. The structure of the proposed RBFNNs consists of three modules such as condition, conclusion, and inference phase. The input space of the condition phase is divided by using Fuzzy C-means(FCM) and the local area of the conclusion phase is represented as four types of polynomial functions. The coefficients of connection weights are estimated by weighted least square estimation(WLSE) for modeling as well as least square estimation(LSE) method for classifier. The final output of the inference phase is obtained through fuzzy inference method. The essential parameters of the proposed model and classifier such ad input variable, polynomial order type, the number of rules, and fuzzification coefficient are optimized by means of Particle Swarm Optimization(PSO) and Differential Evolution(DE). The performance of the proposed precipitation forecasting system is evaluated by using KLAPS meteorological data.

Prevelance of Common YMDD Motif Mutations in Long Term Treated Chronic HBV Infections in a Turkish Population

  • Alagozlu, Hakan;Ozdemir, Ozturk;Koksal, Binnur;Yilmaz, Abdulkerim;Coskun, Mahmut
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.14 no.9
    • /
    • pp.5489-5494
    • /
    • 2013
  • In the current study we aimed to show the common YMDD motif mutations in viral polymerase gene in chronic hepatitis B patients during lamivudine and adefovir therapy. Forty-one serum samples obtained from chronic hepatitis B patients (24 male, 17 female; age range: 34-68 years) were included in the study. HBV-DNA was extracted from the peripheral blood of the patients using an extraction kit (Invisorb, Instant Spin DNA/RNA Virus Mini Kit, Germany). A line probe assay and direct sequencing analyses (INNO-LIPA HBV DR v2; INNOGENETICS N.V, Ghent, Belgium) were applied to determine target mutations of the viral polymerase gene in positive HBV-DNA samples. A total of 41 mutations located in 21 different codons were detected in the current results. In 17 (41.5%) patients various point mutations were detected leading to lamivudin, adefovir and/or combined drug resistance. Wild polymerase gene profiles were detected in 24 (58.5%) HBV positive patients of the current cohort. Eight of the 17 samples (19.5%) having rtM204V/I/A missense transition and/or transversion point mutations and resistance to lamivudin. Six of the the mutated samples (14.6%) having rtL180M missense transversion mutation and resistance to combined adefovir and lamivudin. Three of the mutated samples (7.5%) having rtG215H by the double base substituation and resistance to adefovir. Three of the mutated samples (7.5%) having codon rtL181W due to the missense transversion point mutations and showed resistance to combined adefovir and lamivudin. Unreported novel point mutations were detected in the different codons of polymerase gene region in the current HBV positive cohort fromTurkish population. The current results provide evidence that rtL180M and rtM204V/I/A mutations of HBV-DNA may be associated with a poor antiviral response and HBV chronicity during conventional therapy in Turkish patients.

A Study on the Development of Thesaurus Using Terminological Definitions (용어 정의를 도입한 시소러스 개발 연구)

  • 김태수
    • Journal of the Korean Society for information Management
    • /
    • v.18 no.2
    • /
    • pp.231-254
    • /
    • 2001
  • As contemporary thesauri have become large and complex, it is increasingly difficult to assess the intended meaning of each one of the terms. Thereby the meaning of many descriptors seems to be very similar, and it is often not possible to distinguish among them and to identify the term correctly. Purpose of this article is to induce definitions of descriptors in thesaurus by specifying the characteristics of each concept, locating it in the domain and providing clear and prescriptive information on the meaning of each descriptor in the form of a standardized terminological definition. In this study, a small prototype thesaurus using definition of term in the field of information industry in Korean Standards has been developed. In this thesaurus definitions are written for each descriptor with the help of a proposed defining model and in accordance with defining rules borrowed from the field of terminology. In addition, elements of analyzed definition have been included in the relation structure of descriptors. It is revealed that terminological definition added to thesaurus may permit extraction of separate items of information from definitions for the representation of knowledge structures and makes it easier to confine the scope of descriptors to be included in thesaurus in a given subject field.

  • PDF

Spatial analysis of Shoreline change in Northwest coast of Taean Peninsula

  • Yun, MyungHyun;Choi, ChulUong
    • Korean Journal of Remote Sensing
    • /
    • v.31 no.1
    • /
    • pp.29-38
    • /
    • 2015
  • The coastline influenced naturally and artificially changes dynamically. While the long-term change is influenced by the rise in the surface of the sea and the changes in water level of the rivers, the short-term change is influenced by the tide, earthquake and storm. Also, man-made thoughtless development such as construction of embankment and reclaimed land not considering erosion and deformation of coast has been causes for breaking functions of coast and damages on natural environment. In order to manage coastal environment and resources effectively, In this study is intended to analyze and predict erosion in coastal environment and changes in sedimentation quantitatively by detecting changes in coastal line from data collection for satellite images and aerial LiDAR data. The coastal line in 2007 and 2012 was extracted by manufacturing Digital Surface Model (DSM) with Aviation LiDAR materials. For the coastal line in 2009 and 2010, Normalized Difference Vegetation Index (NDVI) method was used to extract the KOMPSAT-2 image selected after considering tide level and wave height. The change rate of the coastal line is varied in line with the forms of the observation target but most of topography shows a tendency of being eroded as time goes by. Compared to the relatively monotonous beach of Taean, the gravel and rock has very complex form. Therefore, there are more errors in extraction of coastlines and the combination of transect and shoreline, which affect overall changes. Thus, we think the correction of the anomalies caused by these properties is required in the future research.

An XML Tag Indexing Method Using on Lexical Similarity (XML 태그를 분류에 따른 가중치 결정)

  • Jeong, Hye-Jin;Kim, Yong-Sung
    • The KIPS Transactions:PartB
    • /
    • v.16B no.1
    • /
    • pp.71-78
    • /
    • 2009
  • For more effective index extraction and index weight determination, studies of extracting indices are carried out by using document content as well as structure. However, most of studies are concentrating in calculating the importance of context rather than that of XML tag. These conventional studies determine its importance from the aspect of common sense rather than verifying that through an objective experiment. This paper, for the automatic indexing by using the tag information of XML document that has taken its place as the standard for web document management, classifies major tags of constructing a paper according to its importance and calculates the term weight extracted from the tag of low weight. By using the weight obtained, this paper proposes a method of calculating the final weight while updating the term weight extracted from the tag of high weight. In order to determine more objective weight, this paper tests the tag that user considers as important and reflects it in calculating the weight by classifying its importance according to the result. Then by comparing with the search performance while using the index weight calculated by applying a method of determining existing tag importance, it verifies effectiveness of the index weight calculated by applying the method proposed in this paper.

Concept Extraction Technique from Documents Using Domain Ontology (지식 문서에서 도메인 온톨로지를 이용한 개념 추출 기법)

  • Mun Hyeon-Jeong;Woo Yong-Tae
    • The KIPS Transactions:PartD
    • /
    • v.13D no.3 s.106
    • /
    • pp.309-316
    • /
    • 2006
  • We propose a novel technique to categorize XML documents and extract a concept efficiently using domain ontology. First, we create domain ontology that use text mining technique and statistical technique. We propose a DScore technique to classify XML documents by using the structural characteristic of XML document. We also present TScore technique to extract a concept by comparing the association term set of domain ontology and the terms in the XML document. To verify the efficiency of the proposed technique, we perform experiment for 295 papers in the computer science area. The results of experiment show that the proposed technique using the structural information in the XML documents is more efficient than the existing technique. Especially, the TScore technique effectively extract the concept of documents although frequency of term is few. Hence, the proposed concept-based retrieval techniques can be expected to contribute to the development of an efficient ontology-based knowledge management system.