• Title/Summary/Keyword: Co-occurrence

Search Result 1,027, Processing Time 0.037 seconds

Korean Probabilistic Syntactic Model using Head Co-occurrence (중심어 간의 공기정보를 이용한 한국어 확률 구문분석 모델)

  • Lee, Kong-Joo;Kim, Jae-Hoon
    • The KIPS Transactions:PartB
    • /
    • v.9B no.6
    • /
    • pp.809-816
    • /
    • 2002
  • Since a natural language has inherently structural ambiguities, one of the difficulties of parsing is resolving the structural ambiguities. Recently, a probabilistic approach to tackle this disambiguation problem has received considerable attention because it has some attractions such as automatic learning, wide-coverage, and robustness. In this paper, we focus on Korean probabilistic parsing model using head co-occurrence. We are apt to meet the data sparseness problem when we're using head co-occurrence because it is lexical. Therefore, how to handle this problem is more important than others. To lighten the problem, we have used the restricted and simplified phrase-structure grammar and back-off model as smoothing. The proposed model has showed that the accuracy is about 84%.

Multi-Topic Meeting Summarization using Lexical Co-occurrence Frequency and Distribution (어휘의 동시 발생 빈도와 분포를 이용한 다중 주제 회의록 요약)

  • Lee, Byung-Soo;Lee, Jee-Hyong
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2015.07a
    • /
    • pp.13-16
    • /
    • 2015
  • 본 논문에서는 어휘의 동시 발생 (co-occurrence) 빈도와 분포를 이용한 회의록 요약방법을 제안한다. 회의록은 일반 문서와 달리 문서에 여러 세부적인 주제들이 나타나며, 잘못된 형식의 문장, 불필요한 잡담들을 포함하고 있기 때문에 이러한 특징들이 문서요약 과정에서 고려되어야 한다. 기존의 일반적인 문서요약 방법은 하나의 주제를 기반으로 문서 전체에서 가장 중요한 문장으로 요약하기 때문에 다중 주제 회의록 요약에는 적합하지 않다. 제안한 방법은 먼저 어휘의 동시 발생 (co-occurrence) 빈도를 이용하여 회의록 분할 (segmentation) 과정을 수행한다. 다음으로 주제의 구분에 따라 분할된 각 영역 (block)의 중요 단어 집합 생성, 중요 문장 추출 과정을 통해 회의록의 중요 문장들을 선별한다. 마지막으로 추출된 중요 문장들의 위치, 종속 관계를 고려하여 최종적으로 회의록을 요약한다. AMI meeting corpus를 대상으로 실험한 결과, 제안한 방법이 baseline 요약 방법들보다 요약 비율에 따른 평가 및 요약문의 세부 주제별 평가에서 우수한 요약 성능을 보임을 확인하였다.

  • PDF

A Study for the Generation of the Lightweight Ontologies (경량 온톨로지 생성 연구)

  • Han, Dong-Il;Kwon, Hyeong-In;Baek, Sun-Kyoung
    • Journal of Information Technology Services
    • /
    • v.8 no.1
    • /
    • pp.203-215
    • /
    • 2009
  • This paper illustrates the application of co-occurrence theory to generate lightweight ontologies semi-automatically. The proposed model includes three steps of a (Semi-) Automatic creation of Ontology; (they are conceptually named as) the Syntactic-based Ontology, the Semantic-based Ontology and the Ontology Refinement. Each of these three steps are designed to interactively work together, so as to generate Lightweight Ontologies. The Syntactic-based Ontology step includes generating Association words using co-occurrence in web documents. The Semantic-based Ontology step includes the Alignment large Association words with small Ontology, through the process of semantic relations by contextual terms. Finally, the Ontology Refinement step includes the domain expert to refine the lightweight Ontologies. We also conducted a case study to generate lightweight ontologies in specific domains(news domain). In this paper, we found two directions including (1) employment co-occurrence theory to generate Syntactic-based Ontology automatically and (2) Alignment large Association words with small Ontology to generate lightweight ontologies semi-automatically. So far as the design and the generation of big Ontology is concerned, the proposed research will offer useful implications to the researchers and practitioners so as to improve the research level to the commercial use.

Texture analysis of Thyroid Nodules in Ultrasound Image for Computer Aided Diagnostic system (컴퓨터 보조진단을 위한 초음파 영상에서 갑상선 결절의 텍스쳐 분석)

  • Park, Byung eun;Jang, Won Seuk;Yoo, Sun Kook
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.1
    • /
    • pp.43-50
    • /
    • 2017
  • According to living environment, the number of deaths due to thyroid diseases increased. In this paper, we proposed an algorithm for recognizing a thyroid detection using texture analysis based on shape, gray level co-occurrence matrix and gray level run length matrix. First of all, we segmented the region of interest (ROI) using active contour model algorithm. Then, we applied a total of 18 features (5 first order descriptors, 10 Gray level co-occurrence matrix features(GLCM), 2 Gray level run length matrix features and shape feature) to each thyroid region of interest. The extracted features are used as statistical analysis. Our results show that first order statistics (Skewness, Entropy, Energy, Smoothness), GLCM (Correlation, Contrast, Energy, Entropy, Difference variance, Difference Entropy, Homogeneity, Maximum Probability, Sum average, Sum entropy), GLRLM features and shape feature helped to distinguish thyroid benign and malignant. This algorithm will be helpful to diagnose of thyroid nodule on ultrasound images.

Research trends related to childhood and adolescent cancer survivors in South Korea using word co-occurrence network analysis

  • Kang, Kyung-Ah;Han, Suk Jung;Chun, Jiyoung;Kim, Hyun-Yong
    • Child Health Nursing Research
    • /
    • v.27 no.3
    • /
    • pp.201-210
    • /
    • 2021
  • Purpose: This study analyzed research trends related to childhood and adolescent cancer survivors (CACS) using word co-occurrence network analysis on studies registered in the Korean Citation Index (KCI). Methods: This word co-occurrence network analysis study explored major research trends by constructing a network based on relationships between keywords (semantic morphemes) in the abstracts of published articles. Research articles published in the KCI over the past 10 years were collected using the Biblio Data Collector tool included in the NetMiner Program (version 4), using "cancer survivors", "adolescent", and "child" as the main search terms. After pre-processing, analyses were conducted on centrality (degree and eigenvector), cohesion (community), and topic modeling. Results: For centrality, the top 10 keywords included "treatment", "factor", "intervention", "group", "radiotherapy", "health", "risk", "measurement", "outcome", and "quality of life". In terms of cohesion and topic analysis, three categories were identified as the major research trends: "treatment and complications", "adaptation and support needs", and "management and quality of life". Conclusion: The keywords from the three main categories reflected interdisciplinary identification. Many studies on adaptation and support needs were identified in our analysis of nursing literature. Further research on managing and evaluating the quality of life among CACS must also be conducted.

Correlational Structure Modelling for Fall Accident Risk Factors of Portable Ladders Using Co-occurrence Keyword Networks (동시 출현 기반 키워드 네트워크 기법을 이용한 이동식 사다리 추락 재해 위험 요인 연관 구조 모델링)

  • Hwang, Jong Moon;Shin, Sung Woo
    • Journal of the Korean Society of Safety
    • /
    • v.36 no.3
    • /
    • pp.50-59
    • /
    • 2021
  • The main purpose of accident analysis is to identify the causal factors and the mechanisms of those factors leading to the accident. However, current accident analysis techniques focus only on finding the factors related to the accident without providing more insightful results, such as structures or mechanisms. For this reason, preventive actions for safety management are concentrated on the elimination of causal factors rather than blocking the connection or chain of accident processes. This greatly reduces the effectiveness of safety management in practice. In the present study, a technique to model the correlational structure of accident risk factors is proposed by using the co-occurrence keyword network analysis technique. To investigate the effectiveness of the proposed technique, a case study involving a portable ladder fall accident is conducted. The results indicate that the proposed technique can construct the correlational structure model of the risk factors of a portable ladder fall accident. This proves the effectiveness of the proposed technique in modeling the correlational structure of accident risk factors.

Color Component Analysis For Image Retrieval (이미지 검색을 위한 색상 성분 분석)

  • Choi, Young-Kwan;Choi, Chul;Park, Jang-Chun
    • The KIPS Transactions:PartB
    • /
    • v.11B no.4
    • /
    • pp.403-410
    • /
    • 2004
  • Recently, studies of image analysis, as the preprocessing stage for medical image analysis or image retrieval, are actively carried out. This paper intends to propose a way of utilizing color components for image retrieval. For image retrieval, it is based on color components, and for analysis of color, CLCM (Color Level Co-occurrence Matrix) and statistical techniques are used. CLCM proposed in this paper is to project color components on 3D space through geometric rotate transform and then, to interpret distribution that is made from the spatial relationship. CLCM is 2D histogram that is made in color model, which is created through geometric rotate transform of a color model. In order to analyze it, a statistical technique is used. Like CLCM, GLCM (Gray Level Co-occurrence Matrix)[1] and Invariant Moment [2,3] use 2D distribution chart, which use basic statistical techniques in order to interpret 2D data. However, even though GLCM and Invariant Moment are optimized in each domain, it is impossible to perfectly interpret irregular data available on the spatial coordinates. That is, GLCM and Invariant Moment use only the basic statistical techniques so reliability of the extracted features is low. In order to interpret the spatial relationship and weight of data, this study has used Principal Component Analysis [4,5] that is used in multivariate statistics. In order to increase accuracy of data, it has proposed a way to project color components on 3D space, to rotate it and then, to extract features of data from all angles.

Current Research Trends in Entrepreneurship Based on Topic Modeling and Keyword Co-occurrence Analysis: 2002~2021 (토픽모델링과 동시출현단어 분석을 이용한 기업가정신에 대한 연구동향 분석: 2002~2021)

  • Jang, Sung Hee
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.17 no.3
    • /
    • pp.245-256
    • /
    • 2022
  • The purpose of this study is to provide comprehensive insights on the current research trends in entrepreneurship based on topic modeling and keyword co-occurrence analysis. This study queried Web of Science database with 'entrepreneurship' and collected 14,953 research articles between 2002 and 2021. The study used R program for topic modeling and VOSviewer program for keyword co-occurrence analysis. The results of this study are as follows. First, as a result of keyword co-occurrence analysis, 5 clusters divided: entrepreneurship and innovation cluster, entrepreneurship education cluster, social entrepreneurship and sustainability cluster, enterprise performance cluster, and knowledge and technology transfer cluster. Second, as a result of the topic modeling analysis, 12 topics found: start-up environment and economic development, international entrepreneurship, venture capital, government policy and support, social entrepreneurship, management-related issues, regional city planning and development, entrepreneurship research, and entrepreneurial intention. Finally, the study identified two hot topics(venture capital and entrepreneurship intention) and a cold topic(international entrepreneurship). The results of this study are useful to understand current research trends in entrepreneurship research and provide insights into research of entrepreneurship.

A Study on Safety of Hydrogen Station (수소충전소의 안전성에 관한 연구)

  • Ko, Jae-Wook;Lee, Dae-Hee;Jung, In-Hee
    • Journal of the Korean Institute of Gas
    • /
    • v.13 no.1
    • /
    • pp.45-51
    • /
    • 2009
  • A safety assessment was performed through the process analysis of hydrogen station. The purpose of this study provides basic information for the standard establishment about hydrogen stations. The processes of hydrogen stations were classified by four steps (process of manufacture, compression, storage, charge). FMEA (Failure Mode and Effect Analysis) method was applied to evaluate safety. Each risk element is following; S (severity), O (occurrence), D (detection). And the priority of order was decided by using RPN (Risk Priority Number) value multiplying three factors. Scenarios were generated based on FMEA results. And consequence analysis was practiced using PHAST program. In the result of C.A, jet fire and explosion were shown as accident types. In case of leakage of feed line in PSA process, concentration of CO gas is considered to prevent CO gas poisoning when the raw material that can product CO gas was used.

  • PDF