• Title/Summary/Keyword: Text Construction

Search Result 386, Processing Time 0.029 seconds

Implementation of Very Large Hangul Text Retrieval Engine HMG (대용량 한글 텍스트 검색 엔진 HMG의 구현)

  • 박미란;나연묵
    • Journal of Korea Multimedia Society
    • /
    • v.1 no.2
    • /
    • pp.162-172
    • /
    • 1998
  • In this paper, we implement a gigabyte Hangul text retrieval engine HMG(Hangul MG) which is based on the English text retrieval engine MG(Managing Gigabytes) and the Hangul lexical analyzer HAM(Hangul Analysis Module). To support Hangul information, we use the KSC 5601 code in the database construction and query processing stages. The lexical analyzer, parser, and index construction module of the MG system are modified to support Hangul information. To show the usefulness of HMG system, we implemented a NOD(Novel On Demand) system supporting the retrieval of Hangul novels on the WWW. The proposed system HMG can be utilized in the construction of massive full-text information retrieval systems supporting Hangul.

  • PDF

Automatic Construction of Korean Unknown Word Dictionary using Occurrence Frequency in Web Documents (웹문서에서의 출현빈도를 이용한 한국어 미등록어 사전 자동 구축)

  • Park, So-Young
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.3
    • /
    • pp.27-33
    • /
    • 2008
  • In this paper, we propose a method of automatically constructing a dictionary by extracting unknown words from given eojeols in order to improve the performance of a Korean morphological analyzer. The proposed method is composed of a dictionary construction phase based on full text analysis and a dictionary construction phase based on web document frequency. The first phase recognizes unknown words from strings repeatedly occurred in a given full text while the second phase recognizes unknown words based on frequency of retrieving each string, once occurred in the text, from web documents. Experimental results show that the proposed method improves 32.39% recall by utilizing web document frequency compared with a previous method.

  • PDF

Text-Mining Analysis of Korea Government R&D Trends in Construction Machinery Domains (텍스트 마이닝을 통한 건설기계분야 국내 정부 R&D 연구동향 분석)

  • Bom Yun;Joonsoo Bae
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.spc
    • /
    • pp.1-8
    • /
    • 2023
  • To investigate the national science and technology policy direction in the field of construction machinery, an analysis was conducted on projects selected as national research and development (R&D) initiatives by the government. Assuming that the project titles contain key keywords, text mining was employed to substantiate this assumption. Project information data spanning nine years from 2014 to 2022 was collected through the National Science & Technology Information Service (NTIS). To observe changes over time, the years were divided into three-year sections. To analyze research trends efficiently, keywords were categorized into groups: 'equipment,' 'smart,' and 'eco-friendly.' Based on the collected data, keyword frequency analysis, N-gram analysis, and topic modeling were performed. The research findings indicate that domestic government R&D in the construction machinery field primarily focuses on smart-related research and development. Specifically, investments in monitoring systems and autonomous operation technologies are increasing. This study holds significance in analyzing objective research trends through the utilization of big data analysis techniques and is expected to contribute to future research and development planning, strategic formulation, and project management.

Comparison of Three Preservice Elementary School Teachers' Simulation Teaching in Terms of Data-text Transforming Discourses (Data-Text 변형 담화의 측면에서 본 세 초등 예비교사의 모의수업 시연 사례의 비교)

  • Maeng, Seungho
    • Journal of Korean Elementary Science Education
    • /
    • v.41 no.1
    • /
    • pp.93-105
    • /
    • 2022
  • This study investigated the aspects of how three preservice elementary school teachers conducted the data-text transforming discourses in their science simulation teaching and how their epistemological conversations worked for learners' construction of scientific knowledge. Three preservice teachers, who had presented simulation teaching on the seasonal change of constellations, participated in the study. The results revealed that one preservice teacher, who had implemented the transforming discourses of data-to-evidence and model-to-explanation, appeared to facilitate learners' knowledge construction. The other two preservice teachers had difficulty helping learners construct science knowledge due to their lack of transforming discourses. What we should consider for improving preservice elementary school teachers' teaching competencies was discussed based on a detailed comparison of three cases of preservice teachers' data-text transforming.

Analysis of accident types at small and medium-sized construction sites based on web scraping and text mining (웹 스크래핑 및 텍스트마이닝에 기반한 중소규모 건설현장 사고유형 분석)

  • Younggeun Yoon
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.1
    • /
    • pp.609-615
    • /
    • 2024
  • The construction industry's fatality count stands at 402, comprising approximately 46% of total industrial accidents. Notably, construction costs less than 5 billion won account for about 69%, so strengthening safety management at small and medium-sized construction sites is required. In this study, 19,511 accident investigation data were collected using web scraping. Through statistical analysis of the collected structured data and text mining analysis of the unstructured data, accident types and causes of accidents were analyzed by construction costs at sites less than 5 billion won. As a result, it was confirmed that there were differences in accident types and causes depending on the construction costs. It is hoped that the results of this study will be used for customized safety management at small and medium-sized construction sites.

A Method for Text Information Separation from Floorplan Using SIFT Descriptor

  • Shin, Yong-Hee;Kim, Jung Ok;Yu, Kiyun
    • Korean Journal of Remote Sensing
    • /
    • v.34 no.4
    • /
    • pp.693-702
    • /
    • 2018
  • With the development of data analysis methods and data processing capabilities, semantic analysis of floorplans has been actively studied. Therefore, studies for extracting text information from drawings have been conducted for semantic analysis. However, existing research that separates rasterized text from floorplan has the problem of loss of text information, because when graphic and text components overlap, text information cannot be extracted. To solve this problem, this study defines the morphological characteristics of the text in the floorplan, and classifies the class of the corresponding region by applying the class of the SIFT key points through the SVM models. The algorithm developed in this study separated text components with a recall of 94.3% in five sample drawings.

RECENT RESEARCH AND DEVELOPING TREND OF ENGINEERING MANAGEMENT IN CHINA BASED ON TEXT MINING

  • Shaohua Jiang;Wenling Zhang;Zhaohong Qiu;Shaojun Wang
    • International conference on construction engineering and project management
    • /
    • 2009.05a
    • /
    • pp.814-820
    • /
    • 2009
  • With the rapid development of China economy, many engineering projects with large scale and investment were constructed in China and some were the biggest ones in the world. With the development of engineering practice, great progress in the research of engineering management of China was made and a large number of research findings were embodied in content of research papers and were represented by technical words. To know the state of arts in the research field of engineering management in China, three major parts, namely title, abstract and keywords of research papers in last five years from three representative Chinese journals about engineering management were chose as research materials. Unlike western languages, there are no delimiters between the words of Chinese, so the maximum matching and frequency statistics (MMFS) method, a text segmentation technique of text mining Chinese, was presented to extract the features consisting of technical words, phrases and words from the research materials. Recent research and developing trend of engineering management in China were found by comparing and analyzing the difference of technical words in the research materials of last five years.

  • PDF

Analysis on Research Trend of Productivity Using Text Mining - Focusing on KSCE Journal - (텍스트 마이닝을 통한 건설 생산성 분야의 연구동향 분석 - KSCE 저널을 중심으로 -)

  • Gu, Bongil;Huh, Youngki
    • Korean Journal of Construction Engineering and Management
    • /
    • v.21 no.2
    • /
    • pp.15-21
    • /
    • 2020
  • The relationship between keywords, found in all productivity related papers published in the KSCE journal for last 15 years, were analyzed in order to reveal a research trend in the area using text mining and A-Priori algorithm. As the results, it is found that the word of 'productivity' is most closely related to the words of 'work' and 'labor'. Futhermore, the word is somewhat related to those of 'factor', 'model', simulation', and 'work time'. It is also revealed that, on the other hand, the words of 'machine' and 'equipment' have little relationships with the keyword. This research will be a great help for academia to understand a research trend in the area of construction productivity.

A Preliminary Study on Change Management Factors through Analysing Development Phase of Construction IT System (건설 IT 시스템 발전단계분석을 통한 변화관리 요인 기초 연구)

  • Kim, Haneol;Lee, Dongheon;Lim, Hyoungchul
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2022.04a
    • /
    • pp.214-215
    • /
    • 2022
  • This study analyzed the development stage and change management necessity of the construction IT system through existing research and literature review, and used WordCloud, one of the text mining techniques, to analyze current construction trends and major issues. The necessity of change management is derived by using existing research literature and construction-related social issues as analysis data.

  • PDF