• Title/Summary/Keyword: document analysis

Search Result 1,192, Processing Time 0.027 seconds

Analysis of Correlation and Group Difference for Selection of Elementary Fusion Gifted Students (초등융합영재 선발요소의 상관관계 및 그룹 차이 분석)

  • Min, Meekyung;Kim, Kapsu
    • Journal of The Korean Association of Information Education
    • /
    • v.22 no.4
    • /
    • pp.491-500
    • /
    • 2018
  • In the era of the Fourth Industrial Revolution, talents should not be subordinated to a particular discipline, but must be able to converge a variety of disciplines. It is important to have a fused thinking because elementary school students are likely to make various changes. Therefore, when selecting elementary gifted students, they are selecting students for fusion gifted students. This study examines the effects of creative problem solving ability, document evaluation, and interview factors on student selection when selecting students for gifted students. The results show that creative problem solving ability has the most influence on selection. In the case of the fifth graders, the creative problem solving ability and the document evaluation influence the selection. In fourth graders, the creative problem solving ability and interview affect the selection. In the case of female students, it was found that creative problem solving ability and document evaluation influenced selection. In addition, there was a gender difference in the evaluation of documents in the gender difference analysis. There is no significant difference between the three groups in the grade-by-grade difference analysis.

Creation and clustering of proximity data for text data analysis (텍스트 데이터 분석을 위한 근접성 데이터의 생성과 군집화)

  • Jung, Min-Ji;Shin, Sang Min;Choi, Yong-Seok
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.3
    • /
    • pp.451-462
    • /
    • 2019
  • Document-term frequency matrix is a type of data used in text mining. This matrix is often based on various documents provided by the objects to be analyzed. When analyzing objects using this matrix, researchers generally select only terms that are common in documents belonging to one object as keywords. Keywords are used to analyze the object. However, this method misses the unique information of the individual document as well as causes a problem of removing potential keywords that occur frequently in a specific document. In this study, we define data that can overcome this problem as proximity data. We introduce twelve methods that generate proximity data and cluster the objects through two clustering methods of multidimensional scaling and k-means cluster analysis. Finally, we choose the best method to be optimized for clustering the object.

Comparison of Significant Term Extraction Based on the Number of Selected Principal Components (주성분 보유수에 따른 중요 용어 추출의 비교)

  • Lee Chang-Beom;Ock Cheol-Young;Park Hyuk-Ro
    • The KIPS Transactions:PartB
    • /
    • v.13B no.3 s.106
    • /
    • pp.329-336
    • /
    • 2006
  • In this paper, we propose a method of significant term extraction within a document. The technique used is Principal Component Analysis(PCA) which is one of the multivariate analysis methods. PCA can sufficiently use term-term relationships within a document by term-term correlations. We use a correlation matrix instead of a covariance matrix between terms for performing PCA. We also try to find out thresholds of both the number of components to be selected and correlation coefficients between selected components and terms. The experimental results on 283 Korean newspaper articles show that the condition of the first six components with correlation coefficients of |0.4| is the best for extracting sentence based on the significant selected terms.

On the Development of Risk Factor Map for Accident Analysis using Textmining and Self-Organizing Map(SOM) Algorithms (재해분석을 위한 텍스트마이닝과 SOM 기반 위험요인지도 개발)

  • Kang, Sungsik;Suh, Yongyoon
    • Journal of the Korean Society of Safety
    • /
    • v.33 no.6
    • /
    • pp.77-84
    • /
    • 2018
  • Report documents of industrial and occupational accidents have continuously been accumulated in private and public institutes. Amongst others, information on narrative-texts of accidents such as accident processes and risk factors contained in disaster report documents is gaining the useful value for accident analysis. Despite this increasingly potential value of analysis of text information, scientific and algorithmic text analytics for safety management has not been carried out yet. Thus, this study aims to develop data processing and visualization techniques that provide a systematic and structural view of text information contained in a disaster report document so that safety managers can effectively analyze accident risk factors. To this end, the risk factor map using text mining and self-organizing map is developed. Text mining is firstly used to extract risk keywords from disaster report documents and then, the Self-Organizing Map (SOM) algorithm is conducted to visualize the risk factor map based on the similarity of disaster report documents. As a result, it is expected that fruitful text information buried in a myriad of disaster report documents is analyzed, providing risk factors to safety managers.

Text Mining of Wood Science Research Published in Korean and Japanese Journals

  • Eun-Suk JANG
    • Journal of the Korean Wood Science and Technology
    • /
    • v.51 no.6
    • /
    • pp.458-469
    • /
    • 2023
  • Text mining techniques provide valuable insights into research information across various fields. In this study, text mining was used to identify research trends in wood science from 2012 to 2022, with a focus on representative journals published in Korea and Japan. Abstracts from Journal of the Korean Wood Science and Technology (JKWST, 785 articles) and Journal of Wood Science (JWS, 812 articles) obtained from the SCOPUS database were analyzed in terms of the word frequency (specifically, term frequency-inverse document frequency) and co-occurrence network analysis. Both journals showed a significant occurrence of words related to the physical and mechanical properties of wood. Furthermore, words related to wood species native to each country and their respective timber industries frequently appeared in both journals. CLT was a common keyword in engineering wood materials in Korea and Japan. In addition, the keywords "MDF," "MUF," and "GFRP" were ranked in the top 50 in Korea. Research on wood anatomy was inferred to be more active in Japan than in Korea. Co-occurrence network analysis showed that words related to the physical and structural characteristics of wood were organically related to wood materials.

Assessing Satisfaction on Scholarly Journals and Document Delivery Services at Foreign Journal Supporting Center (외국학술지 지원센터의 학술지 및 원문복사서비스의 만족도 분석)

  • Choi, Jae-Hwang
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.42 no.2
    • /
    • pp.69-85
    • /
    • 2008
  • The purpose of this study is to analyze user satisfaction on scholarly journals and document delivery services(DDS) at foreign journal supporting center, which is founded at K University in 2007. The number of 72 users answered the quality of scholarly journals, and 69 users answered for the DDS survey. For the scholarly journal survey, up-to-date, usability, and expertness were asked, and fastness of DDS and easiness of application procedures were asked for the DDS survey. This study reveals that overall users using or visiting foreign journal supporting center are satisfied with the quality of scholarly journals and DDS.

Automatic Reading System for On-off Type DNA Chip

  • Ryu, Mun-Ho;Kim, Jong-Dae;Kim, Jong-Won
    • Journal of Information Processing Systems
    • /
    • v.2 no.3 s.4
    • /
    • pp.189-193
    • /
    • 2006
  • In this study we propose an automatic reading system for diagnostic DNA chips. We define a general specification for an automatic reading system and propose a possible implementation method. The proposed system performs the whole reading process automatically without any user intervention, covering image acquisition, image analysis, and report generation. We applied the system for the automatic report generation of a commercialized DNA chip for cervical cancer detection. The fluorescence image of the hybridization result was acquired with a $GenePix^{TM}$ scanner using its library running in HTML pages. The processing of the acquired image and the report generation were executed by a component object module programmed with Microsoft Visual C++ 6.0. To generate the report document, we made an HWP 2002 document template with marker strings that were supposed to be searched and replaced with the corresponding information such as patient information and diagnosis results. The proposed system generates the report document by reading the template and changing the marker strings with the resultant contents. The system is expected to facilitate the usage of a diagnostic DNA chip for mass screening by the automation of a conventional manual reading process, shortening its processing time, and quantifying the reading criteria.

An Adaptive Binarization of Camera Document Image by Image Quality Estimation (화질 분석을 통한 카메라 문서 영상의 적응적 이진화)

  • Kim, In-Jung
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.9
    • /
    • pp.797-803
    • /
    • 2007
  • Adaptive binarization is very important for the camera-based document recognition. This paper proposes a binarization method which can effectively adapt to the variation of image Qualify. Firstly, it analyzes the effect of binarization parameters to the result and proposes a method to measure the image quality. Then, it statistically analyzes the relationship between the image quality and the binarization parameter. Finally, it proposes a binarization method that automatically adapts to the quality of the input image, using the analysis result. The experiment results show that there is a meaningful relationship between the image quality and the binarization parameter, and therefore, the proposed method can effectively adapt to the variation of image quality.

A Study on the Perception Among University Librarians towardes Resource Sharing (자원공유에 대한 대학도서관 사서들의 인식에 관한 연구)

  • Shim, Won-Sik
    • Journal of the Korean Society for information Management
    • /
    • v.25 no.2
    • /
    • pp.5-24
    • /
    • 2008
  • As resource sharing becomes more active and complicated in academic libraries, a better understanding of how librarians-a key stakeholder-view the current level of resource sharing is needed. Using survey method, this study collected data regarding the Perception of 78 librarians with regard to interlibrary loan, document delivery, union catalog construction, shared acquisition, and community building. Overall, the respondents evaluated well-established forms of resource sharing(interlibrary loan, document delivery, and union catalog construction) more positively than less-well developed ones(shared acquisition and community building). Correlation analysis between perception of library/individual characteristics was conducted. Barriers to each of the five areas of resource sharing are also identified in the study.

A Study on the Notation of Jeongganbo Score using Extensible Markup Language (XML) (확장 마크업 언어(XML)를 이용한 정간보 악보 표기법에 관한 연구)

  • Lee, Yong Ju;Choi, Keunwoo;Park, Tae Jin;Kang, Kyeongok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.32 no.5
    • /
    • pp.446-453
    • /
    • 2013
  • In this paper, we propose an efficient method to describe and save Jeongganbo score which has various structures and symbols by using XML (Extensible Markup Language). To do this, analysis of Jeongganbo's structures, and classification of symbols for jeongganbo were preformed. Then, Jeongganbo DTD (Document Type Definition) was defined to describe Jeongganbo score in XML document. To verify the proposed method, we produced a Jeongganbo score XML file for real Jeongganbo score according to the proposed Jeongganbo DTD, and then evaluated the produced XML file by using Jeongganbo XML interpreter software which can interpret the Jeongganbo XML file and represent the Jeongganbo score.