Search | Korea Science

Evaluation of Similarity Analysis of Newspaper Article Using Natural Language Processing

Ayako Ohshiro;Takeo Okazaki;Takashi Kano;Shinichiro Ueda
- International Journal of Computer Science & Network Security
- /
- v.24 no.6
- /
- pp.1-7
- /
- 2024
Comparing text features involves evaluating the "similarity" between texts. It is crucial to use appropriate similarity measures when comparing similarities. This study utilized various techniques to assess the similarities between newspaper articles, including deep learning and a previously proposed method: a combination of Pointwise Mutual Information (PMI) and Word Pair Matching (WPM), denoted as PMI+WPM. For performance comparison, law data from medical research in Japan were utilized as validation data in evaluating the PMI+WPM method. The distribution of similarities in text data varies depending on the evaluation technique and genre, as revealed by the comparative analysis. For newspaper data, non-deep learning methods demonstrated better similarity evaluation accuracy than deep learning methods. Additionally, evaluating similarities in law data is more challenging than in newspaper articles. Despite deep learning being the prevalent method for evaluating textual similarities, this study demonstrates that non-deep learning methods can be effective regarding Japanese-based texts.
https://doi.org/10.22937/IJCSNS.2024.24.6.1 인용 PDF

Voice Similarities between Sisters

Ko, Do-Heung
- Speech Sciences
- /
- v.8 no.3
- /
- pp.43-50
- /
- 2001
This paper deals with voice similarities between sisters who are supposed to have common physiological characteristics from a single biological mother. Nine pairs of sisters who are believed to have similar voices participated in this experiment. The speech samples obtained from one pair of sisters were eliminated in the analysis because their perceptual score was relatively low. The words were measured in both isolation and context, and the subjects were asked to read the text five times with about three seconds of interval between readings. Recordings were made at natural speed in a quiet room. The data were analyzed in pitch and formant frequencies using CSL (Computerized Speech Lab) and PCQuirer. It was found that data of the initial vowels are much more similar and homogeneous than those of vowels in other positions. The acoustic data showed that voice similarities are strikingly high in both pitch and formant frequencies. It is assumed that statistical data obtained from this experiment can be used as a guideline for modelling speaker identification and speaker verification.
PDF

An Analysis of Similarities that Students Construct in the Process of Problem Solving (중학생들이 수학 문장제 해결 과정에서 구성하는 유사성 분석)

Park Hyun-Jeong;Lee Chong-Hee
- Journal of Educational Research in Mathematics
- /
- v.16 no.2
- /
- pp.115-138
- /
- 2006
The purpose of this paper is to investigate students' constructing similarities in the understanding the problem phase and the devising a plan phase of problem solving. the relation between similarities that students construct and how students construct similarities is researched through case study. Based on the results from the research, authors reached a conclusion as following. All of two students constructed surface similarities in the beginning of the problem solving process and responded to the context of the problem information sensitively. Specially student who constructed the similarities and the difference in terms of a specific dimension by using diagram for herself could translate the equation which used to solve the base problem or the experienced problem into the equation of the target problem solution. However student who understood globally the target problem being based on the surface similarity could not translate the equation that she used to solve the base problem into the equation of target problem solution.
PDF

Performance Analysis of Forwarding Schemes Based on Similarities for Opportunistic Networks (기회적 네트워크에서의 유사도 기반의 포워딩 기법의 성능 분석)

Kim, Sun-Kyum;Lee, Tae-Seok;Kim, Wan-Jong
- KIISE Transactions on Computing Practices
- /
- v.24 no.3
- /
- pp.145-150
- /
- 2018
Forwarding in opportunistic networks shows low performance because there may be no connecting paths between the source and the destination nodes due to the intermittent connectivity. Currently, social network analysis has been researched. Specifically, similarity is one of methods of social networks analysis. In this paper, we propose forwarding schemes based on representative similarities, and evaluate how much the forwarding performance increases. As a result, since the forwarding schemes are based on similarities, these schemes only forward messages to nodes with higher similarity as relay nodes, toward the destination node. These schemes have low network traffic and hop count while having stable transmission delay.
https://doi.org/10.5626/KTCP.2018.24.3.145 인용 KSCI

Microbial Genome Analysis and Application to Clinical Bateriology (미생물의 유전자(Genome) 해석과 임상세균학에 이용)

Kim, Sung-Kwang
- Journal of Yeungnam Medical Science
- /
- v.19 no.1
- /
- pp.1-10
- /
- 2002
With the establishment of rapid sequence analysis of 16S rRNA and the recognition of its potential to determine the phylogenetic position of any prokaryotic organism, the role of 16S rRNA similarities in the present species definition in bacteriology need to be clarified. Comparative studies clearly reveal the limitations of the sequence analysis of this conserved gene and gene product in the determination of relationship at the pathogenic strain level for which DNA-DNA reassociation experiments still constitute the superior method. Since today the primary structure of 16S rRNA is easier to determine than hybridization between DNA strands, the strength of the sequence analysis is to recognize the level at which DNA pairing studies need to be performed, which certainly applies to similarities of 97% and higher.
PDF

Analysis of Genetic Diversity of Leaf Blight Pathogen of Sweet Persimmon Pestalotiopsis species with Isozyme Band Patterns (단감나무 둥근갈색 무늬병균 Pestalotiopsis spp.의 isozyme을 통한 유전다양성 분석)

이윤수;우수진;최혜선;김경수;강원희;김명조;심재욱;장태현;임태헌
- Korean Journal Plant Pathology
- /
- v.14 no.5
- /
- pp.502-506
- /
- 1998
In this study, we calculated the genetic relationships of Pestalotiopsis species collected from various places in southern part of Korea through isozyme analyses. As a result, EST showed the largest number of band, and the number of bands were ranged from 5 to 7 on the average. All the isozymes used in this study showed distinctive band patterns for each isolates. Similarities among the compared isolates ranged from 48 to 93%. Isolates SP7, SP19 and SP23 showed more than 90% similarities, and most isolates showed similarities ranging from 65 to 82%.
PDF

Analysis of Performance Improvement of Collaborative Filtering based on Neighbor Selection Criteria (이웃 선정 조건에 따른 협력 필터링의 성능 향상 분석)

Lee, Soojung
- The Journal of Korean Association of Computer Education
- /
- v.18 no.4
- /
- pp.55-62
- /
- 2015
Recommender systems through collaborative filtering has been utilized successfully in various areas by providing with convenience in searching information. Measuring similarity is critical in determining performance of these systems, because it is the criteria for the range of recommenders. This study analyzes distributions of similarity from traditional measures and investigates relations between similarities and the number of co-rated items. With this, this study suggests a method for selecting reliable recommenders by restricting similarities, which compensates for the drawbacks of previous measures. Experimental results showed that restricting similarities of neighbors by upper and lower thresholds yield superior performance than previous methods, especially when consulting fewer nearest neighbors. Maximum improvement of 0.047 for cosine similarity and that of 0.03 for Pearson was achieved. This result tells that a collaborative filtering system using Pearson or cosine similarities should not consult neighbors with very high or low similarities.
PDF KSCI

Stress analysis of the restraint test specimen (구속균열 시험편의 용접시 응력 해석)

Choi, Gwang;Lim, Sung-Woo
- Proceedings of the KWS Conference
- /
- 2004.05a
- /
- pp.288-289
- /
- 2004
In this report, stress analysis of restraint specimen was done by numerical method (finite element method). Calculations were done by elastic-plastic analysis and thermo-elastic-plastic analysis. The results showed similarities for both cases, and by thermo-elastic-plastic analysis transient characteristics of welding could be found.
PDF

An Experimental Study of Cocitation Analysis on Web Information (웹 정보원의 동시인용분석에 관한 실험적 연구)

정동열;최윤미
- Journal of the Korean Society for information Management
- /
- v.16 no.2
- /
- pp.7-26
- /
- 1999
This experimental study examines informetric analysis of World Wide Web based upon cocitation analysis of Web pages and features of Web resources in the field of communication studies. Cocitation analysis is basically performed to examine the intellectual structure of the communication studies in reflecting link count on the Web. The selected Web resources in the field are mapped in two dimensions based upon the similarities of cocitation frequency, correlation matrix, mutidimensional scale and cluster analysis. Cocitation analysis methods using organizational homepage, personal homepage, or Web index, to Web produced clustering of Web resources that had topical similarities. So far, although informetric analysis of Web resources is in the preliminary stage, it shows that Web can be a new tool for indicating the intellectual structure of a specific research field. In addition, this study analyzes characteristics of printing resources and Web resources, and differences of research methods in applying cocitation analysis.
PDF

A Study on Correlation between the CMMI SPs and GPs at Maturity Levels 2 and 3

Lee, Min Jae;Rhew, Sung-Yul
- Journal of the Korean Society of Systems Engineering
- /
- v.7 no.1
- /
- pp.9-21
- /
- 2011
Assuming that the Capability Maturity Model Integration for Development v1.2 (CMMI) could be applied to an organization more effectively if the content similarities among practices were improved in terms of structure and composition, this paper presents the analysis of the correlations between the CMMI Specific Practices (SP) and Generic Practices (GP) for the Maturity Level 2 and 3 Process Areas using the Chi-square independence test. According to the analysis, a 22.2% correlation was observed. To minimize the problem of repeatedly applying similarities, 6 GPs that are highly correlated with the SPs were grouped together. Then, three different improvement plans: 1) development of a standard template-based project plan, 2) establishment of a configuration management system based on open source software to control work products and leverage experience, and 3) establishment of project assurance through an independent quality assurance-based organization and intensive review by higher-level management - were defined.
https://doi.org/10.14248/JKOSSE.2011.7.1.009 인용 PDF

Search Result 786, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)