• Title/Summary/Keyword: Text comparing

Search Result 271, Processing Time 0.025 seconds

각뿔과 각뿔대의 부피에 대하여 산학서("산학정의(算學正義)(상편(上編))", "구장술해(九章術解)")와 한국.중국수학교과서와의 내용 비교연구

  • Park, Young-Sik;Choi, Kil-Nam
    • East Asian mathematical journal
    • /
    • v.26 no.4
    • /
    • pp.535-551
    • /
    • 2010
  • In this paper, we investigate the methodology to calculate the volume of the pyramid and frustum of the pyramid that is found in Gu Jang Sel Hae and San Hak Jeong Ui(The first volume)text. Comparing and analyzing content in Korean and Chinese mathematics education textbooks that uses as a foundation the aforementioned methodology, it is proposed that in future development of mathematics education curriculum the area of solid geometry be taught in greater depth in basic study guides.

Machine Learning Based Keyphrase Extraction: Comparing Decision Trees, Naïve Bayes, and Artificial Neural Networks

  • Sarkar, Kamal;Nasipuri, Mita;Ghose, Suranjan
    • Journal of Information Processing Systems
    • /
    • v.8 no.4
    • /
    • pp.693-712
    • /
    • 2012
  • The paper presents three machine learning based keyphrase extraction methods that respectively use Decision Trees, Na$\ddot{i}$ve Bayes, and Artificial Neural Networks for keyphrase extraction. We consider keyphrases as being phrases that consist of one or more words and as representing the important concepts in a text document. The three machine learning based keyphrase extraction methods that we use for experimentation have been compared with a publicly available keyphrase extraction system called KEA. The experimental results show that the Neural Network based keyphrase extraction method outperforms two other keyphrase extraction methods that use the Decision Tree and Na$\ddot{i}$ve Bayes. The results also show that the Neural Network based method performs better than KEA.

Ahn Ji-Jae's 《Xiang Ming Suan Fa》 (안지재의 《상명산법》)

  • Lee, Kyung Eon
    • The Mathematical Education
    • /
    • v.53 no.1
    • /
    • pp.111-129
    • /
    • 2014
  • ${\ll}$Xiang Ming Suan Fa${\gg}$ written by Ahn Ji-Jae, a scholar of Yuan Dynasty, is a very important mathematics text in development of mathematics in Joseon Dynasty. Also, ${\ll}$Xiang Ming Suan Fa${\gg}$ in possession of Keimeung university was designated as a Korean National Treasure on February 25, 2011. In this paper, we analyzed the structure and contents of ${\ll}$Xiang Ming Suan Fa${\gg}$. Also, we studied the influences of ${\ll}$Xiang Ming Suan Fa${\gg}$ on Joseon Dynasty's mathematics according to the comparing with mathematics books such as ${\ll}$Mook Sa Jib San Bub> and ${\ll}$San Hak Yib Moon${\gg}$.

Criticality benchmark of McCARD Monte Carlo code for light-water-reactor fuel in transportation and storage packages

  • Jang, Junkyung;Lee, Hochul;Lee, Hyun Chul
    • Nuclear Engineering and Technology
    • /
    • v.50 no.7
    • /
    • pp.1024-1036
    • /
    • 2018
  • In this paper, McCARD code was verified using various models listed in the NUREG/CR-6361 benchmark guide, which provides specifications for single pin-cells, single assemblies, and the whole core classified depending on the nuclear properties and structural characteristics. McCARD code was verified by comparing its results with those of SCALE code for single pin-cell and single assembly benchmark problems. The difference in the multiplication factor obtained through the two codes did not exceed 90 pcm. The benchmark guide treats a total of 173 whole core experiments. The experiments are categorized as simple lattices, separator plates, reflecting walls, reflecting walls and separator plates, burnable absorber fuel rods, water holes, poison rods, and borated moderator. As a result of numerical simulation using McCARD, the mean value of the multiplication factors is 1.00223 and the standard deviation of the multiplication factors is 285 pcm. The difference between the multiplication factors and the experimental value is in the range of -665 pcm to + 1609 pcm. In addition, statistics of results for experiments categorized by reactor shape, additional structure, burnable poison, etc., are detailed in the main text.

Implementation of Framework for Efficient and Scalable Disaster Response Services

  • Seokjin Im
    • International Journal of Advanced Culture Technology
    • /
    • v.11 no.1
    • /
    • pp.290-295
    • /
    • 2023
  • The global warming by greenhouse gases causes climate change and disasters such as earthquakes and tsunamis frequently, leading to great damage. It is important to build efficient and scalable disaster response services to minimize the damage. Existing disaster warning service by the mobile text is limited by the scalability and the data size to be delivered. In this paper, we propose a framework for disaster response services that is efficient and flexible by allowing to adopt various indexing schemes and scalable by supporting any number of clients in disaster situations anytime and anywhere. Also, the framework by wireless data broadcast can be free from the limitation of the size of data to be delivered. We design and implement the proposed framework and evaluate the framework. For the evaluation, we simulate the implemented framework by adopting various indexing schemes like HCI, DSI and TTSI, and by comparing the access times of the clients. Through the evaluation, we show that the proposed framework can provide efficient and scalable and flexible disaster response services.

Study on Promotion of ESG Tourism in Bhutan through Big Data Analysis - Focusing on comparison with ESG Tourism status in Korea-

  • Min Kyeong Kim
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.15 no.2
    • /
    • pp.39-48
    • /
    • 2023
  • The purpose of this study is to revitalize ESG tourism in Bhutan by comparing and analyzing the ESG tourism status in Bhutan and the ESG tourism status in Korea. Big data analysis using text mining was performed by selecting "Bhutan ESG Tourism" and "Korea ESG Tourism" as keywords. The top 30 keywords were extracted through word purification, and based on this, data visualization was conducted through network analysis and Concor analysis between each keyword. As a result of the analysis, it was confirmed that Bhutan, unlike Korea, did not utilize it even though it had elements to incorporate ESG and the tourism industry into the country itself. As a result, since it is necessary to combine ESG elements owned by Bhutan and combine them with the tourism industry, we would like to suggest the direction of combining ESG and the tourism industry through this study.

A Study of 'Emotion Trigger' by Text Mining Techniques (텍스트 마이닝을 이용한 감정 유발 요인 'Emotion Trigger'에 관한 연구)

  • An, Juyoung;Bae, Junghwan;Han, Namgi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.69-92
    • /
    • 2015
  • The explosion of social media data has led to apply text-mining techniques to analyze big social media data in a more rigorous manner. Even if social media text analysis algorithms were improved, previous approaches to social media text analysis have some limitations. In the field of sentiment analysis of social media written in Korean, there are two typical approaches. One is the linguistic approach using machine learning, which is the most common approach. Some studies have been conducted by adding grammatical factors to feature sets for training classification model. The other approach adopts the semantic analysis method to sentiment analysis, but this approach is mainly applied to English texts. To overcome these limitations, this study applies the Word2Vec algorithm which is an extension of the neural network algorithms to deal with more extensive semantic features that were underestimated in existing sentiment analysis. The result from adopting the Word2Vec algorithm is compared to the result from co-occurrence analysis to identify the difference between two approaches. The results show that the distribution related word extracted by Word2Vec algorithm in that the words represent some emotion about the keyword used are three times more than extracted by co-occurrence analysis. The reason of the difference between two results comes from Word2Vec's semantic features vectorization. Therefore, it is possible to say that Word2Vec algorithm is able to catch the hidden related words which have not been found in traditional analysis. In addition, Part Of Speech (POS) tagging for Korean is used to detect adjective as "emotional word" in Korean. In addition, the emotion words extracted from the text are converted into word vector by the Word2Vec algorithm to find related words. Among these related words, noun words are selected because each word of them would have causal relationship with "emotional word" in the sentence. The process of extracting these trigger factor of emotional word is named "Emotion Trigger" in this study. As a case study, the datasets used in the study are collected by searching using three keywords: professor, prosecutor, and doctor in that these keywords contain rich public emotion and opinion. Advanced data collecting was conducted to select secondary keywords for data gathering. The secondary keywords for each keyword used to gather the data to be used in actual analysis are followed: Professor (sexual assault, misappropriation of research money, recruitment irregularities, polifessor), Doctor (Shin hae-chul sky hospital, drinking and plastic surgery, rebate) Prosecutor (lewd behavior, sponsor). The size of the text data is about to 100,000(Professor: 25720, Doctor: 35110, Prosecutor: 43225) and the data are gathered from news, blog, and twitter to reflect various level of public emotion into text data analysis. As a visualization method, Gephi (http://gephi.github.io) was used and every program used in text processing and analysis are java coding. The contributions of this study are as follows: First, different approaches for sentiment analysis are integrated to overcome the limitations of existing approaches. Secondly, finding Emotion Trigger can detect the hidden connections to public emotion which existing method cannot detect. Finally, the approach used in this study could be generalized regardless of types of text data. The limitation of this study is that it is hard to say the word extracted by Emotion Trigger processing has significantly causal relationship with emotional word in a sentence. The future study will be conducted to clarify the causal relationship between emotional words and the words extracted by Emotion Trigger by comparing with the relationships manually tagged. Furthermore, the text data used in Emotion Trigger are twitter, so the data have a number of distinct features which we did not deal with in this study. These features will be considered in further study.

The Analysis on Patterns of Questions in Elementary School Science Textbooks under the 2007 Revised Curriculum (2007년 개정교육과정에 따른 초등 과학교과서에 제시된 발문의 유형 분석)

  • Choi, Yoon-mi;Lee, Hyeong Cheol
    • Journal of Science Education
    • /
    • v.36 no.1
    • /
    • pp.120-129
    • /
    • 2012
  • The purpose of this study is to provide informations for developing next elementary school science textbooks and educational implications for a spot of science class through analyzing patterns of questions in the elementary school science textbooks under the 2007 revised curriculum. To get a meaningful results, the 2,446 questions extracted by operation definition from 3~6 grade science text books were analyzed by modified analysis frame work based on Blosser's classified system. The findings of this study were as follows: First, among 2,446 questions, the propositional pattern element had the highest rate, 49.2%, the appreciable pattern element had the lowest rate, 1.4%, of all pattern elements. Second, from the results of comparing patterns of questions in each grade's science textbook, as the grade went higher, the rate of the applicable and the divergent pattern element tended to increase, and that of the other elements tended to decrease. Third, as the results of comparing patterns of questions of 4 each field in elementary science textbooks, the energy field questions were the largest in number, followed by the substance field. The rate of the propositional pattern element was the highest of all question elements in common in each field. In the reproductive and the propositional pattern element, the energy and the substance field had a little higher rate than the other fields. On the other hand, in the applicable and the divergent pattern element, the earth and the life field had a little higher rate than the other fields.

  • PDF

Detecting Errors in POS-Tagged Corpus on XGBoost and Cross Validation (XGBoost와 교차검증을 이용한 품사부착말뭉치에서의 오류 탐지)

  • Choi, Min-Seok;Kim, Chang-Hyun;Park, Ho-Min;Cheon, Min-Ah;Yoon, Ho;Namgoong, Young;Kim, Jae-Kyun;Kim, Jae-Hoon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.7
    • /
    • pp.221-228
    • /
    • 2020
  • Part-of-Speech (POS) tagged corpus is a collection of electronic text in which each word is annotated with a tag as the corresponding POS and is widely used for various training data for natural language processing. The training data generally assumes that there are no errors, but in reality they include various types of errors, which cause performance degradation of systems trained using the data. To alleviate this problem, we propose a novel method for detecting errors in the existing POS tagged corpus using the classifier of XGBoost and cross-validation as evaluation techniques. We first train a classifier of a POS tagger using the POS-tagged corpus with some errors and then detect errors from the POS-tagged corpus using cross-validation, but the classifier cannot detect errors because there is no training data for detecting POS tagged errors. We thus detect errors by comparing the outputs (probabilities of POS) of the classifier, adjusting hyperparameters. The hyperparameters is estimated by a small scale error-tagged corpus, in which text is sampled from a POS-tagged corpus and which is marked up POS errors by experts. In this paper, we use recall and precision as evaluation metrics which are widely used in information retrieval. We have shown that the proposed method is valid by comparing two distributions of the sample (the error-tagged corpus) and the population (the POS-tagged corpus) because all detected errors cannot be checked. In the near future, we will apply the proposed method to a dependency tree-tagged corpus and a semantic role tagged corpus.

Between a Historical Subject and a Novel Subject -Reading The Song of sword based on the Logic of Choice, Transformation, and Exclusion (역사적 인간과 소설적 인간의 사이 -선택, 변형, 배제의 논리로 읽는 『칼의 노래』)

  • Kim, Won-Kyu
    • Journal of Popular Narrative
    • /
    • v.25 no.3
    • /
    • pp.103-141
    • /
    • 2019
  • The purpose of this paper is to examine the logic of choice, transformation, and exclusion in The Song of sword, comparing it with the historical records. This paper explains how a novel is 'produced'. Through this, it searches for the aspects in which The Song of sword changed into 'a narrative revealing the disillusionment of the novel's subject with the world'. In the logic of choice, it explores which time and space were chosen in the novel, and which character was chosen to prepare the content and formal framework of the novel. In the logic of transformation, it is confirmed that the meaning of 'individual' is highlighted in the novel, unlike the historical records, by transforming both the character of the enemy and the meaning of war. In the logic of exclusion, it studies the characteristics of the modern (novel's) subject in the novel by excluding the characteristics of the historical subject that existed in a particular time and space. This paper differs from previous studies in that it examines the way in which a novel is produced by comparing and analyzing The Song of sword based on the historical records. Through these analyses, we can see the unity of various heterogeneous elements, such as the historical reality, the writer's ideology and imagination, and the desire of the contemporary in the form of a novel. Also, by examining the elements of text that can not be sutured into a complete form, we can see the meaning of the novel's text as an unstable system.