• Title/Summary/Keyword: Comparative sentence

Search Result 52, Processing Time 0.024 seconds

Research on Recent Quality Estimation (최신 기계번역 품질 예측 연구)

  • Eo, Sugyeong;Park, Chanjun;Moon, Hyeonseok;Seo, Jaehyung;Lim, Heuiseok
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.7
    • /
    • pp.37-44
    • /
    • 2021
  • Quality estimation (QE) can evaluate the quality of machine translation output even for those who do not know the target language, and its high utilization highlights the need for QE. QE shared task is held every year at Conference on Machine Translation (WMT), and recently, researches applying Pretrained Language Model (PLM) are mainly being conducted. In this paper, we conduct a survey on the QE task and research trends, and we summarize the features of PLM. In addition, we used a multilingual BART model that has not yet been utilized and performed comparative analysis with the existing studies such as XLM, multilingual BERT, and XLM-RoBERTa. As a result of the experiment, we confirmed which PLM was most effective when applied to QE, and saw the possibility of applying the multilingual BART model to the QE task.

A Basic Performance Evaluation of the Speech Recognition APP of Standard Language and Dialect using Google, Naver, and Daum KAKAO APIs (구글, 네이버, 다음 카카오 API 활용앱의 표준어 및 방언 음성인식 기초 성능평가)

  • Roh, Hee-Kyung;Lee, Kang-Hee
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.12
    • /
    • pp.819-829
    • /
    • 2017
  • In this paper, we describe the current state of speech recognition technology and identify the basic speech recognition technology and algorithms first, and then explain the code flow of API necessary for speech recognition technology. We use the application programming interface (API) of Google, Naver, and Daum KaKao, which have the most famous search engine among the speech recognition APIs, to create a voice recognition app in the Android studio tool. Then, we perform a speech recognition experiment on people's standard words and dialects according to gender, age, and region, and then organize the recognition rates into a table. Experiments were conducted on the Gyeongsang-do, Chungcheong-do, and Jeolla-do provinces where the degree of tongues was severe. And Comparative experiments were also conducted on standardized dialects. Based on the resultant sentences, the accuracy of the sentence is checked based on spacing of words, final consonant, postposition, and words and the number of each error is represented by a number. As a result, we aim to introduce the advantages of each API according to the speech recognition rate, and to establish a basic framework for the most efficient use.

The Effects of Storybooks-making Activities Based on Masterpiece Appreciation on the Language Expression and Picture Appreciation Ability of Young Children (명화감상에 기초한 이야기책 만들기 활동이 유아의 언어표현력과 그림감상능력에 미치는 영향)

  • Lee, Seon Kyung;Choi, Hye Yoon
    • Korean Journal of Child Education & Care
    • /
    • v.18 no.2
    • /
    • pp.49-63
    • /
    • 2018
  • The objective of this study is to understand the effects of storybook making activity based on masterpiece appreciation on the language expression and picture appreciation ability of young children. Targeting 47 five year-old children of S & A kindergartens in Gwangju Metropolitan City, they were randomly assigned like 23 children for experimental group and 24 for comparative group. The experimental group performed the storybook making activity based on masterpiece appreciation for four sessions within 12 weeks while the comparative group expressed their feelings and thought into painting after appreciating masterpieces during the same period of time. Using SPSS 18.0 Program for the collected data, t-test was conducted for differences in the results of language expression and picture appreciation ability. The results of this study are as follows. First, the storybook making activity based on masterpiece appreciation had significant effects on the whole language expression except for sentence length, and it improved the language expression of young children. Second, the storybook making activity based on masterpiece appreciation had significant effects on the overall picture appreciation ability, and it improved the picture appreciation ability of young children. Such results imply that the whole process of appreciating masterpieces and making/appreciating storybook by expressing pre/post stories of painting into writing and drawing would be an effective teaching/learning method for the improvement of language expression and picture appreciation ability of young children.

A Comparative Analysis on the Primary Mathematics Textbooks for Multiplication and Division of Decimals: Focusing on Korea, Japan, Singapore, and Finland (소수의 곱셈과 나눗셈에 대한 초등 수학교과서 비교 분석: 한국, 일본, 싱가포르, 핀란드를 중심으로)

  • Park, Mangoo;Park, Haemin;Choi, Eunmi;Pyo, Junghee
    • Education of Primary School Mathematics
    • /
    • v.25 no.3
    • /
    • pp.251-278
    • /
    • 2022
  • The purpose of this study is to obtain implications for mathematical education by analyzing how the multiplication and division of decimal numbers are presented in the elementary mathematics textbooks in Korea, Japan, Singapore, and Finland. Compared to the fact that students often have misconceptions about multiplication and division of decimal numbers, there have been not many comparative studies in recent elementary mathematics textbooks. For this study, we selected elementary mathematics textbooks those are widely used in Japan, Singapore, and Finland along with Korean elementary mathematics textbooks. We chose the textbooks because the students in the selected countries have scored high in international achievement studies such as TIMSS and PISA. The analysis was examined in terms of elementary mathematics curriculum related to multiplication and division of decimal numbers, introduction and content, real-life situations, use of visual models, and formalization methods of algorithms. As a result of the study, the mathematics curricula related to multiplication and division of decimal numbers includes estimation in Korea and Finland, while Japan and Singapore emphasize real-life connections more, and Finland completes the operations in secondary schools. The introduction and content are intensively provided in a short period of time or distributed in various grades and semesters. The real-life situations are presented in a simple sentence format in all countries, and the use of visual models or formalization of algorithms is linked to the operations of natural numbers in unit conversions. Suggestions were made for textbook development and teacher training programs.

Knowledge Extraction Methodology and Framework from Wikipedia Articles for Construction of Knowledge-Base (지식베이스 구축을 위한 한국어 위키피디아의 학습 기반 지식추출 방법론 및 플랫폼 연구)

  • Kim, JaeHun;Lee, Myungjin
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.43-61
    • /
    • 2019
  • Development of technologies in artificial intelligence has been rapidly increasing with the Fourth Industrial Revolution, and researches related to AI have been actively conducted in a variety of fields such as autonomous vehicles, natural language processing, and robotics. These researches have been focused on solving cognitive problems such as learning and problem solving related to human intelligence from the 1950s. The field of artificial intelligence has achieved more technological advance than ever, due to recent interest in technology and research on various algorithms. The knowledge-based system is a sub-domain of artificial intelligence, and it aims to enable artificial intelligence agents to make decisions by using machine-readable and processible knowledge constructed from complex and informal human knowledge and rules in various fields. A knowledge base is used to optimize information collection, organization, and retrieval, and recently it is used with statistical artificial intelligence such as machine learning. Recently, the purpose of the knowledge base is to express, publish, and share knowledge on the web by describing and connecting web resources such as pages and data. These knowledge bases are used for intelligent processing in various fields of artificial intelligence such as question answering system of the smart speaker. However, building a useful knowledge base is a time-consuming task and still requires a lot of effort of the experts. In recent years, many kinds of research and technologies of knowledge based artificial intelligence use DBpedia that is one of the biggest knowledge base aiming to extract structured content from the various information of Wikipedia. DBpedia contains various information extracted from Wikipedia such as a title, categories, and links, but the most useful knowledge is from infobox of Wikipedia that presents a summary of some unifying aspect created by users. These knowledge are created by the mapping rule between infobox structures and DBpedia ontology schema defined in DBpedia Extraction Framework. In this way, DBpedia can expect high reliability in terms of accuracy of knowledge by using the method of generating knowledge from semi-structured infobox data created by users. However, since only about 50% of all wiki pages contain infobox in Korean Wikipedia, DBpedia has limitations in term of knowledge scalability. This paper proposes a method to extract knowledge from text documents according to the ontology schema using machine learning. In order to demonstrate the appropriateness of this method, we explain a knowledge extraction model according to the DBpedia ontology schema by learning Wikipedia infoboxes. Our knowledge extraction model consists of three steps, document classification as ontology classes, proper sentence classification to extract triples, and value selection and transformation into RDF triple structure. The structure of Wikipedia infobox are defined as infobox templates that provide standardized information across related articles, and DBpedia ontology schema can be mapped these infobox templates. Based on these mapping relations, we classify the input document according to infobox categories which means ontology classes. After determining the classification of the input document, we classify the appropriate sentence according to attributes belonging to the classification. Finally, we extract knowledge from sentences that are classified as appropriate, and we convert knowledge into a form of triples. In order to train models, we generated training data set from Wikipedia dump using a method to add BIO tags to sentences, so we trained about 200 classes and about 2,500 relations for extracting knowledge. Furthermore, we evaluated comparative experiments of CRF and Bi-LSTM-CRF for the knowledge extraction process. Through this proposed process, it is possible to utilize structured knowledge by extracting knowledge according to the ontology schema from text documents. In addition, this methodology can significantly reduce the effort of the experts to construct instances according to the ontology schema.

Korean Students' Achievement in Scientific Literacy (우리 나라 학생들의 과학적 소양 성취도)

  • Shin, Dong-Hee;Ro, Koog-Hyang
    • Journal of The Korean Association For Science Education
    • /
    • v.22 no.1
    • /
    • pp.76-92
    • /
    • 2002
  • OECD/PISA(Programme for International Student Assessment) is significant in that it is the first international comparative study assessing 15-year-old students' scientific literacy. Based on Korean students' results of percent correct in 35 science items, several characteristics such as followings were revealed. First, from the perspectives of science application area, Korean students showed the highest achievement in the area of 'science in technology' followed by in the areas of 'science in life and health' and 'science in earth and environment'. Male students achieved significantly better than female counterparts in all three areas. Second, the achievement in items of science knowledge was significantly higher than in items of scientific processes. Whereas the achievement difference between science knowledge- and scientific process items was larger for male students. Third, from the perspectives of application contexts, Korean students showed the highest achievement in the historical context and the lowest achievement in the personal context. Fourth, from the perspectives of item format, Korean students performed significantly better in open-constructed items rather than in multiple-choice items. Fifth, Korean students showed low performance in items of biotechnology and environment-related issue, which was more prominent for female students. Sixth, whereas male students performed significantly better than female students in most aspects, it is noteworthy that there was no significant gender differences in items of scientific processes and females performed significantly better than male students in open-constructed items which require long sentence.

Competitive Advantages and Growth Characteristics of Korea's Tourism Industry - Comparative Analysis with Northeast Asian Countries by Using Shift-Share Method (우리나라 관광산업의 경쟁우위와 성장 특성 - 변이할당분석방법을 이용한 동북아시아 지역 국가들과의 비교 분석)

  • Kim, Young-Joon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.3
    • /
    • pp.370-379
    • /
    • 2020
  • This study examined the growth characteristics and competitive advantages of Korea's tourism industry compared to other Northeast Asian countries using the Balassa Indices and Shift-Share method. The analysis results showed that the growth of Korea's tourism industry over the past decade was due mainly to external factors, such as the growth of the global economy and the expansion of the tourism sector, while the role of growth momentum of the tourism industry itself was insignificant. Employment in Korea's tourism industry has shown relatively higher increasing rates compared to the rates of the total amount of sales and value-adding. This appears to be caused by the decreased absorption of the labor force in the tourism industry due to the overall capacity of job creation. (Ed note: This sentence was unclear. Please check the edits.)The competitive advantage of Korea's tourism industry has been strengthened over the past decade, but it is still inferior to other countries. The travel account balance showed that the economic size of the Chinese tourism sector had grown rapidly over the past decade, but the competitive advantage of the sector has been weakened. On the other hand, the economic size of the Japanese tourism sector has shown sluggish growth, while its competitive advantage has been strengthened significantly.

Design of a Deep Neural Network Model for Image Caption Generation (이미지 캡션 생성을 위한 심층 신경망 모델의 설계)

  • Kim, Dongha;Kim, Incheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.4
    • /
    • pp.203-210
    • /
    • 2017
  • In this paper, we propose an effective neural network model for image caption generation and model transfer. This model is a kind of multi-modal recurrent neural network models. It consists of five distinct layers: a convolution neural network layer for extracting visual information from images, an embedding layer for converting each word into a low dimensional feature, a recurrent neural network layer for learning caption sentence structure, and a multi-modal layer for combining visual and language information. In this model, the recurrent neural network layer is constructed by LSTM units, which are well known to be effective for learning and transferring sequence patterns. Moreover, this model has a unique structure in which the output of the convolution neural network layer is linked not only to the input of the initial state of the recurrent neural network layer but also to the input of the multimodal layer, in order to make use of visual information extracted from the image at each recurrent step for generating the corresponding textual caption. Through various comparative experiments using open data sets such as Flickr8k, Flickr30k, and MSCOCO, we demonstrated the proposed multimodal recurrent neural network model has high performance in terms of caption accuracy and model transfer effect.

Comparative Analysis of Verbal Interaction between Teachers and Students for the Gifted and the General Science Class in Middle School (중학교 일반학급과 영재학급의 과학수업에서 교사와 학생사이의 언어적 상호작용 비교 분석)

  • Lee, Ji-Hyang;Kim, Dong-Jin;Hwang, Hyun-Sook;Park, Se-Yeol;Baek, In-Hwan;Park, Kuk-Tae
    • Journal of Gifted/Talented Education
    • /
    • v.20 no.3
    • /
    • pp.721-741
    • /
    • 2010
  • This study was to analyze verbal interactions between teachers and students after observations on teachers' questioning and feedback, students' response types and frequency analysis at middle-school class of average and gifted students. As for the verbal interaction between teachers and students of science class of general students, it was dominant for teachers to utilize questions for summarizing or guiding for textbook contents as they are. They were focused on immediate feedback in a restatement form. The students used simple responses like yes/no in general. The most high frequency of verbal interaction models expressed were in the order of cognitive-memory thinking question-short answer-immediate feedback. On the other hand, teachers of gifted students' science class threw divergent and evaluative thinking questions of open question, such as 'what's the reason?' or 'why is it?' Immediate feedback in explanatory form was mainly provided as well. The level of feedback delay was higher than general class and that of immediate feedback was lower than general class. The students preferred short words or a not-complicated sentence when they replied and their participation was more attentive and positive. Hence, The high frequency of verbal interaction models expressed were in order of cognitive-memory thinking questions-elaborative short answer-delayed feedback.

Sentiment Analysis of Movie Review Using Integrated CNN-LSTM Mode (CNN-LSTM 조합모델을 이용한 영화리뷰 감성분석)

  • Park, Ho-yeon;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.141-154
    • /
    • 2019
  • Rapid growth of internet technology and social media is progressing. Data mining technology has evolved to enable unstructured document representations in a variety of applications. Sentiment analysis is an important technology that can distinguish poor or high-quality content through text data of products, and it has proliferated during text mining. Sentiment analysis mainly analyzes people's opinions in text data by assigning predefined data categories as positive and negative. This has been studied in various directions in terms of accuracy from simple rule-based to dictionary-based approaches using predefined labels. In fact, sentiment analysis is one of the most active researches in natural language processing and is widely studied in text mining. When real online reviews aren't available for others, it's not only easy to openly collect information, but it also affects your business. In marketing, real-world information from customers is gathered on websites, not surveys. Depending on whether the website's posts are positive or negative, the customer response is reflected in the sales and tries to identify the information. However, many reviews on a website are not always good, and difficult to identify. The earlier studies in this research area used the reviews data of the Amazon.com shopping mal, but the research data used in the recent studies uses the data for stock market trends, blogs, news articles, weather forecasts, IMDB, and facebook etc. However, the lack of accuracy is recognized because sentiment calculations are changed according to the subject, paragraph, sentiment lexicon direction, and sentence strength. This study aims to classify the polarity analysis of sentiment analysis into positive and negative categories and increase the prediction accuracy of the polarity analysis using the pretrained IMDB review data set. First, the text classification algorithm related to sentiment analysis adopts the popular machine learning algorithms such as NB (naive bayes), SVM (support vector machines), XGboost, RF (random forests), and Gradient Boost as comparative models. Second, deep learning has demonstrated discriminative features that can extract complex features of data. Representative algorithms are CNN (convolution neural networks), RNN (recurrent neural networks), LSTM (long-short term memory). CNN can be used similarly to BoW when processing a sentence in vector format, but does not consider sequential data attributes. RNN can handle well in order because it takes into account the time information of the data, but there is a long-term dependency on memory. To solve the problem of long-term dependence, LSTM is used. For the comparison, CNN and LSTM were chosen as simple deep learning models. In addition to classical machine learning algorithms, CNN, LSTM, and the integrated models were analyzed. Although there are many parameters for the algorithms, we examined the relationship between numerical value and precision to find the optimal combination. And, we tried to figure out how the models work well for sentiment analysis and how these models work. This study proposes integrated CNN and LSTM algorithms to extract the positive and negative features of text analysis. The reasons for mixing these two algorithms are as follows. CNN can extract features for the classification automatically by applying convolution layer and massively parallel processing. LSTM is not capable of highly parallel processing. Like faucets, the LSTM has input, output, and forget gates that can be moved and controlled at a desired time. These gates have the advantage of placing memory blocks on hidden nodes. The memory block of the LSTM may not store all the data, but it can solve the CNN's long-term dependency problem. Furthermore, when LSTM is used in CNN's pooling layer, it has an end-to-end structure, so that spatial and temporal features can be designed simultaneously. In combination with CNN-LSTM, 90.33% accuracy was measured. This is slower than CNN, but faster than LSTM. The presented model was more accurate than other models. In addition, each word embedding layer can be improved when training the kernel step by step. CNN-LSTM can improve the weakness of each model, and there is an advantage of improving the learning by layer using the end-to-end structure of LSTM. Based on these reasons, this study tries to enhance the classification accuracy of movie reviews using the integrated CNN-LSTM model.