• Title/Summary/Keyword: high-frequency vocabulary

Search Result 23, Processing Time 0.024 seconds

An Analysis on the Vocabulary in the English-Translation Version of Donguibogam Using the Corpus-based Analysis (코퍼스 분석방법을 이용한 『동의보감(東醫寶鑑)』 영역본의 어휘 분석)

  • Jung, Ji-Hun;Kim, Dong-Ryul;Kim, Do-Hoon
    • The Journal of Korean Medical History
    • /
    • v.28 no.2
    • /
    • pp.37-45
    • /
    • 2015
  • Objectives : A quantitative analysis on the vocabulary in the English translation version of Donguibogam. Methods : This study quantitatively analyzed the English-translated texts of Donguibogam with the Corpus-based analysis, and compared the quantitative results analyzing the texts of original Donguibogam. Results : As the results from conducting the corpus analysis on the English-translation version of Donguibogam, it was found that the number of total words (Token) was about 1,207,376, and the all types of used words were about 20.495 and the TTR (Type/Token Rate) was 1.69. The accumulation rate reaching to the high-ranking 1000 words was 83.54%, and the accumulation rate reaching to the high-ranking 2000 words was 90.82%. As the words having the high-ranking frequency, the function words like 'the, and of, is' mainly appeared, and for the content words, the words like 'randix, qi, rhizoma and water' were appeared in multi frequencies. As the results from comparing them with the corpus analysis results of original version of Donguibogam, it was found that the TTR was higher in the English translation version than that of original version. The compositions of function words and contents words having high-ranking frequencies were similar between the English translation version and the original version of Donguibogam. The both versions were also similar in that their statements in the parts of 'Remedies' and 'Acupuncture' showed higher composition rate of contents words than the rate of function words. Conclusions : The vocabulary in the English translation version of Donguibogam showed that this book was a book keeping the complete form of sentence and an Korean medical book at the same time. Meanwhile, the English translation version of Donguibogam had some problems like the unification of vocabulary due to several translators, and the incomplete delivery of word's meanings from the Chinese character-culture area to the English-culture area, and these problems are considered as the matters to be considered in a work translating Korean old medical books in English.

The study analyzed a diachronic distribution, social meanings and social evaluations of ONNA : 'Headline Database of Newspaper Articles' by KOKKEN were used as research data. (「여(女)」 관련 어휘의 사용실태 - 国研「ことばに関する新聞記事見出しデ?タベ?ス」를 분석대상으로)

  • Oh, Mi sun
    • Cross-Cultural Studies
    • /
    • v.29
    • /
    • pp.341-366
    • /
    • 2012
  • 'Headline Database of Newspaper Articles' is a database which contains about 141,500 newspaper articles from 1949 to March, 2009. They are collected from two perspectives; 'language' and 'language life' by KOKKEN. There were 3312 newspaper articles (about 2.34%) which included the word ONNA at 'Headline Database of Newspaper Articles'. The number of newspaper articles related to ONNA started to increase in 1975 but they decreased afterwards. They increased rapidly in 1980 and maintained the condition. However, they started to decrease rapidly in 1990 and maintained the decreased condition. They increased rapidly again in 2004 and 2007. The main causes of rapid increase were the commercial message of instant noodles "I am the one who is making. I am the one who is eating." in 1975, newspaper articles related to "Starting of full-scale studies on female language" in 1980, comments of "active women" and "men's crime" related to a murder case of an elementary school student in Sasebo City and mixed attendance books in 2004, a comment of "Women are machines which give birth to babies" in 2007. Those six causes of rapid increase suggested that the perception of gender such as 'Men need to work outside and Women need to do housework and take care of child' which was fixed until then was changing and becoming a stereotype of virtual reality rather than reality. The vocabulary related to ONNA appeared 3411 times among 3312 newspaper articles which included ONNA. Typical forms of the vocabulary related to ONNA were and . They appeared 2390 times and occupied 70% of the whole data. (3411 times) The form of ONNANOKO among the vocabulary related to ONNA appeared 113 times and occupied a high rate. ONNANOKO(113) and other words such as SHOJO(115), JOJI(28), YOJO(9) (152 in total) implied that appearing of young women at newspaper articles were increasing. Also, the vocabulary related to 'female language' such as ONNAKOTOBA(28) ONNANOKOTOBA(10) and a woman's heart such as ONNAGOKORO(35) and ONNANOKIMOCHI(34) appeared frequently. The vocabulary related to JOSEI were divided into <$JOSEI^{**}$> and <$^{**}JOSEI$>. <$JOSEI^{**}$> were mainly related to an occupation. <$^{**}JOSEI$> were mainly used to express women by regional groups such as or combined with modifiers to express women such as . In case with modifiers, WAKAIJOSEI appeared 35 times and showed the highest frequency. It had negative evaluations in many cases. The vocabulary related to JOSI appeared on the form of <$JOSI^{**}$> and mainly associated with 'a girl's school' and 'a female student'.

A study on the predictability of acoustic power distribution of English speech for English academic achievement in a Science Academy (과학영재학교 재학생 영어발화 주파수 대역별 음향 에너지 분포의 영어 성취도 예측성 연구)

  • Park, Soon;Ahn, Hyunkee
    • Phonetics and Speech Sciences
    • /
    • v.14 no.3
    • /
    • pp.41-49
    • /
    • 2022
  • The average acoustic distribution of American English speakers was statistically compared with the English-speaking patterns of gifted students in a Science Academy in Korea. By analyzing speech recordings, the duration time of which is much longer than in previous studies, this research identified the degree of acoustic proximity between the two parties and the predictability of English academic achievement of gifted high school students. Long-term spectral acoustic power distribution vectors were obtained for 2,048 center frequencies in the range of 20 Hz to 20,000 Hz by applying an long-term average speech spectrum (LTASS) MATLAB code. Three more variables were statistically compared to discover additional indices that can predict future English academic achievement: the receptive vocabulary size test, the cumulative vocabulary scores of English formative assessment, and the English Speaking Proficiency Test scores. Linear regression and correlational analyses between the four variables showed that the receptive vocabulary size test and the low-frequency vocabulary formative assessments which require both lexical and domain-specific science background knowledge are relatively more significant variables than a basic suprasegmental level English fluency in the predictability of gifted students' academic achievement.

Analysis of the English Textbooks in North Korean First Middle School (북한 제1중학교 영어교과서 분석)

  • Hwang, Seo-yeon;Kim, Jeong-ryeol
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.11
    • /
    • pp.242-251
    • /
    • 2017
  • For the purposes of this research, a corpus of words was created from the English textbooks of the "First Middle School" for the gifted in North Korea, and using the corpus, their linguistic characteristics were analyzed. Although there have been many studies that identified the traits of English textbooks in the North Korea's general middle school, not much focus has been placed on the English textbooks used at North Korea's First Middle School. Initially, the structure of English textbooks of the first, second, fourth, and sixth grades that had been procured from the Information Center on North Korea was reviewed, after which their corpus was created. Then, by using Wordsmith Tools 7.0, linguistic properties and high frequency content words appeared in the English textbook of the first grade were analyzed specifically. Basic statistical data gathered indicated that while the number of vocabulary did not increase as students progress through the grades, the words used tended to diversify incrementally. In the mean time, a distribution of the high frequency content words by grade illustrated that a big difference was found between the content words used in the English texts of each grade, and it was a subject matter of the texts that determined such difference.

A Case Study of Untact Lecture on Albert Camus' La Peste using Big Data (빅데이터를 활용한 『페스트』(알베르 카뮈) 비대면 문학 강의 운영 사례 연구)

  • MIN, Jinyoung
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.4
    • /
    • pp.59-65
    • /
    • 2021
  • This is a case study on the use of Albert Camus' La Peste, which has gained its popularity in today's generation of post-COVID as well as the use of big data analysis tools for major and elective classes. First, we asked students majoring in French to compare the use of vocabulary and the number of appearances for characters using big data analysis, for about 400 pages of the original text. As a result, we were able to confirm a similar relationship between Camus' Absurdism and the vocabulary used within La Peste, in addition to noting the heavy frequency of resistant characters. Students in elective classes were asked to read the literature in a Korean-translated version to determine the frequency of vocabulary and characters' appearances. Students were able to strongly relate to La Peste due to its commonality between COVID and the plague in the literature. We also received high levels of class satisfaction regarding the use of big data analysis tools. The students showed a positive response both towards choosing La Peste as the work of literature and using big data, the main tool in the Fourth Industrial Evolution. We were able to identify good results even in a non-contact environment, as long as the literature does not rely on traditional methods but rather lectures to reflect current situations.

Metrical Comparison of English Textbooks in East Asian Countries, the U.S.A. and U.K.

  • Ban, Hiromi;Ededrick, Toby;Oyabu, Takashi
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2003.09a
    • /
    • pp.508-512
    • /
    • 2003
  • In 2000, the economy of Asia made a V-character type recovery from the currency and financial crisis in 1997. The increase in exports is assumed to be one of the causes. To negotiate with foreign countries, English must be indispensable in many cases. In this study, we investigated how English education is performed in East Asian countries while focusing on English textbooks. We metrically analyzed some textbooks used junior high schools and high school in Japan and Korea, and elementary schools in China and Singapore to compare them with U.S.A and U.K textbook. We investigated some characteristics of character-and word-appearance of English textbook using an exponential function. Moreover we derived the degree of difficulty far each material through the variety of words and their frequency on the basis of the required English vocabulary in Japanese junior high schools. As a result we could show at which level of U.S.A. or U.K the English textbooks used in East Asian countries are.

  • PDF

A Genre Analysis of Newspaper Articles for Korean Language Education -Based on the linguistic analysis of newspaper articles and reading materials in Korean language textbooks- (한국어 읽기 교육을 위한 기사문 장르분석 -신문기사 및 교재 기사문의 언어학적 분석을 바탕으로-)

  • Lee, Seungyeon;Sim, Jiyeon;Shin, Jungha
    • Journal of Korean language education
    • /
    • v.28 no.3
    • /
    • pp.53-83
    • /
    • 2017
  • The goal of this study is to examine whether the genre characteristics of newspaper articles are appropriately reflected in Korean language textbooks. For the purpose of this study, two corpora were built with 17 textbook articles and 60 newspaper articles respectively. The average sentence length and frequency of vocabulary in each corpus were measured. It was found that the sentences of articles in textbooks tended to have longer sentence length and more complicated structures than the articles in newspapers. For instance, sentences in the textbook articles had more verbal endings, such as conjunctive and transforming endings. On the other hand, in case of vocabulary representing 'timeliness', there was a high frequency of adverbs and nouns which were related to year, month, and time in actual articles, while it is found to be very limited in textbooks. Also, typical translative styles such as '-ko itta', '-e ttareumyun' were more prominent in textbooks than in newspaper articles. In the case of abbreviated and omitted form of particles, this was a characteristic that appeared only in actual articles because of the constraint of space. It is significant that this paper offers suggestions for the development of reading materials for Korean language education by revealing that the genre typology of actual newspaper articles is not adequately reflected in current textbooks.

Characteristics of Environmental Color Image Vocabulary for Public Healthcare Facility (공공보건시설 환경색채이미지 어휘 특성)

  • Park, Heykyung;Oh, Jiyoung
    • Korea Science and Art Forum
    • /
    • v.31
    • /
    • pp.171-180
    • /
    • 2017
  • The purpose of this study is to analyze the characteristics of color image for establishing the color environment contributing to the promotion of public health in the public health facilities and to utilize it as data of public health color plan and index development. For this purpose, the results of the previous precedent studies were integrated and public health facilities were classified into medical facilities (general hospitals), health facilities (public health centers), and sub - healing facilities (elderly care facilities). We visited 18 public health facilities in total, measured the environmental color of with a spectroscopic, compared the results and the precedent studies results, and identified color image characteristics and future supplement points. The results are as follows. First, the previous studies related to the environment color image vocabulary of the public health facilities, it prefer comfortable, bright and positive image. Second, as a result of direct measurement the environmental color of the public health facilities, it is found that most of them use the high brightness and low saturation color of Y series. Third, as a result of analyzing vocabulary of environmental color image of public health facilities, 'natural' image showed the highest frequency, and other images such as 'gentle' and 'decent' appeared. It was difficult to understand the characteristics of the color image vocabularies of public health facilities. This study is a convergence study of color science and environmental design, and it extends the scope of multidisciplinary research related to design and it will be helpful in environmental planning on user's emotion.

Neo-Chinese Style Furniture Design Based on Semantic Analysis and Connection

  • Ye, Jialei;Zhang, Jiahao;Gao, Liqian;Zhou, Yang;Liu, Ziyang;Han, Jianguo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.8
    • /
    • pp.2704-2719
    • /
    • 2022
  • Lately, neo-Chinese style furniture has been frequently noticed by product design professionals for the big part it played in promoting traditional Chinese culture. This article is an attempt to use big data semantic analysis method to provide effective design research method for neo-Chinese furniture design. By using big data mining program TEXTOM for big data collection and analysis, the data obtained from typical websites in a set time period will be sorted and analyzed. On the basis of "neo-Chinese furniture" samples, key data will be compared, classification analysis of overall data, and horizontal analysis of typical data will be performed by the methods of word frequency analysis, connection centrality analysis, and TF-IDF analysis. And we tried to summarize according to the related views and theories of the design. The research results show that the results of data analysis are close to the relevant definitions of design. The core high-frequency vocabulary obtained under data analysis, such as popular, furniture, modern, etc., can provide a reasonable and effective focus of attention for the designs. The result obtained through the systematic sorting and summary of the data can be a reliable guidance in the direction of our design. This research attempted to introduce related big data mining semantic analysis methods into the product design industry, to supply scientific and objective data and channels for studies on design, and to provide a case on the practical application of big data analysis in the industry.

The Blog Polarity Classification Technique using Opinion Mining (오피니언 마이닝을 활용한 블로그의 극성 분류 기법)

  • Lee, Jong-Hyuk;Lee, Won-Sang;Park, Jea-Won;Choi, Jae-Hyun
    • Journal of Digital Contents Society
    • /
    • v.15 no.4
    • /
    • pp.559-568
    • /
    • 2014
  • Previous polarity classification using sentiment analysis utilizes a sentence rule by product reviews based rating points. It is difficult to be applied to blogs which have not rating of product reviews and is possible to fabricate product reviews by comment part-timers and managers who use web site so it is not easy to understand a product and store reviews which are reliability. Considering to these problems, if we analyze blogs which have personal and frank opinions and classify polarity, it is possible to understand rightly opinions for the product, store. This paper suggests that we extract high frequency vocabularies in blogs by several domains and choose topic words. Then we apply a technique of sentiment analysis and classify polarity about contents of blogs. To evaluate performances of sentiment analysis, we utilize the measurement index that use Precision, Recall, F-Score in an information retrieval field. In a result of evaluation, using suggested sentiment analysis is the better performances to classify polarity than previous techniques of using the sentence rule based product reviews.