• Title/Summary/Keyword: Corpus Frequency

Search Result 166, Processing Time 0.026 seconds

A Genre Analysis of Newspaper Articles for Korean Language Education -Based on the linguistic analysis of newspaper articles and reading materials in Korean language textbooks- (한국어 읽기 교육을 위한 기사문 장르분석 -신문기사 및 교재 기사문의 언어학적 분석을 바탕으로-)

  • Lee, Seungyeon;Sim, Jiyeon;Shin, Jungha
    • Journal of Korean language education
    • /
    • v.28 no.3
    • /
    • pp.53-83
    • /
    • 2017
  • The goal of this study is to examine whether the genre characteristics of newspaper articles are appropriately reflected in Korean language textbooks. For the purpose of this study, two corpora were built with 17 textbook articles and 60 newspaper articles respectively. The average sentence length and frequency of vocabulary in each corpus were measured. It was found that the sentences of articles in textbooks tended to have longer sentence length and more complicated structures than the articles in newspapers. For instance, sentences in the textbook articles had more verbal endings, such as conjunctive and transforming endings. On the other hand, in case of vocabulary representing 'timeliness', there was a high frequency of adverbs and nouns which were related to year, month, and time in actual articles, while it is found to be very limited in textbooks. Also, typical translative styles such as '-ko itta', '-e ttareumyun' were more prominent in textbooks than in newspaper articles. In the case of abbreviated and omitted form of particles, this was a characteristic that appeared only in actual articles because of the constraint of space. It is significant that this paper offers suggestions for the development of reading materials for Korean language education by revealing that the genre typology of actual newspaper articles is not adequately reflected in current textbooks.

Effects of Bombesin on Electrical and Mechanical Activities of Gastric Smooth Muscle Strips of Cats (적출한 고양이 위(胃) 평활근 절편의 전기적 및 기계적 활동에 미치는 Bombesin의 영향과 그 작용기전)

  • Park, Hyoung-Jin;Kwon, Hyeok-Yil;Suh, Sang-Won;Kim, Jeong-Mi;Lee, Tae-Hyung
    • The Korean Journal of Physiology
    • /
    • v.24 no.1
    • /
    • pp.39-49
    • /
    • 1990
  • It has been reported that bombesin induces contraction of the smooth muscle of the gastrointestinal tract. Thus, the present investigation was undertaken to see an influence of bombesin on electrical activity of the gastric smooth muscle, since electrical activity is associated with contractile activity in the smooth muscle of the stomach. Smooth muscle strips $(5\;{\times}\;1.5\;cm)$ that included the corpus and antrum were prepared from the ventral and dorsal portion of the feline stomach along the greater curvature. Circular muscle strips $(1\;{\times}\;0.3\;cm)$ of the corpus were also obtained. Electrical activity of the corpus and antrum of the muscle strip was monophasically recorded by using Ag-AgCl capillary electrodes placed on the circular muscle layer. Contractile activity of the circular muscle strip was also recorded. The recordings were performed in Krebs-Ringer solution that was continuously aerated with $O_{2}$ containing 5% $Co_{2}$, and kept at $36^{\circ}C$. Dose-related responses of electrical activity and contractility to bombesin was studied after frequency of slow waves and contraction of each strip reached to a steady state. An action of $D-leu^{13}-{\psi}\;(CH_{2}NH)-D-leu^{14}-bombesin,\;D-pro^{2}-D-trp^{7,9}-substance\;P$, tetrodotoxin, hexamethonium, atropine, phentolamine or propranolol on the effect of bombesin was also observed. 1) Bombesin increased frequency of slow waves and contractions dose-dependently at concentrations from $10^{-9}\;M\;to\;3\;{\times}\;10^{-8}\;M$. 2) The bombesin analogue at a concentration of $3\;{\times}\;10^{-7}\;M$ antagonized the effect of bombesin on frequency of slow waves. 3) The effect of bombesin on frequency of slow waves was inhibited by tetrodotoxin $(10^{-6}\;M)$ and hexamethonium $(10^{-3}\;M)$ but unaffected by atropine $(10^{-6}\;M)$, phentolamine $(10^{-5}\;M)$ and propranolol $(10^{-5}\;M)$. 4) The effect of bombesin on frequency of slow waves was blocked by the substance P analogue at a concentration of $10^{-5}\;M$. 5) Substance P at a concentration of $10^{-5}\;M$ failed to change frequency of slow waves. It is concluded from the above results that bombesin increases the frequency of slow waves as well as contractions of the smooth muscle strip from the feline stomach, and the effect of bombesin might be mediated by non-cholinergic or non-adrenergic mechanism at neuromuscular junction. However, enteric nerves that have substance P as a neurotransmitter do not appear to participate in the action of bombesin on frequency of slow waves.

  • PDF

Separation of Voiced Sounds and Unvoiced Sounds for Corpus-based Korean Text-To-Speech (한국어 음성합성기의 성능 향상을 위한 합성 단위의 유무성음 분리)

  • Hong, Mun-Ki;Shin, Ji-Young;Kang, Sun-Mee
    • Speech Sciences
    • /
    • v.10 no.2
    • /
    • pp.7-25
    • /
    • 2003
  • Predicting the right prosodic elements is a key factor in improving the quality of synthesized speech. Prosodic elements include break, pitch, duration and loudness. Pitch, which is realized by Fundamental Frequency (F0), is the most important element relating to the quality of the synthesized speech. However, the previous method for predicting the F0 appears to reveal some problems. If voiced and unvoiced sounds are not correctly classified, it results in wrong prediction of pitch, wrong unit of triphone in synthesizing the voiced and unvoiced sounds, and the sound of click or vibration. This kind of feature is usual in the case of the transformation from the voiced sound to the unvoiced sound or from the unvoiced sound to the voiced sound. Such problem is not resolved by the method of grammar, and it much influences the synthesized sound. Therefore, to steadily acquire the correct value of pitch, in this paper we propose a new model for predicting and classifying the voiced and unvoiced sounds using the CART tool.

  • PDF

Building and Analyzing Panic Disorder Social Media Corpus for Automatic Deep Learning Classification Model (딥러닝 자동 분류 모델을 위한 공황장애 소셜미디어 코퍼스 구축 및 분석)

  • Lee, Soobin;Kim, Seongdeok;Lee, Juhee;Ko, Youngsoo;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.38 no.2
    • /
    • pp.153-172
    • /
    • 2021
  • This study is to create a deep learning based classification model to examine the characteristics of panic disorder and to classify the panic disorder tendency literature by the panic disorder corpus constructed for the present study. For this purpose, 5,884 documents of the panic disorder corpus collected from social media were directly annotated based on the mental disease diagnosis manual and were classified into panic disorder-prone and non-panic-disorder documents. Then, TF-IDF scores were calculated and word co-occurrence analysis was performed to analyze the lexical characteristics of the corpus. In addition, the co-occurrence between the symptom frequency measurement and the annotated symptom was calculated to analyze the characteristics of panic disorder symptoms and the relationship between symptoms. We also conducted the performance evaluation for a deep learning based classification model. Three pre-trained models, BERT multi-lingual, KoBERT, and KcBERT, were adopted for classification model, and KcBERT showed the best performance among them. This study demonstrated that it can help early diagnosis and treatment of people suffering from related symptoms by examining the characteristics of panic disorder and expand the field of mental illness research to social media.

Modality in Korean Learners' Spoken Interlanguage

  • Park, Hyeson
    • English Language & Literature Teaching
    • /
    • v.18 no.1
    • /
    • pp.197-216
    • /
    • 2012
  • This study examines spoken interlanguage of Korean learners of English, focusing on the distribution of modal verbs and devices of epistemic modality. (Semi-) spontaneous speech data were collected from four students participating in a self-organized study group for seven months, which produced a corpus of about 55,000 words. The data analysis reveals the following: 1) The frequency of the modal verbs produced by the learners was lower than that of native speakers; 1.99 vs. 2.32 tokens per 100 words. The range of the modal verbs used by the learners was also very limited, with over-reliance on can (43%). 2) The grammatical categories of the devices marking epistemic modality were in the order of adverbs, lexical verbs, and modal verbs, with a high frequency of a few items in each category. 3) Lexical items conveying certainty and modals of obligation were preferred over markers of weaker commitment, resulting in speech characterized by firmer assertions and a more authoritative tone, a potential cause for pragmatic failure. 4) A weak developmental change was observed in the frequency of modal verbs, but not in their functions over the seven month period of data collection. L1 influence, L2 proficiency, mode of communication, and instruction effects are discussed as possible variables involved in the distribution patterns observed.

  • PDF

Unstructured Data Analysis and Multi-pattern Storage Technique for Traffic Information Inference (교통정보 추론을 위한 비정형데이터 분석과 다중패턴저장 기법)

  • Kim, Yonghoon;Kim, Booil;Chung, Mokdong
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.2
    • /
    • pp.211-223
    • /
    • 2018
  • To understand the meaning of data is a common goal of research on unstructured data. Among these unstructured data, there are difficulties in analyzing the meaning of unstructured data related to corpus and sentences. In the existing researches, the researchers used LSA to select sentences with the most similar meaning to specific words of the sentences. However, it is problematic to examine many sentences continuously. In order to solve unstructured data classification problem, several search sites are available to classify the frequency of words and to serve to users. In this paper, we propose a method of classifying documents by using the frequency of similar words, and the frequency of non-relevant words to be applied as weights, and storing them in terms of a multi-pattern storage. We use Tensorflow's Softmax to the nearby sentences for machine learning, and utilize it for unstructured data analysis and the inference of traffic information.

Intonation Patterns of Korean Spontaneous Speech (한국어 자유 발화 음성의 억양 패턴)

  • Kim, Sun-Hee
    • Phonetics and Speech Sciences
    • /
    • v.1 no.4
    • /
    • pp.85-94
    • /
    • 2009
  • This paper investigates the intonation patterns of Korean spontaneous speech through an analysis of four dialogues in the domain of travel planning. The speech corpus, which is a subset of spontaneous speech database recorded and distributed by ETRI, is labeled in APs and IPs based on K-ToBI system using Momel, an intonation stylization algorithm. It was found that unlike in English, a significant number of APs and IPs include hesitation lengthening, which is known to be a disfluency phenomenon due to speech planning. This paper also claims that the hesitation lengthening is different from the IP-final lengthening and that it should be categorized as a new category, as it greatly affects the intonation patterns of the language. Except for the fact that 19.09% of APs show hesitation lengthening, the spontaneous speech shows the same AP patterns as in read speech with higher frequency of falling patterns such as LHL in comparison with read speech which show more LH and LHLH patterns. The IP boundary tones of spontaneous speech, showing the same five patterns such as L%, HL%, LHL%, H%, LH% as in read speech, show higher frequency of rising patterns (H% and LH%) and contour tones (HL%, LH%, LHL%) while read speech on the contrary shows higher frequency of falling patterns and simple tones at the end of IPs.

  • PDF

A Study on Environmental research Trends by Information and Communications Technologies using Text-mining Technology (텍스트 마이닝 기법을 이용한 환경 분야의 ICT 활용 연구 동향 분석)

  • Park, Boyoung;Oh, Kwan-Young;Lee, Jung-Ho;Yoon, Jung-Ho;Lee, Seung Kuk;Lee, Moung-Jin
    • Korean Journal of Remote Sensing
    • /
    • v.33 no.2
    • /
    • pp.189-199
    • /
    • 2017
  • Thisstudy quantitatively analyzed the research trendsin the use ofICT ofthe environmental field using the text mining technique. To that end, the study collected 359 papers published in the past two decades(1996-2015)from the National Digital Science Library (NDSL) using 38 environment-related keywords and 16 ICT-related keywords. It processed the natural languages of the environment and ICT fields in the papers and reorganized the classification system into the unit of corpus. It conducted the text mining analysis techniques of frequency analysis, keyword analysis and the association rule analysis of keywords, based on the above-mentioned keywords of the classification system. As a result, the frequency of the keywords of 'general environment' and 'climate' accounted for 77 % of the total proportion and the keywords of 'public convergence service' and 'industrial convergence service' in the ICT field took up approximately 30 % of the total proportion. According to the time series analysis, the researches using ICT in the environmental field rapidly increased over the past 5 years (2011-2015) and the number of such researches more than doubled compared to the past (1996-2010). Based on the environmental field with generated association rules among the keywords, it was identified that the keyword 'general environment' was using 16 ICT-based technologies and 'climate' was using 14 ICT-based technologies.

Clinical Analysis and Investigation for the Infertile Women with Hyperprolactinemia (불임환자의 고 Prolactin 혈증에 관한 연구)

  • Kang, S.B.;Kang, B.M.;Kim, J.G.;Lee, J.Y.;Chang, Y.S.
    • Clinical and Experimental Reproductive Medicine
    • /
    • v.13 no.1
    • /
    • pp.21-28
    • /
    • 1986
  • It is now apparent that many cases of amenorrhea. oligomenorrhea. corpus luteum deficiency, galactorrhea, and infertility are due to hyperprolactinemia. We investigated the relationships between serum prolactin values and factors such as menstrual pattern, frequency of galactorrhea etc, in 135 hyperproIactinemic patients at the Seoul National University Hospital during a period of 6 years, from January, 1979 to December, 1984. The results was as follows: 1. Menstrual pattern changed according to the serum prolactin level. The frequency of amenorrhea is 1.7 percent in patients with serum prolactin levels ranged from $25{\sim}40ng/ml$, whereas 72.4 percent in patients with serum prolactin levels above 100ng/ml. 2. The incidence of galactorrhea in hyperprolactinemic patients was 3.1 percent and the frequency of galactorrhea had direct relationship with the serum prolactin level and/or the frequency of abnormal menstrual pattern. 3. The incidence of pituitary tumor in hyperprolactinemic patients was 104 percent and sixty percent of patients with serum prolactin levels above 100ng/ml had a pituitary tumor . 4. There was an inverse correlation between serum prolactin and progesletone value. 5. The frequency of anovulatory menstrual cycle evidenced by basal body temperature is 23.9 percent in patients with serum prolactin levels ranged from $20{\sim}40ng/ml$, whereas 76.9 percent in patients with serum prolactin levels above 100ng/ml.

  • PDF

A Diachronic Lexical Analysis of the North Korean English Textbooks (북한 영어 교과서 어휘의 통시적 분석)

  • Kim, Jiyoung;Lee, Je-Young;Kim, Jeong-ryeol
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.4
    • /
    • pp.331-341
    • /
    • 2017
  • This paper aims to analyze English vocabulary of the North Korean textbooks diachronically using the constructed English textbook corpus. The North Korea English textbooks attained from Information Center on North Korea of the Ministry of Unification are divided into before and after Kim Jong-Il era for the year of 1996 in which the curriculum revision has been conducted. They are stored as text files to analyse vocabularies using WordSmith Tools 7.0. The vocabulary size of the revised textbooks increased after the curriculum reorganization, but the number of vocabulary types and vocabulary diversity decreased. After the curriculum revision, it was found that lots of vocabulary related to the establishment of the Kim Jong-Il system appeared as the keyword. It was also found that some vocabularies reflected the economic and social life of North Korea. In addition, through comparison of the 100 high-frequency word list and keywords, it can be concluded that the vocabulary of the English textbooks of North Korea is gradually changing into communicative contents from contents related with written language.