• Title/Summary/Keyword: Seoul Corpus

Search Result 97, Processing Time 0.024 seconds

Pronunciation of the Korean diphthong /jo/: Phonetic realizations and acoustic properties (한국어 /ㅛ/의 발음 양상 연구: 발음형 빈도와 음향적 특징을 중심으로)

  • Hyangwon Lee
    • Phonetics and Speech Sciences
    • /
    • v.15 no.1
    • /
    • pp.9-17
    • /
    • 2023
  • The purpose of this study is to determine how the Korean diphthong /jo/ shows phonetic variation in various linguistic environments. The pronunciation of /jo/ is discussed, focusing on the relationship between phonetic variation and the distribution range of vowels. The location in a word (monosyllable, word-initial, word-medial, word-final) and word class (content word, function word) were analyzed using the speech of 10 female speakers of the Seoul Corpus. As a result of determining the frequency of appearance of /jo/ in each environment, the pronunciation type and word class were affected by the location in a word. Frequent phonetic reduction was observed in the function word /jo/ in the acoustic analysis. The word class did not change the average phonetic values of /jo/, but changed the distribution of individual tokens. These results indicate that the linguistic environment affects the phonetic distribution of vowels.

Change Detection Using Spectral Unmixing and IEA(Iterative Error Analysis) for Hyperspectral Images (IEA(Iterative Error Analysis)와 분광혼합분석기법을 이용한 초분광영상의 변화탐지)

  • Song, Ahram;Choi, Jaewan;Chang, Anjin;Kim, Yongil
    • Korean Journal of Remote Sensing
    • /
    • v.31 no.5
    • /
    • pp.361-370
    • /
    • 2015
  • Various algorithms such as Chronochrome(CC), Principle Component Analysis(PCA), and spectral unmixing have been studied for hyperspectral change detection. Change detection by spectral unmixing offers useful information on the nature of the change compared to the other change detection methods which provide only the locations of changes in the scene. However, hyperspectral change detection by spectral unmixing is still in an early stage. This research proposed a new approach to extract endmembers, which have identical properties in temporally different images, by Iterative Error Analysis (IEA) and Spectral Angle Mapper(SAM). The change map obtained from the difference of abundance efficiently showed the changed pixels. Simulated images generated from Compact Airborne Spectrographic Imager (CASI) and Hyperion were used for change detection, and the experimental results showed that the proposed method performed better than CC, PCA, and spectral unmixing using N-FINDR. The proposed method has the advantage of automatically extracting endmembers without prior information, and it could be applicable for the real images composed of many materials.

A Monitoring of Aflatoxins in Commercial Herbs for Food and Medicine (식·약공용 농산물의 아플라톡신 오염 실태 조사)

  • Kim, Sung-dan;Kim, Ae-kyung;Lee, Hyun-kyung;Lee, Sae-ram;Lee, Hee-jin;Ryu, Hoe-jin;Lee, Jung-mi;Yu, In-sil;Jung, Kweon
    • Journal of Food Hygiene and Safety
    • /
    • v.32 no.4
    • /
    • pp.267-274
    • /
    • 2017
  • This paper deals with the natural occurrence of total aflatoxins ($B_1$, $B_2$, $G_1$, and $G_2$) in commercial herbs for food and medicine. To monitor aflatoxins in commercial herbs for food and medicine not included in the specifications of Food Code, a total of 62 samples of 6 different herbs (Bombycis Corpus, Glycyrrhizae Radix et Rhizoma, Menthae Herba, Nelumbinis Semen, Polygalae Radix, Zizyphi Semen) were collected from Yangnyeong market in Seoul, Korea. The samples were treated by the immunoaffinity column clean-up method and quantified by high performance liquid chromatography (HPLC) with on-line post column photochemical derivatization (PHRED) and fluorescence detection (FLD). The analytical method for aflatoxins was validated by accuracy, precision and detection limits. The method showed recovery values in the 86.9~114.0% range and the values of percent coefficient of variaton (CV%) in the 0.9~9.8% range. The limits of detection (LOD) and quantitation (LOQ) in herb were ranged from 0.020 to $0.363{\mu}g/kg$ and from 0.059 to $1.101{\mu}g/kg$, respectively. Of 62 samples analyzed, 6 semens (the original form of 2 Nelumbinis Semen and 2 Zizyphi Semen, the powder of 1 Nelumbinis Semen and 1 Zizyphi Semen) were aflatoxin positive. Aflatoxins $B_1$ or $B_2$ were detected in all positive samples, and the presence of aflatoxins $G_1$ and $G_2$ were not detected. The amount of total aflatoxins ($B_1$, $B_2$, $G_1$, and $G_2$) in the powder and original form of Nelumbinis Semen and Zizyphi Semen were observed around $ND{\sim}21.8{\mu}g/kg$, which is not regulated presently in Korea. The 56 samples presented levels below the limits of detection and quantitation.

Anisotropy Measurement and Fiber Tracking of the White Matter by Using Diffusion Tensor MR Imaging: Influence of the Number of Diffusion-Sensitizing Gradient Direction (확산텐서 MR 영상을 이용한 백질의 비등방성 측정 및 백질섬유 트래킹: 확산경사자장의 방향수가 미치는 영향)

  • Jun, Woo-Sun;Hong, Sung-Woo;Lee, Jong-Sea;Kim, Sung-Hyun;Kim, Jae-Hyoung
    • Investigative Magnetic Resonance Imaging
    • /
    • v.10 no.1
    • /
    • pp.1-7
    • /
    • 2006
  • Purpose : Recent development of diffusion tensor imaging enables the evaluation of the microstructural characteristics of the brain white matter. However, optimal imaging parameters for diffusion tensor imaging, particularly concerning the number of diffusion gradient direction, have not been studied thoroughly yet. The purpose of this study was to evaluate the influence of the number of diffusion gradient direction on the fiber tracking of the white matter. Materials and methods : 13 healthy volunteers (ten men and three women, mean age 30 years, age range 23-37 years) were included in this study. Diffusion tensor imaging was performed with different numbers of diffusion gradient direction as 6, 15, and 32, keeping the other imaging parameters constant. The imaging field ranged from 1 cm below the pons to 2-3 cm above the lateral ventricle, parallel to the anterior commissure-posterior commissure line. FA (fractional anisotropy) maps were created via image postprocessing, and then FA and its standard deviation were calculated in the genu and the splenium of the corpus callosum on each of FA maps. Fiber tracking of the corticospinal tract in the brain was performed and the number of the reconstructed fibers of the tract was measured. FA, standard deviation of FA and the number of the reconstructed fibers were compared statistically between the different diffusion gradient directions. Results : FA is not statistically significantly different between the different diffusion gradient directions. By increasing the number of diffusion gradient direction, standard deviation of FA decreased significantly, and the number of the reconstructed fibers increased significantly. Conclusion : The higher number of diffusion gradient direction provided better quality of fiber tracking.

  • PDF

A Semantic Text Model with Wikipedia-based Concept Space (위키피디어 기반 개념 공간을 가지는 시멘틱 텍스트 모델)

  • Kim, Han-Joon;Chang, Jae-Young
    • The Journal of Society for e-Business Studies
    • /
    • v.19 no.3
    • /
    • pp.107-123
    • /
    • 2014
  • Current text mining techniques suffer from the problem that the conventional text representation models cannot express the semantic or conceptual information for the textual documents written with natural languages. The conventional text models represent the textual documents as bag of words, which include vector space model, Boolean model, statistical model, and tensor space model. These models express documents only with the term literals for indexing and the frequency-based weights for their corresponding terms; that is, they ignore semantical information, sequential order information, and structural information of terms. Most of the text mining techniques have been developed assuming that the given documents are represented as 'bag-of-words' based text models. However, currently, confronting the big data era, a new paradigm of text representation model is required which can analyse huge amounts of textual documents more precisely. Our text model regards the 'concept' as an independent space equated with the 'term' and 'document' spaces used in the vector space model, and it expresses the relatedness among the three spaces. To develop the concept space, we use Wikipedia data, each of which defines a single concept. Consequently, a document collection is represented as a 3-order tensor with semantic information, and then the proposed model is called text cuboid model in our paper. Through experiments using the popular 20NewsGroup document corpus, we prove the superiority of the proposed text model in terms of document clustering and concept clustering.

The Study on Possibility of Applying Word-Level Word Embedding Model of Literature Related to NOS -Focus on Qualitative Performance Evaluation- (과학의 본성 관련 문헌들의 단어수준 워드임베딩 모델 적용 가능성 탐색 -정성적 성능 평가를 중심으로-)

  • Kim, Hyunguk
    • Journal of Science Education
    • /
    • v.46 no.1
    • /
    • pp.17-29
    • /
    • 2022
  • The purpose of this study is to look qualitatively into how efficiently and reasonably a computer can learn themes related to the Nature of Science (NOS). In this regard, a corpus has been constructed focusing on literature (920 abstracts) related to NOS, and factors of the optimized Word2Vec (CBOW, Skip-gram) were confirmed. According to the four dimensions (Inquiry, Thinking, Knowledge and STS) of NOS, the comparative evaluation on the word-level word embedding was conducted. As a result of the study, according to the previous studies and the pre-evaluation on performance, the CBOW model was determined to be 200 for the dimension, five for the number of threads, ten for the minimum frequency, 100 for the number of repetition and one for the context range. And the Skip-gram model was determined to be 200 for the number of dimension, five for the number of threads, ten for the minimum frequency, 200 for the number of repetition and three for the context range. The Skip-gram had better performance in the dimension of Inquiry in terms of types of words with high similarity by model, which was checked by applying it to the four dimensions of NOS. In the dimensions of Thinking and Knowledge, there was no difference in the embedding performance of both models, but in case of words with high similarity for each model, they are sharing the name of a reciprocal domain so it seems that it is required to apply other models additionally in order to learn properly. It was evaluated that the dimension of STS also had the embedding performance that was not sufficient to look into comprehensive STS elements, while listing words related to solution of problems excessively. It is expected that overall implications on models available for science education and utilization of artificial intelligence could be given by making a computer learn themes related to NOS through this study.

The Need for Paradigm Shift in Semantic Similarity and Semantic Relatedness : From Cognitive Semantics Perspective (의미간의 유사도 연구의 패러다임 변화의 필요성-인지 의미론적 관점에서의 고찰)

  • Choi, Youngseok;Park, Jinsoo
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.111-123
    • /
    • 2013
  • Semantic similarity/relatedness measure between two concepts plays an important role in research on system integration and database integration. Moreover, current research on keyword recommendation or tag clustering strongly depends on this kind of semantic measure. For this reason, many researchers in various fields including computer science and computational linguistics have tried to improve methods to calculating semantic similarity/relatedness measure. This study of similarity between concepts is meant to discover how a computational process can model the action of a human to determine the relationship between two concepts. Most research on calculating semantic similarity usually uses ready-made reference knowledge such as semantic network and dictionary to measure concept similarity. The topological method is used to calculated relatedness or similarity between concepts based on various forms of a semantic network including a hierarchical taxonomy. This approach assumes that the semantic network reflects the human knowledge well. The nodes in a network represent concepts, and way to measure the conceptual similarity between two nodes are also regarded as ways to determine the conceptual similarity of two words(i.e,. two nodes in a network). Topological method can be categorized as node-based or edge-based, which are also called the information content approach and the conceptual distance approach, respectively. The node-based approach is used to calculate similarity between concepts based on how much information the two concepts share in terms of a semantic network or taxonomy while edge-based approach estimates the distance between the nodes that correspond to the concepts being compared. Both of two approaches have assumed that the semantic network is static. That means topological approach has not considered the change of semantic relation between concepts in semantic network. However, as information communication technologies make advantage in sharing knowledge among people, semantic relation between concepts in semantic network may change. To explain the change in semantic relation, we adopt the cognitive semantics. The basic assumption of cognitive semantics is that humans judge the semantic relation based on their cognition and understanding of concepts. This cognition and understanding is called 'World Knowledge.' World knowledge can be categorized as personal knowledge and cultural knowledge. Personal knowledge means the knowledge from personal experience. Everyone can have different Personal Knowledge of same concept. Cultural Knowledge is the knowledge shared by people who are living in the same culture or using the same language. People in the same culture have common understanding of specific concepts. Cultural knowledge can be the starting point of discussion about the change of semantic relation. If the culture shared by people changes for some reasons, the human's cultural knowledge may also change. Today's society and culture are changing at a past face, and the change of cultural knowledge is not negligible issues in the research on semantic relationship between concepts. In this paper, we propose the future directions of research on semantic similarity. In other words, we discuss that how the research on semantic similarity can reflect the change of semantic relation caused by the change of cultural knowledge. We suggest three direction of future research on semantic similarity. First, the research should include the versioning and update methodology for semantic network. Second, semantic network which is dynamically generated can be used for the calculation of semantic similarity between concepts. If the researcher can develop the methodology to extract the semantic network from given knowledge base in real time, this approach can solve many problems related to the change of semantic relation. Third, the statistical approach based on corpus analysis can be an alternative for the method using semantic network. We believe that these proposed research direction can be the milestone of the research on semantic relation.