• Title/Summary/Keyword: Linguistic processing

Search Result 171, Processing Time 0.022 seconds

A Statistical Prediction Model of Speakers' Intentions in a Goal-Oriented Dialogue (목적지향 대화에서 화자 의도의 통계적 예측 모델)

  • Kim, Dong-Hyun;Kim, Hark-Soo;Seo, Jung-Yun
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.9
    • /
    • pp.554-561
    • /
    • 2008
  • Prediction technique of user's intention can be used as a post-processing method for reducing the search space of an automatic speech recognizer. Prediction technique of system's intention can be used as a pre-processing method for generating a flexible sentence. To satisfy these practical needs, we propose a statistical model to predict speakers' intentions that are generalized into pairs of a speech act and a concept sequence. Contrary to the previous model using simple n-gram statistic of speech acts, the proposed model represents a dialogue history of a current utterance to a feature set with various linguistic levels (i.e. n-grams of speech act and a concept sequence pairs, clue words, and state information of a domain frame). Then, the proposed model predicts the intention of the next utterance by using the feature set as inputs of CRFs (Conditional Random Fields). In the experiment in a schedule management domain, The proposed model showed the precision of 76.25% on prediction of user's speech act and the precision of 64.21% on prediction of user's concept sequence. The proposed model also showed the precision of 88.11% on prediction of system's speech act and the Precision of 87.19% on prediction of system's concept sequence. In addition, the proposed model showed 29.32% higher average precision than the previous model.

Detection of Gene Interactions based on Syntactic Relations (구문관계에 기반한 유전자 상호작용 인식)

  • Kim, Mi-Young
    • The KIPS Transactions:PartB
    • /
    • v.14B no.5
    • /
    • pp.383-390
    • /
    • 2007
  • Interactions between proteins and genes are often considered essential in the description of biomolecular phenomena and networks of interactions are considered as an entre for a Systems Biology approach. Recently, many works try to extract information by analyzing biomolecular text using natural language processing technology. Previous researches insist that linguistic information is useful to improve the performance in detecting gene interactions. However, previous systems do not show reasonable performance because of low recall. To improve recall without sacrificing precision, this paper proposes a new method for detection of gene interactions based on syntactic relations. Without biomolecular knowledge, our method shows reasonable performance using only small size of training data. Using the format of LLL05(ICML05 Workshop on Learning Language in Logic) data we detect the agent gene and its target gene that interact with each other. In the 1st phase, we detect encapsulation types for each agent and target candidate. In the 2nd phase, we construct verb lists that indicate the interaction information between two genes. In the last phase, to detect which of two genes is an agent or a target, we learn direction information. In the experimental results using LLL05 data, our proposed method showed F-measure of 88% for training data, and 70.4% for test data. This performance significantly outperformed previous methods. We also describe the contribution rate of each phase to the performance, and demonstrate that the first phase contributes to the improvement of recall and the second and last phases contribute to the improvement of precision.

An Intelligent Marking System based on Semantic Kernel and Korean WordNet (의미커널과 한글 워드넷에 기반한 지능형 채점 시스템)

  • Cho Woojin;Oh Jungseok;Lee Jaeyoung;Kim Yu-Seop
    • The KIPS Transactions:PartA
    • /
    • v.12A no.6 s.96
    • /
    • pp.539-546
    • /
    • 2005
  • Recently, as the number of Internet users are growing explosively, e-learning has been applied spread, as well as remote evaluation of intellectual capacity However, only the multiple choice and/or the objective tests have been applied to the e-learning, because of difficulty of natural language processing. For the intelligent marking of short-essay typed answer papers with rapidness and fairness, this work utilize heterogenous linguistic knowledges. Firstly, we construct the semantic kernel from un tagged corpus. Then the answer papers of students and instructors are transformed into the vector form. Finally, we evaluate the similarity between the papers by using the semantic kernel and decide whether the answer paper is correct or not, based on the similarity values. For the construction of the semantic kernel, we used latent semantic analysis based on the vector space model. Further we try to reduce the problem of information shortage, by integrating Korean Word Net. For the construction of the semantic kernel we collected 38,727 newspaper articles and extracted 75,175 indexed terms. In the experiment, about 0.894 correlation coefficient value, between the marking results from this system and the human instructors, was acquired.

Effects of Task Training for Cognitive Activation of Stroke Patients on Upper Function and Activities of Daily Living (뇌졸중 환자의 인지활성화를 위한 과제 훈련이 상지기능 및 일상생활동작에 미치는 영향)

  • Kim, Yu-Jeong;Kang, Bo-Ra;Ahn, Si-Nae
    • Journal of Korean Society of Neurocognitive Rehabilitation
    • /
    • v.10 no.2
    • /
    • pp.27-34
    • /
    • 2018
  • The purpose of this study is to investigate the effect of task training for cognitive activation of upper extremity on upper limb function and activities of daily living and to suggest intervention in rehabilitation treatment. From January to February of 2018, nine stroke patients were arbitrated 30 minutes a day, five days a week, for four weeks. For the experimental group, the therapist has induced the group using the linguistic guidance to patients, so they utilize the cognition strategy. The control group conducted active exercises in a range of motion using the instruments and passive exercises in a range of motion to reduce the stiffness of joints and upper limbs. As muscle strengthening exercises, the patients were assigned to work on the biceps muscle of arm, triceps muscle of arm, and deltoid according to the individual patient's muscular strength level. For the experimental group, the MBI was improved by ten points at maximum, and K-AMPS motor skills showed the improvements of 1.0 logit at maximum, and processing skills showed improvements of 0.6 logits at maximum. In MFT, the maximum improvement was by two points. For the control group, MBI was improved by five points at maximum, and 0.2 maximum improvements were shown in K-AMPS' motor skills and 0.3 maximum improvements in processing skills. MFT showed no change. The conclusion is that the challenges to enable training for stroke patients give a positive impact on upper limb function and activities of daily living.

A MVC Framework for Visualizing Text Data (텍스트 데이터 시각화를 위한 MVC 프레임워크)

  • Choi, Kwang Sun;Jeong, Kyo Sung;Kim, Soo Dong
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.39-58
    • /
    • 2014
  • As the importance of big data and related technologies continues to grow in the industry, it has become highlighted to visualize results of processing and analyzing big data. Visualization of data delivers people effectiveness and clarity for understanding the result of analyzing. By the way, visualization has a role as the GUI (Graphical User Interface) that supports communications between people and analysis systems. Usually to make development and maintenance easier, these GUI parts should be loosely coupled from the parts of processing and analyzing data. And also to implement a loosely coupled architecture, it is necessary to adopt design patterns such as MVC (Model-View-Controller) which is designed for minimizing coupling between UI part and data processing part. On the other hand, big data can be classified as structured data and unstructured data. The visualization of structured data is relatively easy to unstructured data. For all that, as it has been spread out that the people utilize and analyze unstructured data, they usually develop the visualization system only for each project to overcome the limitation traditional visualization system for structured data. Furthermore, for text data which covers a huge part of unstructured data, visualization of data is more difficult. It results from the complexity of technology for analyzing text data as like linguistic analysis, text mining, social network analysis, and so on. And also those technologies are not standardized. This situation makes it more difficult to reuse the visualization system of a project to other projects. We assume that the reason is lack of commonality design of visualization system considering to expanse it to other system. In our research, we suggest a common information model for visualizing text data and propose a comprehensive and reusable framework, TexVizu, for visualizing text data. At first, we survey representative researches in text visualization era. And also we identify common elements for text visualization and common patterns among various cases of its. And then we review and analyze elements and patterns with three different viewpoints as structural viewpoint, interactive viewpoint, and semantic viewpoint. And then we design an integrated model of text data which represent elements for visualization. The structural viewpoint is for identifying structural element from various text documents as like title, author, body, and so on. The interactive viewpoint is for identifying the types of relations and interactions between text documents as like post, comment, reply and so on. The semantic viewpoint is for identifying semantic elements which extracted from analyzing text data linguistically and are represented as tags for classifying types of entity as like people, place or location, time, event and so on. After then we extract and choose common requirements for visualizing text data. The requirements are categorized as four types which are structure information, content information, relation information, trend information. Each type of requirements comprised with required visualization techniques, data and goal (what to know). These requirements are common and key requirement for design a framework which keep that a visualization system are loosely coupled from data processing or analyzing system. Finally we designed a common text visualization framework, TexVizu which is reusable and expansible for various visualization projects by collaborating with various Text Data Loader and Analytical Text Data Visualizer via common interfaces as like ITextDataLoader and IATDProvider. And also TexVisu is comprised with Analytical Text Data Model, Analytical Text Data Storage and Analytical Text Data Controller. In this framework, external components are the specifications of required interfaces for collaborating with this framework. As an experiment, we also adopt this framework into two text visualization systems as like a social opinion mining system and an online news analysis system.

A Study on the Development of Text Communication System based on AIS and ECDIS for Safe Navigation (항해안전을 위한 AIS와 ECDIS 기반의 문자통신시스템 개발에 관한 연구)

  • Ahn, Young-Joong;Kang, Suk-Young;Lee, Yun-Sok
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.21 no.4
    • /
    • pp.403-408
    • /
    • 2015
  • A text-based communication system has been developed with a communication function on AIS and display and input function on ECDIS as a way to complement voice communication. It features no linguistic error and is not affected by VHF restrictions on use and noise. The text communication system is designed to use messages for clear intentions and further improves convenience of users by using various UI through software. It works without additional hardware installation and modification and can transmit a sentence by selecting only via Message Banner Interface without keyboard input and furthermore has a advantage to enhance processing speed through its own message coding and decoding. It is determined as the most useful alternative to reduce language limitations and recognition errors of the user and solve the problem of various voice communications on VHF. In addition, it will help to prevent collisions between ships with decrease in VHF use, accurate communication and request of cooperation based on text at heavy traffic areas.

An Analysis of the Results of a Mathematics Diagnostic Test taken by Multicultural Koreans in their First or Second Year of Elementary School (다문화가정 학생 대상 언어.인지 진단도구 적용 결과 분석 - 초등학교 1.2학년 수학 -)

  • Cho, Young-Mi;Lee, Og-Young
    • Journal of Educational Research in Mathematics
    • /
    • v.20 no.2
    • /
    • pp.103-119
    • /
    • 2010
  • This study aims to figure out the characteristics of the mathematical ability of multicultural Korean elementary school learners. This was done by analyzing the results of a mathematics diagnostic test given to multicultural Korean first and second year elementary school students. The findings of this study mainly support the following three. First, it was indicated that, regardless of whether the students are multicultural or not, more second-year students had difficulty in understanding mathematics than the first-year students. Specifically, a higher percentage of second-year students were below the reference point (cut-off point) than was the case in the first-year learners, which pattern of the overall Korean students was consistent with that of multicultural Koreans. Second, concerning the sub-fields of mathematics, higher proportion of the students fell below the cut-off point in 'numbers and arithmetics' area than in 'measure and geometry,' which pattern was again the same with the multicultural students. Third, it was implied that, in addition to mathematically more complex questions, linguistically complex sentential representations contributed to increasing the difficulty of the test items. It is suggested that care be taken to enhance linguistic processing and to employ well-defined terms.

  • PDF

ISAAC : An Integrated System with User Interface for Sentence Analysis (ISAAC :문장분석용 통합시스템 및 사용자 인터페이스)

  • Kim, Gon;Kim, Min-Chan;Bae, Jae-Hak;Lee, Jong-Hyuk
    • The KIPS Transactions:PartB
    • /
    • v.11B no.1
    • /
    • pp.107-116
    • /
    • 2004
  • This paper introduces ISAAC (An Interface for Sentence Analysis & Abstraction with Cogitation) which provides an integrated user interface for sentence analysis. Into ISAAC, the various linguistic tools and resources are integrated. They are necessary for sentence analysis. Most of the tools and resources for sentence analysis are developed and accumulated independently. In the sentence analyzing with these tools and resources, it is difficult for sentence analyst to manage and control information which is taken on each step. In this respect, we have integrated the usable tools and resources, and made ISAAC to provide the consistent user oriented interface to each function. We have been able to divide sentence analysis process Into 14 steps. In ISAAC, these steps are processed by four individual modules $\cicled1$syntactic analysis of sentence,$\cicled2$retrieval of a root word,$\cicled3$searching category information in Roget s Thesaurus, and $\cicled4$searching category information in OfN(Ontology for Narratives). Therefore, in case of sentence analysis with ISAAC, the process of total 14 steps falls into 4 steps. This means that it is able to improve the performance of sentence analyst to the extent 3.5 times or more. Furthermore, ISAAC undertaking tedious transcription needed to process each step, we expect that ISAAC can help the analyst to maintain the accuracy of sentence analysis.

Predictability effects on speech perception in noise (SPIN) in Korean (한국어 소음속말인지에 나타나는 예측성 효과)

  • Lee, Sun-Young
    • Korean Journal of Cognitive Science
    • /
    • v.27 no.1
    • /
    • pp.129-157
    • /
    • 2016
  • This study investigates speech perception in noise (SPIN) in Korean. A new type of Korean SPIN test was developed by adopting a similar format to the English SPIN test. The predictability effects, noise effects and their interactions were examined in order to verify the previous findings based on English. The data from 14 Korean adults collected with this new type of Korean SPIN test confirmed the previous findings: first, the participants' overall performance was better in low noise conditions than in high noise conditions. Secondly, there was a tendency for highly predictable words to be more accurately perceived than less predictable words especially in high noise conditions. The results were interpreted in such a way that the listeners actively used both types of information: acoustic information and contextual information in speech perception. When the acoustic property of the speech sound was degraded with noise, the listeners took advantage of the linguistic contextual information in their processing of the speech sound. The findings of this study conform to those of the previous studies based on the English SPIN test. In addition, a possible effect of the frequency of target word was also found, calling for further investigation in this field of research in Korean. Implications of the results were also discussed. (Cyber Hankuk University of Foreign Studies)

  • PDF

A Study on Shot Segmentation and Indexing of Language Education Videos by Content-based Visual Feature Analysis (교육용 어학 영상의 내용 기반 특징 분석에 의한 샷 구분 및 색인에 대한 연구)

  • Han, Heejun
    • Journal of the Korean Society for information Management
    • /
    • v.34 no.1
    • /
    • pp.219-239
    • /
    • 2017
  • As IT technology develops rapidly and the personal dissemination of smart devices increases, video material is especially used as a medium of information transmission among audiovisual materials. Video as an information service content has become an indispensable element, and it has been used in various ways such as unidirectional delivery through TV, interactive service through the Internet, and audiovisual library borrowing. Especially, in the Internet environment, the information provider tries to reduce the effort and cost for the processing of the provided information in view of the video service through the smart device. In addition, users want to utilize only the desired parts because of the burden on excessive network usage, time and space constraints. Therefore, it is necessary to enhance the usability of the video by automatically classifying, summarizing, and indexing similar parts of the contents. In this paper, we propose a method of automatically segmenting the shots that make up videos by analyzing the contents and characteristics of language education videos and indexing the detailed contents information of the linguistic videos by combining visual features. The accuracy of the semantic based shot segmentation is high, and it can be effectively applied to the summary service of language education videos.