• Title/Summary/Keyword: morphemes

Search Result 140, Processing Time 0.024 seconds

Implementation of User Recommendation System based on Video Contents Story Analysis and Viewing Pattern Analysis (영상 스토리 분석과 시청 패턴 분석 기반의 추천 시스템 구현)

  • Lee, Hyoun-Sup;Kim, Minyoung;Lee, Ji-Hoon;Kim, Jin-Deog
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.12
    • /
    • pp.1567-1573
    • /
    • 2020
  • The development of Internet technology has brought the era of one-man media. An individual produces content on user own and uploads it to related online services, and many users watch the content of online services using devices that allow them to use the Internet. Currently, most users find and watch content they want through search functions provided by existing online services. These features are provided based on information entered by the user who uploaded the content. In an environment where content needs to be retrieved based on these limited word data, user unwanted information is presented to users in the search results. To solve this problem, in this paper, the system actively analyzes the video in the online service, and presents a way to extract and reflect the characteristics held by the video. The research was conducted to extract morphemes based on the story content based on the voice data of a video and analyze them with big data technology.

Knowledge Structure of Chronic Obstructive Pulmonary Disease Health Information on Health-Related Websites and Patients' Needs in the Literature Using Text Network Analysis (웹사이트에 제공된 만성폐쇄성폐질환 건강정보와 연구문헌에 나타난 환자의 건강정보 요구의 지식구조: 텍스트 네트워크 분석 활용)

  • Choi, Ja Yun;Lim, Su Yeon;Yun, So Young
    • Journal of Korean Academy of Nursing
    • /
    • v.51 no.6
    • /
    • pp.720-731
    • /
    • 2021
  • Purpose: The purpose of this study was to identify the knowledge structure of health information (HI) for chronic obstructive pulmonary disease (COPD). Methods: Keywords or meaningful morphemes from HI presented on five health-related websites (HRWs) of one national HI institute and four hospitals, as well as HI needs among patients presented in nine literature, were reviewed, refined, and analyzed using text network analysis and their co-occurrence matrix was generated. Two networks of 61 and 35 keywords, respectively, were analyzed for degree, closeness, and betweenness centrality, as well as betweenness community analysis. Results: The most common keywords pertaining to HI on HRWs were lung, inhaler, smoking, dyspnea, and infection, focusing COPD treatment. In contrast, HI needs among patients were lung, medication, support, symptom, and smoking cessation, expanding to disease management. Two common sub-topic groups in HI on HRWs were COPD overview and medication administration, whereas three common sub-topic groups in HI needs among patients in the literature were COPD overview, self-management, and emotional management. Conclusion: The knowledge structure of HI on HRWs is medically oriented, while patients need supportive information. Thus, the support system for self-management and emotional management on HRWs must be informed according to the structure of patients' needs for HI. Healthcare providers should consider presenting COPD patient-centered information on HRWs.

Korean Head-Tail Tokenization and Part-of-Speech Tagging by using Deep Learning (딥러닝을 이용한 한국어 Head-Tail 토큰화 기법과 품사 태깅)

  • Kim, Jungmin;Kang, Seungshik;Kim, Hyeokman
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.17 no.4
    • /
    • pp.199-208
    • /
    • 2022
  • Korean is an agglutinative language, and one or more morphemes are combined to form a single word. Part-of-speech tagging method separates each morpheme from a word and attaches a part-of-speech tag. In this study, we propose a new Korean part-of-speech tagging method based on the Head-Tail tokenization technique that divides a word into a lexical morpheme part and a grammatical morpheme part without decomposing compound words. In this method, the Head-Tail is divided by the syllable boundary without restoring irregular deformation or abbreviated syllables. Korean part-of-speech tagger was implemented using the Head-Tail tokenization and deep learning technique. In order to solve the problem that a large number of complex tags are generated due to the segmented tags and the tagging accuracy is low, we reduced the number of tags to a complex tag composed of large classification tags, and as a result, we improved the tagging accuracy. The performance of the Head-Tail part-of-speech tagger was experimented by using BERT, syllable bigram, and subword bigram embedding, and both syllable bigram and subword bigram embedding showed improvement in performance compared to general BERT. Part-of-speech tagging was performed by integrating the Head-Tail tokenization model and the simplified part-of-speech tagging model, achieving 98.99% word unit accuracy and 99.08% token unit accuracy. As a result of the experiment, it was found that the performance of part-of-speech tagging improved when the maximum token length was limited to twice the number of words.

Performance Evaluation of Video Recommendation System with Rich Metadata (풍부한 메타데이터를 가진 동영상 추천 시스템의 성능 평가)

  • Min Hwa Cho;Da Yeon Kim;Hwa Rang Lee;Ha Neul Oh;Sun Young Lee;In Hwan Jung;Jae Moon Lee;Kitae Hwang
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.2
    • /
    • pp.29-35
    • /
    • 2023
  • This paper makes it possible to search videos based on sentence by improving the previous research which automatically generates rich metadata from videos and searches videos by key words. For search by sentence, morphemes are analyzed for each sentence, keywords are extracted, weights are assigned to each keyword, and some videos are recommended by applying a ranking algorithm developed in the previous research. In order to evaluate performance of video search in this paper, a sufficient amount of videos and sufficient number of user experiences are re required. However, in the current situation where these are insufficient, three indirect evaluation methods were used: evaluation of overall user satisfaction, comparison of recommendation scores and user satisfaction, and evaluation of user satisfaction by video categories. As a result of performance evaluation, it was shown that the rich metadata construction and video recommendation implementation in this paper give users high search satisfaction.

Analysis of the Knowledge Structure of Research related to Reality Shock Experienced by New Graduate Nurses using Text Network Analysis (텍스트네트워크분석을 활용한 신규간호사가 경험하는 현실충격 관련 연구의 지식구조 분석)

  • Heejang Yun
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.1
    • /
    • pp.463-469
    • /
    • 2023
  • The aim of this study is to provide basic data that can contribute to improving successful clinical adaptation and reducing turnover of new graduate nurses by analyzing research related to reality shock experienced by new graduate nurses using text network analysis. The topics of reality shock experienced by new graduate nurses were extracted from 115 papers published in domestic and foreign journals from January 2002 to December 2021. Articles were retrieved from 6 databases (Korean DB: DBpia, KISS, RISS /International DB: Web of science, Springer, Scopus). Keywords were extracted from the abstract and organized using semantic morphemes. Network analysis and topic modeling for subject knowledge structure analysis were performed using NetMiner 4.5.0 program. The core keywords included 'new graduate nurses', 'reality shock', 'transition', 'student nurse', 'experience', 'practice', 'work environment', 'role', 'care' and 'education'. In recent articles on reality shock experienced by new graduate nurses, three major topics were extracted by LDA (Latent Dirichlet Allocation) techniques: 'turnover', 'work environment', 'experience of transition'. Based on this research, the necessity of interventional research that can effectively reduce the reality shock experienced by new graduate nurses and successfully help clinical adaptation is suggested.

A study on research trends for pregnancy in adolescence: Focusing on text network analysis and topic modeling (청소년 임신에 대한 연구 동향 분석: 텍스트 네트워크 분석과 토픽 모델링)

  • Park, Seungmi;Kwak, Eunju;Park, Hye Ok;Hong, Jung Eun
    • The Journal of Korean Academic Society of Nursing Education
    • /
    • v.30 no.2
    • /
    • pp.149-159
    • /
    • 2024
  • Purpose: The aim of this study was to identify core keywords and topic groups in the "adolescent pregnancy" field of research for a better understanding of research trends in the past 10 years. Methods: Topics related to adolescent pregnancy were extracted from 3,819 articles that were published in journals between January 2013 and July 2023. Abstracts were retrieved from five databases (MEDLINE, CINAHL, Embase, RISS, and KISS). Keywords were extracted from the abstracts and cleaned using semantic morphemes. Text network analysis and topic modeling were performed using NetMiner 4.3.3. Results: The most important keywords were "health," "woman," "risk," "group," "girl," "school," "service," "family," "program," and "contraception." Five topic groups were identified through topic modeling. Through the topic modeling analysis, five themes were derived: "health service," "community program for school girls," "risks for adult women," "relationship risks," and "sexual contraceptive knowledge." Conclusion: This study utilized text network analysis and topic modeling to analyze keywords from abstracts of research conducted over the past decade on adolescent pregnancy. Given that adolescent pregnancy leads to physical, mental, social, and economic issues, it is imperative to provide integrated intervention programs, including prenatal/postnatal care, psychological services, proper contraception methods, and sex education, through school and community partnerships, as well as related research studies. Nurses can play a vital role by actively engaging in prevention efforts and directly supporting and educating socially disadvantaged adolescent mothers, which could significantly contribute to improving their quality of life.

A Study on the Culture Transformation about "Takyung-Takjok" in Traditional Landscape Ruins (탁영·탁족의 문화 변용을 통해 본 정원유구)

  • Rho, Jae-Hyun;Suh, Hyo-Suk;Choi, Jong-Hee;Han, Sang-Yub
    • Journal of the Korean Institute of Traditional Landscape Architecture
    • /
    • v.34 no.1
    • /
    • pp.97-106
    • /
    • 2016
  • This study is to suggest the necessity of landscaping alternatives for succession of Takjok(濯足) culture by considering the background and meaning of Takjok's cultural phenomenon shown in old literatures, paintings and ruins of landscape architecture as a front morphemes. Its result is as follows. 1. An old idiom, 'Takyung Takjok(濯纓濯足)' implying a disinterested living attitude from the mundane world and an attitude complying with the nature, has been sublimated to 'Takjokjiyu(濯足之遊)' which means living in comfortable retirement through life in seclusion(隱逸). 2. The meaning of Takjok did not expand into, not only Takyung Takjok, but also into Takcheong(濯淸), Tako(濯吾), and Taksa(濯斯) with continued expansion in its meaning. The spaces the meaning of Takyung Takjok is implied on have also newly expanded into the artificial spaces, including Jeong(亭-pavilion) Jae(齋-house) Heon(軒-eaves), and Ji(池-pond), as well as the natural spaces, including Am(巖-rock) Dae(臺-flat foundation) Dam(潭-deep pond) Ban(盤-dish rock) Seok(石-stone) So(沼-shallow pond) San(山-mountain) Bong(峰-peak), and Cheon(泉-water hole). 3. As seen here, the cultural phenomenon of Takyung Takjok, which have derived from the Dangho(堂號) of buildings, the names of natural objects in Palgyung and Gugok(eight sceneries and nine curves), facilities of Byeolseo garden and Seowon, and the Amgakseo in nature, is worth noting. 4. It should be considered that Takjok includes ordinary people's wisdom to resist the hot weather as well as classical scholar's ideal and the veneration of antiquity. From this perspective, water space, Takjok rocks, and use of water based on the environmental supportability should be newly focused as a recreational space and it reminds us that the spirit of Takjok is a classical mental healing method.

Korean Sentence Generation Using Phoneme-Level LSTM Language Model (한국어 음소 단위 LSTM 언어모델을 이용한 문장 생성)

  • Ahn, SungMahn;Chung, Yeojin;Lee, Jaejoon;Yang, Jiheon
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.71-88
    • /
    • 2017
  • Language models were originally developed for speech recognition and language processing. Using a set of example sentences, a language model predicts the next word or character based on sequential input data. N-gram models have been widely used but this model cannot model the correlation between the input units efficiently since it is a probabilistic model which are based on the frequency of each unit in the training set. Recently, as the deep learning algorithm has been developed, a recurrent neural network (RNN) model and a long short-term memory (LSTM) model have been widely used for the neural language model (Ahn, 2016; Kim et al., 2016; Lee et al., 2016). These models can reflect dependency between the objects that are entered sequentially into the model (Gers and Schmidhuber, 2001; Mikolov et al., 2010; Sundermeyer et al., 2012). In order to learning the neural language model, texts need to be decomposed into words or morphemes. Since, however, a training set of sentences includes a huge number of words or morphemes in general, the size of dictionary is very large and so it increases model complexity. In addition, word-level or morpheme-level models are able to generate vocabularies only which are contained in the training set. Furthermore, with highly morphological languages such as Turkish, Hungarian, Russian, Finnish or Korean, morpheme analyzers have more chance to cause errors in decomposition process (Lankinen et al., 2016). Therefore, this paper proposes a phoneme-level language model for Korean language based on LSTM models. A phoneme such as a vowel or a consonant is the smallest unit that comprises Korean texts. We construct the language model using three or four LSTM layers. Each model was trained using Stochastic Gradient Algorithm and more advanced optimization algorithms such as Adagrad, RMSprop, Adadelta, Adam, Adamax, and Nadam. Simulation study was done with Old Testament texts using a deep learning package Keras based the Theano. After pre-processing the texts, the dataset included 74 of unique characters including vowels, consonants, and punctuation marks. Then we constructed an input vector with 20 consecutive characters and an output with a following 21st character. Finally, total 1,023,411 sets of input-output vectors were included in the dataset and we divided them into training, validation, testsets with proportion 70:15:15. All the simulation were conducted on a system equipped with an Intel Xeon CPU (16 cores) and a NVIDIA GeForce GTX 1080 GPU. We compared the loss function evaluated for the validation set, the perplexity evaluated for the test set, and the time to be taken for training each model. As a result, all the optimization algorithms but the stochastic gradient algorithm showed similar validation loss and perplexity, which are clearly superior to those of the stochastic gradient algorithm. The stochastic gradient algorithm took the longest time to be trained for both 3- and 4-LSTM models. On average, the 4-LSTM layer model took 69% longer training time than the 3-LSTM layer model. However, the validation loss and perplexity were not improved significantly or became even worse for specific conditions. On the other hand, when comparing the automatically generated sentences, the 4-LSTM layer model tended to generate the sentences which are closer to the natural language than the 3-LSTM model. Although there were slight differences in the completeness of the generated sentences between the models, the sentence generation performance was quite satisfactory in any simulation conditions: they generated only legitimate Korean letters and the use of postposition and the conjugation of verbs were almost perfect in the sense of grammar. The results of this study are expected to be widely used for the processing of Korean language in the field of language processing and speech recognition, which are the basis of artificial intelligence systems.

The Development of Characters with Artificial Emotion through Analyzing Drama characters - With a Korean Drama titled 'The Sons of Sol Pharmacy House' (드라마 대본 분석을 통한 등장인물의 성격이 반영된 인공정서 캐릭터 개발 - '솔약국집 아들들'을 중심으로)

  • Ham, Jun-Seok;Rhee, Shin-Young;Bang, Green;Ko, Il-Ju
    • Science of Emotion and Sensibility
    • /
    • v.15 no.2
    • /
    • pp.239-248
    • /
    • 2012
  • This paper looks to extract personality traits from the drama characters within a drama script, and to apply it them to a character that has an artificial emotion. The method of applying the personality of a character from a drama script is as follows. First, we separate a drama script into several pieces, by the characters therin. Next, we extract emotion-related terms by matching morphemes analysis and by using an emotion terms database. Next, we analyze a dominant emotion using extracted emotion terms. Finally last, we apply the analyzed dominant emotion to an equation pertaining to artificial emotion. We made progress in developing user evaluation that features blind testing, to verify that the artificial emotion character bears the personality of a drama character. We apply three drama character personalities to artificial emotion characters bearing the same appearance. The user had to match three artificial emotion characters and drama characters according to personality. The users had a high percentage of correct answers, thus confirming the efficacy of our method of applying a personality, using information from a drama script.

  • PDF

A Morphological Analysis Method of Predicting Place-Event Performance by Online News Titles (온라인 뉴스 제목 분석을 통한 특정 장소 이벤트 성과 예측을 위한 형태소 분석 방법)

  • Choi, Sukjae;Lee, Jaewoong;Kwon, Ohbyung
    • The Journal of Society for e-Business Studies
    • /
    • v.21 no.1
    • /
    • pp.15-32
    • /
    • 2016
  • Online news on the Internet, as published open data, contain facts or opinions about a specific affair and hence influences considerably on the decisions of the general publics who are interested in a particular issue. Therefore, we can predict the people's choices related with the issue by analyzing a large number of related internet news. This study aims to propose a text analysis methodto predict the outcomes of events that take place in a specific place. We used topics of the news articles because the topics contains more essential text than the news articles. Moreover, when it comes to mobile environment, people tend to rely more on the news topics before clicking into the news articles. We collected the titles of news articles and divided them into the learning and evaluation data set. Morphemes are extracted and their polarity values are identified with the learning data. Then we analyzed the sensitivity of the entire articles. As a result, the prediction success rate was 70.6% and it showed a clear difference with other analytical methods to compare. Derived prediction information will be helpful in determining the expected demand of goods when preparing the event.