• Title/Summary/Keyword: Language Processing

Search Result 2,707, Processing Time 0.03 seconds

An LSTM Method for Natural Pronunciation Expression of Foreign Words in Sentences (문장에 포함된 외국어의 자연스러운 발음 표현을 위한 LSTM 방법)

  • Kim, Sungdon;Jung, Jaehee
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.4
    • /
    • pp.163-170
    • /
    • 2019
  • Korea language has postpositions such as eul, reul, yi, ga, wa, and gwa, which are attached to nouns and add meaning to the sentence. When foreign notations or abbreviations are included in sentences, the appropriate postposition for the pronunciation of the foreign words may not be used. Sometimes, for natural expression of the sentence, two postpositions are used with one in parentheses as in "eul(reul)" so that both postpositions can be acceptable. This study finds examples of using unnatural postpositions when foreign words are included in Korean sentences and proposes a method for using natural postpositions by learning the final consonant pronunciation of nouns. The proposed method uses a recurrent neural network model to naturally express postpositions connected to foreign words. Furthermore, the proposed method is proven by learning and testing with the proposed method. It will be useful for composing perfect sentences for machine translation by using natural postpositions for English abbreviations or new foreign words included in Korean sentences in the future.

Telemedicine Software Application

  • UNGUREANU, Ovidiu Costica;POPESCU, Marius-Constantin;CIOBANU, Daniela;UNGUREANU, Elena;SARLA, Calin Gabriel;CIOBANU, Alina-Elena;TODINCA, Paul
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.2
    • /
    • pp.171-180
    • /
    • 2021
  • Currently, hospitals and medical practices have a large amount of unstructured information, gathered in time at each ward or practice by physicians in a wide range of medical branches. The data requires processing in order to be able to extract relevant information, which can be used to improve the medical system. It is useful for a physician to have access to a patient's entire medical history when he or she is in an emergency situation, as relevant information can be found about the patient's problems such as: allergies to various medications, personal history, or hereditary collateral conditions etc. If the information exists in a structured form, the detection of diseases based on specific symptoms is much easier, faster and with a higher degree of accuracy. Thus, physicians may investigate certain pathological profiles and conduct cohort clinical trials, including comparing the profile of a particular patient with other similar profiles that already have a confirmed diagnosis. Involving information technology in this field will change so the time which the physicians should spend in front of the computer into a much more beneficial one, providing them with the possibility for more interaction with the patient while listening to the patient's needs. The expert system, described in the paper, is an application for medical diagnostic of the most frequently met conditions, based on logical programming and on the theory of probabilities. The system rationale is a search item in the field basic knowledge on the condition. The web application described in the paper is implemented for the ward of pathological anatomy of a hospital in Romania. It aims to ease the healthcare staff's work, to create a connection of communication at one click between the necessary wards and to reduce the time lost with bureaucratic proceedings. The software (made in PHP programming language, by writing directly in the source code) is developed in order to ease the healthcare staff's activity, being created in a simpler and as elegant way as possible.

Research Trends of Ergonomics in Occupational Safety and Health through MEDLINE Search: Focus on Abstract Word Modeling using Word Embedding (MEDLINE 검색을 통한 산업안전보건 분야에서의 인간공학 연구동향 : 워드임베딩을 활용한 초록 단어 모델링을 중심으로)

  • Kim, Jun Hee;Hwang, Ui Jae;Ahn, Sun Hee;Gwak, Gyeong Tae;Jung, Sung Hoon
    • Journal of the Korean Society of Safety
    • /
    • v.36 no.5
    • /
    • pp.61-70
    • /
    • 2021
  • This study aimed to analyze the research trends of the abstract data of ergonomic studies registered in MEDLINE, a medical bibliographic database, using word embedding. Medical-related ergonomic studies mainly focus on work-related musculoskeletal disorders, and there are no studies on the analysis of words as data using natural language processing techniques, such as word embedding. In this study, the abstract data of ergonomic studies were extracted with a program written with selenium and BeutifulSoup modules using python. The word embedding of the abstract data was performed using the word2vec model, after which the data found in the abstract were vectorized. The vectorized data were visualized in two dimensions using t-Distributed Stochastic Neighbor Embedding (t-SNE). The word "ergonomics" and ten of the most frequently used words in the abstract were selected as keywords. The results revealed that the most frequently used words in the abstract of ergonomics studies include "use", "work", and "task". In addition, the t-SNE technique revealed that words, such as "workplace", "design", and "engineering," exhibited the highest relevance to ergonomics. The keywords observed in the abstract of ergonomic studies using t-SNE were classified into four groups. Ergonomics studies registered with MEDLINE have investigated the risk factors associated with workers performing an operation or task using tools, and in this study, ergonomics studies were identified by the relationship between keywords using word embedding. The results of this study will provide useful and diverse insights on future research direction on ergonomic studies.

Frequency and Social Network Analysis of the Bible Data using Big Data Analytics Tools R (빅데이터 분석도구 R을 이용한 성경 데이터의 빈도와 소셜 네트워크 분석)

  • Ban, ChaeHoon;Ha, JongSoo;Kim, Dong Hyun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.2
    • /
    • pp.166-171
    • /
    • 2020
  • Big data processing technology that can store and analyze data and obtain new knowledge has been adjusted for importance in many fields of the society. Big data is emerging as an important problem in the field of information and communication technology, but the mind of continuous technology is rising. the R, a tool that can analyze big data, is a language and environment that enables information analysis of statistical bases. In this paper, we use this to analyze the Bible data. We analyze the four Gospels of the New Testament in the Bible. We collect the Bible data and perform filtering for analysis. The R is used to investigate the frequency of what text is distributed and analyze the Bible through social network analysis, in which words from a sentence are paired and analyzed between words for accurate data analysis.

Integrated Privacy Protection Model based on RBAC (RBAC에 기초한 통합형 프라이버시 보호 모델)

  • Cho, Hyug-Hyun;Park, Hee-Man;Lee, Young-Lok;Noh, Bong-Nam;Lee, Hyung-Hyo
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.20 no.4
    • /
    • pp.135-144
    • /
    • 2010
  • Privacy protection can only be achieved by enforcing privacy policies within an enterprise's on and offline data processing systems. There are P-RBAC model and purpose based model and obligations model among privacy policy models. But only these models each can not dynamically deal with the rapidly changing business environment. Even though users are in the same role, on occasion, secure system has to opt for a figure among them who is smart, capable and supremely confident and to give him/her a special mission during a given period and to strengthen privacy protection by permitting to present fluently access control conditions. For this, we propose Integrated Privacy Protection Model based on RBAC. Our model includes purpose model and P-RBAC and obligation model. And lastly, we define high level policy language model based XML to be independent of platforms and applications.

Longitudinal music perception performance of postlingual deaf adults with cochlear implants using acoustic and/or electrical stimulation

  • Chang, Son A;Shin, Sujin;Kim, Sungkeong;Lee, Yeabitna;Lee, Eun Young;Kim, Hanee;Shin, You-Ree;Chun, Young-Myoung
    • Phonetics and Speech Sciences
    • /
    • v.13 no.2
    • /
    • pp.103-109
    • /
    • 2021
  • In this study, we investigated longitudinal music perception of adult cochlear implant (CI) users and how acoustic stimulation with CI affects their music performance. A total of 163 participants' data were analyzed retrospectively. 96 participants were using acoustic stimulation with CI and 67 participants were using electrical stimulation only via CI. The music performance (melody identification, appreciation, and satisfaction) data were collected pre-implantation, 1-year, and 2-year post-implantation. Mixed repeated measures of ANOVA and pairwise analysis adjusted by Tukey were used for the statistics. As result, in both groups, there were significant improvements in melody identification, music appreciation, and music satisfaction at 1-year, and 2-year post-implantation than a pre-implantation, but there was no significant difference between 1 and 2 years in any of the variables. Also, the group of acoustic stimulation with CI showed better perception skill of melody identification than the CI-only group. However, no differences found in music appreciation and satisfaction between the two groups, and possible explanations were discussed. In conclusion, acoustic and/or electrical hearing devices benefit the recipients in music performance over time. Although acoustic stimulation accompanied with electrical stimulation could benefit the recipients in terms of listening skills, those benefits may not extend to the subjective acceptance of music. These results suggest the need for improved sound processing mechanisms and music rehabilitation.

Modified multi-sense skip-gram using weighted context and x-means (가중 문맥벡터와 X-means 방법을 이용한 변형 다의어스킵그램)

  • Jeong, Hyunwoo;Lee, Eun Ryung
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.3
    • /
    • pp.389-399
    • /
    • 2021
  • In recent years, word embedding has been a popular field of natural language processing research and a skip-gram has become one successful word embedding method. It assigns a word embedding vector to each word using contexts, which provides an effective way to analyze text data. However, due to the limitation of vector space model, primary word embedding methods assume that every word only have a single meaning. As one faces multi-sense words, that is, words with more than one meaning, in reality, Neelakantan (2014) proposed a multi-sense skip-gram (MSSG) to find embedding vectors corresponding to the each senses of a multi-sense word using a clustering method. In this paper, we propose a modified method of the MSSG to improve statistical accuracy. Moreover, we propose a data-adaptive choice of the number of clusters, that is, the number of meanings for a multi-sense word. Some numerical evidence is given by conducting real data-based simulations.

Development of Artificial Intelligence-based Legal Counseling Chatbot System

  • Park, Koo-Rack
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.3
    • /
    • pp.29-34
    • /
    • 2021
  • With the advent of the 4th industrial revolution era, IT technology is creating new services that have not existed by converging with various existing industries and fields. In particular, in the field of artificial intelligence, chatbots and the latest technologies have developed dramatically with the development of natural language processing technology, and various business processes are processed through chatbots. This study is a study on a system that provides a close answer to the question the user wants to find by creating a structural form for legal inquiries through Slot Filling-based chatbot technology, and inputting a predetermined type of question. Using the proposal system, it is possible to construct question-and-answer data in a more structured form of legal information, which is unstructured data in text form. In addition, by managing the accumulated Q&A data through a big data storage system such as Apache Hive and recycling the data for learning, the reliability of the response can be expected to continuously improve.

High-Speed Search for Pirated Content and Research on Heavy Uploader Profiling Analysis Technology (불법복제물 고속검색 및 Heavy Uploader 프로파일링 분석기술 연구)

  • Hwang, Chan-Woong;Kim, Jin-Gang;Lee, Yong-Soo;Kim, Hyeong-Rae;Lee, Tae-Jin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.6
    • /
    • pp.1067-1078
    • /
    • 2020
  • With the development of internet technology, a lot of content is produced, and the demand for it is increasing. Accordingly, the number of contents in circulation is increasing, while the number of distributing illegal copies that infringe on copyright is also increasing. The Korea Copyright Protection Agency operates a illegal content obstruction program based on substring matching, and it is difficult to accurately search because a large number of noises are inserted to bypass this. Recently, researches using natural language processing and AI deep learning technologies to remove noise and various blockchain technologies for copyright protection are being studied, but there are limitations. In this paper, noise is removed from data collected online, and keyword-based illegal copies are searched. In addition, the same heavy uploader is estimated through profiling analysis for heavy uploaders. In the future, it is expected that copyright damage will be minimized if the illegal copy search technology and blocking and response technology are combined based on the results of profiling analysis for heavy uploaders.

Policy-based In-Network Security Management using P4 Network DataPlane Programmability (P4 프로그래머블 네트워크를 통한 정책 기반 인-네트워크 보안 관리 방법)

  • Cho, Buseung
    • Convergence Security Journal
    • /
    • v.20 no.5
    • /
    • pp.3-10
    • /
    • 2020
  • Recently, the Internet and networks are regarded as essential infrastructures that constitute society, and security threats have been constantly increased. However, the network switch that actually transmits packets in the network can cope with security threats only through firewall or network access control based on fixed rules, so the effective defense for the security threats is extremely limited in the network itself and not actively responding as well. In this paper, we propose an in-network security framework using the high-level data plane programming language, P4 (Programming Protocol-independent Packet Processor), to deal with DDoS attacks and IP spoofing attacks at the network level by monitoring all flows in the network in real time and processing specific security attack packets at the P4 switch. In addition, by allowing the P4 switch to apply the network user's or administrator's policy through the SDN (Software-Defined Network) controller, various security requirements in the network application environment can be reflected.