• Title/Summary/Keyword: domain of words

Search Result 213, Processing Time 0.026 seconds

Sentiment Analysis of User-Generated Content on Drug Review Websites

  • Na, Jin-Cheon;Kyaing, Wai Yan Min
    • Journal of Information Science Theory and Practice
    • /
    • v.3 no.1
    • /
    • pp.6-23
    • /
    • 2015
  • This study develops an effective method for sentiment analysis of user-generated content on drug review websites, which has not been investigated extensively compared to other general domains, such as product reviews. A clause-level sentiment analysis algorithm is developed since each sentence can contain multiple clauses discussing multiple aspects of a drug. The method adopts a pure linguistic approach of computing the sentiment orientation (positive, negative, or neutral) of a clause from the prior sentiment scores assigned to words, taking into consideration the grammatical relations and semantic annotation (such as disorder terms) of words in the clause. Experiment results with 2,700 clauses show the effectiveness of the proposed approach, and it performed significantly better than the baseline approaches using a machine learning approach. Various challenging issues were identified and discussed through error analysis. The application of the proposed sentiment analysis approach will be useful not only for patients, but also for drug makers and clinicians to obtain valuable summaries of public opinion. Since sentiment analysis is domain specific, domain knowledge in drug reviews is incorporated into the sentiment analysis algorithm to provide more accurate analysis. In particular, MetaMap is used to map various health and medical terms (such as disease and drug names) to semantic types in the Unified Medical Language System (UMLS) Semantic Network.

Study on Effective Extraction of New Coined Vocabulary from Political Domain Article and News Comment (정치 도메인에서 신조어휘의 효과적인 추출 및 의미 분석에 대한 연구)

  • Lee, Jihyun;Kim, Jaehong;Cho, Yesung;Lee, Mingu;Choi, Hyebong
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.2
    • /
    • pp.149-156
    • /
    • 2021
  • Text mining is one of the useful tools to discover public opinion and perception regarding political issues from big data. It is very common that users of social media express their opinion with newly-coined words such as slang and emoji. However, those new words are not effectively captured by traditional text mining methods that process text data using a language dictionary. In this study, we propose effective methods to extract newly-coined words that connote the political stance and opinion of users. With various text mining techniques, I attempt to discover the context and the political meaning of the new words.

The Posttraumatic Stress Research Trends of Korean and Foreign Firefighters (국내외 소방대원의 외상 후 스트레스 연구경향)

  • Baek, Mi-Lye
    • The Korean Journal of Emergency Medical Services
    • /
    • v.13 no.2
    • /
    • pp.61-72
    • /
    • 2009
  • Purpose : This study aimed to analyze the posttraumatic stress research trends in Korean and foreign firefighters. Method : Total 63 published international articles were searched by Pub Med internet site and total 17 published Korean articles were searched by Korean Medical Database internet site using 'PTSD in firefighters'. These articles were analyzed by published time, domains of journal, research designs, key words and research subjects. Result : 1) By the published time, there were 29 disaster-related researches(46.0%) and 34 job-related researches(54.0%) among 63 international articles. However, there were 16 disaster-related researches(94.1%) and 1 job-related research (5.9%) of Korean 17 articles. 2) By the international research domain, 9 researches(14.3%) were published in The Journal of Nervous and Mental Disease. Among domestic research domain, there were 9 researches(52.9%) consisting of 6 master's degrees and 3 doctor degrees. In major analysis of Korean domain, the highest portion is 4 psychology researches. (23.5%) 3) In the term of the international research design, quantitative research methods were highly used in both 23 disaster-related researches (36.5%) and 30 job-related researches(47.5%). In domestic research, quantitative research methods were mostly used in 14 job-related researches(82.3%) and Q methodology was only used in 1 disaster-related research(5.9%). 4) Looking on the research content trends according to the key words, 9 researches (31.0%) done on posttraumatic stress and coping had the most research and was followed up by posttraumatic stress symptom. Among these researches, key words for PTSD(Posttraumatic Stress Disorder) and PTS(Posttraumatic Stress) were mostly used. Moreover, there was 1 domestic study done on verifying the trends of Posttraumatic Stress in disaster-related research with PTS as the key word. In job-related research, the relationship between the Posttraumatic Stress and other factors had the most with ten studies (62.5%). Among these researches, key words for 5 PTSD(31.3%) were mostly used. 5) According to the international research subjects, the Posttrau consist the most subjects with 16 cases each for disaster and job related stress ; however, domestic research had 16 studies(94.1%) only using firefighters and 1 (5.9%) with their families as subjects. Conclusion : Although the studies of Posttraumatic Stress on Korean firefighters had started later than those on Foreign firefighters, first used for crucial topics show research development in various fields of study and should be tested for studies like those done in abroad regarding multiple topics and methods.

  • PDF

Constraints of English Poetic Meter : Focused on lambic. (영어율격의 제약 - iambic을 중심으로 -)

  • Sohn Il-Gwon
    • Proceedings of the KSPS conference
    • /
    • 2002.11a
    • /
    • pp.64-69
    • /
    • 2002
  • This study concerns the constraints of English Poetic Meter. In English poems, the metrical pattern doesn't always match the linguistic stress on the lines. These mismatches are found differently among the poets. For the lexical stress mismatched with the weak metrical position, $*W{\Rightarrow}{\;}Strength$ is established by the concept of the strong syllable. The peaks of monosyllabic words mismatched with the weak metrical position are divided according to which side of the boundary of a phonological domain they are adjacent to. Adjacency Constraint I is suggested for the mismatched peak which is adjacent to the left boundary of a phonological domain; *Peak] and Adjacency ConstraintII for the mismatched peak which are adjacent to the right boundary of a phonological domain. These constraints are various according to the poets(Pope, Milton and Shakespeare) : *[Peak [-stress], $W{\Rightarrow}{\;}*Strength$ and *Peak] in Pope; *[+stress][Peak [-stress] and *Peak] in Milton ; *[+stress][Peak [-stress], $W{\;}{\Rightarrow}{\;}*Strength$ and ACII in Shakespeare.

  • PDF

Constraints of English Poetic Meter: Focused on Iambic (영시 율격의 제약 - Iambic을 중심으로 -)

  • 손일권
    • Korean Journal of English Language and Linguistics
    • /
    • v.2 no.4
    • /
    • pp.555-574
    • /
    • 2002
  • This study concerns the constraints of English Poetic Meter. In English poems, the metrical pattern doesn't always match the linguistic stress on the lines. These mismatches are found differently among the poets. For the lexical stress mismatched with the weak metrical position, W⇒ Strength is established by the concept of the strong syllable. The peaks of monosyllabic words mismatched with the weak metrical position are divided according to which side of the boundary of a phonological domain they are adjacent to. Adjacency Constraint I is suggested for the mismatched peak which is adjacent to the left boundary of a phonological domain; /sup */Peak] and Adjacency ConstraintⅡ for the mismatched peak which is adjacent to the right boundary of a phonological domain. These constraints are various according to the poets (Pope, Milton and Shakespeare) : /sup */[Peak [-stress], /sup */W⇒ Strength and /sup */Peak] in Pope; /sup */[+stress][Peak[-stress] and /sup */Peak] in Milton; /sup */[ +stress][Peak[-stress], /sup */W⇒Strength and Adjacency ConstraintⅡ in Shakespeare.

  • PDF

ONTOLOGY DESIGN FOR THE EFFICIENT CUSTOMER INFORMATION RETRIEVAL

  • Gu, Mi-Sug;Hwang, Jeong-Hee;Ryu, Keun-Ho
    • Proceedings of the KSRS Conference
    • /
    • 2005.10a
    • /
    • pp.345-348
    • /
    • 2005
  • Because the current web search engine estimates the similarity of documents, using the frequency of words, many documents irrespective of the user query are provided. To solve these kinds of problems, the semantic web is appearing as a future web. It is possible to provide the service based on the semantic web through ontology which specifies the knowledge in a special domain and defines the concepts of knowledge and the relationships between concepts. In this paper to search the information of potential customers for home-delivery marketing, we model the specific domain for generating the ontology. And we research how to retrieve the information, using the ontology. Therefore, in this paper, we generate the ontology to define the domain about potential customers and develop the search robot which collects the information of customers.

  • PDF

Ontology Matching Method Based on Word Embedding and Structural Similarity

  • Hongzhou Duan;Yuxiang Sun;Yongju Lee
    • International journal of advanced smart convergence
    • /
    • v.12 no.3
    • /
    • pp.75-88
    • /
    • 2023
  • In a specific domain, experts have different understanding of domain knowledge or different purpose of constructing ontology. These will lead to multiple different ontologies in the domain. This phenomenon is called the ontology heterogeneity. For research fields that require cross-ontology operations such as knowledge fusion and knowledge reasoning, the ontology heterogeneity has caused certain difficulties for research. In this paper, we propose a novel ontology matching model that combines word embedding and a concatenated continuous bag-of-words model. Our goal is to improve word vectors and distinguish the semantic similarity and descriptive associations. Moreover, we make the most of textual and structural information from the ontology and external resources. We represent the ontology as a graph and use the SimRank algorithm to calculate the structural similarity. Our approach employs a similarity queue to achieve one-to-many matching results which provide a wider range of insights for subsequent mining and analysis. This enhances and refines the methodology used in ontology matching.

A New Endpoint Detection Method Based on Chaotic System Features for Digital Isolated Word Recognition System

  • Zang, Xian;Chong, Kil-To
    • Proceedings of the IEEK Conference
    • /
    • 2009.05a
    • /
    • pp.37-39
    • /
    • 2009
  • In the research of speech recognition, locating the beginning and end of a speech utterance in a background of noise is of great importance. Since the background noise presenting to record will introduce disturbance while we just want to get the stationary parameters to represent the corresponding speech section, in particular, a major source of error in automatic recognition system of isolated words is the inaccurate detection of beginning and ending boundaries of test and reference templates, thus we must find potent method to remove the unnecessary regions of a speech signal. The conventional methods for speech endpoint detection are based on two simple time-domain measurements - short-time energy, and short-time zero-crossing rate, which couldn't guarantee the precise results if in the low signal-to-noise ratio environments. This paper proposes a novel approach that finds the Lyapunov exponent of time-domain waveform. This proposed method has no use for obtaining the frequency-domain parameters for endpoint detection process, e.g. Mel-Scale Features, which have been introduced in other paper. Comparing with the conventional methods based on short-time energy and short-time zero-crossing rate, the novel approach based on time-domain Lyapunov Exponents(LEs) is low complexity and suitable for Digital Isolated Word Recognition System.

  • PDF

Real-Tim Sound Field Effect Implementation Using Block Filtering and QFT (Block Filtering과 QFT를 이용한 실시간 음장 효과구현)

  • Sohn Sung-Yong;Seo Jeongil;Hahn Minsoo
    • MALSORI
    • /
    • no.51
    • /
    • pp.85-98
    • /
    • 2004
  • It is almost impossible to generate the sound field effect in real time with the time-domain linear convolution because of its large multiplication operation requirement. To solve this, three methods are introduced to reduce the number of multiplication operations in this paper. Firstly, the time-domain linear convolution is replaced with the frequency-domain circular convolution. In other words, the linear convolution result can be derived from that of the circular convolution. This technique reduces the number of multiplication operations remarkably, Secondly, a subframe concept is introduced, i.e., one original frame is divided into several subframes. Then the FFT is executed for each subframe and, as a result, the number of multiplication operations can be reduced. Finally, the QFT is used in stead of the FFT. By combining all the above three methods into our final the SFE generation algorithm, the number of computations are reduced sufficiently and the real-time SFE generation becomes possible with a general PC.

  • PDF

KNU Korean Sentiment Lexicon: Bi-LSTM-based Method for Building a Korean Sentiment Lexicon (Bi-LSTM 기반의 한국어 감성사전 구축 방안)

  • Park, Sang-Min;Na, Chul-Won;Choi, Min-Seong;Lee, Da-Hee;On, Byung-Won
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.219-240
    • /
    • 2018
  • Sentiment analysis, which is one of the text mining techniques, is a method for extracting subjective content embedded in text documents. Recently, the sentiment analysis methods have been widely used in many fields. As good examples, data-driven surveys are based on analyzing the subjectivity of text data posted by users and market researches are conducted by analyzing users' review posts to quantify users' reputation on a target product. The basic method of sentiment analysis is to use sentiment dictionary (or lexicon), a list of sentiment vocabularies with positive, neutral, or negative semantics. In general, the meaning of many sentiment words is likely to be different across domains. For example, a sentiment word, 'sad' indicates negative meaning in many fields but a movie. In order to perform accurate sentiment analysis, we need to build the sentiment dictionary for a given domain. However, such a method of building the sentiment lexicon is time-consuming and various sentiment vocabularies are not included without the use of general-purpose sentiment lexicon. In order to address this problem, several studies have been carried out to construct the sentiment lexicon suitable for a specific domain based on 'OPEN HANGUL' and 'SentiWordNet', which are general-purpose sentiment lexicons. However, OPEN HANGUL is no longer being serviced and SentiWordNet does not work well because of language difference in the process of converting Korean word into English word. There are restrictions on the use of such general-purpose sentiment lexicons as seed data for building the sentiment lexicon for a specific domain. In this article, we construct 'KNU Korean Sentiment Lexicon (KNU-KSL)', a new general-purpose Korean sentiment dictionary that is more advanced than existing general-purpose lexicons. The proposed dictionary, which is a list of domain-independent sentiment words such as 'thank you', 'worthy', and 'impressed', is built to quickly construct the sentiment dictionary for a target domain. Especially, it constructs sentiment vocabularies by analyzing the glosses contained in Standard Korean Language Dictionary (SKLD) by the following procedures: First, we propose a sentiment classification model based on Bidirectional Long Short-Term Memory (Bi-LSTM). Second, the proposed deep learning model automatically classifies each of glosses to either positive or negative meaning. Third, positive words and phrases are extracted from the glosses classified as positive meaning, while negative words and phrases are extracted from the glosses classified as negative meaning. Our experimental results show that the average accuracy of the proposed sentiment classification model is up to 89.45%. In addition, the sentiment dictionary is more extended using various external sources including SentiWordNet, SenticNet, Emotional Verbs, and Sentiment Lexicon 0603. Furthermore, we add sentiment information about frequently used coined words and emoticons that are used mainly on the Web. The KNU-KSL contains a total of 14,843 sentiment vocabularies, each of which is one of 1-grams, 2-grams, phrases, and sentence patterns. Unlike existing sentiment dictionaries, it is composed of words that are not affected by particular domains. The recent trend on sentiment analysis is to use deep learning technique without sentiment dictionaries. The importance of developing sentiment dictionaries is declined gradually. However, one of recent studies shows that the words in the sentiment dictionary can be used as features of deep learning models, resulting in the sentiment analysis performed with higher accuracy (Teng, Z., 2016). This result indicates that the sentiment dictionary is used not only for sentiment analysis but also as features of deep learning models for improving accuracy. The proposed dictionary can be used as a basic data for constructing the sentiment lexicon of a particular domain and as features of deep learning models. It is also useful to automatically and quickly build large training sets for deep learning models.