• Title/Summary/Keyword: Lexicon

Search Result 273, Processing Time 0.023 seconds

Semantic Analysis of Color Terms in Chinese Neologisms: Focusing on Black, White, and Gray (중국어 신조어에 나타난 색채어 의미 분석 - 검은색, 흰색, 회색을 중심으로 -)

  • Lee, Myung-Ah;Han, Yong-su
    • Cross-Cultural Studies
    • /
    • v.47
    • /
    • pp.241-260
    • /
    • 2017
  • A multitude of neologisms has entered the lexicon of modern Chinese society as a reflection of the changes modern Chinese society has undergone, and amid this trend, a variety of color terms has emerged. However, these neologisms of color terms in modern Chinese society are used somewhat differently from their roots. First, the achromatic color terms used in Chinese neologisms include black, white, and gray. The significance criteria generally used in these neologisms of color terms only partially express their meaning in the modern Chinese language. Second, the frequency usage of significant criteria of color terms that have emerged in Chinese neologisms reveals a relative distribution between color terms referring to black and white. The color term "black" is the most active neologism to connote its expanded meaning, followed by its basic meaning. However, the color term "white" is most actively used to connote its basic meaning, followed by its expanded meaning. Third, among the achromatic color terms used in Chinese neologisms, black and gray exhibit expansion of meaning. For example, in the context of neologisms, the color term "black" is used to symbolize "in disaster areas" and "socially discriminated against," while "gray" is used to symbolize the "social aspect."

On Doublets (쌍형어에 대하여)

  • Yi, Eun-Gyeong
    • Cross-Cultural Studies
    • /
    • v.50
    • /
    • pp.425-451
    • /
    • 2018
  • In this paper, we examined the issues of the discussions on the subject of doublets. In general, as a definition, the use of doublets refer to a pair of words which have a common etymon, but also to a pair of words or grammatical morphemes that have the same meaning and similar forms of the word. In this paper, we have seen that a typical pairing word is a pair of words with a common etymology. Generally speaking, it is possible to divide doublets into subtypes depending on the identified similarities or differences in the meaning or form. The most distant type from the typical type of doublets is a pair of words that do not have a common etymon, but have the same meaning and are similar in form. The second issue about doublets is whether doublets include only words. For example, if some josas (postpositions or particles) have a common etymon, then it is noted that they can be accepted as a kind of doublets. In the case of suffixes, it may be possible to recognize the suffixes as doublets if they have a common etymon. In other words, it is not necessary to recognize the suffixes as doublets because the derivatives which are derived by the suffixes can be accepted as doublets. In the case of endings, it may be possible to recognize a pair of endings which have the same meaning and the common etymon as a doublet. Otherwise, the word forms to which the endings are combined can be accepted likewise as doublets. However, considering the fact that the endings typically in use in the Korean language may have syntactic properties, the endings should be considered as doublets rather than the words which have the endings. Finally, we conclude that there may be some debate as to whether stem doublets or ending doublets belong to a lexical item in the lexicon. It can be said that they are plural underlying forms and may be deserving of further research.

A proposal for the classification of Korean taste terms (한국어의 '맛 어휘' 분류 체계)

  • Kim, Hyeong Min
    • 기호학연구
    • /
    • no.56
    • /
    • pp.7-44
    • /
    • 2018
  • The objective of this paper is to propose a classification of Korean taste terms, especially Korean taste adjectives, from the perspective of cognitive science. The classification of Korean taste terms is here grounded in the definition of 'taste sense', 'flavor' and 'taste' which is usually employed in disciplines of cognitive science. There have been a large number of domestic researches in field of taste terms. Accordingly, a lot of research findings on the classification of taste terms have steadily been released showing the differences among researchers. These different classifications are largely based on the fact that researchers have applied their subjective criteria rather than their objective in order to categorize taste terms. According to previous studies, it is well-known that, in everyday usage, the term 'taste' covers a much wider range of qualities than those perceived through the taste receptor cells alone. In addition, we take it for granted that as much as 80~90% of taste comes from olfactory modality. It is also important to note that the texture and temperature of food, the color of food, the sounds of food, and atmospheric cues have an essential effect on taste perception. Many scientists have already pointed out that taste evaluations are influenced by a number of individual and sociocultural factors. Eating and tasting are important parts of our everyday life, so that linguistic approaches to taste perception seem to be of great significance. We can assume that a classification of taste terms from the perspective of cognitive sciences may shed light on the perceptive mechanism through which we perceive taste. It should be noted that this paper is an advanced work prepared for the follow-up study which will try to make a geometric model of word field 'taste terms' existing or probably existing in the mental lexicon of human beings.

Multimodal Sentiment Analysis Using Review Data and Product Information (리뷰 데이터와 제품 정보를 이용한 멀티모달 감성분석)

  • Hwang, Hohyun;Lee, Kyeongchan;Yu, Jinyi;Lee, Younghoon
    • The Journal of Society for e-Business Studies
    • /
    • v.27 no.1
    • /
    • pp.15-28
    • /
    • 2022
  • Due to recent expansion of online market such as clothing, utilizing customer review has become a major marketing measure. User review has been used as a tool of analyzing sentiment of customers. Sentiment analysis can be largely classified with machine learning-based and lexicon-based method. Machine learning-based method is a learning classification model referring review and labels. As research of sentiment analysis has been developed, multi-modal models learned by images and video data in reviews has been studied. Characteristics of words in reviews are differentiated depending on products' and customers' categories. In this paper, sentiment is analyzed via considering review data and metadata of products and users. Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), Self Attention-based Multi-head Attention models and Bidirectional Encoder Representation from Transformer (BERT) are used in this study. Same Multi-Layer Perceptron (MLP) model is used upon every products information. This paper suggests a multi-modal sentiment analysis model that simultaneously considers user reviews and product meta-information.

Investigating the Influence of ESG Information on Funding Success in Online Crowdfunding Platform by Using Text Mining Technique and Logistic Regression

  • Kyu Sung Kim;Min Gyeong Kim;Francis Joseph Costello;Kun Chang Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.7
    • /
    • pp.155-164
    • /
    • 2023
  • In this paper, we examine the influence of Environmental, Social, and Governance (ESG)-related content on the success of online crowdfunding proposals. Along with the increasing significance of ESG standards in business, investment proposals incorporating ESG concepts are now commonplace. Due to the ESG trend, conventional wisdom holds that the majority of proposals with ESG concepts will have a higher rate of success. We investigate by analyzing over 9000 online business presentations found in a Kickstarter dataset to determine which characteristics of these proposals led to increased investment. We first utilized lexicon-based measurement and Feature Engineering to determine the relationship between environment and society scores and financial indicators. Next, Logistic Regression is utilized to determine the effect of including environmental and social terms in a project's description on its ability to obtain funding. Contrary to popular belief, our research found that microentrepreneurs were less likely to succeed with proposals that focused on ESG issues. Our research will generate new opportunities for research in the disciplines of information science and crowdfunding by shedding new light on the environment of online micro-entrepreneurship.

Comparison of One- and Two-Region of Interest Strain Elastography Measurements in the Differential Diagnosis of Breast Masses

  • Hee Jeong Park;Sun Mi Kim;Bo La Yun;Mijung Jang;Bohyoung Kim;Soo Hyun Lee;Hye Shin Ahn
    • Korean Journal of Radiology
    • /
    • v.21 no.4
    • /
    • pp.431-441
    • /
    • 2020
  • Objective: To compare the diagnostic performance and interobserver variability of strain ratio obtained from one or two regions of interest (ROI) on breast elastography. Materials and Methods: From April to May 2016, 140 breast masses in 140 patients who underwent conventional ultrasonography (US) with strain elastography followed by US-guided biopsy were evaluated. Three experienced breast radiologists reviewed recorded US and elastography images, measured strain ratios, and categorized them according to the American College of Radiology breast imaging reporting and data system lexicon. Strain ratio was obtained using the 1-ROI method (one ROI drawn on the target mass), and the 2-ROI method (one ROI in the target mass and another in reference fat tissue). The diagnostic performance of the three radiologists among datasets and optimal cut-off values for strain ratios were evaluated. Interobserver variability of strain ratio for each ROI method was assessed using intraclass correlation coefficient values, Bland-Altman plots, and coefficients of variation. Results: Compared to US alone, US combined with the strain ratio measured using either ROI method significantly improved specificity, positive predictive value, accuracy, and area under the receiver operating characteristic curve (AUC) (all p values < 0.05). Strain ratio obtained using the 1-ROI method showed higher interobserver agreement between the three radiologists without a significant difference in AUC for differentiating breast cancer when the optimal strain ratio cut-off value was used, compared with the 2-ROI method (AUC: 0.788 vs. 0.783, 0.693 vs. 0.715, and 0.691 vs. 0.686, respectively, all p values > 0.05). Conclusion: Strain ratios obtained using the 1-ROI method showed higher interobserver agreement without a significant difference in AUC, compared to those obtained using the 2-ROI method. Considering that the 1-ROI method can reduce performers' efforts, it could have an important role in improving the diagnostic performance of breast US by enabling consistent management of breast lesions.

Crafting a Quality Performance Evaluation Model Leveraging Unstructured Data (비정형데이터를 활용한 건축현장 품질성과 평가 모델 개발)

  • Lee, Kiseok;Song, Taegeun;Yoo, Wi Sung
    • Journal of the Korea Institute of Building Construction
    • /
    • v.24 no.1
    • /
    • pp.157-168
    • /
    • 2024
  • The frequent occurrence of structural failures at building construction sites in Korea has underscored the critical role of rigorous oversight in the inspection and management of construction projects. As mandated by prevailing regulations and standards, onsite supervision by designated supervisors encompasses thorough documentation of construction quality, material standards, and the history of any reconstructions, among other factors. These reports, predominantly consisting of unstructured data, constitute approximately 80% of the data amassed at construction sites and serve as a comprehensive repository of quality-related information. This research introduces the SL-QPA model, which employs text mining techniques to preprocess supervision reports and establish a sentiment dictionary, thereby enabling the quantification of quality performance. The study's findings, demonstrating a statistically significant Pearson correlation between the quality performance scores derived from the SL-QPA model and various legally defined indicators, were substantiated through a one-way analysis of variance of the correlation coefficients. The SL-QPA model, as developed in this study, offers a supplementary approach to evaluating the quality performance of building construction projects. It holds the promise of enhancing quality inspection and management practices by harnessing the wealth of unstructured data generated throughout the lifecycle of construction projects.

An Analysis of Relationship between Social Sentiments and Cryptocurrency Price: An Econometric Analysis with Big Data (소셜 감성과 암호화폐 가격 간의 관계 분석: 빅데이터를 활용한 계량경제적 분석)

  • Sangyi Ryu;Jiyeon Hyun;Sang-Yong Tom Lee
    • Information Systems Review
    • /
    • v.21 no.1
    • /
    • pp.91-111
    • /
    • 2019
  • Around the end of 2017, the investment fever for cryptocurrencies-especially Bitcoin-has started all over the world. Especially, South Korea has been at the center of this phenomenon. Sinceit was difficult to find the profitable investment opportunities, people have started to see the cryptocurrency markets as an alternative investment objects. However, the cryptocurrency fever inSouth Korea is mostly based on psychological phenomenon due to expectation of short-term profits and social atmosphere rather than intrinsic value of the assets. Therefore, this study aimed to analyze influence of people's social sentiment on price movement of cryptocurrency. The data was collected for 181 days from Nov 1st, 2017 to Apr 30th, 2018, especially focusing on Bitcoin-related post in Twitter along with price of Bitcoin in Bithumb/UPbit. After the collected data was refined into neutral, positive and negative words through sentiment analysis, the refined neutral, positive, and negative words were put into regression model in order to find out the impacts of social sentiments on Bitcoin price. After examining the relationship by the regression analyses and Granger Causality tests, we found that the positive sentiments had a positive relationship with Bitcoin price, while the negative words had a negative relation with it. Also, the causality test results show that there exist two-way causalities between social sentiment and Bitcoin price movement. Therefore, we were able to conclude that the Bitcoin investors'behaviors are affected by the changes of social sentiments.

MRI Findings of Triple Negative Breast Cancer: A Comparison with Non-Triple Negative Breast Cancer (삼중음성 유방암의 자기공명영상 소견: 비삼중음성 유방암과의 비교)

  • Choi, Jae-Jeong;Kim, Sung-Hun;Cha, Eun-Suk;Kang, Bong-Joo;Lee, Ji-Hye;Lee, So-Yeon;Jeong, Seung-Hee;Yim, Hyeon-Woo;Song, Byung-Joo
    • Investigative Magnetic Resonance Imaging
    • /
    • v.14 no.2
    • /
    • pp.95-102
    • /
    • 2010
  • Purpose : To evaluate the magnetic resonance imaging (MRI) and clinicopathological features of triple negative breast cancer, and compare them with those of non-triple negative breast cancer. Materials and Methods : This study included 231 pathologically confirmed breast cancers from January 2007 to May 2008. We retrospectively reviewed the MRI findings according to the Breast Imaging Reporting and Data System (BI-RADS) lexicon: mass or non-mass type, mass shape, mass margin, non-mass distribution, and enhancement pattern. Histologic type, histologic grade, and the results for epidermal growth factor receptor, p53, and Ki 67 were reviewed. Results : Of 231 patients, 43(18.6%) were triple negative breast cancer. Forty triple negative breast cancers (93.0%) were mass-type lesion on MRI. A round or oval or lobular shape (p=0.006) and rim enhancement (p=0.004) were significantly more in triple negative breast cancer than non- triple negative breast cancer. In contrast, irregular shape (p=0.006) and spiculated margins (p=0.032) were significantly more in non-triple negative breast cancer. Old age (p=0.019), high histologic grade (p<0.0001), EGFR positivity (p<0.0001), p53 overexpression (p=0.038), and Ki 67 expression (<0.0001) were significantly associated with the triple negative breast cancer. Conclusion : MRI finding may be helpful for differentiation between triple negative and non-triple negative breast cancer.

The prediction of the stock price movement after IPO using machine learning and text analysis based on TF-IDF (증권신고서의 TF-IDF 텍스트 분석과 기계학습을 이용한 공모주의 상장 이후 주가 등락 예측)

  • Yang, Suyeon;Lee, Chaerok;Won, Jonggwan;Hong, Taeho
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.237-262
    • /
    • 2022
  • There has been a growing interest in IPOs (Initial Public Offerings) due to the profitable returns that IPO stocks can offer to investors. However, IPOs can be speculative investments that may involve substantial risk as well because shares tend to be volatile, and the supply of IPO shares is often highly limited. Therefore, it is crucially important that IPO investors are well informed of the issuing firms and the market before deciding whether to invest or not. Unlike institutional investors, individual investors are at a disadvantage since there are few opportunities for individuals to obtain information on the IPOs. In this regard, the purpose of this study is to provide individual investors with the information they may consider when making an IPO investment decision. This study presents a model that uses machine learning and text analysis to predict whether an IPO stock price would move up or down after the first 5 trading days. Our sample includes 691 Korean IPOs from June 2009 to December 2020. The input variables for the prediction are three tone variables created from IPO prospectuses and quantitative variables that are either firm-specific, issue-specific, or market-specific. The three prospectus tone variables indicate the percentage of positive, neutral, and negative sentences in a prospectus, respectively. We considered only the sentences in the Risk Factors section of a prospectus for the tone analysis in this study. All sentences were classified into 'positive', 'neutral', and 'negative' via text analysis using TF-IDF (Term Frequency - Inverse Document Frequency). Measuring the tone of each sentence was conducted by machine learning instead of a lexicon-based approach due to the lack of sentiment dictionaries suitable for Korean text analysis in the context of finance. For this reason, the training set was created by randomly selecting 10% of the sentences from each prospectus, and the sentence classification task on the training set was performed after reading each sentence in person. Then, based on the training set, a Support Vector Machine model was utilized to predict the tone of sentences in the test set. Finally, the machine learning model calculated the percentages of positive, neutral, and negative sentences in each prospectus. To predict the price movement of an IPO stock, four different machine learning techniques were applied: Logistic Regression, Random Forest, Support Vector Machine, and Artificial Neural Network. According to the results, models that use quantitative variables using technical analysis and prospectus tone variables together show higher accuracy than models that use only quantitative variables. More specifically, the prediction accuracy was improved by 1.45% points in the Random Forest model, 4.34% points in the Artificial Neural Network model, and 5.07% points in the Support Vector Machine model. After testing the performance of these machine learning techniques, the Artificial Neural Network model using both quantitative variables and prospectus tone variables was the model with the highest prediction accuracy rate, which was 61.59%. The results indicate that the tone of a prospectus is a significant factor in predicting the price movement of an IPO stock. In addition, the McNemar test was used to verify the statistically significant difference between the models. The model using only quantitative variables and the model using both the quantitative variables and the prospectus tone variables were compared, and it was confirmed that the predictive performance improved significantly at a 1% significance level.