• Title/Summary/Keyword: Corpus-based Study

Search Result 204, Processing Time 0.026 seconds

Topic Modeling Insomnia Social Media Corpus using BERTopic and Building Automatic Deep Learning Classification Model (BERTopic을 활용한 불면증 소셜 데이터 토픽 모델링 및 불면증 경향 문헌 딥러닝 자동분류 모델 구축)

  • Ko, Young Soo;Lee, Soobin;Cha, Minjung;Kim, Seongdeok;Lee, Juhee;Han, Ji Yeong;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.39 no.2
    • /
    • pp.111-129
    • /
    • 2022
  • Insomnia is a chronic disease in modern society, with the number of new patients increasing by more than 20% in the last 5 years. Insomnia is a serious disease that requires diagnosis and treatment because the individual and social problems that occur when there is a lack of sleep are serious and the triggers of insomnia are complex. This study collected 5,699 data from 'insomnia', a community on 'Reddit', a social media that freely expresses opinions. Based on the International Classification of Sleep Disorders ICSD-3 standard and the guidelines with the help of experts, the insomnia corpus was constructed by tagging them as insomnia tendency documents and non-insomnia tendency documents. Five deep learning language models (BERT, RoBERTa, ALBERT, ELECTRA, XLNet) were trained using the constructed insomnia corpus as training data. As a result of performance evaluation, RoBERTa showed the highest performance with an accuracy of 81.33%. In order to in-depth analysis of insomnia social data, topic modeling was performed using the newly emerged BERTopic method by supplementing the weaknesses of LDA, which is widely used in the past. As a result of the analysis, 8 subject groups ('Negative emotions', 'Advice and help and gratitude', 'Insomnia-related diseases', 'Sleeping pills', 'Exercise and eating habits', 'Physical characteristics', 'Activity characteristics', 'Environmental characteristics') could be confirmed. Users expressed negative emotions and sought help and advice from the Reddit insomnia community. In addition, they mentioned diseases related to insomnia, shared discourse on the use of sleeping pills, and expressed interest in exercise and eating habits. As insomnia-related characteristics, we found physical characteristics such as breathing, pregnancy, and heart, active characteristics such as zombies, hypnic jerk, and groggy, and environmental characteristics such as sunlight, blankets, temperature, and naps.

Two -week Oral Toxicity Study of 1- (4-methylpiperazinyl) -3- phenylisoquinoline (CWJ-a-5) in sprague-Dawley (SD) Rats (1-(4-methylpiperazinyl)-3-phenylisoquinoline (CWJ- a-5)의 Sprague-Dawley(SD) 랫드를 이용한 2주간 반복 경구투여 독성시험)

  • 강부현;조원제;김대덕;김용범;차신우;장순재
    • Toxicological Research
    • /
    • v.18 no.1
    • /
    • pp.47-57
    • /
    • 2002
  • The subacute oral toxicity of 1-(4-methylpiperazinyl)-3-phenylisoquinoline (CWJ- a-5) was investigated in Sprague-Dawley (SD) rats. Five groups of 5 males and 5 females were orally administered at doses of 0, 37.5, 75, 150 and 200 mg/kg with CWJ-a-5 for 2 weeks. In clinical signs, Salivation was observed in the 75, 150 and 500 mg/kg male and female groups. Loss of fur was observed in the 500 mg/kg male and female group. Body weight were significantly decreased in the 150 and 500 mg/kg male groups and in the 500 mg/kg female group. Food consumption was significantly decreased in the 300 mg/kg male group. In serum biochemistry, total cholesterol and phospholipid were significantly increased in 500 mg/kg male and female group. Aspartate aminotransferase was significantly increased in the 500 mg/kg female group. In histopathological examination, vacuolar degeneration of renal tubules in the kidney, vacuolar degeneration of hepatocytes in the liver vacuolar degeneration of myocytes in the heart, vacuolar degeneration of histiocytes in the spleen and thymus, atrophy of seminiferous tubule and degeneration of germinal epithelium in the testis, vacuolar degeneration of corpus luteum, granulosa cell and theca cell in the ovary were observed in the 150 and 500 mg/kg male and female groups. Based on these results, the no observed adverse effect level (NOAEL) with CWJ-a-5 was considered to be 75 mg/kg and the absolute toxic dose was considered to be 150 mg/kg in this study

Trend of Pharmacopuncture Therapy for Treating Cervical Disease in Korea

  • Kim, Seok-Hee;Jung, Da-Jung;Choi, Yoo-Min;Kim, Jong-Uk;Yook, Tae-Han
    • Journal of Pharmacopuncture
    • /
    • v.17 no.4
    • /
    • pp.7-14
    • /
    • 2014
  • Objectives: The purpose of this study is to analyze trends in domestic studies on pharmacopuncture therapy for treating cervical disease. Methods: This study was carried out on original copies and abstracts of theses listed in databases or published until July 2014. The search was made on the Oriental medicine Advanced Searching Integrated System (OASIS) the National Digital Science Library (NDSL), and the Korean traditional knowledge portal. Search words were 'pain on cervical spine', 'cervical pain', 'ruptured cervical disk', 'cervical disc disorder', 'stiffness of the neck', 'cervical disk', 'whiplash injury', 'cervicalgia', 'posterior cervical pain', 'neck disability', 'Herniated Nucleus Pulposus (HNP)', and 'Herniated Intervertebral Disc (HIVD)'. Results: Twenty-five clinical theses related to pharmacopuncture were selected and were analyzed by year according to the type of pharmacopuncture used, the academic journal in which the publication appeared, and the effect of pharmacopuncture therapy. Conclusion: The significant conclusions are as follows: (1) Pharmacopunctures used for cervical pain were Bee venom pharmacopuncture, Carthami-flos pharmacopuncture, Scolopendra pharmacopuncture, Ouhyul pharmacopuncturen, Hwangryun pharmacopuncture, Corpus pharmacopuncture, Soyeom pharmacopuncture, Hwangryunhaedoktang pharmacopuncture, Shinbaro phamacopuncture. (2) Randomized controlled trials showed that pharmacopuncture therapy combined with other methods was more effective. (3) In the past, studies oriented toward Bee venom pharmacopuncture were actively pursued, but the number of studies on various other types of pharmacopuncture gradually began to increase. (4) For treating a patient with cervical pain, the type of pharmacopuncture to be used should be selected based on the cause of the disease and the patient's condition.

Differentiation of Human Adult Adipose Derived Stem Cell in vitro and Immunohistochemical Study of Adipose Derived Stem Cell after Intracerebral Transplantation in Rats

  • Ko, Kwang-Seok;Lee, Il-Woo;Joo, Won-Il;Lee, Kyung-Jun;Park, Hae-Kwan;Rha, Hyung-Keun
    • Journal of Korean Neurosurgical Society
    • /
    • v.42 no.2
    • /
    • pp.118-124
    • /
    • 2007
  • Objective : Adipose tissue is derived from the embryonic mesoderm and contains a heterogenous stromal cell population. Authors have tried to verify the characteristics of stem cell of adipose derived stromal cells (ADSCs) and to investigate immunohistochemical findings after transplantation of ADSC into rat brain to evaluate survival, migration and differentiation of transplanted stromal cells. Methods : First ADSCs were isolated from human adipose tissue and induced adipose, osseous and neuronal differentiation under appropriate culture condition in vitro and examined phenotypes profile of human ADSCs in undifferentiated states using flow cytometry and immunohistochemical study. Human ADSCs were transplanted into the healthy rat brain to investigate survival, migration and differentiation after 4 weeks. Results : From human adipose tissue, adipose stem cells were harvested and subcultured for several times. The cultured ADSCs were differentiated into adipocytes, osteoctye and neuron-like cell under conditioned media. Flow cytometric analysis of undifferentiated ADSCs revealed that ADSCs were positive for CD29, CD44 and negative for CD34, CD45, CD117 and HLA-DR. Transplanted human ADSCs were found mainly in cortex adjacent to injection site and migrated from injection site at a distance of at least 1 mm along the cortex and corpus callosum. A few transplanted cells have differentiated into neuron and astrocyte. Conclusion : ADSCs were differentiated into multilineage cell lines through transdifferentiation. ADSCs were survived and migrated in xenograft without immunosuppression. Based on this data, ADSCs may be potential source of stem cells for many human disease including neurologic disorder.

Development of a Sizing System of Women's Fitness Wear for the Senior Population in South Korea (한국 노인 여성을 위한 피트니스 압박웨어 치수 개발)

  • Jeon, Eun-Jin;Lee, Won-sup;Park, Jang-Woon;You, Hee-Cheon
    • Fashion & Textile Research Journal
    • /
    • v.20 no.4
    • /
    • pp.464-473
    • /
    • 2018
  • The objective of this study is to develop a sizing system of fitness clothing that can properly accommodate various body sizes of Korean senior women. The sizing system of upper and lower fitness clothing was developed in the present study by selection of key variables, identification of size category candidates, and determination of an optimal sizing system. First, key anthropometric dimensions (stature and bust circumference for upper clothing and stature; waist circumference for lower clothing) were identified by factor analysis on the direct body measurements (n = 272) and 3D whole-body scan data (n = 271) of Korean senior women in Size Korea. Second, sizing system candidates based on the key dimensions of upper and lower clothing were explored using a grid method and an optimization method. Lastly, among the sizing system candidates, optimal sizing systems of upper and lower clothing were selected in terms of accommodation rate. Five size categories (short/small, short/medium, tall/small, tall/medium, and tall/large) were selected as the optimal sizing systems of upper and lower clothing with 89% and 78% of accommodation rate, respectively, for the Korean senior women. The anthropometric characteristics of the representative humans of the optimal size categories would be of use in the design of fitness compressive wear for the better fit and effectiveness of exercise and health of Korean senior women.

Multi-Label Classification for Corporate Review Text: A Local Grammar Approach (머신러닝 기반의 기업 리뷰 다중 분류: 부분 문법 적용을 중심으로)

  • HyeYeon Baek;Young Kyun Chang
    • Information Systems Review
    • /
    • v.25 no.3
    • /
    • pp.27-41
    • /
    • 2023
  • Unlike the previous works focusing on the state-of-the-art methodologies to improve the performance of machine learning models, this study improves the 'quality' of training data used in machine learning. We propose a method to enhance the quality of training data through the processing of 'local grammar,' frequently used in corpus analysis. We collected a vast amount of unstructured corporate review text data posted by employees working in the top 100 companies in Korea. After improving the data quality using the local grammar process, we confirmed that the classification model with local grammar outperformed the model without it in terms of classification performance. We defined five factors of work engagement as classification categories, and analyzed how the pattern of reviews changed before and after the COVID-19 pandemic. Through this study, we provide evidence that shows the value of the local grammar-based automatic identification and classification of employee experiences, and offer some clues for significant organizational cultural phenomena.

Linac Based Radiosurgery for Cerebral Arteriovenous Malformations (선형가속기 방사선 수술을 이용한 뇌동정맥기형의 치료)

  • Lee, Sung Yeal;Son, Eun Ik;Kim, Ok Bae;Choi, Tae Jin;Kim, Dong Won;Yim, Man Bin;Kim, In Hong
    • Journal of Korean Neurosurgical Society
    • /
    • v.29 no.8
    • /
    • pp.1030-1036
    • /
    • 2000
  • Objective : The aim of this study was to retrospectively analyze the safety and effect of Linac based Photon Knife Radiosugery System(PKRS) for treatment of cerebral arteriovenous malformation. Patients and Methods : The authors analyzed the clinical method and results of ten patients who were followed up more than two years, among the 18 patients who had radiosurgery on arteriovenous malformation from June, 1992, to Dec. 1997, with Linac based Photon knife radiosurgery system(PKRS) which was developed in our hospital. Results : The average age of the patients was 30.4(with the range of 13-49), and the sex was seven males and three females. For the initial clinical symptoms, there were five patients with headache, three with seizure, one with hemiparesis, and one with vomiting. Before the radiosurgery, computed tomography, MRI, and cerebral angiogram were done. For the location of arteriovenous malformation, it was found on six patients of cerebral hemisphere, two of thalamus, one of brainstem, and one of corpus callosum. Regarding the size of nidus, there were seven patients of smaller than 3cm, and three patients of larger than 3cm. Computed tomography, MRI, and cerebral angiogram were done periodically for sixth months, first year, and second year after the radiosurgery of PKRS for the completeness of obliteration. Six cases showed complete obliteration, and four partial obliterations were observed among ten cases, and interestingly, six cases of complete obliteration were observed among seven cases of small AVM of smaller than 3cm(the rate of complete obliteration : 85.7%). All patients tolerated the treatment and no significant complication were seen. Conclusion : In this study, linac based radiosurgery using PKRS onto arteriovenous malformation showed excellent effects, therefore authors believe that it is an ideal method for small sized or deep seated AVM.

  • PDF

A Deep Learning-based Depression Trend Analysis of Korean on Social Media (딥러닝 기반 소셜미디어 한글 텍스트 우울 경향 분석)

  • Park, Seojeong;Lee, Soobin;Kim, Woo Jung;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.39 no.1
    • /
    • pp.91-117
    • /
    • 2022
  • The number of depressed patients in Korea and around the world is rapidly increasing every year. However, most of the mentally ill patients are not aware that they are suffering from the disease, so adequate treatment is not being performed. If depressive symptoms are neglected, it can lead to suicide, anxiety, and other psychological problems. Therefore, early detection and treatment of depression are very important in improving mental health. To improve this problem, this study presented a deep learning-based depression tendency model using Korean social media text. After collecting data from Naver KonwledgeiN, Naver Blog, Hidoc, and Twitter, DSM-5 major depressive disorder diagnosis criteria were used to classify and annotate classes according to the number of depressive symptoms. Afterwards, TF-IDF analysis and simultaneous word analysis were performed to examine the characteristics of each class of the corpus constructed. In addition, word embedding, dictionary-based sentiment analysis, and LDA topic modeling were performed to generate a depression tendency classification model using various text features. Through this, the embedded text, sentiment score, and topic number for each document were calculated and used as text features. As a result, it was confirmed that the highest accuracy rate of 83.28% was achieved when the depression tendency was classified based on the KorBERT algorithm by combining both the emotional score and the topic of the document with the embedded text. This study establishes a classification model for Korean depression trends with improved performance using various text features, and detects potential depressive patients early among Korean online community users, enabling rapid treatment and prevention, thereby enabling the mental health of Korean society. It is significant in that it can help in promotion.

CRNN-Based Korean Phoneme Recognition Model with CTC Algorithm (CTC를 적용한 CRNN 기반 한국어 음소인식 모델 연구)

  • Hong, Yoonseok;Ki, Kyungseo;Gweon, Gahgene
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.3
    • /
    • pp.115-122
    • /
    • 2019
  • For Korean phoneme recognition, Hidden Markov-Gaussian Mixture model(HMM-GMM) or hybrid models which combine artificial neural network with HMM have been mainly used. However, current approach has limitations in that such models require force-aligned corpus training data that is manually annotated by experts. Recently, researchers used neural network based phoneme recognition model which combines recurrent neural network(RNN)-based structure with connectionist temporal classification(CTC) algorithm to overcome the problem of obtaining manually annotated training data. Yet, in terms of implementation, these RNN-based models have another difficulty in that the amount of data gets larger as the structure gets more sophisticated. This problem of large data size is particularly problematic in the Korean language, which lacks refined corpora. In this study, we introduce CTC algorithm that does not require force-alignment to create a Korean phoneme recognition model. Specifically, the phoneme recognition model is based on convolutional neural network(CNN) which requires relatively small amount of data and can be trained faster when compared to RNN based models. We present the results from two different experiments and a resulting best performing phoneme recognition model which distinguishes 49 Korean phonemes. The best performing phoneme recognition model combines CNN with 3hop Bidirectional LSTM with the final Phoneme Error Rate(PER) at 3.26. The PER is a considerable improvement compared to existing Korean phoneme recognition models that report PER ranging from 10 to 12.

A Study on Knowledge Entity Extraction Method for Individual Stocks Based on Neural Tensor Network (뉴럴 텐서 네트워크 기반 주식 개별종목 지식개체명 추출 방법에 관한 연구)

  • Yang, Yunseok;Lee, Hyun Jun;Oh, Kyong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.25-38
    • /
    • 2019
  • Selecting high-quality information that meets the interests and needs of users among the overflowing contents is becoming more important as the generation continues. In the flood of information, efforts to reflect the intention of the user in the search result better are being tried, rather than recognizing the information request as a simple string. Also, large IT companies such as Google and Microsoft focus on developing knowledge-based technologies including search engines which provide users with satisfaction and convenience. Especially, the finance is one of the fields expected to have the usefulness and potential of text data analysis because it's constantly generating new information, and the earlier the information is, the more valuable it is. Automatic knowledge extraction can be effective in areas where information flow is vast, such as financial sector, and new information continues to emerge. However, there are several practical difficulties faced by automatic knowledge extraction. First, there are difficulties in making corpus from different fields with same algorithm, and it is difficult to extract good quality triple. Second, it becomes more difficult to produce labeled text data by people if the extent and scope of knowledge increases and patterns are constantly updated. Third, performance evaluation is difficult due to the characteristics of unsupervised learning. Finally, problem definition for automatic knowledge extraction is not easy because of ambiguous conceptual characteristics of knowledge. So, in order to overcome limits described above and improve the semantic performance of stock-related information searching, this study attempts to extract the knowledge entity by using neural tensor network and evaluate the performance of them. Different from other references, the purpose of this study is to extract knowledge entity which is related to individual stock items. Various but relatively simple data processing methods are applied in the presented model to solve the problems of previous researches and to enhance the effectiveness of the model. From these processes, this study has the following three significances. First, A practical and simple automatic knowledge extraction method that can be applied. Second, the possibility of performance evaluation is presented through simple problem definition. Finally, the expressiveness of the knowledge increased by generating input data on a sentence basis without complex morphological analysis. The results of the empirical analysis and objective performance evaluation method are also presented. The empirical study to confirm the usefulness of the presented model, experts' reports about individual 30 stocks which are top 30 items based on frequency of publication from May 30, 2017 to May 21, 2018 are used. the total number of reports are 5,600, and 3,074 reports, which accounts about 55% of the total, is designated as a training set, and other 45% of reports are designated as a testing set. Before constructing the model, all reports of a training set are classified by stocks, and their entities are extracted using named entity recognition tool which is the KKMA. for each stocks, top 100 entities based on appearance frequency are selected, and become vectorized using one-hot encoding. After that, by using neural tensor network, the same number of score functions as stocks are trained. Thus, if a new entity from a testing set appears, we can try to calculate the score by putting it into every single score function, and the stock of the function with the highest score is predicted as the related item with the entity. To evaluate presented models, we confirm prediction power and determining whether the score functions are well constructed by calculating hit ratio for all reports of testing set. As a result of the empirical study, the presented model shows 69.3% hit accuracy for testing set which consists of 2,526 reports. this hit ratio is meaningfully high despite of some constraints for conducting research. Looking at the prediction performance of the model for each stocks, only 3 stocks, which are LG ELECTRONICS, KiaMtr, and Mando, show extremely low performance than average. this result maybe due to the interference effect with other similar items and generation of new knowledge. In this paper, we propose a methodology to find out key entities or their combinations which are necessary to search related information in accordance with the user's investment intention. Graph data is generated by using only the named entity recognition tool and applied to the neural tensor network without learning corpus or word vectors for the field. From the empirical test, we confirm the effectiveness of the presented model as described above. However, there also exist some limits and things to complement. Representatively, the phenomenon that the model performance is especially bad for only some stocks shows the need for further researches. Finally, through the empirical study, we confirmed that the learning method presented in this study can be used for the purpose of matching the new text information semantically with the related stocks.