Search | Korea Science

Selective Word Embedding for Sentence Classification by Considering Information Gain and Word Similarity (문장 분류를 위한 정보 이득 및 유사도에 따른 단어 제거와 선택적 단어 임베딩 방안)

Lee, Min Seok;Yang, Seok Woo;Lee, Hong Joo
- Journal of Intelligence and Information Systems
- /
- v.25 no.4
- /
- pp.105-122
- /
- 2019
Dimensionality reduction is one of the methods to handle big data in text mining. For dimensionality reduction, we should consider the density of data, which has a significant influence on the performance of sentence classification. It requires lots of computations for data of higher dimensions. Eventually, it can cause lots of computational cost and overfitting in the model. Thus, the dimension reduction process is necessary to improve the performance of the model. Diverse methods have been proposed from only lessening the noise of data like misspelling or informal text to including semantic and syntactic information. On top of it, the expression and selection of the text features have impacts on the performance of the classifier for sentence classification, which is one of the fields of Natural Language Processing. The common goal of dimension reduction is to find latent space that is representative of raw data from observation space. Existing methods utilize various algorithms for dimensionality reduction, such as feature extraction and feature selection. In addition to these algorithms, word embeddings, learning low-dimensional vector space representations of words, that can capture semantic and syntactic information from data are also utilized. For improving performance, recent studies have suggested methods that the word dictionary is modified according to the positive and negative score of pre-defined words. The basic idea of this study is that similar words have similar vector representations. Once the feature selection algorithm selects the words that are not important, we thought the words that are similar to the selected words also have no impacts on sentence classification. This study proposes two ways to achieve more accurate classification that conduct selective word elimination under specific regulations and construct word embedding based on Word2Vec embedding. To select words having low importance from the text, we use information gain algorithm to measure the importance and cosine similarity to search for similar words. First, we eliminate words that have comparatively low information gain values from the raw text and form word embedding. Second, we select words additionally that are similar to the words that have a low level of information gain values and make word embedding. In the end, these filtered text and word embedding apply to the deep learning models; Convolutional Neural Network and Attention-Based Bidirectional LSTM. This study uses customer reviews on Kindle in Amazon.com, IMDB, and Yelp as datasets, and classify each data using the deep learning models. The reviews got more than five helpful votes, and the ratio of helpful votes was over 70% classified as helpful reviews. Also, Yelp only shows the number of helpful votes. We extracted 100,000 reviews which got more than five helpful votes using a random sampling method among 750,000 reviews. The minimal preprocessing was executed to each dataset, such as removing numbers and special characters from text data. To evaluate the proposed methods, we compared the performances of Word2Vec and GloVe word embeddings, which used all the words. We showed that one of the proposed methods is better than the embeddings with all the words. By removing unimportant words, we can get better performance. However, if we removed too many words, it showed that the performance was lowered. For future research, it is required to consider diverse ways of preprocessing and the in-depth analysis for the co-occurrence of words to measure similarity values among words. Also, we only applied the proposed method with Word2Vec. Other embedding methods such as GloVe, fastText, ELMo can be applied with the proposed methods, and it is possible to identify the possible combinations between word embedding methods and elimination methods.
https://doi.org/10.13088/jiis.2019.25.4.105 인용 PDF KSCI

Development of prevotella intermedia ATCC 49046 Strain-Specific PCR Primer Based on a Pig6 DNA Probe (Pig6 DNA probe를 기반으로 하는 Prevotella intermedia ATCC 49046 균주-특이 PCR primer 개발)

Jeong Seung-U;Yoo So-Young;Kang Sook-Jin;Kim Mi-Kwang;Jang Hyun-Seon;Lee Kwang-Yong;Kim Byung-Ok;Kook Joong-Ki
- Korean Journal of Microbiology
- /
- v.42 no.2
- /
- pp.89-94
- /
- 2006
The purpose of this study is to develop the strain-specific PCR primers for the identification of prevotella inter-media ATCC 49046 which is frequently used in the pathogenesis studies of periodontitis. The Hind III-digested genomic DNA of P. intermedia ATCC 49046 were cloned by random cloning method. The specificity of cloned DNA fragments were determined by Southern blot analysis. The nucleotide sequence of cloned DNA probes was determined by chain termination method. The PCR primers were designed based on the nucleotide sequence of cloned DNA fragment. The data showed that Pig6 DNA probe were hybridized with the genomic DNA from P. intermedia strains (ATCC $25611^T$ and 49046) isolated from the Westerns, not the strains isolated from Koreans. The Pig6 DNA probe were consisted of 813 bp. Pig6-F3 and Pig6-R3 primers, designed base on the nucleotide Sequences Of Pig6 DNA Probe, were 3150 specific to the only both P. intermedia ATCC $25611^T$ and P. intermedia ATCC 49046. In the other hand, Pig6-60F and Pig6-770R primers were specific to the only P. intermedia ATCC 49046. The two PCR primer sets could detect as little as 4 pg of chromosomal DNA of P. intermedia. These results indicate that Pig6-60F and Pig6-770R primers have proven useful for the identification of P. intermedia ATCC 49046, especially with regard to the maintenance of the strain.
PDF KSCI

Analysis of User's Impact on Vegetation Structure Changes and User's Psychology in Odongdo Island of Hallyo-Haesang National Park (오동도(梧桐島)에서의 이용객(利用客)에 의한 식생구조(植生構造) 변화(變化) 및 이용자(利用者) 심리분석(心理分析)에 관(關)한 연구)

Park, Myong Kyu;Lee, Kyong Jae;Park, In Hyeop
- Journal of Korean Society of Forest Science
- /
- v.76 no.4
- /
- pp.397-409
- /
- 1987
This study was executed to analyze the user's impact on vegetation structure changes and user's psychology in Odongdo Island of Hallyo-Haesang National Park. Five sites were sampled for vegetation structure changes vi the study area according to the extent of impact observed. Also user's psychology was studied through questionnaire with the visitors and 366 answers were collected at random sampling in May, 1986. Evergreen broad-leaved forest, i.e. Machilus thunbergii, Cinnamomum camphora, and Camellia japonica forest, took possession of 32.5% (3.91ha) of total forest area when condisering the actual vegetation. Camellia japonica community covered 40.0% (4.72ha) and Sasa coreana community took possession of 41.8% (5.02ha). The area of environmental impact grade 3 and 4 area covered 44.3% of total forest area and it should be restored because self-repair seemed to be impossible. The evergreen broad-leaved forest was destoryed seriously with no younger trees in middle and lower layers by overuse impact and would be bared soon. So the preservation of autochthonous flora is required by the control of the number of users. It was shown that most of visitors come on holidays and Sunday and places which were favorably impressed were shown as the area of showing the sea and Camellia forest. Overall levels of satisfaction was comparatively low, consequently 55% of visitors were satisfied. This level of satisfaction was associated with number of users, landscape of forest and number of facilities.
PDF

Rubric Development for Performance Evaluation of Middle School Home Economics - Focusing on Experiment and Practice Methods - (중학교 가정교과 수행평가를 위한 루브릭(rubric) 개발 - 실험.실습법에 적용 -)

Bum, Sun-Hwa;Chae, Jung-Hyun
- Journal of Korean Home Economics Education Association
- /
- v.20 no.3
- /
- pp.85-105
- /
- 2008
The purpose of this study was to develop a narrative analytic scoring rubric through teacher-students negotiations, as an assessment of tasks using methods of experiment and practice for home economic(HE) in the middle school. In this study. an analytic rubric had been developed in the following three stages: In the first stage, all the things for rubric development were defined and prepared, by selecting tasks used for rubric application through a questionnaire survey, providing detailed directions on methods and procedures and needed items, and selecting a class for rubric negotiation and setting the development schedule. In addition, the method suggested by Ainsworth and Christinson(1998) in Student Generated Rubrics was used. In the second stage, performance criteria for tasks in terms of knowledge, skills, and attitude were developed, setting scoring framework and scales depending on assessment areas. Referring to selected scoring framework and assessment criteria, observable and assessable behaviors were used to describe rubric based on A, B, and C scale. Then, a primary rubric was developed through teacher-students negotiations, using rubrics made by group. In the last stage, the developed primary rubric was reviewed by an expert of HE education to test the validity. Moreover, the analysis to test the suitability of the final rubric assessment tool employed 46 copies of questionnaire collected from incumbent home economics teachers selected by way of random sampling mainly focusing on those teachers who were in the Master's degree program or completed the program at one university. As a result, the average of suitability of aa the rubrics were over 4.0 in th 5-point scale.
PDF

Analysis of Massive Transfusion Blood Product Use in a Tertiary Care Hospital (일개 3차 의료기관의 대량수혈 혈액 사용 분석)

Lim, Young Ae;Jung, Kyoungwon;Lee, John Cook-Jong
- The Korean Journal of Blood Transfusion
- /
- v.29 no.3
- /
- pp.253-261
- /
- 2018
Background: A massive blood transfusion (MT) requires significant efforts by the Blood Bank. This study examined blood product use in MT and emergency O Rh Positive red cells (O RBCs) available directly for emergency patients from the Trauma Center in Ajou University Hospital. Methods: MT was defined as a transfusion of 10 or more RBCs within 24 hours. The extracted data for the total RBCs, fresh frozen plasma (FFP), platelets (PLTs, single donor platelets (SDP) and random platelet concentrates (PC)) issued from Blood Bank between March 2016 and November 2017 from Hospital Information System were reviewed. SDP was considered equivalent to 6 units of PC. Results: A total of 345 MTs, and 6233/53268 (11.7%) RBCs, 4717/19376 (24.3%) FFP, and 4473/94166 (4.8%) PLTs were used in MT (P<0.001). For the RBC products in MT and non-MT transfusions, 28.0% and 34.1% were group A; 27.1% and 26.0% were group B; 37.3% and 29.7% were group O, and 7.5% and 10.2% were group AB (P<0.001). The ratios of RBC:FFP:PLT use were 1:0.76:0.72 in MT and 1:0.31:1.91 in non-MT (P<0.001). A total of 461 O RBCs were used in 36.2% (125/345) of MT cases and the number of O RBCs transfused per patient ranged from 1 to 18. Conclusion: RBCs with the O blood group are most used for MT. Ongoing education of clinicians to minimize the overuse of emergency O RBCs in MT is required. A procedure to have thawed plasma readily available in MT appears to be of importance because FFP was used frequently in MT.
https://doi.org/10.17945/kjbt.2018.29.3.253 인용

Electroencephalographic Changes Induced by a Neurofeedback Training : A Preliminary Study in Primary Insomniac Patients (뉴로피드백 훈련에 의한 뇌파 변화 연구 : 일차성 불면증 환자에 대한 예비 연구)

Lee, Jin Han;Shin, Hong-Beom;Kim, Jong Won;Suh, Ho-Suk;Lee, Young Jin
- Sleep Medicine and Psychophysiology
- /
- v.26 no.1
- /
- pp.44-48
- /
- 2019
Objectives: Insomnia is one of the most prevalent sleep disorders. Recent studies suggest that cognitive and physical arousal play an important role in the generation of primary insomnia. Studies have also shown that information processing disorders due to cortical hyperactivity might interfere with normal sleep onset and sleep continuity. Therefore, focusing on central nervous system arousal and normalizing the information process have become current topics of interest. It has been well known that neurofeedback can reduce the brain hyperarousal by modulating patients' brain waves during a sequence of behavior therapy. The purpose of this study was to investigate effects of neurofeedback therapy on electroencephalography (EEG) characteristics in patients with primary insomnia. Methods: Thirteen subjects who met the criteria for an insomnia diagnosis and 14 control subjects who were matched on sex and age were included. Neurofeedback and sham treatments were performed in a random order for 30 minutes, respectively. EEG spectral power analyses were performed to quantify effects of the neurofeedback therapy on brain wave forms. Results: In patients with primary insomnia, relative spectral theta and sigma power during a therapeutic neurofeedback session were significantly lower than during a sham session ($13.9{\pm}2.6$ vs. $12.2{\pm}3.8$ and $3.6{\pm}0.9$ vs. $3.2{\pm}1.0$ in %, respectively; p < 0.05). There were no statistically significant changes in other EEG spectral bands. Conclusion: For the first time in Korea, EEG spectral power in the theta band was found to increase when a neurofeedback session was applied to patients with insomnia. This outcome might provide some insight into new interventions for improving sleep onset. However, the treatment response of insomniacs was not precisely evaluated due to limitations of the current pilot study, which requires follow-up studies with larger samples in the future.
https://doi.org/10.14401/KASMED.2019.26.1.44 인용 PDF KSCI HTML

Regeneration of a defective Railroad Surface for defect detection with Deep Convolution Neural Networks (Deep Convolution Neural Networks 이용하여 결함 검출을 위한 결함이 있는 철도선로표면 디지털영상 재 생성)

Kim, Hyeonho;Han, Seokmin
- Journal of Internet Computing and Services
- /
- v.21 no.6
- /
- pp.23-31
- /
- 2020
This study was carried out to generate various images of railroad surfaces with random defects as training data to be better at the detection of defects. Defects on the surface of railroads are caused by various factors such as friction between track binding devices and adjacent tracks and can cause accidents such as broken rails, so railroad maintenance for defects is necessary. Therefore, various researches on defect detection and inspection using image processing or machine learning on railway surface images have been conducted to automate railroad inspection and to reduce railroad maintenance costs. In general, the performance of the image processing analysis method and machine learning technology is affected by the quantity and quality of data. For this reason, some researches require specific devices or vehicles to acquire images of the track surface at regular intervals to obtain a database of various railway surface images. On the contrary, in this study, in order to reduce and improve the operating cost of image acquisition, we constructed the 'Defective Railroad Surface Regeneration Model' by applying the methods presented in the related studies of the Generative Adversarial Network (GAN). Thus, we aimed to detect defects on railroad surface even without a dedicated database. This constructed model is designed to learn to generate the railroad surface combining the different railroad surface textures and the original surface, considering the ground truth of the railroad defects. The generated images of the railroad surface were used as training data in defect detection network, which is based on Fully Convolutional Network (FCN). To validate its performance, we clustered and divided the railroad data into three subsets, one subset as original railroad texture images and the remaining two subsets as another railroad surface texture images. In the first experiment, we used only original texture images for training sets in the defect detection model. And in the second experiment, we trained the generated images that were generated by combining the original images with a few railroad textures of the other images. Each defect detection model was evaluated in terms of 'intersection of union(IoU)' and F1-score measures with ground truths. As a result, the scores increased by about 10~15% when the generated images were used, compared to the case that only the original images were used. This proves that it is possible to detect defects by using the existing data and a few different texture images, even for the railroad surface images in which dedicated training database is not constructed.
https://doi.org/10.7472/jksii.2020.21.6.23 인용 PDF KSCI HTML

Comparative evaluation of dose according to changes in rectal gas volume during radiation therapy for cervical cancer : Phantom Study (자궁경부암 방사선치료 시 직장가스 용적 변화에 따른 선량 비교 평가 - Phantom Study)

Choi, So Young;Kim, Tae Won;Kim, Min Su;Song, Heung Kwon;Yoon, In Ha;Back, Geum Mun
- The Journal of Korean Society for Radiation Therapy
- /
- v.33
- /
- pp.89-97
- /
- 2021
Purpose: The purpose of this study is to compare and evaluate the dose change according to the gas volume variations in the rectum, which was not included in the treatment plan during radiation therapy for cervical cancer. Materials and methods: Static Intensity Modulated Radiation Therapy (S-IMRT) using a 9-field and Volumetric Modulated Arc Therapy (VMAT) using 2 full-arcs were established with treatment planning system on Computed Tomography images of a human phantom. Random gas parameters were included in the Planning Target Volume(PTV) with a maximum change of 2.0 cm in increments of 0.5 cm. Then, the Conformity Index (CI), Homogeneity Index (HI) and PTV D_max for the target volume were calculated, and the minimum dose (D_min), mean dose (D_mean) and Maximum Dose (D_max) were calculated and compared for OAR(organs at risk). For statistical analysis, T-test was performed to obtain a p-value, where the significance level was set to 0.05. Result: The HI coefficients of determination(R²) of S-IMRT and VMAT were 0.9423 and 0.8223, respectively, indicating a relatively clear correlation, and PTV D_max was found to increase up to 2.8% as the volume of a given gas parameter increased. In case of OAR evaluation, the dose in the bladder did not change with gas volume while a significant dose difference of more than D_mean 700 cGy was confirmed in rectum using both treatment plans at gas volumes of 1.0 cm or more. In all values except for D_mean of bladder, p-value was less than 0.05, confirming a statistically significant difference. Conclusion: In the case of gas generation not considered in the reference treatment plan, as the amount of gas increased, the dose difference at PTV and the dose delivered to the rectum increased. Therefore, during radiation therapy, it is necessary to make efforts to minimize the dose transmission error caused by a large amount of gas volumes in the rectum. Further studies will be necessary to evaluate dose transmission by not only varying the gas volume but also where the gas was located in the treatment field.
PDF KSCI

Comparison of Bleeding Tendency Between Selective Serotonin Reuptake Inhibitors and Serotonin Norepinephrine Reuptake Inhibitors Using Platelet Function Analyzer (혈소판기능분석기를 이용한 선택적 세로토닌 재흡수 억제제와 세로토닌 노르에피네프린 재흡수 억제제의 출혈 경향성 비교)

Koo, Seung Mo;Kim, Hyun;Lee, Kang Joon
- Korean Journal of Psychosomatic Medicine
- /
- v.29 no.2
- /
- pp.153-161
- /
- 2021
Objectives : The purpose of this study is to compare bleeding tendency of selective serotonin reuptake inhibitor (SSRI) and serotonin norepinephrine reuptake inhibitors (SNRI) using platelet function analyzer (PFA-100) in patients with major depressive disorder. Methods : This study is a prospective open-label study conducted by a single institution. A total of 41 subjects diagnosed with major depressive disorder under the DSM-5 diagnostic criteria participated in this study. The subjects were classified into SSRI (escitalopram) groups and SNRI (duloxetine) groups, respectively, according to random assignments. The closure time (CT) was measured using a platelet function analyzer (PFA-100) before each antidepressant was administered and after 6 weeks. Paired-sample t-test was conducted within each group to determine whether a specific antidepressant had an effect on closure time. In order to confirm the relative change in platelet function between the two groups, an independent sample t-test was conducted to compare and analyze the change in closure time between the two groups. Results : There was no significant changes in closure time (CEPI-CT, CADP-CT) before and 6 weeks after drug administration in the SSRI and SNRI groups, and there was no difference in the amount of changes in closure time between the two groups. Conclusions : Our results showed no difference in bleeding tendency between SSRI and SNRI. This study suggests that further large-scale studies on bleeding tendency for various antidepressants are needed in the future.
https://doi.org/10.22722/KJPM.2021.29.2.153 인용 PDF KSCI

Analysis of the impact of mathematics education research using explainable AI (설명가능한 인공지능을 활용한 수학교육 연구의 영향력 분석)

Oh, Se Jun
- The Mathematical Education
- /
- v.62 no.3
- /
- pp.435-455
- /
- 2023
This study primarily focused on the development of an Explainable Artificial Intelligence (XAI) model to discern and analyze papers with significant impact in the field of mathematics education. To achieve this, meta-information from 29 domestic and international mathematics education journals was utilized to construct a comprehensive academic research network in mathematics education. This academic network was built by integrating five sub-networks: 'paper and its citation network', 'paper and author network', 'paper and journal network', 'co-authorship network', and 'author and affiliation network'. The Random Forest machine learning model was employed to evaluate the impact of individual papers within the mathematics education research network. The SHAP, an XAI model, was used to analyze the reasons behind the AI's assessment of impactful papers. Key features identified for determining impactful papers in the field of mathematics education through the XAI included 'paper network PageRank', 'changes in citations per paper', 'total citations', 'changes in the author's h-index', and 'citations per paper of the journal'. It became evident that papers, authors, and journals play significant roles when evaluating individual papers. When analyzing and comparing domestic and international mathematics education research, variations in these discernment patterns were observed. Notably, the significance of 'co-authorship network PageRank' was emphasized in domestic mathematics education research. The XAI model proposed in this study serves as a tool for determining the impact of papers using AI, providing researchers with strategic direction when writing papers. For instance, expanding the paper network, presenting at academic conferences, and activating the author network through co-authorship were identified as major elements enhancing the impact of a paper. Based on these findings, researchers can have a clear understanding of how their work is perceived and evaluated in academia and identify the key factors influencing these evaluations. This study offers a novel approach to evaluating the impact of mathematics education papers using an explainable AI model, traditionally a process that consumed significant time and resources. This approach not only presents a new paradigm that can be applied to evaluations in various academic fields beyond mathematics education but also is expected to substantially enhance the efficiency and effectiveness of research activities.
https://doi.org/10.7468/mathedu.2023.62.3.435 인용 PDF

Search Result 4,611, Processing Time 0.032 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)