• Title/Summary/Keyword: prediction technique

Search Result 2,086, Processing Time 0.033 seconds

Analysis of Research Trends Related to drug Repositioning Based on Machine Learning (머신러닝 기반의 신약 재창출 관련 연구 동향 분석)

  • So Yeon Yoo;Gyoo Gun Lim
    • Information Systems Review
    • /
    • v.24 no.1
    • /
    • pp.21-37
    • /
    • 2022
  • Drug repositioning, one of the methods of developing new drugs, is a useful way to discover new indications by allowing drugs that have already been approved for use in people to be used for other purposes. Recently, with the development of machine learning technology, the case of analyzing vast amounts of biological information and using it to develop new drugs is increasing. The use of machine learning technology to drug repositioning will help quickly find effective treatments. Currently, the world is having a difficult time due to a new disease caused by coronavirus (COVID-19), a severe acute respiratory syndrome. Drug repositioning that repurposes drugsthat have already been clinically approved could be an alternative to therapeutics to treat COVID-19 patients. This study intends to examine research trends in the field of drug repositioning using machine learning techniques. In Pub Med, a total of 4,821 papers were collected with the keyword 'Drug Repositioning'using the web scraping technique. After data preprocessing, frequency analysis, LDA-based topic modeling, random forest classification analysis, and prediction performance evaluation were performed on 4,419 papers. Associated words were analyzed based on the Word2vec model, and after reducing the PCA dimension, K-Means clustered to generate labels, and then the structured organization of the literature was visualized using the t-SNE algorithm. Hierarchical clustering was applied to the LDA results and visualized as a heat map. This study identified the research topics related to drug repositioning, and presented a method to derive and visualize meaningful topics from a large amount of literature using a machine learning algorithm. It is expected that it will help to be used as basic data for establishing research or development strategies in the field of drug repositioning in the future.

Development and Validation of 18F-FDG PET/CT-Based Multivariable Clinical Prediction Models for the Identification of Malignancy-Associated Hemophagocytic Lymphohistiocytosis

  • Xu Yang;Xia Lu;Jun Liu;Ying Kan;Wei Wang;Shuxin Zhang;Lei Liu;Jixia Li;Jigang Yang
    • Korean Journal of Radiology
    • /
    • v.23 no.4
    • /
    • pp.466-478
    • /
    • 2022
  • Objective: 18F-fluorodeoxyglucose (FDG) PET/CT is often used for detecting malignancy in patients with newly diagnosed hemophagocytic lymphohistiocytosis (HLH), with acceptable sensitivity but relatively low specificity. The aim of this study was to improve the diagnostic ability of 18F-FDG PET/CT in identifying malignancy in patients with HLH by combining 18F-FDG PET/CT and clinical parameters. Materials and Methods: Ninety-seven patients (age ≥ 14 years) with secondary HLH were retrospectively reviewed and divided into the derivation (n = 71) and validation (n = 26) cohorts according to admission time. In the derivation cohort, 22 patients had malignancy-associated HLH (M-HLH) and 49 patients had non-malignancy-associated HLH (NM-HLH). Data on pretreatment 18F-FDG PET/CT and laboratory results were collected. The variables were analyzed using the Mann-Whitney U test or Pearson's chi-square test, and a nomogram for predicting M-HLH was constructed using multivariable binary logistic regression. The predictors were also ranked using decision-tree analysis. The nomogram and decision tree were validated in the validation cohort (10 patients with M-HLH and 16 patients with NM-HLH). Results: The ratio of the maximal standardized uptake value (SUVmax) of the lymph nodes to that of the mediastinum, the ratio of the SUVmax of bone lesions or bone marrow to that of the mediastinum, and age were selected for constructing the model. The nomogram showed good performance in predicting M-HLH in the validation cohort, with an area under the receiver operating characteristic curve of 0.875 (95% confidence interval, 0.686-0.971). At an appropriate cutoff value, the sensitivity and specificity for identifying M-HLH were 90% (9/10) and 68.8% (11/16), respectively. The decision tree integrating the same variables showed 70% (7/10) sensitivity and 93.8% (15/16) specificity for identifying M-HLH. In comparison, visual analysis of 18F-FDG PET/CT images demonstrated 100% (10/10) sensitivity and 12.5% (2/16) specificity. Conclusion: 18F-FDG PET/CT may be a practical technique for identifying M-HLH. The model constructed using 18F-FDG PET/CT features and age was able to detect malignancy with better accuracy than visual analysis of 18F-FDG PET/CT images.

A Study on the Extraction of Psychological Distance Embedded in Company's SNS Messages Using Machine Learning (머신 러닝을 활용한 회사 SNS 메시지에 내포된 심리적 거리 추출 연구)

  • Seongwon Lee;Jin Hyuk Kim
    • Information Systems Review
    • /
    • v.21 no.1
    • /
    • pp.23-38
    • /
    • 2019
  • The social network service (SNS) is one of the important marketing channels, so many companies actively exploit SNSs by posting SNS messages with appropriate content and style for their customers. In this paper, we focused on the psychological distances embedded in the SNS messages and developed a method to measure the psychological distance in SNS message by mixing a traditional content analysis, natural language processing (NLP), and machine learning. Through a traditional content analysis by human coding, the psychological distance was extracted from the SNS message, and these coding results were used for input data for NLP and machine learning. With NLP, word embedding was executed and Bag of Word was created. The Support Vector Machine, one of machine learning techniques was performed to train and test the psychological distance in SNS message. As a result, sensitivity and precision of SVM prediction were significantly low because of the extreme skewness of dataset. We improved the performance of SVM by balancing the ratio of data by upsampling technique and using data coded with the same value in first content analysis. All performance index was more than 70%, which showed that psychological distance can be measured well.

Radiologic assessment of the optimal point for tube thoracostomy using the sternum as a landmark: a computed tomography-based analysis

  • Jaeik Jang;Jae-Hyug Woo;Mina Lee;Woo Sung Choi;Yong Su Lim;Jin Seong Cho;Jae Ho Jang;Jea Yeon Choi;Sung Youl Hyun
    • Journal of Trauma and Injury
    • /
    • v.37 no.1
    • /
    • pp.37-47
    • /
    • 2024
  • Purpose: This study aimed at developing a novel tube thoracostomy technique using the sternum, a fixed anatomical structure, as an indicator to reduce the possibility of incorrect chest tube positioning and complications in patients with chest trauma. Methods: This retrospective study analyzed the data of 184 patients with chest trauma who were aged ≥18 years, visited a single regional trauma center in Korea between April and June 2022, and underwent chest computed tomography (CT) with their arms down. The conventional gold standard, 5th intercostal space (ICS) method, was compared to the lower 1/2, 1/3, and 1/4 of the sternum method by analyzing CT images. Results: When virtual tube thoracostomy routes were drawn at the mid-axillary line at the 5th ICS level, 150 patients (81.5%) on the right side and 179 patients (97.3%) on the left did not pass the diaphragm. However, at the lower 1/2 of the sternum level, 171 patients (92.9%, P<0.001) on the right and 182 patients (98.9%, P= 0.250) on the left did not pass the diaphragm. At the 5th ICS level, 129 patients (70.1%) on the right and 156 patients (84.8%) on the left were located in the safety zone and did not pass the diaphragm. Alternatively, at the lower 1/2, 1/3, and 1/4 of the sternum level, 139 (75.5%, P=0.185), 49 (26.6%, P<0.001), and 10 (5.4%, P<0.001), respectively, on the right, and 146 (79.3%, P=0.041), 69 (37.5%, P<0.001), and 16 (8.7%, P<0.001) on the left were located in the safety zone and did not pass the diaphragm. Compared to the conventional 5th ICS method, the sternum 1/2 method had a safety zone prediction sensitivity of 90.0% to 90.7%, and 97.3% to 100% sensitivity for not passing the diaphragm. Conclusions: Using the sternum length as a tube thoracostomy indicator might be feasible.

Study on Method to Develop Case-based Security Threat Scenario for Cybersecurity Training in ICS Environment (ICS 환경에서의 사이버보안 훈련을 위한 사례 기반 보안 위협 시나리오 개발 방법론 연구)

  • GyuHyun Jeon;Kwangsoo Kim;Jaesik Kang;Seungwoon Lee;Jung Taek Seo
    • Journal of Platform Technology
    • /
    • v.12 no.1
    • /
    • pp.91-105
    • /
    • 2024
  • As the number of cases of applying IT systems to the existing isolated ICS (Industrial Control System) network environment continues to increase, security threats in the ICS environment have rapidly increased. Security threat scenarios help to design security strategies in cybersecurity training, including analysis, prediction, and response to cyberattacks. For successful cybersecurity training, research is needed to develop valid and reliable security threat scenarios for meaningful training. Therefore, this paper proposes a case-based security threat scenario development methodology for cybersecurity training in the ICS environment. To this end, we develop a methodology consisting of five steps based on analyzing actual cybersecurity incident cases targeting ICS. Threat techniques are standardized in the same form using objective data based on the MITER ATT&CK framework, and then a list of CVEs and CWEs corresponding to the threat technique is identified. Additionally, it analyzes and identifies vulnerable functions in programming used in CWE and ICS assets. Based on the data generated up to the previous stage, develop security threat scenarios for cybersecurity training for new ICS. As a result of verification through a comparative analysis between the proposed methodology and existing research confirmed that the proposed method was more effective than the existing method regarding scenario validity, appropriateness of evidence, and development of various scenarios.

  • PDF

Discussion on Detection of Sediment Moisture Content at Different Altitudes Employing UAV Hyperspectral Images (무인항공 초분광 영상을 기반으로 한 고도에 따른 퇴적물 함수율 탐지 고찰)

  • Kyoungeun Lee;Jaehyung Yu;Chanhyeok Park;Trung Hieu Pham
    • Economic and Environmental Geology
    • /
    • v.57 no.4
    • /
    • pp.353-362
    • /
    • 2024
  • This study examined the spectral characteristics of sediments according to moisture content using an unmanned aerial vehicle (UAV)-based hyperspectral sensor and evaluated the efficiency of moisture content detection at different flight altitudes. For this purpose, hyperspectral images in the 400-1000nm wavelength range were acquired and analyzed at altitudes of 40m and 80m for sediment samples with various moisture contents. The reflectance of the sediments generally showed a decreasing trend as the moisture content increased. Correlation analysis between moisture content and reflectance showed a strong negative correlation (r < -0.8) across the entire 400-900nm range. The moisture content detection model constructed using the Random Forest technique showed detection accuracies of RMSE 2.6%, R2 0.92 at 40m altitude and RMSE 2.2%, R2 0.95 at 80m altitude, confirming that the difference in accuracy between altitudes was minimal. Variable importance analysis revealed that the 600-700nm band played a crucial role in moisture content detection. This study is expected to be utilized in efficient sediment moisture management and natural disaster prediction in the field of environmental monitoring in the future.

A Study For Optimizing Input Waveforms In Radiofrequency Liver Tumor Ablation Using Finite Element Analysis (유한 요소 해석을 이용한 고주파 간 종양 절제술의 입력 파형 최적화를 위한 연구)

  • Lim, Do-Hyung;NamGung, Bum-Seok;Lee, Tae-Woo;Choi, Jin-Seung;Tack, Gye-Rae;Kim, Han-Sung
    • Journal of Biomedical Engineering Research
    • /
    • v.28 no.2
    • /
    • pp.235-243
    • /
    • 2007
  • Hepatocellular carcinoma is significant worldwide public health problem with an estimated annually mortality of 1,000,000 people. Radiofrequency (RF) ablation is an interventional technique that in recent years has come to be used for treatment of the hepatocellualr carcinoma, by destructing tumor tissues in high temperatures. Numerous studies have been attempted to prove excellence of RF ablation and to improve its efficiency by various methods. However, the attempts are sometimes paradox to advantages of a minimum invasive characteristic and an operative simplicity in RF ablation. The aim of the current study is, therefore, to suggest an improved RF ablation technique by identifying an optimum RF pattern, which is one of important factors capable of controlling the extent of high temperature region in lossless of the advantages of RF ablation. Three-dimensional finite element (FE) model was developed and validated comparing with the results reported by literature. Four representative Rf patterns (sine, square, exponential, and simulated RF waves), which were corresponding to currents fed during simulated RF ablation, were investigated. Following parameters for each RF pattern were analyzed to identify which is the most optimum in eliminating effectively tumor tissues. 1) maximum temperature, 2) a degree of alteration of maximum temperature in a constant time range (30-40 second), 3) a domain of temperature over $47^{\circ}C$ isothermal temperature (IT), and 4) a domain inducing over 63% cell damage. Here, heat transfer characteristics within the tissues were determined by Bioheat Governing Equation. Developed FE model showed 90-95% accuracy approximately in prediction of maximum temperature and domain of interests achieved during RF ablation. Maximum temperatures for sine, square, exponential, and simulated RF waves were $69.0^{\circ}C,\;66.9^{\circ}C,\;65.4^{\circ}C,\;and\;51.8^{\circ}C$, respectively. While the maximum temperatures were decreased in the constant time range, average time intervals for sine, square, exponential, and simulated RE waves were $0.49{\pm}0.14,\;1.00{\pm}0.00,\;1.65{\pm}0.02,\;and\;1.66{\pm}0.02$ seconds, respectively. Average magnitudes of the decreased maximum temperatures in the time range were $0.45{\pm}0.15^{\circ}C$ for sine wave, $1.93{\pm}0.02^{\circ}C$ for square wave, $2.94{\pm}0.05^{\circ}C$ for exponential wave, and $1.53{\pm}0.06^{\circ}C$ for simulated RF wave. Volumes of temperature domain over $47^{\circ}C$ IT for sine, square, exponential, and simulated RF waves were 1480mm3, 1440mm3, 1380mm3, and 395mm3, respectively. Volumes inducing over 63% cell damage for sine, square, exponential, and simulated RF waves were 114mm3, 62mm3, 17mm3, and 0mm3, respectively. These results support that applying sine wave during RF ablation may be generally the most optimum in destructing effectively tumor tissues, compared with other RF patterns.

Development of Unfolding Energy Spectrum with Clinical Linear Accelerator based on Transmission Data (물질투과율 측정정보 기반 의료용 선형가속기의 에너지스펙트럼 유도기술 개발)

  • Choi, Hyun Joon;Park, Hyo Jun;Yoo, Do Hyeon;Kim, Byoung-Chul;Yi, Chul-Young;Min, Chul Hee
    • Journal of Radiation Protection and Research
    • /
    • v.41 no.1
    • /
    • pp.41-47
    • /
    • 2016
  • Background: For the accurate dose assessment in radiation therapy, energy spectrum of the photon beam generated from the linac head is essential. The aim of this study is to develop the technique to accurately unfolding the energy spectrum with the transmission analysis method. Materials and Methods: Clinical linear accelerator and Monet Carlo method was employed to evaluate the transmission signals according to the thickness of the observer material, and then the response function of the ion chamber response was determined with the mono energy beam. Finally the energy spectrum was unfolded with HEPROW program. Elekta Synergy Flatform and Geant4 tool kits was used in this study. Results and Discussion: In the comparison between calculated and measured transmission signals using aluminum alloy as an attenuator, root mean squared error was 0.43%. In the comparison between unfolded spectrum using HEPROW program and calculated spectrum using Geant4, the difference of peak and mean energy were 0.066 and 0.03 MeV, respectively. However, for the accurate prediction of the energy spectrum, additional experiment with various type of material and improvement of the unfolding program is required. Conclusion: In this research, it is demonstrated that unfolding spectra technique could be used in megavoltage photon beam with aluminum alloy and HEPROW program.

The Adaptive Personalization Method According to Users Purchasing Index : Application to Beverage Purchasing Predictions (고객별 구매빈도에 동적으로 적응하는 개인화 시스템 : 음료수 구매 예측에의 적용)

  • Park, Yoon-Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.95-108
    • /
    • 2011
  • TThis is a study of the personalization method that intelligently adapts the level of clustering considering purchasing index of a customer. In the e-biz era, many companies gather customers' demographic and transactional information such as age, gender, purchasing date and product category. They use this information to predict customer's preferences or purchasing patterns so that they can provide more customized services to their customers. The previous Customer-Segmentation method provides customized services for each customer group. This method clusters a whole customer set into different groups based on their similarity and builds predictive models for the resulting groups. Thus, it can manage the number of predictive models and also provide more data for the customers who do not have enough data to build a good predictive model by using the data of other similar customers. However, this method often fails to provide highly personalized services to each customer, which is especially important to VIP customers. Furthermore, it clusters the customers who already have a considerable amount of data as well as the customers who only have small amount of data, which causes to increase computational cost unnecessarily without significant performance improvement. The other conventional method called 1-to-1 method provides more customized services than the Customer-Segmentation method for each individual customer since the predictive model are built using only the data for the individual customer. This method not only provides highly personalized services but also builds a relatively simple and less costly model that satisfies with each customer. However, the 1-to-1 method has a limitation that it does not produce a good predictive model when a customer has only a few numbers of data. In other words, if a customer has insufficient number of transactional data then the performance rate of this method deteriorate. In order to overcome the limitations of these two conventional methods, we suggested the new method called Intelligent Customer Segmentation method that provides adaptive personalized services according to the customer's purchasing index. The suggested method clusters customers according to their purchasing index, so that the prediction for the less purchasing customers are based on the data in more intensively clustered groups, and for the VIP customers, who already have a considerable amount of data, clustered to a much lesser extent or not clustered at all. The main idea of this method is that applying clustering technique when the number of transactional data of the target customer is less than the predefined criterion data size. In order to find this criterion number, we suggest the algorithm called sliding window correlation analysis in this study. The algorithm purposes to find the transactional data size that the performance of the 1-to-1 method is radically decreased due to the data sparity. After finding this criterion data size, we apply the conventional 1-to-1 method for the customers who have more data than the criterion and apply clustering technique who have less than this amount until they can use at least the predefined criterion amount of data for model building processes. We apply the two conventional methods and the newly suggested method to Neilsen's beverage purchasing data to predict the purchasing amounts of the customers and the purchasing categories. We use two data mining techniques (Support Vector Machine and Linear Regression) and two types of performance measures (MAE and RMSE) in order to predict two dependent variables as aforementioned. The results show that the suggested Intelligent Customer Segmentation method can outperform the conventional 1-to-1 method in many cases and produces the same level of performances compare with the Customer-Segmentation method spending much less computational cost.

The Simulation of Pore Size Distribution from Unsaturated Hydraulic Conductivity Data Using the Hydraulic Functions (토양 수리학적 함수를 이용한 불포화 수리전도도로부터 공극크기분포의 모사)

  • Yoon, Young-Man;Kim, Jeong-Gyu;Shin, Kook-Sik
    • Korean Journal of Soil Science and Fertilizer
    • /
    • v.43 no.4
    • /
    • pp.407-414
    • /
    • 2010
  • Until now, the pore size distribution, PSD, of soil profile has been calculated from soil moisture characteristic data by water release method or mercury porosimetry using the capillary rise equation. But the current methods are often difficult to use and time consuming. Thus, in this work, theoretical framework for an easy and fast technique was suggested to estimate the PSD from unsaturated hydraulic conductivity data in an undisturbed field soil profile. In this study, unsaturated hydraulic conductivity data were collected and simulated by the variation of soil parameters in the given boundary conditions (Brooks and Corey soil parameters, ${\alpha}_{BC}=1-5L^{-1}$, b = 1 - 10; van Genuchten soil parameters, ${\alpha}_{VG}=0.001-1.0L^{-1}$, m = 0.1 - 0.9). Then, $K_s$ (1.0 cm $h^{-1})$ was used as the fixed input parameter for the simulation of each models. The PSDs were estimated from the collected K(h) data by model simulation. In the simulation of Brooks-Corey parameter, the saturated hydraulic conductivity, $K_s$, played a role of scaling factor for unsaturated hydraulic conductivity, K(h) Changes of parameter b explained the shape of PSD curve of soil intimately, and a ${\alpha}_{BC}$ affected on the sensitivity of PSD curve. In the case of van Genuchten model, $K_s$ and ${\alpha}_{VG}$ played the role of scaling factor for a vertical axis and a horizontal axis, respectively. Parameter m described the shape of PSD curve and K(h) systematically. This study suggests that the new theoretical technique can be applied to the in situ prediction of PSD in undisturbed field soil.