• Title/Summary/Keyword: 데이터 분할

Search Result 2,616, Processing Time 0.03 seconds

High-Speed Implementation and Efficient Memory Usage of Min-Entropy Estimation Algorithms in NIST SP 800-90B (NIST SP 800-90B의 최소 엔트로피 추정 알고리즘에 대한 고속 구현 및 효율적인 메모리 사용 기법)

  • Kim, Wontae;Yeom, Yongjin;Kang, Ju-Sung
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.28 no.1
    • /
    • pp.25-39
    • /
    • 2018
  • NIST(National Institute of Standards and Technology) has recently published SP 800-90B second draft which is the document for evaluating security of entropy source, a key element of a cryptographic random number generator(RNG), and provided a tool implemented on Python code. In SP 800-90B, the security evaluation of the entropy sources is a process of estimating min-entropy by several estimators. The process of estimating min-entropy is divided into IID track and non-IID track. In IID track, the entropy sources are estimated only from MCV estimator. In non-IID Track, the entropy sources are estimated from 10 estimators including MCV estimator. The running time of the NIST's tool in non-IID track is approximately 20 minutes and the memory usage is over 5.5 GB. For evaluation agencies that have to perform repeatedly evaluations on various samples, and developers or researchers who have to perform experiments in various environments, it may be inconvenient to estimate entropy using the tool and depending on the environment, it may be impossible to execute. In this paper, we propose high-speed implementations and an efficient memory usage technique for min-entropy estimation algorithm of SP 800-90B. Our major achievements are the three improved speed and efficient memory usage reduction methods which are the method applying advantages of C++ code for improving speed of MultiMCW estimator, the method effectively reducing the memory and improving speed of MultiMMC by rebuilding the data storage structure, and the method improving the speed of LZ78Y by rebuilding the data structure. The tool applied our proposed methods is 14 times faster and saves 13 times more memory usage than NIST's tool.

Factors Influencing the Activation of Brown Adipose Tissue in 18F-FDG PET/CT in National Cancer Center (양전자방출단층촬영 시 갈색지방조직 활성화에 영향을 미치는 요인 분석)

  • You, Yeon Wook;Lee, Chung Wun;Jung, Jae Hoon;Kim, Yun Cheol;Lee, Dong Eun;Park, So Hyeon;Kim, Tae-Sung
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.25 no.1
    • /
    • pp.21-28
    • /
    • 2021
  • Purpose Brown fat, or brown adipose tissue (BAT), is involved in non-shivering thermogenesis and creates heat through glucose metabolism. BAT activation occurs stochastically by internal factors such as age, sex, and body mass index (BMI) and external factors such as temperature and environment. In this study, as a retrospective, electronic medical record (EMR) observation study, statistical analysis is conducted to confirm BAT activation and various factors. Materials and Methods From January 2018 to December 2019, EMR of patients who underwent PET/CT scan at the National Cancer Center for two years were collected, a total of 9155 patients were extracted, and 13442 case data including duplicate scan were targeted. After performing a univariable logistic regression analysis to determine whether BAT activation is affected by the environment (outdoor temperature) and the patient's condition (BMI, cancer type, sex, and age), A multivariable regression model that affects BAT activation was finally analyzed by selecting univariable factors with P<0.1. Results BAT activation occurred in 93 cases (0.7%). According to the results of univariable logistic regression analysis, the likelihood of BAT activation was increased in patients under 50 years old (P<0.001), in females (P<0.001), in lower outdoor temperature below 14.5℃ (P<0.001), in lower BMI (P<0.001) and in patients who had a injection before 12:30 PM (P<0.001). It decreased in higher BMI (P<0.001) and in patients diagnosed with lung cancer (P<0.05) In multivariable results, BAT activation was significantly increased in patients under 50 years (P<0.001), in females (P<0.001) and in lower outdoor temperature below 14.5℃ (P<0.001). It was significantly decreased in higher BMI (P<0.05). Conclusion A retrospective study of factors affecting BAT activation in patients who underwent PET/CT scan for 2 years at the National Cancer Center was conducted. The results confirmed that BAT was significantly activated in normal-weight women under 50 years old who underwent PET/CT scan in weather with an outdoor temperature of less than 14.5℃. Based on this result, the patient applied to the factor can be identified in advance, and it is thought that it will help to reduce BAT activation through several studies in the future.

Detection of Wildfire Burned Areas in California Using Deep Learning and Landsat 8 Images (딥러닝과 Landsat 8 영상을 이용한 캘리포니아 산불 피해지 탐지)

  • Youngmin Seo;Youjeong Youn;Seoyeon Kim;Jonggu Kang;Yemin Jeong;Soyeon Choi;Yungyo Im;Yangwon Lee
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_1
    • /
    • pp.1413-1425
    • /
    • 2023
  • The increasing frequency of wildfires due to climate change is causing extreme loss of life and property. They cause loss of vegetation and affect ecosystem changes depending on their intensity and occurrence. Ecosystem changes, in turn, affect wildfire occurrence, causing secondary damage. Thus, accurate estimation of the areas affected by wildfires is fundamental. Satellite remote sensing is used for forest fire detection because it can rapidly acquire topographic and meteorological information about the affected area after forest fires. In addition, deep learning algorithms such as convolutional neural networks (CNN) and transformer models show high performance for more accurate monitoring of fire-burnt regions. To date, the application of deep learning models has been limited, and there is a scarcity of reports providing quantitative performance evaluations for practical field utilization. Hence, this study emphasizes a comparative analysis, exploring performance enhancements achieved through both model selection and data design. This study examined deep learning models for detecting wildfire-damaged areas using Landsat 8 satellite images in California. Also, we conducted a comprehensive comparison and analysis of the detection performance of multiple models, such as U-Net and High-Resolution Network-Object Contextual Representation (HRNet-OCR). Wildfire-related spectral indices such as normalized difference vegetation index (NDVI) and normalized burn ratio (NBR) were used as input channels for the deep learning models to reflect the degree of vegetation cover and surface moisture content. As a result, the mean intersection over union (mIoU) was 0.831 for U-Net and 0.848 for HRNet-OCR, showing high segmentation performance. The inclusion of spectral indices alongside the base wavelength bands resulted in increased metric values for all combinations, affirming that the augmentation of input data with spectral indices contributes to the refinement of pixels. This study can be applied to other satellite images to build a recovery strategy for fire-burnt areas.

Performance Analysis of Frequent Pattern Mining with Multiple Minimum Supports (다중 최소 임계치 기반 빈발 패턴 마이닝의 성능분석)

  • Ryang, Heungmo;Yun, Unil
    • Journal of Internet Computing and Services
    • /
    • v.14 no.6
    • /
    • pp.1-8
    • /
    • 2013
  • Data mining techniques are used to find important and meaningful information from huge databases, and pattern mining is one of the significant data mining techniques. Pattern mining is a method of discovering useful patterns from the huge databases. Frequent pattern mining which is one of the pattern mining extracts patterns having higher frequencies than a minimum support threshold from databases, and the patterns are called frequent patterns. Traditional frequent pattern mining is based on a single minimum support threshold for the whole database to perform mining frequent patterns. This single support model implicitly supposes that all of the items in the database have the same nature. In real world applications, however, each item in databases can have relative characteristics, and thus an appropriate pattern mining technique which reflects the characteristics is required. In the framework of frequent pattern mining, where the natures of items are not considered, it needs to set the single minimum support threshold to a too low value for mining patterns containing rare items. It leads to too many patterns including meaningless items though. In contrast, we cannot mine any pattern if a too high threshold is used. This dilemma is called the rare item problem. To solve this problem, the initial researches proposed approximate approaches which split data into several groups according to item frequencies or group related rare items. However, these methods cannot find all of the frequent patterns including rare frequent patterns due to being based on approximate techniques. Hence, pattern mining model with multiple minimum supports is proposed in order to solve the rare item problem. In the model, each item has a corresponding minimum support threshold, called MIS (Minimum Item Support), and it is calculated based on item frequencies in databases. The multiple minimum supports model finds all of the rare frequent patterns without generating meaningless patterns and losing significant patterns by applying the MIS. Meanwhile, candidate patterns are extracted during a process of mining frequent patterns, and the only single minimum support is compared with frequencies of the candidate patterns in the single minimum support model. Therefore, the characteristics of items consist of the candidate patterns are not reflected. In addition, the rare item problem occurs in the model. In order to address this issue in the multiple minimum supports model, the minimum MIS value among all of the values of items in a candidate pattern is used as a minimum support threshold with respect to the candidate pattern for considering its characteristics. For efficiently mining frequent patterns including rare frequent patterns by adopting the above concept, tree based algorithms of the multiple minimum supports model sort items in a tree according to MIS descending order in contrast to those of the single minimum support model, where the items are ordered in frequency descending order. In this paper, we study the characteristics of the frequent pattern mining based on multiple minimum supports and conduct performance evaluation with a general frequent pattern mining algorithm in terms of runtime, memory usage, and scalability. Experimental results show that the multiple minimum supports based algorithm outperforms the single minimum support based one and demands more memory usage for MIS information. Moreover, the compared algorithms have a good scalability in the results.

The Comparison of the Ultra-Violet Radiation of Summer Outdoor Screened by the Landscaping Shade Facilities and Tree (조경용 차양시설과 수목에 의한 하절기 옥외공간의 자외선 차단율 비교)

  • Lee, Chun-Seok;Ryu, Nam-Hyong
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.41 no.6
    • /
    • pp.20-28
    • /
    • 2013
  • The purpose of this study was to compare the ultra-violet(UV) radiation under the landscaping shade facilities and tree with natural solar UV of the outdoor space at summer middays. The UVA+B and UVB were recorded every minute from the $20^{th}$ of June to the $26^{th}$ of September 2012 at a height of 1.1m above in the four different shading conditions, with fours same measuring system consisting of two couple of analog UVA+B sensor(220~370nm, Genicom's GUVA-T21GH) and UVB sensor(220~320nm, Genicom's GUVA-T21GH) and data acquisition systems(Comfile Tech.'s Moacon). Four different shading conditions were under an wooden shelter($W4.2m{\times}L4.2m{\times}H2.5m$), a polyester membrane structure ($W4.9m{\times}L4.9m{\times}H2.6m$), a Salix koreensis($H11{\times}B30$), and a brick-paved plot without any shading material. Based on the 648 records of 17 sunny days, the time serial difference of natural solar UVA+B and UVB for midday periods were analysed and compared, and statistical analysis about the difference between the four shading conditions was done based on the 2,052 records of daytime period from 10 A.M. to 4 P.M.. The major findings were as follows; 1. The average UVA+B under the wooden shelter, the membrane and the tree were $39{\mu}W/cm^2$(3.4%), $74{\mu}W/cm^2$(6.4%), $87{\mu}W/cm^2$(7.6%) respectively, while the solar UVA+B was $1.148{\mu}W/cm^2$. Which means those facilities and tree screened at least 93% of solar UV+B. 2. The average UVB under the wooden shelter, the membrane and the tree were $12{\mu}W/cm^2$(5.8%), $26{\mu}W/cm^2$(13%), $17{\mu}W/cm^2$(8.2%) respectively, while the solar UVB was $207{\mu}W/cm^2$. The membrane showed the highest level and the wooden shelter lowest. 3. According to the results of time serial analysis, the difference between the three shaded conditions around noon was very small, but the differences of early morning and late afternoon were apparently big. Which seems caused by the matter of the formal and structural characteristics of the shading facilities and tree, not by the shading materials itself. In summary, the performance of the four landscaping shade facilities and tree were very good at screening the solar UV at outdoor of summer middays, but poor at screening the lateral UV during early morning and late afternoon. Therefore, it can be apparently said that the more delicate design of shading facilities and big tree or forest to block the additional lateral UV, the more effective in conditioning the outdoor space reducing the useless or even harmful radiation for human activities.

Application of MicroPACS Using the Open Source (Open Source를 이용한 MicroPACS의 구성과 활용)

  • You, Yeon-Wook;Kim, Yong-Keun;Kim, Yeong-Seok;Won, Woo-Jae;Kim, Tae-Sung;Kim, Seok-Ki
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.13 no.1
    • /
    • pp.51-56
    • /
    • 2009
  • Purpose: Recently, most hospitals are introducing the PACS system and use of the system continues to expand. But small-scaled PACS called MicroPACS has already been in use through open source programs. The aim of this study is to prove utility of operating a MicroPACS, as a substitute back-up device for conventional storage media like CDs and DVDs, in addition to the full-PACS already in use. This study contains the way of setting up a MicroPACS with open source programs and assessment of its storage capability, stability, compatibility and performance of operations such as "retrieve", "query". Materials and Methods: 1. To start with, we searched open source software to correspond with the following standards to establish MicroPACS, (1) It must be available in Windows Operating System. (2) It must be free ware. (3) It must be compatible with PET/CT scanner. (4) It must be easy to use. (5) It must not be limited of storage capacity. (6) It must have DICOM supporting. 2. (1) To evaluate availability of data storage, we compared the time spent to back up data in the open source software with the optical discs (CDs and DVD-RAMs), and we also compared the time needed to retrieve data with the system and with optical discs respectively. (2) To estimate work efficiency, we measured the time spent to find data in CDs, DVD-RAMs and MicroPACS. 7 technologists participated in this study. 3. In order to evaluate stability of the software, we examined whether there is a data loss during the system is maintained for a year. Comparison object; How many errors occurred in randomly selected data of 500 CDs. Result: 1. We chose the Conquest DICOM Server among 11 open source software used MySQL as a database management system. 2. (1) Comparison of back up and retrieval time (min) showed the result of the following: DVD-RAM (5.13,2.26)/Conquest DICOM Server (1.49,1.19) by GE DSTE (p<0.001), CD (6.12,3.61)/Conquest (0.82,2.23) by GE DLS (p<0.001), CD (5.88,3.25)/Conquest (1.05,2.06) by SIEMENS. (2) The wasted time (sec) to find some data is as follows: CD ($156{\pm}46$), DVD-RAM ($115{\pm}21$) and Conquest DICOM Server ($13{\pm}6$). 3. There was no data loss (0%) for a year and it was stored 12741 PET/CT studies in 1.81 TB memory. In case of CDs, On the other hand, 14 errors among 500 CDs (2.8%) is generated. Conclusions: We found that MicroPACS could be set up with the open source software and its performance was excellent. The system built with open source proved more efficient and more robust than back-up process using CDs or DVD-RAMs. We believe that the operation of the MicroPACS would be effective data storage device as long as its operators develop and systematize it.

  • PDF

Application of Support Vector Regression for Improving the Performance of the Emotion Prediction Model (감정예측모형의 성과개선을 위한 Support Vector Regression 응용)

  • Kim, Seongjin;Ryoo, Eunchung;Jung, Min Kyu;Kim, Jae Kyeong;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.185-202
    • /
    • 2012
  • .Since the value of information has been realized in the information society, the usage and collection of information has become important. A facial expression that contains thousands of information as an artistic painting can be described in thousands of words. Followed by the idea, there has recently been a number of attempts to provide customers and companies with an intelligent service, which enables the perception of human emotions through one's facial expressions. For example, MIT Media Lab, the leading organization in this research area, has developed the human emotion prediction model, and has applied their studies to the commercial business. In the academic area, a number of the conventional methods such as Multiple Regression Analysis (MRA) or Artificial Neural Networks (ANN) have been applied to predict human emotion in prior studies. However, MRA is generally criticized because of its low prediction accuracy. This is inevitable since MRA can only explain the linear relationship between the dependent variables and the independent variable. To mitigate the limitations of MRA, some studies like Jung and Kim (2012) have used ANN as the alternative, and they reported that ANN generated more accurate prediction than the statistical methods like MRA. However, it has also been criticized due to over fitting and the difficulty of the network design (e.g. setting the number of the layers and the number of the nodes in the hidden layers). Under this background, we propose a novel model using Support Vector Regression (SVR) in order to increase the prediction accuracy. SVR is an extensive version of Support Vector Machine (SVM) designated to solve the regression problems. The model produced by SVR only depends on a subset of the training data, because the cost function for building the model ignores any training data that is close (within a threshold ${\varepsilon}$) to the model prediction. Using SVR, we tried to build a model that can measure the level of arousal and valence from the facial features. To validate the usefulness of the proposed model, we collected the data of facial reactions when providing appropriate visual stimulating contents, and extracted the features from the data. Next, the steps of the preprocessing were taken to choose statistically significant variables. In total, 297 cases were used for the experiment. As the comparative models, we also applied MRA and ANN to the same data set. For SVR, we adopted '${\varepsilon}$-insensitive loss function', and 'grid search' technique to find the optimal values of the parameters like C, d, ${\sigma}^2$, and ${\varepsilon}$. In the case of ANN, we adopted a standard three-layer backpropagation network, which has a single hidden layer. The learning rate and momentum rate of ANN were set to 10%, and we used sigmoid function as the transfer function of hidden and output nodes. We performed the experiments repeatedly by varying the number of nodes in the hidden layer to n/2, n, 3n/2, and 2n, where n is the number of the input variables. The stopping condition for ANN was set to 50,000 learning events. And, we used MAE (Mean Absolute Error) as the measure for performance comparison. From the experiment, we found that SVR achieved the highest prediction accuracy for the hold-out data set compared to MRA and ANN. Regardless of the target variables (the level of arousal, or the level of positive / negative valence), SVR showed the best performance for the hold-out data set. ANN also outperformed MRA, however, it showed the considerably lower prediction accuracy than SVR for both target variables. The findings of our research are expected to be useful to the researchers or practitioners who are willing to build the models for recognizing human emotions.

Analysis of components according to different collecting time and production method in sun-dried salt (채취시기 및 생산방법에 따른 천일염의 성분 분석)

  • Jin, Yong-Xie;Kim, Haeng-Ryan;Kim, So-Young
    • Food Science and Preservation
    • /
    • v.20 no.6
    • /
    • pp.791-797
    • /
    • 2013
  • This study was conducted to investigate the changes in the composition and microbiological properties of domestic sun-dried salt (white and gray salts) according to their collection time and production method. The results showed that the moisture contents of the white and gray sun-dried salts were 10.4~13.2% and 5.2~8.0%, respectively, and the sand contents were 0.1% and 0.2~0.3%, respectively, according to the month. Several samples exceeded the criteria of 15% moisture content and 0.2% sand content. The ash content and salinity of gray salt (below 85% and 90%, respectively) were higher than those of white salt (both below 80%). The total chloride contents of the salts collected in September and October were slightly lower than that of the others and exceeded the criteria of above 40%. In the case of mineral contents, there was no significant difference among the collection times because the analyses showed a marked deviation. The microbiological analysis showed that there was no significant difference among the production method, but the salt samples collected in September and October had relatively high detection rates of total aerobe, staphylococci, and halophilic bacteria.

Encounters and Acceptable Number of Encounters at the Seoseokdae Trail Section of Mudeungsan National Park (무등산국립공원 서석대 구간의 탐방객 조우수와 허용가능 조우수)

  • Kim, Sang-Mi;Kim, Sang-Oh
    • Korean Journal of Environment and Ecology
    • /
    • v.34 no.5
    • /
    • pp.454-465
    • /
    • 2020
  • This study measured the present number of encounters and established the evaluation criterion for the allowable number of encounters in the Seoseokdae summit area (SSA) of Mudeungsan National Park to examine managerial conditions of the number of visitors to the Seoseokdae trail section (STS). Data were obtained from a questionnaire survey of 263 visitors to STS selected through convenient sampling during June 2019. The average number of encounters in SSA was 18.7. Most of the respondents (95.4%) encountered fewer than 30 other visitors. The average maximum number of simultaneous users (AMNSU, measured at 15-minute intervals) in SSA was 13.4 persons (range: 3~31 persons). The AMNSU by the hour was the highest with 21.0 persons at 13-14, followed by 19.8 persons at 11-12, 15.5 persons at 14-15, 15.3 persons at 12-13, 12.3 persons at 10-11, and 10.8 persons at 8-9. Acceptable encounter number (AEN) developed by long-question format (LQF) was 59.2 persons, and that by short-question format (SQF) was 55.1 persons. AEN of the respondents who preferred "near-nature experience" at 27.5 persons was fewer than those who preferred "resort/tourism area like experience" at 46.6 persons. The present number of encounters and AMNUS (range: 3~31 persons) in SSA were fewer than AENs derived from LQF (59.2 persons) and SQF (55.1 persons). Eighty-three percent of the respondents preferred "near-nature experience," while only 10.5% of the respondents preferred "resort/tourism area like experience." 78.4% of the respondents did not perceive that SSA was crowded. The absolute majority of the respondents (92.3%) answered higher personal AEN than the perceived encounter numbers (PEN). The gaps between the personal AEN and the PEN were negatively correlated with perceived crowding.

Correlational Analysis of Supine Position Time and Sleep-related Variables in Obstructive Sleep Apnea Syndrome (폐쇄성 수면무호흡 증후군에서 앙와위 자세시간과 수면관련변인 간 상관관계 분석)

  • Kim, Si Young;Park, Doo-Heum;Yu, Jaehak;Ryu, Seung-Ho;Ha, Ji-Hyeon
    • Sleep Medicine and Psychophysiology
    • /
    • v.24 no.1
    • /
    • pp.32-37
    • /
    • 2017
  • Objectives: A supine sleep position increases sleep apneas compared to non-supine positions in obstructive sleep apnea syndrome (OSAS). However, supine position time (SPT) is not highly associated with apnea-hypopnea index (AHI) in OSAS. We evaluated the correlation among sleep-related variables and SPT in OSAS. Methods: A total of 365 men with OSAS were enrolled in this study. We analyzed how SPT was correlated with demographic data, sleep structure-related variables, OSAS-related variables and heart rate variability (HRV). Multiple linear regression analysis was conducted to investigate the factors that affected SPT. Results: SPT had the most significant correlation with total sleep time (TST ; r = 0.443, p < 0.001), followed by sleep efficiency (SE ; r = 0.300, p < 0.001). Snoring time (r = 0.238, p < 0.001), time at < 90% SpO2 (r = 0.188, p < 0.001), apnea-hypopnea index (AHI ; r = 0.180, p = 0.001) and oxygen desaturation index (ODI ; r = 0.149, p = 0.004) were significantly correlated with SPT. Multiple regression analysis revealed that TST (t = 7.781, p < 0.001), snoring time (t = 3.794, p < 0.001), AHI (t = 3.768, p < 0.001) and NN50 count (t = 1.993, p = 0.047) were associated with SPT. Conclusion: SPT was more highly associated with sleep structure-related parameters than OSAS-related variables. SPT was correlated with TST, SE, AHI, snoring time and NN50 count. This suggests that SPT is likely to be determined by sleep structure, HRV and the severity of OSAS.