• Title/Summary/Keyword: Data driven method

Search Result 531, Processing Time 0.024 seconds

Incidence, Prevalence, and Mortality Rate of Gastrointestinal Cancer in Isfahan, Iran: Application of the MIAMOD Method

  • Moradpour, Farhad;Gholami, Ali;Salehi, Mohammad;Mansori, Kamiar;Maracy, Mohammad Reza;Javanmardi, Setareh;Rajabi, Abdolhalim;Moradi, Yousef;Khodadost, Mahmod
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.17 no.sup3
    • /
    • pp.11-15
    • /
    • 2016
  • Gastrointestinal cancers remain the most prevalent cancers in many developing countries such as Iran. The aim of this study was to estimate incidence, prevalence and mortality, as well as time trends for gastrointestinal cancers in Isfahan province of Iran for the period 2001 to 2010 and to project these estimates to the year 2020. Estimates were driven by applying the MIAMOD method (a backward calculation approach using mortality and relative survival rates). Mortality data were obtained from the Ministry of Health and the relative survival rate for all gastrointestinal cancers combined was derived from the Eurocare 3 study. Results indicated that there were clear upward trends in age adjusted incidence (males 22.9 to 74.2 and females 14.9 to 44.2), prevalence (males 52.6 to 177.7 and females 38.3 to 111.03), and mortality (males 14.6 to 47.2 and females 9.6 to 28.2) rates per 100,000 for the period of 2001 to 2010 and this upward state would persist for the projected period. For the entire period, the male to female ratio increased slightly for all parameters (incidence rate increased from 1.5 to 1.7, prevalence from 1.4 to 1.6, and mortality from 1.5 to 1.7). In males, totals of 2,179 incident cases, 5,097 prevalent cases and 1,398 mortality cases were predicated to occur during the study period. For females the predicted figures were 1,379, 3,190 and 891, respectively. It was concluded that the upward trend of incidence alongside increase in survival rates would induce a high burden on the health care infrastructure in the province of Isfahan in the future.

Estimation of Energy Expenditure using Unfixed Accelerometer during Exercise (비고정식 가속도계를 이용한 운동 중 에너지소비 추정)

  • Kim, Joo-Han;Lee, Jeon;Lee, Hee-Young;Kim, Young-Ho;Lee, Kyoung-Joung
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.48 no.4
    • /
    • pp.63-70
    • /
    • 2011
  • In this paper, we proposed a method for estimating energy expenditure using the unfixed axis of the accelerometer. Most studies adopted waist-placement because of the fact that the waist is close to the center of mass of a whole human body. But we adopted pocket-placement, which is capable of using unfixed axis of sensor, that is more convenient than conventional methods. To evaluate the proposed method, 28 male subjects performed walking and running on a motor driven treadmill. All of subject put on the indirect calorimeter and fixed accelerometer, then data were simultaneously measured during exercise. The regression analysis was performed using the test group(n=20) and the regression equation was applied to the control group(n=8). A strong linear relationship between energy expenditure and unfixed accelerometer signal was found. Futhermore, the coefficient of determination was significantly reliable($R^2$=0.98) and showed zero of p-value. The error of energy expenditure estimation between indirect calorimeter and two types of accelerometer was 15.0%(fixed) and 17.0%(unfixed) respectively. These results show the possibilities that the unfixed accelerometer can be used in estimating the energy expenditure during exercise.

A Meta Study on Research Trend of Digital Forensic in Korea (메타스터디를 통한 국내 디지털 포렌식 연구 동향)

  • Kwak, Na-Yeon;Lee, Choong C.;Maeng, Yun-Ho;Cho, Bang-Ho;Lee, Sang-Eun
    • Informatization Policy
    • /
    • v.24 no.3
    • /
    • pp.91-107
    • /
    • 2017
  • Digital forensics is the process of uncovering and interpreting electronic data and materials found in digital device in relation to crime. The goal of the process is to preserve any evidence in its most original form which shall be having the force of law. The digital forensic market is increasing with a growth of ICT in domestic and global market. Many countries including U.S. are actively performing researched regarding a structured investigation by collecting, identifying and validating the digital information for the purpose of reconstructing past events which so does in academic society in Korea. This paper is to understand overall research trend about digital forensics and derive future strategy by integrating the result of meta-analysis into practices based on five criteria - main theme and topic, analysis phase, technical method for analysis, author's affiliation, and unit of analysis and method. 239 papers are analyzed, which were selected out of 470 papers published for 10 years (2007~2016) in academic journal on the list of KCI (Korea Citation index). The results of this analysis will be used to examine the characteristics of research in the field of digital forensics. The result of this research will contribute to understanding of the research trend and characteristics leading the technology-driven academia, through which measures for further research development and facilitation are suggested.

Prediction of the remaining time and time interval of pebbles in pebble bed HTGRs aided by CNN via DEM datasets

  • Mengqi Wu;Xu Liu;Nan Gui;Xingtuan Yang;Jiyuan Tu;Shengyao Jiang;Qian Zhao
    • Nuclear Engineering and Technology
    • /
    • v.55 no.1
    • /
    • pp.339-352
    • /
    • 2023
  • Prediction of the time-related traits of pebble flow inside pebble-bed HTGRs is of great significance for reactor operation and design. In this work, an image-driven approach with the aid of a convolutional neural network (CNN) is proposed to predict the remaining time of initially loaded pebbles and the time interval of paired flow images of the pebble bed. Two types of strategies are put forward: one is adding FC layers to the classic classification CNN models and using regression training, and the other is CNN-based deep expectation (DEX) by regarding the time prediction as a deep classification task followed by softmax expected value refinements. The current dataset is obtained from the discrete element method (DEM) simulations. Results show that the CNN-aided models generally make satisfactory predictions on the remaining time with the determination coefficient larger than 0.99. Among these models, the VGG19+DEX performs the best and its CumScore (proportion of test set with prediction error within 0.5s) can reach 0.939. Besides, the remaining time of additional test sets and new cases can also be well predicted, indicating good generalization ability of the model. In the task of predicting the time interval of image pairs, the VGG19+DEX model has also generated satisfactory results. Particularly, the trained model, with promising generalization ability, has demonstrated great potential in accurately and instantaneously predicting the traits of interest, without the need for additional computational intensive DEM simulations. Nevertheless, the issues of data diversity and model optimization need to be improved to achieve the full potential of the CNN-aided prediction tool.

Predictive Clustering-based Collaborative Filtering Technique for Performance-Stability of Recommendation System (추천 시스템의 성능 안정성을 위한 예측적 군집화 기반 협업 필터링 기법)

  • Lee, O-Joun;You, Eun-Soon
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.119-142
    • /
    • 2015
  • With the explosive growth in the volume of information, Internet users are experiencing considerable difficulties in obtaining necessary information online. Against this backdrop, ever-greater importance is being placed on a recommender system that provides information catered to user preferences and tastes in an attempt to address issues associated with information overload. To this end, a number of techniques have been proposed, including content-based filtering (CBF), demographic filtering (DF) and collaborative filtering (CF). Among them, CBF and DF require external information and thus cannot be applied to a variety of domains. CF, on the other hand, is widely used since it is relatively free from the domain constraint. The CF technique is broadly classified into memory-based CF, model-based CF and hybrid CF. Model-based CF addresses the drawbacks of CF by considering the Bayesian model, clustering model or dependency network model. This filtering technique not only improves the sparsity and scalability issues but also boosts predictive performance. However, it involves expensive model-building and results in a tradeoff between performance and scalability. Such tradeoff is attributed to reduced coverage, which is a type of sparsity issues. In addition, expensive model-building may lead to performance instability since changes in the domain environment cannot be immediately incorporated into the model due to high costs involved. Cumulative changes in the domain environment that have failed to be reflected eventually undermine system performance. This study incorporates the Markov model of transition probabilities and the concept of fuzzy clustering with CBCF to propose predictive clustering-based CF (PCCF) that solves the issues of reduced coverage and of unstable performance. The method improves performance instability by tracking the changes in user preferences and bridging the gap between the static model and dynamic users. Furthermore, the issue of reduced coverage also improves by expanding the coverage based on transition probabilities and clustering probabilities. The proposed method consists of four processes. First, user preferences are normalized in preference clustering. Second, changes in user preferences are detected from review score entries during preference transition detection. Third, user propensities are normalized using patterns of changes (propensities) in user preferences in propensity clustering. Lastly, the preference prediction model is developed to predict user preferences for items during preference prediction. The proposed method has been validated by testing the robustness of performance instability and scalability-performance tradeoff. The initial test compared and analyzed the performance of individual recommender systems each enabled by IBCF, CBCF, ICFEC and PCCF under an environment where data sparsity had been minimized. The following test adjusted the optimal number of clusters in CBCF, ICFEC and PCCF for a comparative analysis of subsequent changes in the system performance. The test results revealed that the suggested method produced insignificant improvement in performance in comparison with the existing techniques. In addition, it failed to achieve significant improvement in the standard deviation that indicates the degree of data fluctuation. Notwithstanding, it resulted in marked improvement over the existing techniques in terms of range that indicates the level of performance fluctuation. The level of performance fluctuation before and after the model generation improved by 51.31% in the initial test. Then in the following test, there has been 36.05% improvement in the level of performance fluctuation driven by the changes in the number of clusters. This signifies that the proposed method, despite the slight performance improvement, clearly offers better performance stability compared to the existing techniques. Further research on this study will be directed toward enhancing the recommendation performance that failed to demonstrate significant improvement over the existing techniques. The future research will consider the introduction of a high-dimensional parameter-free clustering algorithm or deep learning-based model in order to improve performance in recommendations.

Target Word Selection Disambiguation using Untagged Text Data in English-Korean Machine Translation (영한 기계 번역에서 미가공 텍스트 데이터를 이용한 대역어 선택 중의성 해소)

  • Kim Yu-Seop;Chang Jeong-Ho
    • The KIPS Transactions:PartB
    • /
    • v.11B no.6
    • /
    • pp.749-758
    • /
    • 2004
  • In this paper, we propose a new method utilizing only raw corpus without additional human effort for disambiguation of target word selection in English-Korean machine translation. We use two data-driven techniques; one is the Latent Semantic Analysis(LSA) and the other the Probabilistic Latent Semantic Analysis(PLSA). These two techniques can represent complex semantic structures in given contexts like text passages. We construct linguistic semantic knowledge by using the two techniques and use the knowledge for target word selection in English-Korean machine translation. For target word selection, we utilize a grammatical relationship stored in a dictionary. We use k- nearest neighbor learning algorithm for the resolution of data sparseness Problem in target word selection and estimate the distance between instances based on these models. In experiments, we use TREC data of AP news for construction of latent semantic space and Wail Street Journal corpus for evaluation of target word selection. Through the Latent Semantic Analysis methods, the accuracy of target word selection has improved over 10% and PLSA has showed better accuracy than LSA method. finally we have showed the relatedness between the accuracy and two important factors ; one is dimensionality of latent space and k value of k-NT learning by using correlation calculation.

KNU Korean Sentiment Lexicon: Bi-LSTM-based Method for Building a Korean Sentiment Lexicon (Bi-LSTM 기반의 한국어 감성사전 구축 방안)

  • Park, Sang-Min;Na, Chul-Won;Choi, Min-Seong;Lee, Da-Hee;On, Byung-Won
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.219-240
    • /
    • 2018
  • Sentiment analysis, which is one of the text mining techniques, is a method for extracting subjective content embedded in text documents. Recently, the sentiment analysis methods have been widely used in many fields. As good examples, data-driven surveys are based on analyzing the subjectivity of text data posted by users and market researches are conducted by analyzing users' review posts to quantify users' reputation on a target product. The basic method of sentiment analysis is to use sentiment dictionary (or lexicon), a list of sentiment vocabularies with positive, neutral, or negative semantics. In general, the meaning of many sentiment words is likely to be different across domains. For example, a sentiment word, 'sad' indicates negative meaning in many fields but a movie. In order to perform accurate sentiment analysis, we need to build the sentiment dictionary for a given domain. However, such a method of building the sentiment lexicon is time-consuming and various sentiment vocabularies are not included without the use of general-purpose sentiment lexicon. In order to address this problem, several studies have been carried out to construct the sentiment lexicon suitable for a specific domain based on 'OPEN HANGUL' and 'SentiWordNet', which are general-purpose sentiment lexicons. However, OPEN HANGUL is no longer being serviced and SentiWordNet does not work well because of language difference in the process of converting Korean word into English word. There are restrictions on the use of such general-purpose sentiment lexicons as seed data for building the sentiment lexicon for a specific domain. In this article, we construct 'KNU Korean Sentiment Lexicon (KNU-KSL)', a new general-purpose Korean sentiment dictionary that is more advanced than existing general-purpose lexicons. The proposed dictionary, which is a list of domain-independent sentiment words such as 'thank you', 'worthy', and 'impressed', is built to quickly construct the sentiment dictionary for a target domain. Especially, it constructs sentiment vocabularies by analyzing the glosses contained in Standard Korean Language Dictionary (SKLD) by the following procedures: First, we propose a sentiment classification model based on Bidirectional Long Short-Term Memory (Bi-LSTM). Second, the proposed deep learning model automatically classifies each of glosses to either positive or negative meaning. Third, positive words and phrases are extracted from the glosses classified as positive meaning, while negative words and phrases are extracted from the glosses classified as negative meaning. Our experimental results show that the average accuracy of the proposed sentiment classification model is up to 89.45%. In addition, the sentiment dictionary is more extended using various external sources including SentiWordNet, SenticNet, Emotional Verbs, and Sentiment Lexicon 0603. Furthermore, we add sentiment information about frequently used coined words and emoticons that are used mainly on the Web. The KNU-KSL contains a total of 14,843 sentiment vocabularies, each of which is one of 1-grams, 2-grams, phrases, and sentence patterns. Unlike existing sentiment dictionaries, it is composed of words that are not affected by particular domains. The recent trend on sentiment analysis is to use deep learning technique without sentiment dictionaries. The importance of developing sentiment dictionaries is declined gradually. However, one of recent studies shows that the words in the sentiment dictionary can be used as features of deep learning models, resulting in the sentiment analysis performed with higher accuracy (Teng, Z., 2016). This result indicates that the sentiment dictionary is used not only for sentiment analysis but also as features of deep learning models for improving accuracy. The proposed dictionary can be used as a basic data for constructing the sentiment lexicon of a particular domain and as features of deep learning models. It is also useful to automatically and quickly build large training sets for deep learning models.

A Study on Land Acquisition Priority for Establishing Riparian Buffer Zones in Korea (수변녹지 조성을 위한 토지매수 우선순위 산정 방안 연구)

  • Hong, Jin-Pyo;Lee, Jae-Won;Choi, Ok-Hyun;Son, Ju-Dong;Cho, Dong-Gil;Ahn, Tong-Mahn
    • Journal of the Korean Society of Environmental Restoration Technology
    • /
    • v.17 no.4
    • /
    • pp.29-41
    • /
    • 2014
  • The Korean government has purchased land properties alongside any significant water bodies before setting up the buffers to secure water qualities. Since the annual budgets are limited, however, there has always been the issue of which land parcels ought to be given the priority. Therefore, this study aims to develop efficient mechanism for land acquisition priorities in stream corridors that would ultimately be vegetated for riparian buffer zones. The criteria of land acquisition priority were driven through literary review along with experts' advice. The relative weights of their value and priorities for each criterion were computed using the Analytical Hierarchy Process(AHP) method. Major findings of the study are as follows: 1. The decision-making structural model for land acquisition priority focuses mainly on the reduction of non-point source pollutants(NSPs). This fact is highly associated with natural and physical conditions and land use types of surrounding areas. The criteria were classified into two categories-NSPs runoff areas and potential NSPs runoff areas. 2. Land acquisition priority weights derived for NSPs runoff areas and potential NSPs runoff areas were 0.862 and 0.138, respectively. This implicates that much higher priority should be given to the land parcels with NSPs runoff areas. 3. Weights and priorities of sub-criteria suggested from this study include: proximity to the streams(0.460), land cover(0.189), soil permeability(0.117), topographical slope(0.096), proximity to the roads(0.058), land-use types(0.036), visibility to the streams(0.032), and the land price(0.012). This order of importance suggests, as one can expect, that it is better to purchase land parcels that are adjacent to the streams. 4. A standard scoring system including the criteria and weights for land acquisition priority was developed which would likely to allow expedited decision making and easy quantification for priority evaluation due to the utilization of measurable spatial data. Further studies focusing on both point and non-point pollutants and GIS-based spatial analysis and mapping of land acquisition priority are needed.

Temporal and Spatial Characteristics of Visual and Somatosensory Integration in Normal Adult Brain (정상성인의 시각 및 촉각 통합 작용 시 뇌신경세포의 전기생리적활동의 시간 및 공간적 특성: 예비실험)

  • Ju, Yu-Mi;Kim, Ji-Hyun
    • The Journal of Korean Academy of Sensory Integration
    • /
    • v.8 no.1
    • /
    • pp.41-49
    • /
    • 2010
  • Objective : Multisensory integration (MSI) is the essential process to use diverse sensory information for cognitive task or execution of motor action. Especially, visual and somatosensory integration is critical for motor behavior and coordination. This study was designed to explain spatial and temporal characteristics of visual and somatosensory integration by neurophysiological research method that identifies the time course and brain location of the SI process. Methods : Electroencephalography (EEG) and event-related potential (ERP) is used in this study in order to observe neural activities when integrating visual and tactile input. We calculate the linear summation (SUM) of visual-related potentials (VEPs) and somatosensory-related potentials (SEPs), and compared the SUM with simultaneously presented visual-tactile ERPs(SIM) Results : There were significant differences between the SIM and SUM in later time epochs (about 200-300ms) at contralateral somatosensory areas (C4) and occipital cortices (O1&O2). The amplitude of the SIM was mathematically larger than the summed signals, implying that the integration made some extra neural activities. Conclusion : This study provides some empirical neural evidence of that multisensory integration is more powerful than just combing two unisensory inputs in the brain and ERP data reveals neural signature relating to multisensory integrative process. Since this study is preliminary pilot study, larger population and criteria are needed for level of the significance. Further study is recommended to consider issues including effect of internally-driven attention and laterality of interaction to make the evidence by this study solid.

  • PDF

Mega Flood Simulation Assuming Successive Extreme Rainfall Events (연속적인 극한호우사상의 발생을 가정한 거대홍수모의)

  • Choi, Changhyun;Han, Daegun;Kim, Jungwook;Jung, Jaewon;Kim, Duckhwan;Kim, Hung Soo
    • Journal of Wetlands Research
    • /
    • v.18 no.1
    • /
    • pp.76-83
    • /
    • 2016
  • In recent, the series of extreme storm events were occurred by those continuous typhoons and the severe flood damages due to the loss of life and the destruction of property were involved. In this study, we call Mega flood for the Extreme flood occurred by these successive storm events and so we can have a hypothetical Mega flood by assuming that a extreme event can be successively occurred with a certain time interval. Inter Event Time Definition (IETD) method was used to determine the time interval between continuous events in order to simulate Mega flood. Therefore, the continuous extreme rainfall events are determined with IETD then Mega flood is simulated by the consecutive events : (1) consecutive occurrence of two historical extreme events, (2) consecutive occurrence of two design events obtained by the frequency analysis based on the historical data. We have shown that Mega floods by continuous extreme rainfall events were increased by 6-17% when we compared to typical flood by a single event. We can expect that flood damage caused by Mega flood leads to much greater than damage driven by a single rainfall event. The second increase in the flood caused by heavy rain is not much compared to the first flood caused by heavy rain. But Continuous heavy rain brings the two times of flood damage. Therefore, flood damage caused by the virtual Mega flood of is judged to be very large. Here we used the hypothetical rainfall events which can occur Mega floods and this could be used for preparing for unexpected flood disaster by simulating Mega floods defined in this study.