• Title/Summary/Keyword: 표본추출 알고리즘

Search Result 55, Processing Time 0.02 seconds

Development of Prediction Model of Financial Distress and Improvement of Prediction Performance Using Data Mining Techniques (데이터마이닝 기법을 이용한 기업부실화 예측 모델 개발과 예측 성능 향상에 관한 연구)

  • Kim, Raynghyung;Yoo, Donghee;Kim, Gunwoo
    • Information Systems Review
    • /
    • v.18 no.2
    • /
    • pp.173-198
    • /
    • 2016
  • Financial distress can damage stakeholders and even lead to significant social costs. Thus, financial distress prediction is an important issue in macroeconomics. However, most existing studies on building a financial distress prediction model have only considered idiosyncratic risk factors without considering systematic risk factors. In this study, we propose a prediction model that considers both the idiosyncratic risk based on a financial ratio and the systematic risk based on a business cycle. Ultimately, we build several IT artifacts associated with financial ratio and add them to the idiosyncratic risk factors as well as address the imbalanced data problem by using an oversampling technique and synthetic minority oversampling technique (SMOTE) to ensure good performance. When considering systematic risk, our study ensures that each data set consists of both financially distressed companies and financially sound companies in each business cycle phase. We conducted several experiments that change the initial imbalanced sample ratio between the two company groups into a 1:1 sample ratio using SMOTE and compared the prediction results from the individual data set. We also predicted data sets from the subsequent business cycle phase as a test set through a built prediction model that used business contraction phase data sets, and then we compared previous prediction performance and subsequent prediction performance. Thus, our findings can provide insights into making rational decisions for stakeholders that are experiencing an economic crisis.

A Study on the Knowledge Acquisition from Local Companies and Job Seekers using Data Mining Techniques (데이터마이닝 기법을 이용한 지역 기업과 구직자로부터의 지식 도출에 관한 연구)

  • Kim, Jin-Sung
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.2
    • /
    • pp.141-147
    • /
    • 2012
  • The purpose of the study is the acquisitions of knowledge related in job searching from local companies and job seekers using data mining techniques. At the first step, for the study, we had selected the local companies their headquarters are located in Jeonbuk province. Then we had picked the graduating students out from the high schools, colleges, and universities in the same area as the job seekers. After the targeting of the sample, we had surveyed 560 local companies and 14 schools for the collecting of the preliminary data. As the result of the survey, we could collect 173 responses from the companies and 551 responses from the job seekers. At the second step using data mining, we had adapted the C5.0 algorithm to extract the inference rules. Then we had used the Visual Basic (VB) programming language to visualize the rules at the third step. At the fourth step, we transformed the inference rules into DB tables. At the final step, we had executed the rule inferences to support the development of the long-term human resources development (HRD) strategies. As the result of the study, we could suggest the helpful information to the HRD directors and job seekers in designing their strategies in managing their jobs and career development.

Location Estimation Method using Extended Kalman Filter with Frequency Offsets in CSS WPAN (CSS WPAN에서 주파수 편이를 보상하는 확장 Kalman 필터를 사용한 이동노드의 위치추정 방식)

  • Nam, Yoon-Seok
    • The KIPS Transactions:PartC
    • /
    • v.19C no.4
    • /
    • pp.239-246
    • /
    • 2012
  • The function of location estimation in WPAN has been studied and specified on the ultra wide band optionally. But the devices based on CSS(Chirp Spread Spectrum) specification has been used widely in the market because of its functionality, cheapness and support of development. As the CSS device uses 2.4GHz for a carrier frequency and the sampling frequency is lower than that of the UWB, the resolution of a timestamp is very coarse. Then actually the error of a measured distance is very large about 30cm~1m at 10 m depart. And the location error in ($10m{\times}10m$) environment is known as about 1m~2m. So for some applications which require more accurate location information, it is very natural and important to develop a sophisticated post processing algorithm after distance measurements. In this paper, we have studied extended Kalman filter with the frequency offsets of anchor nodes, and proposed a novel algorithm frequency offset compensated extended Kalman filter. The frequency offsets are composed with a variable as a common frequency offset and constants as individual frequency offsets. The proposed algorithm shows that the accurate location estimation, less than 10cm distance error, with CSS WPAN nodes is possible practically.

A Review of Multivariate Analysis Studies Applied for Plant Morphology in Korea (국내 식물 형태 연구에 사용된 다변량분석 논문에 대한 재고)

  • Chang, Kae Sun;Oh, Hana;Kim, Hui;Lee, Heung Soo;Chang, Chin-Sung
    • Journal of Korean Society of Forest Science
    • /
    • v.98 no.3
    • /
    • pp.215-224
    • /
    • 2009
  • A review was given of the role of traditional morphometrics in plant morphological studies using 54 published studies in three major journals and others in Korea, such as Journal of Korean Forestry Society, Korean Journal of Plant Taxonomy, Korean Journal of Breeding, Korean Journal of Apiculture, Journal of Life Science, and Korean Journal of Plant Resources from 1997 to 2008. The two most commonly used techniques of data analysis, cluster analysis (CA) and principal components analysis (PCA) with other statistical tests were discussed. The common problem of PCA is the underlying assumptions of methods, like random sampling and multivariate normal distribution of data. The procedure was intended mainly for continuous data and was not efficient for data which were not well summarized by variances or covariances. Likewise CA was most appropriate for categorical rather than continuous data. Also, the CA produced clusters whether or not natural groupings existed, and the results depended on both the similarity measure chosen and the algorithm used for clustering. An additional problems of the PCA and the CA arised with both qualitative and quantitative data with a limited number of variables and/or too few numbers of samples. Some of these problems may be avoided if a certain number of variables (more than 20 at least) and sufficient samples (40-50 at least) are considered for morphometric analyses, but we do not think that the methods are all mighty tools for data analysts. Instead, we do believe that reasonable applications combined with focus on objectives and limitations of each procedure would be a step forward.

A Study on Electron Dose Distribution of Cones for Intraoperative Radiation Therapy (수술중 전자선치료에 있어서 선량분포에 관한 연구)

  • Kang, Wee-Saing;Ha, Sung-Whan;Yun, Hyong-Geun
    • Progress in Medical Physics
    • /
    • v.3 no.2
    • /
    • pp.1-12
    • /
    • 1992
  • For intraoperative radiation therapy using electron beams, a cone system to deliver a large dose to the tumor during surgical operation and to save the surrounding normal tissue should be developed and dosimetry for the cone system is necessary to find proper X-ray collimator setting as well as to get useful data for clinical use. We developed a docking type of a cone system consisting of two parts made of aluminum: holder and cone. The cones which range from 4cm to 9cm with 1cm step at 100cm SSD of photon beam are 28cm long circular tubular cylinders. The system has two 26cm long holders: one for the cones larger than or equal to 7cm diamter and another for the smaller ones than 7cm. On the side of the holder is an aperture for insertion of a lamp and mirror to observe treatment field. Depth dose curve. dose profile and output factor at dept of dose maximum. and dose distribution in water for each cone size were measured with a p-type silicone detector controlled by a linear scanner for several extra opening of X-ray collimators. For a combination of electron energy and cone size, the opening of the X-ray collimator was caused to the surface dose, depths of dose maximum and 80%, dose profile and output factor. The variation of the output factor was the most remarkable. The output factors of 9MeV electron, as an example, range from 0.637 to 1.549. The opening of X-ray collimators would cause the quantity of scattered electrons coming to the IORT cone system. which in turn would change the dose distribution as well as the output factor. Dosimetry for an IORT cone system is inevitable to minimize uncertainty in the clinical use.

  • PDF