• Title/Summary/Keyword: Combined dataset

Search Result 158, Processing Time 0.026 seconds

Decision Tree Induction with Imbalanced Data Set: A Case of Health Insurance Bill Audit in a General Hospital (불균형 데이터 집합에서의 의사결정나무 추론: 종합 병원의 건강 보험료 청구 심사 사례)

  • Hur, Joon;Kim, Jong-Woo
    • Information Systems Review
    • /
    • v.9 no.1
    • /
    • pp.45-65
    • /
    • 2007
  • In medical industry, health insurance bill audit is unique and essential process in general hospitals. The health insurance bill audit process is very important because not only for hospital's profit but also hospital's reputation. Particularly, at the large general hospitals many related workers including analysts, nurses, and etc. have engaged in the health insurance bill audit process. This paper introduces a case of health insurance bill audit for finding reducible health insurance bill cases using decision tree induction techniques at a large general hospital in Korea. When supervised learning methods had been tried to be applied, one of major problems was data imbalance problem in the health insurance bill audit data. In other words, there were many normal(passing) cases and relatively small number of reduction cases in a bill audit dataset. To resolve the problem, in this study, well-known methods for imbalanced data sets including over sampling of rare cases, under sampling of major cases, and adjusting the misclassification cost are combined in several ways to find appropriate decision trees that satisfy required conditions in health insurance bill audit situation.

Research on Basic Investigation and Analysis for Iand Substitution Planing using High-resolution Satellite Imagery (환지계획 수립시 고해상 위성영상을 이용한 기초조사 및 분석에 관한 연구)

  • Choi, Seung Pil;Jeong, Cheol Ju;Yeu, Yeon
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.21 no.2
    • /
    • pp.3-9
    • /
    • 2013
  • Various data like digital maps(1/1,000 or 1/5,000), field surveying, online materials and literatures are used for the preliminary investigation for urban development such as the feasibility evaluation, the profitability analysis, the zoning proposal, the zoning designation, and the land replotting planning. There are a couple of urban development methods like an expropriation, a replotting, a mixed-used method. The replotting method requires the consideration of land replotting types based on topography and building condition, which is not easy to gather data for the preliminary investigation maintaining the security of development planning. There are limitations of a preliminary investigation using aerial photos to detect topographic and building changes at specific period. GIS data combined with high-resolution imagery has advantages over the current dataset, which come from easy acquisition of various spatial resolution satellite images, wide swath coverage, the choice of imagery resolution satisfying a usage purpose, economic benefit comparing to aerial photos, and the calculation of distance and area on imagery from image modeling. For these reasons, the proposed method in this study enables to perform the more appropriate preliminary investigation using more accurate information.

Three-Dimensional Positional Accuracy Analysis of UAV Imagery Using Ground Control Points Acquired from Multisource Geospatial Data (다종 공간정보로부터 취득한 지상기준점을 활용한 UAV 영상의 3차원 위치 정확도 비교 분석)

  • Park, Soyeon;Choi, Yoonjo;Bae, Junsu;Hong, Seunghwan;Sohn, Hong-Gyoo
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.5_3
    • /
    • pp.1013-1025
    • /
    • 2020
  • Unmanned Aerial Vehicle (UAV) platform is being widely used in disaster monitoring and smart city, having the advantage of being able to quickly acquire images in small areas at a low cost. Ground Control Points (GCPs) for positioning UAV images are essential to acquire cm-level accuracy when producing UAV-based orthoimages and Digital Surface Model (DSM). However, the on-site acquisition of GCPs takes considerable manpower and time. This research aims to provide an efficient and accurate way to replace the on-site GNSS surveying with three different sources of geospatial data. The three geospatial data used in this study is as follows; 1) 25 cm aerial orthoimages, and Digital Elevation Model (DEM) based on 1:1000 digital topographic map, 2) point cloud data acquired by Mobile Mapping System (MMS), and 3) hybrid point cloud data created by merging MMS data with UAV data. For each dataset a three-dimensional positional accuracy analysis of UAV-based orthoimage and DSM was performed by comparing differences in three-dimensional coordinates of independent check point obtained with those of the RTK-GNSS survey. The result shows the third case, in which MMS data and UAV data combined, to be the most accurate, showing an RMSE accuracy of 8.9 cm in horizontal and 24.5 cm in vertical, respectively. In addition, it has been shown that the distribution of geospatial GCPs has more sensitive on the vertical accuracy than on horizontal accuracy.

Estimation of Daily per Capita Intake of Total Phenolics, Total Flavonoids, and Antioxidant Capacities from Commercial Products of Japanese Apricot (Prunus mume) in the Korean Diet, Based on the Korea National Health and Nutrition Examination Survey in 2010 (2010년 국민건강영양조사에 근거한 매실가공품 섭취로부터 한국인의 일인당 하루 총페놀, 총플라보노이드 및 항산화능 섭취량 추정)

  • Lee, Bong Han;Yoo, Hee Geun;Baek, Youngsu;Kwon, O Jun;Chung, Dae Kyun;Kim, Dae-Ok
    • Korean Journal of Food Science and Technology
    • /
    • v.46 no.2
    • /
    • pp.237-244
    • /
    • 2014
  • The total phenolics, total flavonoids, and antioxidant capacities of ten commercial products of Japanese apricot (maesil) were evaluated, including four kinds of alcoholic drinks, two kinds of soft drinks, and four kinds of concentrate found in the Korean market. The daily per capita consumption (g/capita/day) of each product was calculated from in the existing dataset of the Korea National Health and Nutrition Examination Survey in 2010. Using the combined datasets indicated above, the daily per capita intake of total phenolics from maesil product consumption was found to be 1.05 mg gallic acid equivalents. The daily per capita intake of total flavonoids was determined to be 0.13 mg catechin equivalents, and the daily per capita intake of antioxidant capacities were measured at 0.70 mg vitamin C equivalents (1,1-diphenyl-2-picrylhydrazyl assay), and at 1.04 mg vitamin C equivalents (2,2'-azino-bis(3-ethylbenzthiazoline-6-sulfonic acid) assay). The daily per capita intakes of total phenolics, total flavonoids, and antioxidant capacities were influenced by the daily quantity of consumption of maesil products, as well as their compositional contents.

Creative Project and Reward Based Crowdfunding:Determinants of Success (창의적 프로젝트와 후원형 크라우드펀딩: 성공요인)

  • Chun, Hesuk
    • The Journal of the Korea Contents Association
    • /
    • v.15 no.5
    • /
    • pp.560-569
    • /
    • 2015
  • Crowd funding is the method of raising money for a project, companies from a large group of people via the Internet, in return for future products or equity. Kickstarter is the largest and most successful crowdfunding site where creative projects raise reward based funding. Drawing on dataset of 80,267 projects with combined funding over $1.3b from 8.1m people, this paper suggest that backer select project based on their preference on the project, instead profitability of the project. It suggests that well-established platform and big size of network increases the chance of success of the project due to a ripple effect and blockbuster effects. Clear communication about the project's idea and goal is highly correlated with success. Regular communication on the project site, such as by constant progress updates, helps the success of the project. Equity-based crowdfunding is emerging as an innovative means of raising capital for businesses, so it has been receiving a lot of attention and expectation from the government and the market. The findings of this paper and others will help to get some understanding and insight into equity-based crowdfunding. However, Kickstarter differs from equity-based crowdfunding in the goals of the backers. Kickstarter's backers are not investors, they are contributors. To understand equity-based crowdfunding, the subject will need further study.

Performance Evaluations for Leaf Classification Using Combined Features of Shape and Texture (형태와 텍스쳐 특징을 조합한 나뭇잎 분류 시스템의 성능 평가)

  • Kim, Seon-Jong;Kim, Dong-Pil
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.1-12
    • /
    • 2012
  • There are many trees in a roadside, parks or facilities for landscape. Although we are easily seeing a tree in around, it would be difficult to classify it and to get some information about it, such as its name, species and surroundings of the tree. To find them, you have to find the illustrated books for plants or search for them on internet. The important components of a tree are leaf, flower, bark, and so on. Generally we can classify the tree by its leaves. A leaf has the inherited features of the shape, vein, and so on. The shape is important role to decide what the tree is. And texture included in vein is also efficient feature to classify them. This paper evaluates the performance of a leaf classification system using both shape and texture features. We use Fourier descriptors for shape features, and both gray-level co-occurrence matrices and wavelets for texture features, and used combinations of such features for evaluation of images from the Flavia dataset. We compared the recognition rates and the precision-recall performances of these features. Various experiments showed that a combination of shape and texture gave better results for performance. The best came from the case of a combination of features of shape and texture with a flipped contour for a Fourier descriptor.

Factors Affecting the Morbidity Related to Respiratory Dieseases in Urban Korea (한국 도시의 만성호흡기 질환 이환율에 영향을 주는 요인)

  • Han, Sung-Hyun;Park, Jae-Sung;Seo, Seung-Hee;Yoon, Jee-Eun;Jee, Sun-Ha
    • Korea journal of population studies
    • /
    • v.28 no.2
    • /
    • pp.205-217
    • /
    • 2005
  • Purpose: To evaluate the factors affecting hospital utilization for respiratory diseases by ecological study design and GIS tool. To raise the social concern for respiratory disease by the result. Methods: Hospital admission data supported by national health insurance cooperation were transformed to spread sheet data format and combined with air monitoring dataset. Air pollution data were collected from the annual report of air monitoring published by Korea Ministry of Environment. Socioeconomic statistics including population density, age distribution, forest ratio etc.. were filed using Korea National Statistical Office database. Multiple linear regression analysis was performed to evaluate the factors affecting hospital utilization for respiratory diseases. Analytical unit was 52 cities. Results: The factors affecting hospital utilization for respiratory diseases were the proportion of population 60 years and over, seaside city, $O_3$ level, smoking rate. Conclusions: However, outdoor pollutants monitoring data and smoking rate have weakness in reflecting individual exposure. Further research is required to propose more illustrative means to evaluate causal relationship between air pollution and respiratory health effect factors.

A Prospect on the Changes in Short-term Cold Hardiness in "Campbell Early" Grapevine under the Future Warmer Winter in South Korea (남한의 겨울기온 상승 예측에 따른 포도 "캠벨얼리" 품종의 단기 내동성 변화 전망)

  • Chung, U-Ran;Yun, Jin-I.
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.10 no.3
    • /
    • pp.94-101
    • /
    • 2008
  • Warming trends during winter seasons in East Asian regions are expected to accelerate in the future according to the climate projection by the Inter-governmental Panel on Climate Change (IPCC). Warmer winters may affect short-term cold hardiness of deciduous fruit trees, and yet phenological observations are scant compared to long-term climate records in the regions. Dormancy depth, which can be estimated by daily temperature, is expected to serve as a reasonable proxy for physiological tolerance of flowering buds to low temperature in winter. In order to delineate the geographical pattern of short-term cold hardiness in grapevines, a selected dormancy depth model was parameterized for "Campbell Early", the major cultivar in South Korea. Gridded data sets of daily maximum and minimum temperature with a 270m cell spacing ("High Definition Digital Temperature Map", HDDTM) were prepared for the current climatological normal year (1971-2000) based on observations at the 56 Korea Meteorological Administration (KMA) stations and a geospatial interpolation scheme for correcting land surface effects (e.g., land use, topography, and site elevation). To generate relevant datasets for climatological normal years in the future, we combined a 25km-resolution, 2011-2100 temperature projection dataset covering South Korea (under the auspices of the IPCC-SRES A2 scenario) with the 1971-2000 HD-DTM. The dormancy depth model was run with the gridded datasets to estimate geographical pattern of change in the cold-hardiness period (the number of days between endo- and forced dormancy release) across South Korea for the normal years (1971-2000, 2011-2040, 2041-2070, and 2071-2100). Results showed that the cold-hardiness zone with 60 days or longer cold-tolerant period would diminish from 58% of the total land area of South Korea in 1971-2000 to 40% in 2011-2040, 14% in 2041-2070, and less than 3% in 2071-2100. This method can be applied to other deciduous fruit trees for delineating geographical shift of cold-hardiness zone under the projected climate change in the future, thereby providing valuable information for adaptation strategy in fruit industry.

An Enhanced Density and Grid based Spatial Clustering Algorithm for Large Spatial Database (대용량 공간데이터베이스를 위한 확장된 밀도-격자 기반의 공간 클러스터링 알고리즘)

  • Gao, Song;Kim, Ho-Seok;Xia, Ying;Kim, Gyoung-Bae;Bae, Hae-Young
    • The KIPS Transactions:PartD
    • /
    • v.13D no.5 s.108
    • /
    • pp.633-640
    • /
    • 2006
  • Spatial clustering, which groups similar objects based on their distance, connectivity, or their relative density in space, is an important component of spatial data mining. Density-based and grid-based clustering are two main clustering approaches. The former is famous for its capability of discovering clusters of various shapes and eliminating noises, while the latter is well known for its high speed. Clustering large data sets has always been a serious challenge for clustering algorithms, because huge data set would make the clustering process extremely costly. In this paper, we propose an enhanced Density-Grid based Clustering algorithm for Large spatial database by setting a default number of intervals and removing the outliers effectively with the help of a proper measurement to identify areas of high density in the input data space. We use a density threshold DT to recognize dense cells before neighbor dense cells are combined to form clusters. When proposed algorithm is performed on large dataset, a proper granularity of each dimension in data space and a density threshold for recognizing dense areas can improve the performance of this algorithm. We combine grid-based and density-based methods together to not only increase the efficiency but also find clusters with arbitrary shape. Synthetic datasets are used for experimental evaluation which shows that proposed method has high performance and accuracy in the experiments.

Tectonic Link between NE China and Korean Peninsula, Revealed by Interpreting CHAMP Satellite Magnetic and GRACE Satellite Gravity Data

  • Choi, Sungchan;Oh, Chang-Whan;Luehr, Herrmann
    • Journal of the Korean Geophysical Society
    • /
    • v.9 no.3
    • /
    • pp.209-217
    • /
    • 2006
  • The major continental blocks in NE-Asia are the North China Block and the South China Blo, which have collided, starting from the Korean peninsula. The suture zone in NE China between two blocks is well defined from the QinIing-Dabie-Orogenic Belt to the Jiaodong (Sulu) Belt by the geological and geophysical interpretation. The discovery of high pressure metamorphic rocks in the Hongsung area of the Korean peninsula can be used to estimate the suture zone. This indicates that the suture zone in the Jiaodong Belt might be extended to Hongsung area. However, due to the lack of geological and geophysical data over the Yellow sea, the extension of the suture zone to the Korean peninsula across the Yellow Sea is obscure. To find out the tectonic relationship between NE China and the Korean peninsula it is necessary to complete U-ie homogeneous geophysical dataset of NE Asia, which can be provided by satellite observations. The CHAMP lithospheric magnetic field (MF3) and CHAMP-GRACE gravity field, combined with surface measured data, allow a much more accurate in-ference of tectonic structures than previously available. The CHAMP magnetic anomaly map reveals significant magnetic lows in the Yellow Sea near Nanjing and Hongsung, where are characterized by gravity highs on U-ie CHAMP-GRACE gravity anomaly map. To evaluate the depth and location of poten-tial field anomaly causative bodies, the Euler Deconvolution method is implemented. After comparing the two potential field solutions with the simplified geological map containing tectonic lines and the distribution of earthquakes epicenters, it is found that the derived structure boundaries of both are well coincident with the seismic activities as well as with the tectonic lineaments. The interpretation of the CHAMP satellite magnetic and GRACE satellite gravity datasets reveal two tectonic boundaries in U-ie Yellow Sea and the Korean peninsula, indicating U-ie norttiern and southern margins of the suture zone between the North China Block and the South China Block. The former is extended from the Jiaodong Belt in East China to the Imjingang Belt on the Korean peninsula, the later from Nanjing, East China, to Hongsung, the Korean peninsula. The tectonic movement in or near the suture zone might be responsible for the seismic activities in the western region of the Korean Peninsula and the development of the Yellow Sea sedimentary basin.

  • PDF