• Title/Summary/Keyword: index clustering

Search Result 323, Processing Time 0.033 seconds

EDGE: An Enticing Deceptive-content GEnerator as Defensive Deception

  • Li, Huanruo;Guo, Yunfei;Huo, Shumin;Ding, Yuehang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.5
    • /
    • pp.1891-1908
    • /
    • 2021
  • Cyber deception defense mitigates Advanced Persistent Threats (APTs) with deploying deceptive entities, such as the Honeyfile. The Honeyfile distracts attackers from valuable digital documents and attracts unauthorized access by deliberately exposing fake content. The effectiveness of distraction and trap lies in the enticement of fake content. However, existing studies on the Honeyfile focus less on this perspective. In this work, we seek to improve the enticement of fake text content through enhancing its readability, indistinguishability, and believability. Hence, an enticing deceptive-content generator, EDGE, is presented. The EDGE is constructed with three steps: extracting key concepts with a semantics-aware K-means clustering algorithm, searching for candidate deceptive concepts within the Word2Vec model, and generating deceptive text content under the Integrated Readability Index (IR). Furthermore, the readability and believability performance analyses are undertaken. The experimental results show that EDGE generates indistinguishable deceptive text content without decreasing readability. In all, EDGE proves effective to generate enticing deceptive text content as deception defense against APTs.

Classification of Forest Cover Types in the Baekdudaegan, South Korea

  • Chung, Sang Hoon;Lee, Sang Tae
    • Journal of Forest and Environmental Science
    • /
    • v.37 no.4
    • /
    • pp.269-279
    • /
    • 2021
  • This study was carried out to introduce the forest cover types of the Baekdudaegan inhabiting the number of native tree species. In order to understand the vegetation distribution characteristics of the Baekdudaegan, a vegetation survey was conducted on the major 20 mountains of the Baekdudaegan. The vegetation data were collected from 3,959 sample points by the point-centered quarter method. Each mountain was classified into 4-7 forests by using various multivariate statistical methods such as cluster analysis, indicator species analysis, multiple discriminant analysis, and species composition analysis. The forests were classified mainly according to the relative abundance of Quercus mongolica. There was a total of 111 classified forests and these forests were integrated into the following nine forest cover types using the percentage similarity index and by clustering according to vegetation type: 1) Mongolian oak, 2) Mongolian oak and other deciduous, 3) Oaks (Mixed Quercus spp.), 4) Korean red pine, 5) Korean red pine and oaks, 6) ash, 7) mixed mesophytic, 8) subalpine zone coniferous, and 9) miscellaneous forest. Forests grouped within the subalpine zone coniferous and miscellaneous classifications were characterized by similar environmental conditions and those forests that did not fit in any other category, respectively.

Machine Learning based Optimal Location Modeling for Children's Smart Pedestrian Crosswalk: A Case Study of Changwon-si (머신러닝을 활용한 어린이 스마트 횡단보도 최적입지 선정 - 창원시 사례를 중심으로 -)

  • Lee, Suhyeon;Suh, Youngwon;Kim, Sein;Lee, Jaekyung;Yun, Wonjoo
    • Journal of KIBIM
    • /
    • v.12 no.2
    • /
    • pp.1-11
    • /
    • 2022
  • Road traffic accidents (RTAs) are the leading cause of accidental death among children. RTA reduction is becoming an increasingly important social issue among children. Municipalities aim to resolve this issue by introducing "Smart Pedestrian Crosswalks" that help prevent traffic accidents near children's facilities. Nonetheless such facilities tend to be installed in relatively limited number of areas, such as the school zone. In order for budget allocation to be efficient and policy effects maximized, optimal location selection based on machine learning is needed. In this paper, we employ machine learning models to select the optimal locations for smart pedestrian crosswalks to reduce the RTAs of children. This study develops an optimal location index using variable importance measures. By using k-means clustering method, the authors classified the crosswalks into three types after the optimal location selection. This study has broadened the scope of research in relation to smart crosswalks and traffic safety. Also, the study serves as a unique contribution by integrating policy design decisions based on public and open data.

Water resources monitoring technique using multi-source satellite image data fusion (다종 위성영상 자료 융합 기반 수자원 모니터링 기술 개발)

  • Lee, Seulchan;Kim, Wanyub;Cho, Seongkeun;Jeon, Hyunho;Choi, Minhae
    • Journal of Korea Water Resources Association
    • /
    • v.56 no.8
    • /
    • pp.497-508
    • /
    • 2023
  • Agricultural reservoirs are crucial structures for water resources monitoring especially in Korea where the resources are seasonally unevenly distributed. Optical and Synthetic Aperture Radar (SAR) satellites, being utilized as tools for monitoring the reservoirs, have unique limitations in that optical sensors are sensitive to weather conditions and SAR sensors are sensitive to noises and multiple scattering over dense vegetations. In this study, we tried to improve water body detection accuracy through optical-SAR data fusion, and quantitatively analyze the complementary effects. We first detected water bodies at Edong, Cheontae reservoir using the Compact Advanced Satellite 500(CAS500), Kompsat-3/3A, and Sentinel-2 derived Normalized Difference Water Index (NDWI), and SAR backscattering coefficient from Sentinel-1 by K-means clustering technique. After that, the improvements in accuracies were analyzed by applying K-means clustering to the 2-D grid space consists of NDWI and SAR. Kompsat-3/3A was found to have the best accuracy (0.98 at both reservoirs), followed by Sentinel-2(0.83 at Edong, 0.97 at Cheontae), Sentinel-1(both 0.93), and CAS500(0.69, 0.78). By applying K-means clustering to the 2-D space at Cheontae reservoir, accuracy of CAS500 was improved around 22%(resulting accuracy: 0.95) with improve in precision (85%) and degradation in recall (14%). Precision of Kompsat-3A (Sentinel-2) was improved 3%(5%), and recall was degraded 4%(7%). More precise water resources monitoring is expected to be possible with developments of high-resolution SAR satellites including CAS500-5, developments of image fusion and water body detection techniques.

A Study on the Measurement of Technological Impact using Citation Analysis of Patent Information (특허정보분석을 이용한 기술파급효과 측정에 관한 연구)

  • Yoo, Sun-Hi;Lee, Yong-Ho;Won, Dong-Kyu
    • Journal of Korea Technology Innovation Society
    • /
    • v.10 no.4
    • /
    • pp.687-705
    • /
    • 2007
  • Nowadays it is more important to measure the technological impact of a concerned R&D technology on others, when deciding or selecting strategically, under the environment such as more complex, more uncertain and more costly. But there was very few of proper methods to measure quantitatively. So we studied on measuring the technological impact of one group of technologies on others, which means the flow of disembodied knowledge, using patent citation analysis. We reviewed the prior art of the measurement of technological impact, and designs the effective citation analysis method using patent information, analyzing the prior art of patent citation analysis method and ie index. Finally, we developed the disembodied knowledge flow matrix between technology groups, counting citation frequencies between them, using KISTI's US patent database(USPA) and the index to represent the technological impact to others using the developed matrix as well as the intrinsic nature of the technological groups clustering by network analysis. The results of this study is to present the insight of a technological impact on the others quantitatively and this study aims at using them to refer to R&D budgeting and decision making in case of R&D planning or to the basic information to understand technology conversion or fusion.

  • PDF

Indexing and Retrieval Mechanism using Variation Patterns of Theme Melodies in Content-based Music Information Retrievals (내용 기반 음악 정보 검색에서 주제 선율의 변화 패턴을 이용한 색인 및 검색 기법)

  • 구경이;신창환;김유성
    • Journal of KIISE:Databases
    • /
    • v.30 no.5
    • /
    • pp.507-520
    • /
    • 2003
  • In this paper, an automatic construction method of theme melody index for large music database and an associative content-based music retrieval mechanism in which the constructed theme melody index is mainly used to improve the users' response time are proposed. First, the system automatically extracted the theme melody from a music file by the graphical clustering algorithm based on the similarities between motifs of the music. To place an extracted theme melody into the metric space of M-tree, we chose the average length variation and the average pitch variation of the theme melody as the major features. Moreover, we added the pitch signature and length signature which summarize the pitch variation pattern and the length variation pattern of a theme melody, respectively, to increase the precision of retrieval results. We also proposed the associative content-based music retrieval mechanism in which the k-nearest neighborhood searching and the range searching algorithms of M-tree are used to select the similar melodies to user's query melody from the theme melody index. To improve the users' satisfaction, the proposed retrieval mechanism includes ranking and user's relevance feedback functions. Also, we implemented the proposed mechanisms as the essential components of content-based music retrieval systems to verify the usefulness.

Association between High Diffusion-Weighted Imaging-Derived Functional Tumor Burden of Peritoneal Carcinomatosis and Overall Survival in Patients with Advanced Ovarian Carcinoma

  • He An;Jose AU Perucho;Keith WH Chiu;Edward S Hui;Mandy MY Chu;Siew Fei Ngu;Hextan YS Ngan;Elaine YP Lee
    • Korean Journal of Radiology
    • /
    • v.23 no.5
    • /
    • pp.539-547
    • /
    • 2022
  • Objective: To investigate the association between functional tumor burden of peritoneal carcinomatosis (PC) derived from diffusion-weighted imaging (DWI) and overall survival in patients with advanced ovarian carcinoma (OC). Materials and Methods: This prospective study was approved by the local research ethics committee, and informed consent was obtained. Fifty patients (mean age ± standard deviation, 57 ± 12 years) with stage III-IV OC scheduled for primary or interval debulking surgery (IDS) were recruited between June 2016 and December 2021. DWI (b values: 0, 400, and 800 s/mm2) was acquired with a 16-channel phased-array torso coil. The functional PC burden on DWI was derived based on K-means clustering to discard fat, air, and normal tissue. A score similar to the surgical peritoneal cancer index was assigned to each abdominopelvic region, with additional scores assigned to the involvement of critical sites, denoted as the functional peritoneal cancer index (fPCI). The apparent diffusion coefficient (ADC) of the largest lesion was calculated. Patients were dichotomized by immediate surgical outcome into high- and low-risk groups (with and without residual disease, respectively) with subsequent survival analysis using the Kaplan-Meier curve and log-rank test. Multivariable Cox proportional hazards regression was used to evaluate the association between DWI-derived results and overall survival. Results: Fifteen (30.0%) patients underwent primary debulking surgery, and 35 (70.0%) patients received neoadjuvant chemotherapy followed by IDS. Complete tumor debulking was achieved in 32 patients. Patients with residual disease after debulking surgery had reduced overall survival (p = 0.043). The fPCI/ADC was negatively associated with overall survival when accounted for clinicopathological information with a hazard ratio of 1.254 for high fPCI/ADC (95% confidence interval, 1.007-1.560; p = 0.043). Conclusion: A high DWI-derived functional tumor burden was associated with decreased overall survival in patients with advanced OC.

The Metabolic Syndrome in Obese Children (소아 비만에서 대사증후군의 고찰)

  • Yom, Hye Won;Shin, Jee Seon;Lee, Hyun Joo;Park, So Eun;Jo, Su Jin;Seo, Jeong Wan
    • Pediatric Gastroenterology, Hepatology & Nutrition
    • /
    • v.7 no.2
    • /
    • pp.228-238
    • /
    • 2004
  • Purpose: Obesity is rapidly increasing in Korean children. Obesity is a risk factor for cardiovascular morbidity and is frequently associated with hypertension, diabetes mellitus and coronary artery disease. This study was designed to evaluate risk factors of the metabolic syndrome in obese children. Methods: From February 2000 to June 2004, eighty eight obese (body mass index ${\geq}95th$ percentile) children aged 4 to 15 years were included. We measured serum lipid levels (total cholesterol, triglyceride, HDL cholesterol, LDL cholesterol), fasting sugar levels and insulin levels. Insulin resistance was determined by homeostasis model assessment, fasting insulin/glucose ratio and quantitative insulin sensitivity check index. Results: Clustering of risk factors for the metabolic syndrome in obese children demonstrated that 60.2% had more than one risk factors. Hypertension (14.8%), hypertriglyceridemia (14.8%), HDL-hypocholesterolemia (14.8%), LDL-hypercholesterolemia (12.5%) and hyperinsulinemia (12.5%) were observed. As BMI increased, there was statistically significant increase in systolic blood pressure, insulin and insulin resistance values. Insulin resistance was correlated to systolic blood pressure, serum lipid and insulin levels. The more risk factors for the metabolic syndrome obese children had, the higher was their insulin resistance. Conclusion: The increase in insulin resistance and clustering of risk factors for the metabolic syndrome are already apparent in obese children. Monitoring these risk factors for the metabolic syndrome should become a part of routine medical care for obese children.

  • PDF

Notes on the Benthic Macrofauna During September 1997 Namdaecheon Estuary, Gangneung, Korea (강릉 남대천 하구역의 1997년 9월중 대형저서동물의 분포패턴)

  • 홍재상;서인수;윤건탁;황인서;김창수
    • Korean Journal of Environmental Biology
    • /
    • v.22 no.2
    • /
    • pp.341-350
    • /
    • 2004
  • We examined estuarine macrobenthos in Namdaecheon estuary, Gangneung, Korea, September 22, 1997. A total of 56 species were found together with an abundance of 378 individuals $m^{-2}$/ and a biomass of 20.79 gWWt $m^{-2}$. The major dominant species were unidentified oligochaete followed by different species of polychaetes Hediste japonica, Rhynchospio glutaea, Poecilochaetus trilobatus, Scoloplos armiger, Spiophanes bombyx and a talitrid amphipod Platorchestia crassicornis. The study area was divided into two different groups of stations and species, based on the q -mode and r-mode clustering analysis. In case of q-mode, there are two groups: one is a marine station group and the other is estuarine group. The r-mode clustering analysis showed the two main communities as follows; 1) Marine species group occurred only at station 7, 8 and 9 and 2) Estuarine species group composed of the species present at station 1 to 6 and 10 to 15. In the number of species, the estuarine station group includes 13 species, whereas 43 species were presented in the marine station group. Total macrofaunal abundance and total biomass were higher in the marine station group than those in the estuarine group. The species diversity index was also high in the marine group (> 2), whereas it was less than ca. 1 in the estuarine group.

Analysis of Area Type Classification of Seoul Using Geodemographics Methods (Geodemographics의 연구기법을 활용한 서울시 지역유형 분석 연구)

  • Woo, Hyun-Jee;Kim, Young-Hoon
    • Journal of the Korean association of regional geographers
    • /
    • v.15 no.4
    • /
    • pp.510-523
    • /
    • 2009
  • Geodemographics(GD) can be defined as an analytical approach of socio-economic and behavioral data about people to investigate geographical patterns. GD is based on the assumptions that demographical and behavioral characteristics of people who live in the same neighborhood are similar and then the neighborhoods can be categorized with spatial classifications with the geographical classifications. Thus, this paper, in order to identify the applicability of the geographical classification of the GD, explores the concepts of the geodemographics into Seoul city areas with Korea census data sets that contain key characteristics of demographic profiles in the area. Then, this paper attempt to explain each area classification profile by using clustering techniques with Ward's and k-means statistical methods. For this as as as, this paper employs 2005 Census dataset released by Korea National Statistics Office and the neighborhood unit is based on Dong level, the smallest administrative boundary unit in Korea. After selecting and standardizing variables, several areas are categorized by the cluster techniques into 13, this paps as distinctive cluster profiles. These cluster profiles are used to cthite a short description and expand on the cluster names. Finally, the results of the classification propose a reasonable judgement for target area types which benefits for the people who make a spatial decision for their spatial problem-solving.

  • PDF