• 제목/요약/키워드: On-Line Mining

Search Result 129, Processing Time 0.025 seconds

Reinforcement Mining Method for Anomaly Detection and Misuse Detection using Post-processing and Training Method (이상탐지(Anomaly Detection) 및 오용탐지(Misuse Detection) 분석의 정확도 향상을 위한 개선된 데이터마이닝 방법 연구)

  • Choi Yun-Jeong;Park Seung-Soo
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.06b
    • /
    • pp.238-240
    • /
    • 2006
  • 네트워크상에서 발생하는 다양한 형태의 대량의 데이터를 정확하고 효율적으로 분석하기 위해 설계되고 있는 마이닝 시스템들은 목표지향적으로 훈련데이터들을 어떻게 구축하여 다룰 것인지에 대한 문제보다는 대부분 얼마나 많은 데이터 마이닝 기법을 지원하고 이를 적용할 수 있는지 등의 기법에 초점을 두고 있다. 따라서, 점점 더 에이전트화, 분산화, 자동화 및 은닉화 되는 최근의 보안공격기법을 정확하게 탐지하기 위한 방법은 미흡한 실정이다. 본 연구에서는 유비쿼터스 환경 내에서 발생 가능한 문제 중 복잡하고 지능화된 침입패턴의 탐지를 위해 데이터 마이닝 기법과 결함허용방법을 이용하는 개선된 학습알고리즘과 후처리 방법에 의한 RTPID(Refinement Training and Post-processing for Intrusion Detection)시스템을 제안한다. 본 논문에서의 RTPID 시스템은 active learning과 post-processing을 이용하여, 네트워크 내에서 발생 가능한 침입형태들을 정확하고 효율적으로 다루어 분석하고 있다. 이는 기법에만 초점을 맞춘 기존의 데이터마이닝 분석을 개선하고 있으며, 특히 제안된 분석 프로세스를 진행하는 동안 능동학습방법의 장점을 수용하여 학습효과는 높이며 비용을 감소시킬 수 있는 자가학습방법(self learning)방법의 효과를 기대할 수 있다. 이는 관리자의 개입을 최소화하는 학습방법이면서 동시에 False Positive와 False Negative 의 오류를 매우 효율적으로 개선하는 방법으로 기대된다. 본 논문의 제안방법은 분석도구나 시스템에 의존하지 않기 때문에, 유사한 문제를 안고 있는 여러 분야의 네트웍 환경에 적용될 수 있다.더욱 높은성능을 가짐을 알 수 있다.의 각 노드의 전력이 위험할 때 에러 패킷을 발생하는 기법을 추가하였다. NS-2 시뮬레이터를 이용하여 실험을 한 결과, 제안한 기법이 AOMDV에 비해 경로 탐색 횟수가 최대 36.57% 까지 감소되었음을 알 수 있었다.의 작용보다 더 강력함을 시사하고 있다.TEX>로 최고값을 나타내었으며 그 후 감소하여 담금 10일에는 $1.61{\sim}2.34%$였다. 시험구간에는 KKR, SKR이 비교적 높은 값을 나타내었다. 무기질 함량은 발효기간이 경과할수록 증하였고 Ca는 $2.95{\sim}36.76$, Cu는 $0.01{\sim}0.14$, Fe는 $0.71{\sim}3.23$, K는 $110.89{\sim}517.33$, Mg는 $34.78{\sim}122.40$, Mn은 $0.56{\sim}5.98$, Na는 $0.19{\sim}14.36$, Zn은 $0.90{\sim}5.71ppm$을 나타내었으며, 시험구별로 보면 WNR, BNR구가 Na만 제외한 다른 무기성분 함량이 가장 높았다.O to reduce I/O cost by reusing data already present in the memory of other nodes. Finally, chunking and on-line compression mechanisms are included in both models. We demonstrate that we can obtain significantly high-performanc

  • PDF

The Analysis of Changes in East Coast Tourism using Topic Modeling (토핑 모델링을 활용한 동해안 관광의 변화 분석)

  • Jeong, Eun-Hee
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.13 no.6
    • /
    • pp.489-495
    • /
    • 2020
  • The amount of data is increasing through various IT devices in a hyper-connected society where the 4th revolution is progressing, and new value can be created by analyzing that data. This paper was collected total 1,526 articles from 2017 to 2019 in central magazines, economic magazines, regional associations, and major broadcasting companies with the keyword "(East Coast Tourism or East Coast Travel) and Gangwon-do" through Bigkinds. It was performed the topic modeling using LDA algorithm implemented in the R language to analyze the collected 1,526 articles. It was extracted keywords for each year from 2017 to 2019, and classified and compared keywords with high frequency for each year. It was setted the optimal number of topics to 8 using Log Likelihood and Perplexity, and then inferred 8 topics using the Gibbs Sampling method. The inferred topics were Gangneung and Beach, Goseong and Mt.Geumgang, KTX and Donghae-Bukbu line, weekend sea tour, Sokcho and Unification Observatory, Yangyang and Surfing, experience tour, and transportation network infra. The changes of articles on East coast tourism was was analyzed using the proportion of the inferred eight topics. As the result, the proportion of Unification Observatory and Mt. Geumgang showed no significant change, the proportion of KTX and experience tour increased, and the proportion of other topics decreased in 2018 compared to 2017. In 2019, the proportion of KTX and experience tour decreased, but the proportion of other topics showed no significant change.

Development of Selection Model of Interchange Influence Area in Seoul Belt Expressway Using Chi-square Automatic Interaction Detection (CHAID) (CHAID분석을 이용한 나들목 주변 지가의 공간분포 영향모형 개발 - 서울외곽순환고속도로를 중심으로 -)

  • Kim, Tae Ho;Park, Je Jin;Kim, Young Il;Rho, Jeong Hyun
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.29 no.6D
    • /
    • pp.711-717
    • /
    • 2009
  • This study develops model for analysis of relationship between major node (Interchange in expressway) and land price formation of apartments along with Seoul Belt Expressway by using CHAID analysis. The results show that first, regions(outer side: Gyeongido, inner side: Seoul) on the line of Seoul Belt Expressway are different and a graph generally show llinear relationships between land price and traffic node but it does not; second, CHAID analysis shows two different spatial distribution at the point of 2.6km in the outer side, but three different spatial distribution at the point of 1.4km and 3.8km in the inner side. In other words, traffic access does not necessarily guarantee high housing price since the graphs shows land price related to composite spatial distribution. This implies that residential environments (highway noise and regional discontinuity) and traffic accessibility cause mutual interaction to generate this phenomenon. Therefore, the highway IC landprice model will be beneficial for calculation of land price in New Town which constantly is being built along the highway.

Seasonal Variation of Surface Sediments in the Myeongsasipri Tidal Flat, Gochanggun, SW Korea (고창군 명사십리 조간대 표층 퇴적물의 계절 변화)

  • So, Kwang-Suk;Ryang, Woo-Hun;Kwon, Yi-Kyun
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.14 no.3
    • /
    • pp.181-188
    • /
    • 2009
  • The macro tidal flat of the Gochanggun Myongsasipri, located on the southwestern coast of Korea, is studied in terms of seasonal variations of surface sediment and sedimentary environment. Surface sediments of 45 sites in the winter (February) and the summer (August) are sampled across three survey lines (15 sites in each survey line), respectively. The tidal flat of open-coast Myongsasipri is mainly composed of fine to medium sand, the distribution of which shows a coast-parallel trend. Grain-size distribution has a bi-modal trend, and grain size in the winter is coarser than that in the summer. During the winter, the upper tidal flat is dominated by medium sand, while the lower tidal flat is dominated by find sand. Such a feature is attributed to wave-dominated sedimentation in the winter. The finer grains of the summer rather than that of the winter and relationship between texture parameters suggest that tidal energy plays an important role in tidal-flat sedimentation during the summer. This study represents an environmental change from wave-dominated conditions in the winter to tide-dominated conditions in the summer as a result of the seasonal variation in the intensity of onshore-directed winds and waves in the Myongsasipri tidal flat.

An Analysis of the Internal Marketing Impact on the Market Capitalization Fluctuation Rate based on the Online Company Reviews from Jobplanet (직원을 위한 내부마케팅이 기업의 시가 총액 변동률에 미치는 영향 분석: 잡플래닛 기업 리뷰를 중심으로)

  • Kichul Choi;Sang-Yong Tom Lee
    • Information Systems Review
    • /
    • v.20 no.2
    • /
    • pp.39-62
    • /
    • 2018
  • Thanks to the growth of computing power and the recent development of data analytics, researchers have started to work on the data produced by users through the Internet or social media. This study is in line with these recent research trends and attempts to adopt data analytical techniques. We focus on the impact of "internal marketing" factors on firm performance, which is typically studied through survey methodologies. We looked into the job review platform Jobplanet (www.jobplanet.co.kr), which is a website where employees and former employees anonymously review companies and their management. With web crawling processes, we collected over 40K data points and performed morphological analysis to classify employees' reviews for internal marketing data. We then implemented econometric analysis to see the relationship between internal marketing and market capitalization. Contrary to the findings of extant survey studies, internal marketing is positively related to a firm's market capitalization only within a limited area. In most of the areas, the relationships are negative. Particularly, female-friendly environment and human resource development (HRD) are the areas exhibiting positive relations with market capitalization in the manufacturing industry. In the service industry, most of the areas, such as employ welfare and work-life balance, are negatively related with market capitalization. When firm size is small (or the history is short), female-friendly environment positively affect firm performance. On the contrary, when firm size is big (or the history is long), most of the internal marketing factors are either negative or insignificant. We explain the theoretical contributions and managerial implications with these results.

Sedimentary History and Tectonics in the Southeastern Continental Shelf of Korea based on High Resolution Shallow Seismic Data. (고해상탄성파탐사자료에 의한 한국남동대륙붕의 퇴적사 및 조구조운동)

  • Min Geon Hong;Park Yong Ahn
    • The Korean Journal of Petroleum Geology
    • /
    • v.5 no.1_2 s.6
    • /
    • pp.1-8
    • /
    • 1997
  • Seismic stratigraphic analysis of the high resolution profiles obtained from the southeastern shelf of Korea divided the deposits into 4 sequences; 1) sequence D, 2) sequence C, 3) sequence B and 4) sequence A (Holocene sediments). Sequence D was deposited in shallow-water environment at west of the Yangsan Fault as the basin subsided. On the other hand, the eastern part was formed at the slope front. Landward part of the slope-front fill sediments were eroded and redeposited nearby slope due to the syndepositional tilting of the basin. This tilting probably resulted from the continuous closing of the Ulleung Basin. Sequence C is made of stacked successions of the lowstand fluvial sediments, transgressive sediments and marine highstand sediments derived from the paleo-river in the western part of the Yangsan Fault. Sequence C in the eastern part of the Yanshan Fault was formed at the shelf break. Progradation of the lowstand sediments resulted in broadening of the shelf. Sequence C in the eastern part was also tilted but the tilting was weaker than in Sequence D. During the formation of sequence B the tilting stopped and the point source instead of the line source started in both sides of the Yangsan Fault. Sequence B was composed of the highstand systems tract partially preserved around the Yokji island, lowstand systems tract mainly preserved in the Korea Trough and transgressive systems tract. After the stop of the tilting, the force of compression due to the closing of the Ulleung Basin may be released by the strike-slip faults instead of tilting.

  • PDF

Geochemical Characteristics of Granodiorite and Arenaceous Sedimentary Rocks in Chon-Ashuu Area, Kyrgyzstan (키르키스스탄 촌아슈 지역 화강섬록암질암 및 사질원 퇴적암의 지화학적 특징)

  • Kim, Soo-Young;Chi, Sei-Jung;Park, Sung-Won
    • Economic and Environmental Geology
    • /
    • v.44 no.4
    • /
    • pp.273-288
    • /
    • 2011
  • Chon-Ashuu copper mining claim area is located, in terms of the geotectonic setting, in the northern part of the suture line which is bounded with the marginal part of Issik-kul micro-continent on the southern part of North Tien-Shan terrane. The geological blocks of Chon-Ashuu districts belong to the southern tip of Kazakhstan orocline. The rock formation of this area are composed of the continental crust or/and arc collage and the paleo-continental fragments-accretionary wedge complex of pre-Altaid orogenic materials. ASI(Alumina Saturation Index) of Paleozoic plutonic rocks in Chon-Ashuu area belong to the peraluminous and metaluminous rocks which were generated from fractional crystallization of Island and volcanic arc crusts in syn-post collisional plate. The geology of the ChonAshuu area consists of upper Proterozoic and Paleozoic rock formations. According to Harker variation diagrams for Chon-Ashuu arenaceous sedimentary rocks, the silty sandstone of Chon-Ashuu area showing the mineralogical immaturity were derived from Island arc or the marginal environments of active continent in Cambro-Carboniferous period. Numerous intrusive rocks of Chon-Ashuu area are distributed along north east trending tectonic structures and are bounded on four sides by the conjugate pattern. The most common type of the plutonic rocks are granodiorite and monzodiorite. According to the molecular normative An-Ab-Or composition (Barker, 1979), the plutonic rocks in Chon-Ashuu area are classified into tonalite - trondhjemite - granodiorite (TTG) series which are an aggregation of rocks which is the country rock of copper mineralization, that are formed by melting of hydrous mafic crust at high pressure.

A Mineralogical and Gemological Studies for the Enhancement of Tanzania Ruby by Heat Treatment (탄자니아산 루비의 열처리에 의한 보석·광물학적 품질개선 연구)

  • Kim, Seon-Ok;Wang, Sookyun;Oh, Sul-Mi;Park, Hee Yul;Park, Maeng-Eon
    • Economic and Environmental Geology
    • /
    • v.47 no.6
    • /
    • pp.563-569
    • /
    • 2014
  • Ruby is one of the most favor colored gem, for beautiful red tone, be high in scarcity value. However, rubies with high quality are produced in restricted regions, such as in Thailand, Sri Lanka, Myanmar, and Tanzania etc., and they have been gradually exhausted by mining for a long period. Therefore, improving qualities of low level rubies with various treatments is arising an alternative way to obtain better rubies. Gemological and mineralogical properties of the natural ruby from Tanzanian were studied with heat treatments. Those characteristics were compared between only heat and adding flux materials under heating. Tanzanian raw rubies were applied a heat treatment ($1,600^{\circ}C$ for 6 hours). However, chromameter and UV-Vis analyses found that a simple heat treatment is inappropriated for the Tanzanian ruby. Although $Cr^{3+}$ containing for red color in the ruby increased with heat treatment, the ruby displays dark medium red because of Fe in the ruby as a form of $Fe_2O_3$. The low transparency after heat treatment is attributed to the recrystallization of $SiO_2$ which has a low melting point. Chromameter confirmed adding Pb-containing flux under heating greatly improves the clarity and color of Tanzanian rubies with micro-fractures and cavities on the surface. EMPA results show that Pb as an additive fills the cavities and cracks on raw Tanzanian rubies during the heat treatment. As a rewult of it, the quality of the Tanzanian ruby raw dramatically improved. These results indicate that the heat treatment with an additive (Pb in this study) is an effective way to obtain better quality of the Tanzanian ruby. Consequently, this study suggests a suitable method to improve the properties of the Tanzanina ruby. The result of this study would provide useful information to upgrade the qualities of similar gem stones such as corundum and sapphire.

A New Approach to Automatic Keyword Generation Using Inverse Vector Space Model (키워드 자동 생성에 대한 새로운 접근법: 역 벡터공간모델을 이용한 키워드 할당 방법)

  • Cho, Won-Chin;Rho, Sang-Kyu;Yun, Ji-Young Agnes;Park, Jin-Soo
    • Asia pacific journal of information systems
    • /
    • v.21 no.1
    • /
    • pp.103-122
    • /
    • 2011
  • Recently, numerous documents have been made available electronically. Internet search engines and digital libraries commonly return query results containing hundreds or even thousands of documents. In this situation, it is virtually impossible for users to examine complete documents to determine whether they might be useful for them. For this reason, some on-line documents are accompanied by a list of keywords specified by the authors in an effort to guide the users by facilitating the filtering process. In this way, a set of keywords is often considered a condensed version of the whole document and therefore plays an important role for document retrieval, Web page retrieval, document clustering, summarization, text mining, and so on. Since many academic journals ask the authors to provide a list of five or six keywords on the first page of an article, keywords are most familiar in the context of journal articles. However, many other types of documents could not benefit from the use of keywords, including Web pages, email messages, news reports, magazine articles, and business papers. Although the potential benefit is large, the implementation itself is the obstacle; manually assigning keywords to all documents is a daunting task, or even impractical in that it is extremely tedious and time-consuming requiring a certain level of domain knowledge. Therefore, it is highly desirable to automate the keyword generation process. There are mainly two approaches to achieving this aim: keyword assignment approach and keyword extraction approach. Both approaches use machine learning methods and require, for training purposes, a set of documents with keywords already attached. In the former approach, there is a given set of vocabulary, and the aim is to match them to the texts. In other words, the keywords assignment approach seeks to select the words from a controlled vocabulary that best describes a document. Although this approach is domain dependent and is not easy to transfer and expand, it can generate implicit keywords that do not appear in a document. On the other hand, in the latter approach, the aim is to extract keywords with respect to their relevance in the text without prior vocabulary. In this approach, automatic keyword generation is treated as a classification task, and keywords are commonly extracted based on supervised learning techniques. Thus, keyword extraction algorithms classify candidate keywords in a document into positive or negative examples. Several systems such as Extractor and Kea were developed using keyword extraction approach. Most indicative words in a document are selected as keywords for that document and as a result, keywords extraction is limited to terms that appear in the document. Therefore, keywords extraction cannot generate implicit keywords that are not included in a document. According to the experiment results of Turney, about 64% to 90% of keywords assigned by the authors can be found in the full text of an article. Inversely, it also means that 10% to 36% of the keywords assigned by the authors do not appear in the article, which cannot be generated through keyword extraction algorithms. Our preliminary experiment result also shows that 37% of keywords assigned by the authors are not included in the full text. This is the reason why we have decided to adopt the keyword assignment approach. In this paper, we propose a new approach for automatic keyword assignment namely IVSM(Inverse Vector Space Model). The model is based on a vector space model. which is a conventional information retrieval model that represents documents and queries by vectors in a multidimensional space. IVSM generates an appropriate keyword set for a specific document by measuring the distance between the document and the keyword sets. The keyword assignment process of IVSM is as follows: (1) calculating the vector length of each keyword set based on each keyword weight; (2) preprocessing and parsing a target document that does not have keywords; (3) calculating the vector length of the target document based on the term frequency; (4) measuring the cosine similarity between each keyword set and the target document; and (5) generating keywords that have high similarity scores. Two keyword generation systems were implemented applying IVSM: IVSM system for Web-based community service and stand-alone IVSM system. Firstly, the IVSM system is implemented in a community service for sharing knowledge and opinions on current trends such as fashion, movies, social problems, and health information. The stand-alone IVSM system is dedicated to generating keywords for academic papers, and, indeed, it has been tested through a number of academic papers including those published by the Korean Association of Shipping and Logistics, the Korea Research Academy of Distribution Information, the Korea Logistics Society, the Korea Logistics Research Association, and the Korea Port Economic Association. We measured the performance of IVSM by the number of matches between the IVSM-generated keywords and the author-assigned keywords. According to our experiment, the precisions of IVSM applied to Web-based community service and academic journals were 0.75 and 0.71, respectively. The performance of both systems is much better than that of baseline systems that generate keywords based on simple probability. Also, IVSM shows comparable performance to Extractor that is a representative system of keyword extraction approach developed by Turney. As electronic documents increase, we expect that IVSM proposed in this paper can be applied to many electronic documents in Web-based community and digital library.