• Title/Summary/Keyword: 연관규칙분석

Search Result 347, Processing Time 0.021 seconds

A study on 3-step complex data mining in society indicator survey (사회지표조사에서의 3단계 복합 데이터마이닝의 적용 방안)

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.5
    • /
    • pp.983-992
    • /
    • 2012
  • Social indicator survey can identify the state of society as a whole. When we create a policy, social indicator survey can reflect the public opinion of the region. Social indicator survey is an important measure of social change. Social indicator survey has been conducted in many municipalities (Seoul, Incheon, Busan, Ulsan, Gyeongsangnamdo, etc.). But, the result of social indicator survey analysis is mainly the basic statistical analysis. In this study, we propose a new data mining methodology for effective analysis. We propose a 3-step complex data mining in society indicator survey. 3-step complex data mining uses three data mining method (intervening association rule, clustering, decision tree).

A Study on a Working Pattern Analysis Prototype using Correlation Analysis and Linear Regression Analysis in Welding BigData Environment (용접 빅데이터 환경에서 상관분석 및 회귀분석을 이용한 작업 패턴 분석 모형에 관한 연구)

  • Jung, Se-Hoon;Sim, Chun-Bo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.9 no.10
    • /
    • pp.1071-1078
    • /
    • 2014
  • Recently, information providing service using Big Data is being expanded. Big Data processing technology is actively being academic research to an important issue in the IT industry. In this paper, we analyze a skilled pattern of welder through Big Data analysis or extraction of welding based on R programming. We are going to reduce cost on welding work including weld quality, weld operation time by providing analyzed results non-skilled welder. Welding has a problem that should be invested long time to be a skilled welder. For solving these issues, we apply connection rules algorithms and regression method to much pattern variable for welding pattern analysis of skilled welder. We analyze a pattern of skilled welder according to variable of analyzed rules by analyzing top N rules. In this paper, we confirmed the pattern structure of power consumption rate and wire consumption length through experimental results of analyzed welding pattern analysis.

Development of Intelligent Job Classification System based on Job Posting on Job Sites (구인구직사이트의 구인정보 기반 지능형 직무분류체계의 구축)

  • Lee, Jung Seung
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.123-139
    • /
    • 2019
  • The job classification system of major job sites differs from site to site and is different from the job classification system of the 'SQF(Sectoral Qualifications Framework)' proposed by the SW field. Therefore, a new job classification system is needed for SW companies, SW job seekers, and job sites to understand. The purpose of this study is to establish a standard job classification system that reflects market demand by analyzing SQF based on job offer information of major job sites and the NCS(National Competency Standards). For this purpose, the association analysis between occupations of major job sites is conducted and the association rule between SQF and occupation is conducted to derive the association rule between occupations. Using this association rule, we proposed an intelligent job classification system based on data mapping the job classification system of major job sites and SQF and job classification system. First, major job sites are selected to obtain information on the job classification system of the SW market. Then We identify ways to collect job information from each site and collect data through open API. Focusing on the relationship between the data, filtering only the job information posted on each job site at the same time, other job information is deleted. Next, we will map the job classification system between job sites using the association rules derived from the association analysis. We will complete the mapping between these market segments, discuss with the experts, further map the SQF, and finally propose a new job classification system. As a result, more than 30,000 job listings were collected in XML format using open API in 'WORKNET,' 'JOBKOREA,' and 'saramin', which are the main job sites in Korea. After filtering out about 900 job postings simultaneously posted on multiple job sites, 800 association rules were derived by applying the Apriori algorithm, which is a frequent pattern mining. Based on 800 related rules, the job classification system of WORKNET, JOBKOREA, and saramin and the SQF job classification system were mapped and classified into 1st and 4th stages. In the new job taxonomy, the first primary class, IT consulting, computer system, network, and security related job system, consisted of three secondary classifications, five tertiary classifications, and five fourth classifications. The second primary classification, the database and the job system related to system operation, consisted of three secondary classifications, three tertiary classifications, and four fourth classifications. The third primary category, Web Planning, Web Programming, Web Design, and Game, was composed of four secondary classifications, nine tertiary classifications, and two fourth classifications. The last primary classification, job systems related to ICT management, computer and communication engineering technology, consisted of three secondary classifications and six tertiary classifications. In particular, the new job classification system has a relatively flexible stage of classification, unlike other existing classification systems. WORKNET divides jobs into third categories, JOBKOREA divides jobs into second categories, and the subdivided jobs into keywords. saramin divided the job into the second classification, and the subdivided the job into keyword form. The newly proposed standard job classification system accepts some keyword-based jobs, and treats some product names as jobs. In the classification system, not only are jobs suspended in the second classification, but there are also jobs that are subdivided into the fourth classification. This reflected the idea that not all jobs could be broken down into the same steps. We also proposed a combination of rules and experts' opinions from market data collected and conducted associative analysis. Therefore, the newly proposed job classification system can be regarded as a data-based intelligent job classification system that reflects the market demand, unlike the existing job classification system. This study is meaningful in that it suggests a new job classification system that reflects market demand by attempting mapping between occupations based on data through the association analysis between occupations rather than intuition of some experts. However, this study has a limitation in that it cannot fully reflect the market demand that changes over time because the data collection point is temporary. As market demands change over time, including seasonal factors and major corporate public recruitment timings, continuous data monitoring and repeated experiments are needed to achieve more accurate matching. The results of this study can be used to suggest the direction of improvement of SQF in the SW industry in the future, and it is expected to be transferred to other industries with the experience of success in the SW industry.

연관분석을 이용한 데이터마이닝 기법에 관한 사례연구

  • Ryu, Gwi-Yeol;Mun, Yeong-Su;Choi, Seung-Du
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2006.04a
    • /
    • pp.109-120
    • /
    • 2006
  • Huge information has been made due to the current computing environment and could not be acceptable. People want the information which they can understand and accept easily. They may want not only simple information but also knowledge. That is why data mining becomes a center of information. We use RFM analysis in order to create customer score. Customers are classified into five groups(most oxcellenrexcellenycommoflowerilowest) for a various marketing activities. We can found the significant patterns in each group, and classify customers from loyal customers to leaving customers in the near future by the indirect data mining(e.g. association analysis) and the direct data mining(e.g. decision tree, logistic regression analysis, etc.), which are named in this study. Our research focuses on the advanced models by applying the association rules in data mining. Our results indicate that the indirect data mining and the direct data mining seem to have same outputs, but the former shows more clear pattern then the latter one.

  • PDF

Design and implementation of data mining tool using PHP and WEKA (피에이치피와 웨카를 이용한 데이터마이닝 도구의 설계 및 구현)

  • You, Young-Jae;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.2
    • /
    • pp.425-433
    • /
    • 2009
  • Data mining is the method to find useful information for large amounts of data in database. It is used to find hidden knowledge by massive data, unexpectedly pattern, relation to new rule. We need a data mining tool to explore a lot of information. There are many data mining tools or solutions; E-Miner, Clementine, WEKA, and R. Almost of them are were focused on diversity and general purpose, and they are not useful for laymen. In this paper we design and implement a web-based data mining tool using PHP and WEKA. This system is easy to interpret results and so general users are able to handle. We implement Apriori algorithm of association rule, K-means algorithm of cluster analysis, and J48 algorithm of decision tree.

  • PDF

Clustering and Pattern Analysis for Building Semantic Ontologies in RESTful Web Services (RESTful 웹 서비스에서 시맨틱 온톨로지를 구축하기 위한 클러스터링 및 패턴 분석 기법)

  • Lee, Yong-Ju
    • Journal of Internet Computing and Services
    • /
    • v.12 no.4
    • /
    • pp.119-133
    • /
    • 2011
  • With the advent of Web 2.0, the use of RESTful web services is expected to overtake that of the traditional SOAP-based web services. Recently, the growing number of RESTful web services available on the web raises the challenging issue of how to locate the desired web services. However, the existing keyword searching method is insufficient for the bad recall and the bad precision. In this paper, we propose a novel building semantic ontology method which employs both the clustering technique based on association rules and the semantic analysis technique based on patterns. From this method, we can generate ontologies automatically, reduce the burden of semantic annotations, and support more efficient web services search. We ran our experiments on the subset of 168 RESTful web services downloaded from the PregrammableWeb site. The experimental results show that our method achieves up to 35% improvement for recall performance, and up to 18% for precision performance compared to the existing keyword searching method.

A Study on the Lunch Box Promotion of Convenience Store by Commercial Areas (상권별 편의점 도시락 판매 전략에 관한 연구)

  • Choi, Sung-WooK;Shin, Yong Jae
    • Journal of Digital Convergence
    • /
    • v.17 no.6
    • /
    • pp.77-91
    • /
    • 2019
  • In order to establish a sales strategy for convenience store lunches, this study conducted analysis using association rules based on POS data obtained from convenience stores located in four commercial districts. For this purpose, the data used in the analysis were divided into the time zones from 6:00 am to 8:00 pm, 17:00 pm to 19:00 pm, and the convenience stores according to the commercial areas. As a result of the analysis, it was found that products that were sold together with a lunch box were mainly made of products that could be eaten together with lunch such as milk, beverage, and cotton. However, it was confirmed that there were differences in the types and numbers of the products that were sold together with the lunch boxes of the morning time and the afternoon hours for the other products. These results and approaches are expected to contribute to finding and responding to the needs for goods and services that change as well as convenience stores as well as sociocultural changes.

Text Mining and Association Rules Analysis to a Self-Introduction Letter of Freshman at Korea National College of Agricultural and Fisheries (2) (한국농수산대학 신입생 자기소개서의 텍스트 마이닝과 연관규칙 분석 (2))

  • Joo, J.S.;Lee, S.Y.;Kim, J.S.;Shin, Y.K.;Park, N.B.
    • Journal of Practical Agriculture & Fisheries Research
    • /
    • v.22 no.2
    • /
    • pp.99-114
    • /
    • 2020
  • In this study we examined the topic analysis and correlation analysis by text mining from the self introduction letter of freshman at Korea National College of Agriculture and Fisheries(KNCAF) in 2020. The analysis items of the 3rd question were and the 4th question were the motivation for applying to college, the academic plan and the career plan. The text mining to the 3rd question showed that the frequency of 'friends' was overwhelmingly high, followed by keywords such as 'thought', 'time', 'opinion', 'activity', and 'club'. In the 4th question, keyword frequency such as 'thought', 'agriculture', 'KNCAF', 'farm', 'father' was high. The result of association rules analysis for each question showed that the relationship with the highest support level, which means the frequency and importance of the rule, was the {friend} <=> {thought}, {thought} <=> {KNCAF}. The confidence level of a correlation between keywords was the highest in the rules of {teacher}=>{friend}, {agriculture, KNCAF}=>{thought}. Also the lift level that indicates the closeness of two words was the highest in the rules of {friend} <=> {teacher}, {knowledge} <=> {professional}. These keywords are found to play a very important roles in analyzing betweenness centrality and analyzing degree centrality between keywords. The results of frequency analysis and association analysis were visualized with word cloud and correlation graphs to make it easier to understand all the results.

Comparing Accuracy of Imputation Methods for Categorical Incomplete Data (범주형 자료의 결측치 추정방법 성능 비교)

  • 신형원;손소영
    • The Korean Journal of Applied Statistics
    • /
    • v.15 no.1
    • /
    • pp.33-43
    • /
    • 2002
  • Various kinds of estimation methods have been developed for imputation of categorical missing data. They include category method, logistic regression, and association rule. In this study, we propose two fusions algorithms based on both neural network and voting scheme that combine the results of individual imputation methods. A Mont-Carlo simulation is used to compare the performance of these methods. Five factors used to simulate the missing data pattern are (1) input-output function, (2) data size, (3) noise of input-output function (4) proportion of missing data, and (5) pattern of missing data. Experimental study results indicate the following: when the data size is small and missing data proportion is large, modal category method, association rule, and neural network based fusion have better performances than the other methods. However, when the data size is small and correlation between input and missing output is strong, logistic regression and neural network barred fusion algorithm appear better than the others. When data size is large with low missing data proportion, a large noise, and strong correlation between input and missing output, neural networks based fusion algorithm turns out to be the best choice.

Developing an Intelligent System for the Analysis of Signs Of Disaster (인적재난사고사례기반의 새로운 재난전조정보 등급판정 연구)

  • Lee, Young Jai
    • Journal of Korean Society of societal Security
    • /
    • v.4 no.2
    • /
    • pp.29-40
    • /
    • 2011
  • The objective of this paper is to develop an intelligent decision support system that is able to advise disaster countermeasures and degree of incidents on the basis of the collected and analyzed signs of disasters. The concepts derived from ontology, text mining and case-based reasoning are adapted to design the system. The functions of this system include term-document matrix, frequency normalization, confidency, association rules, and criteria for judgment. The collected qualitative data from signs of new incidents are processed by those functions and are finally compared and reasoned to past similar disaster cases. The system provides the varying degrees of how dangerous the new signs of disasters are and the few countermeasures to the disaster for the manager of disaster management. The system will be helpful for the decision-maker to make a judgment about how much dangerous the signs of disaster are and to carry out specific kinds of countermeasures on the disaster in advance. As a result, the disaster will be prevented.

  • PDF