• Title/Summary/Keyword: Pattern mining

Search Result 624, Processing Time 0.028 seconds

A Case Study of a Text Mining Method for Discovering Evolutionary Patterns of Mobile Phone in Korea (국내 휴대폰의 진화패턴 규명을 위한 텍스트 마이닝 방안 제안 및 사례 연구)

  • On, Byung-Won
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.2
    • /
    • pp.29-45
    • /
    • 2015
  • Systematic theory, concepts, and methodology for the biological evolution have been developed while patterns and principles of the evolution have been actively studied in the past 200 years. Furthermore, they are applied to various fields such as evolutionary economics, evolutionary psychology, evolutionary linguistics, making significant progress in research. In addition, existing studies have applied main biological evolutionary models to artifacts although such methods do not fit to them. These models are also limited to generalize evolutionary patterns of artifacts because they are designed in terms of a subjective point of view of experts who know well about the artifacts. Unlike biological organisms, because artifacts are likely to reflect the imagination of the human will, it is known that the theory of biological evolution cannot be directly applied to artifacts. In this paper, beyond the individual's subjective, the aim of our research is to present evolutionary patterns of a given artifact based on peeping the idea of the public. For this, we propose a text mining approach that presents a systematic framework that can find out the evolutionary patterns of a given artifact and then visualize effectively. In particular, based on our proposal, we focus mainly on a case study of mobile phone that has emerged as an icon of innovation in recent years. We collect and analyze review posts on mobile phone available in the domestic market over the past decade, and discuss the detailed results about evolutionary patterns of the mobile phone. Moreover, this kind of task is a tedious work over a long period of time because a small number of experts carry out an extensive literature survey and summarize a huge number of materials to finally draw a diagram of evolutionary patterns of the mobile phone. However, in this work, to minimize the human efforts, we present a semi-automatic mining algorithm, and through this research we can understand how human creativity and imagination are implemented. In addition, it is a big help to predict the future trend of mobile phone in business and industries.

Trend Analysis of Barrier-free Academic Research using Text Mining and CONCOR (텍스트 마이닝과 CONCOR을 활용한 배리어 프리 학술연구 동향 분석)

  • Jeong-Ki Lee;Ki-Hyok Youn
    • Journal of Internet of Things and Convergence
    • /
    • v.9 no.2
    • /
    • pp.19-31
    • /
    • 2023
  • The importance of barrier free is being highlighted worldwide. This study attempted to identify barrier-free research trends using text mining. Through this, it was intended to help with research and policies to create a barrier free environment. The analysis data is 227 papers published in domestic academic journals from 1996 when barrier free research began to 2022. The researcher converted the title, keywords, and abstract of an academic thesis into text, and then analyzed the pattern of the thesis and the meaning of the data. The summary of the research results is as follows. First, barrier-free research began to increase after 2009, with an annual average of 17.1 papers being published. This is related to the implementation guidelines for the barrier-free certification system that took effect on July 15, 2008. Second, results of barrier-free text mining i) As a result of word frequency analysis of top keywords, important keywords such as barrier free, disabled, design, universal design, access, elderly, certification, improvement, evaluation, and space, facility, and environment were searched. ii) As a result of TD-IDF analysis, the main keywords were universal design, design, certification, house, access, elderly, installation, disabled, park, evaluation, architecture, and space. iii) As a result of N-Ggam analysis, barrier free+certification, barrier free+design, barrier free+barrier free, elderly+disabled, disabled+elderly, disabled+convenience facilities, the disabled+the elderly, society+the elderly, convenience facilities+installation, certification+evaluation index, physical+environment, life+quality, etc. appeared in a related language. Third, as a result of the CONCOR analysis, cluster 1 was barrier-free issues and challenges, cluster 2 was universal design and space utilization, cluster 3 was Improving Accessibility for the Disabled, and cluster 4 was barrier free certification and evaluation. Based on the analysis results, this study presented policy implications for vitalizing barrier-free research and establishing a desirable barrier free environment.

Pattern Analysis for Civil Complaints of Local Governments Using a Text Mining (텍스트마이닝에 의한 지자체 민원청구 패턴 분석)

  • Won, Tae Hong;Yoo, Hwan Hee
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.34 no.3
    • /
    • pp.319-327
    • /
    • 2016
  • Korea faces a wide range of problems in areas such as safety, environment, and traffic due to the rapid economic development and urbanization process. Despite the local governments’ efforts to deal with electronic civil complaints and solve urban problems, civil complaints have been on the increase year by year. In this study, we collected civil complaint data over the last six years from a small and medium-sized city, Jinju-si. In order to conduct a spatial distribution pattern analysis, we indicated the location data on the area through Geocoding after classifying the reasons for civil complaints and then extracted the location data of the civil complaint occurrence spots in order to analyze the correlation between electronic civil complaints and land use. Results demonstrated that electronic civil complaints in Jinju-si were clustered in residential, central commercial, and residential-industrial mixed-use areas—areas where land development had been completed within the city center. After analyzing the civil complaints according to the land use, results revealed that complaints about illegal parking were the highest. Regarding the analysis results of facility distribution within a 50m radius from the civil complaint areas, civil complaints occurred a lot in detached housing areas located within the commercial and residential-industrial mixed-use areas. In the case of residential areas(old downtown), civil complaints were condensed in the areas with many ordinary restaurants. This research explored civil complaints in terms of the urban space and can be expected to be effectively utilized in finding solutions to the civil complaints

The genesis of Ulsan carbonate rocks: a possibility of carbonatite\ulcorner (울산 광산에 분포하는 탄산염암체의 성인에 관한 연구: 카보내타이트의 가능성)

  • 양경희;황진연;옥수석
    • The Journal of the Petrological Society of Korea
    • /
    • v.10 no.1
    • /
    • pp.1-12
    • /
    • 2001
  • A small of carbonate rocks and spatially-associated ultramafic rocks uniquely occur in the ulsan iron-serpentine mine of the sourtheastern Kyungsang basin. The study of field geology, core drilling data and stable isotope analysis suggest that the carbonate rocks are carbonatite formed from the melt reflecting intrusive natures. Based on this study, the geology of the Ulsan iron-serpentinite mining area consists of Cretaceous sedimentary, volcanic, granitic ultramafic and carbonate rocks in ascending order. The carbonate and ultramafic rocks show concentric and ellipsoidal shapes at the outcrop and a funnel shape in the cross sectional view. Carbon and oxygen stable isotope analysis show a bimodal pattern rather than a typical mantle pattern, which may indicate that the melt was a secondary melt generated within the crus not in the mantle directly. The uprising of ultramafic melts would have melted lime-contained rocks forming a secondary carbonate melt in the upper crus. Then, the intrusion of the ultramafic melts would have melted lime-contained rocks forming a secondary carbonate melt in the upper crust. Then, the intrusion of the ultramafic melt was followed by the intrusion of the carbonate melt along deep-seated fractures. Well-developed major fractures in this area, fluid inclusion characteristics of the carbonate rocks, the spatial relation between the ultramafic and carbonate rocks and stable isotope data support interpreting the Ulsan carbonate rocks as carbonatite.

  • PDF

Development of Intelligent Job Classification System based on Job Posting on Job Sites (구인구직사이트의 구인정보 기반 지능형 직무분류체계의 구축)

  • Lee, Jung Seung
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.123-139
    • /
    • 2019
  • The job classification system of major job sites differs from site to site and is different from the job classification system of the 'SQF(Sectoral Qualifications Framework)' proposed by the SW field. Therefore, a new job classification system is needed for SW companies, SW job seekers, and job sites to understand. The purpose of this study is to establish a standard job classification system that reflects market demand by analyzing SQF based on job offer information of major job sites and the NCS(National Competency Standards). For this purpose, the association analysis between occupations of major job sites is conducted and the association rule between SQF and occupation is conducted to derive the association rule between occupations. Using this association rule, we proposed an intelligent job classification system based on data mapping the job classification system of major job sites and SQF and job classification system. First, major job sites are selected to obtain information on the job classification system of the SW market. Then We identify ways to collect job information from each site and collect data through open API. Focusing on the relationship between the data, filtering only the job information posted on each job site at the same time, other job information is deleted. Next, we will map the job classification system between job sites using the association rules derived from the association analysis. We will complete the mapping between these market segments, discuss with the experts, further map the SQF, and finally propose a new job classification system. As a result, more than 30,000 job listings were collected in XML format using open API in 'WORKNET,' 'JOBKOREA,' and 'saramin', which are the main job sites in Korea. After filtering out about 900 job postings simultaneously posted on multiple job sites, 800 association rules were derived by applying the Apriori algorithm, which is a frequent pattern mining. Based on 800 related rules, the job classification system of WORKNET, JOBKOREA, and saramin and the SQF job classification system were mapped and classified into 1st and 4th stages. In the new job taxonomy, the first primary class, IT consulting, computer system, network, and security related job system, consisted of three secondary classifications, five tertiary classifications, and five fourth classifications. The second primary classification, the database and the job system related to system operation, consisted of three secondary classifications, three tertiary classifications, and four fourth classifications. The third primary category, Web Planning, Web Programming, Web Design, and Game, was composed of four secondary classifications, nine tertiary classifications, and two fourth classifications. The last primary classification, job systems related to ICT management, computer and communication engineering technology, consisted of three secondary classifications and six tertiary classifications. In particular, the new job classification system has a relatively flexible stage of classification, unlike other existing classification systems. WORKNET divides jobs into third categories, JOBKOREA divides jobs into second categories, and the subdivided jobs into keywords. saramin divided the job into the second classification, and the subdivided the job into keyword form. The newly proposed standard job classification system accepts some keyword-based jobs, and treats some product names as jobs. In the classification system, not only are jobs suspended in the second classification, but there are also jobs that are subdivided into the fourth classification. This reflected the idea that not all jobs could be broken down into the same steps. We also proposed a combination of rules and experts' opinions from market data collected and conducted associative analysis. Therefore, the newly proposed job classification system can be regarded as a data-based intelligent job classification system that reflects the market demand, unlike the existing job classification system. This study is meaningful in that it suggests a new job classification system that reflects market demand by attempting mapping between occupations based on data through the association analysis between occupations rather than intuition of some experts. However, this study has a limitation in that it cannot fully reflect the market demand that changes over time because the data collection point is temporary. As market demands change over time, including seasonal factors and major corporate public recruitment timings, continuous data monitoring and repeated experiments are needed to achieve more accurate matching. The results of this study can be used to suggest the direction of improvement of SQF in the SW industry in the future, and it is expected to be transferred to other industries with the experience of success in the SW industry.

Analysis on Relation between Rehabilitation Training Movement and Muscle Activation using Weighted Association Rule Discovery (가중연관규칙 탐사를 이용한 재활훈련운동과 근육 활성의 연관성 분석)

  • Lee, Ah-Reum;Piao, Youn-Jun;Kwon, Tae-Kyu;Kim, Jung-Ja
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.46 no.6
    • /
    • pp.7-17
    • /
    • 2009
  • The precise analysis of exercise data for designing an effective rehabilitation system is very important as a feedback for planing the next exercising step. Many subjective and reliable research outcomes that were obtained by analysis and evaluation for the human motor ability by various methods of biomechanical experiments have been introduced. Most of them include quantitative analysis based on basic statistical methods, which are not practical enough for application to real clinical problems. In this situation, data mining technology can be a promising approach for clinical decision support system by discovering meaningful hidden rules and patterns from large volume of data obtained from the problem domain. In this research, in order to find relational rules between posture training type and muscle activation pattern, we investigated an application of the WAR(Weishted Association Rule) to the biomechanical data obtained mainly for evaluation of postural control ability. The discovered rules can be used as a quantitative prior knowledge for expert's decision making for rehabilitation plan. The discovered rules can be used as a more qualitative and useful priori knowledge for the rehabilitation and clinical expert's decision-making, and as a index for planning an optimal rehabilitation exercise model for a patient.

Exploration of relationship between confirmation measures and association thresholds (기준 확인 측도와 연관성 평가기준과의 관계 탐색)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.4
    • /
    • pp.835-845
    • /
    • 2013
  • Association rule of data mining techniques is the method to quantify the relevance between a set of items in a big database, andhas been applied in various fields like manufacturing industry, shopping mall, healthcare, insurance, and education. Philosophers of science have proposed interestingness measures for various kinds of patterns, analyzed their theoretical properties, evaluated them empirically, and suggested strategies to select appropriate measures for particular domains and requirements. Such interestingness measures are divided into objective, subjective, and semantic measures. Objective measures are based on data used in the discovery process and are typically motivated by statistical considerations. Subjective measures take into account not only the data but also the knowledge and interests of users who examine the pattern, while semantic measures additionally take into account utility and actionability. In a very different context, researchers have devoted a lot of attention to measures of confirmation or evidential support. The focus in this paper was on asymmetric confirmation measures, and we compared confirmation measures with basic association thresholds using some simulation data. As the result, we could distinguish the direction of association rule by confirmation measures, and interpret degree of association operationally by them. Futhermore, the result showed that the measure by Rips and that by Kemeny and Oppenheim were better than other confirmation measures.

An Analysis of the Research Trends for Urban Study using Topic Modeling (토픽모델링을 이용한 도시 분야 연구동향 분석)

  • Jang, Sun-Young;Jung, Seunghyun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.22 no.3
    • /
    • pp.661-670
    • /
    • 2021
  • Research trends can be usefully used to determine the importance of research topics by period, identify insufficient research fields, and discover new fields. In this study, research trends of urban spaces, where various problems are occurring due to population concentration and urbanization, were analyzed by topic modeling. The analysis target was the abstracts of papers listed in the Korea Citation Index (KCI) published between 2002 and 2019. Topic modeling is an algorithm-based text mining technique that can discover a certain pattern in the entire content, and it is easy to cluster. In this study, the frequency of keywords, trends by year, topic derivation, cluster by topic, and trend by topic type were analyzed. Research in urban regeneration is increasing continuously, and it was analyzed as a field where detailed topics could be expanded in the future. Furthermore, urban regeneration is now becoming a regular research field. On the other hand, topics related to development/growth and energy/environment have entered a stagnation period. This study is meaningful because the correlation and trends between keywords were analyzed using topic modeling targeting all domestic urban studies.

The Evaluation for Web Mining and Analytics Service from the View of Personal Information Protection and Privacy (개인정보보호 관점에서의 웹 트래픽 수집 및 분석 서비스에 대한 타당성 연구)

  • Kang, Daniel;Shim, Mi-Na;Bang, Je-Wan;Lee, Sang-Jin;Lim, Jong-In
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.19 no.6
    • /
    • pp.121-134
    • /
    • 2009
  • Consumer-centric marketing business is surely one of the most successful emerging business but it poses a threat to personal privacy. Between the service provider and the user there are many contrary issues to each other. The enterprise asserts that to abuse the privacy data which is anonymous there is not a problem. The individual only will not be able to willingly submit the problem which is latent. Web traffic analysis technology itself doesn't create issues, but this technology when used on data of personal nature might cause concerns. The most criticized ethical issue involving web traffic analysis is the invasion of privacy. So we need to inspect how many and what kind of personal informations being used and if there is any illegal treatment of personal information. In this paper, we inspect the operation of consumer-centric marketing tools such as web log analysis solutions and data gathering services with web browser toolbar. Also we inspect Microsoft explorer-based toolbar application which records and analyzes personal web browsing pattern through reverse engineering technology. Finally, this identified and explored security and privacy requirement issues to develop more reliable solutions. This study is very important for the balanced development with personal privacy protection and web traffic analysis industry.

Development of Market Growth Pattern Map Based on Growth Model and Self-organizing Map Algorithm: Focusing on ICT products (자기조직화 지도를 활용한 성장모형 기반의 시장 성장패턴 지도 구축: ICT제품을 중심으로)

  • Park, Do-Hyung;Chung, Jaekwon;Chung, Yeo Jin;Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.1-23
    • /
    • 2014
  • Market forecasting aims to estimate the sales volume of a product or service that is sold to consumers for a specific selling period. From the perspective of the enterprise, accurate market forecasting assists in determining the timing of new product introduction, product design, and establishing production plans and marketing strategies that enable a more efficient decision-making process. Moreover, accurate market forecasting enables governments to efficiently establish a national budget organization. This study aims to generate a market growth curve for ICT (information and communication technology) goods using past time series data; categorize products showing similar growth patterns; understand markets in the industry; and forecast the future outlook of such products. This study suggests the useful and meaningful process (or methodology) to identify the market growth pattern with quantitative growth model and data mining algorithm. The study employs the following methodology. At the first stage, past time series data are collected based on the target products or services of categorized industry. The data, such as the volume of sales and domestic consumption for a specific product or service, are collected from the relevant government ministry, the National Statistical Office, and other relevant government organizations. For collected data that may not be analyzed due to the lack of past data and the alteration of code names, data pre-processing work should be performed. At the second stage of this process, an optimal model for market forecasting should be selected. This model can be varied on the basis of the characteristics of each categorized industry. As this study is focused on the ICT industry, which has more frequent new technology appearances resulting in changes of the market structure, Logistic model, Gompertz model, and Bass model are selected. A hybrid model that combines different models can also be considered. The hybrid model considered for use in this study analyzes the size of the market potential through the Logistic and Gompertz models, and then the figures are used for the Bass model. The third stage of this process is to evaluate which model most accurately explains the data. In order to do this, the parameter should be estimated on the basis of the collected past time series data to generate the models' predictive value and calculate the root-mean squared error (RMSE). The model that shows the lowest average RMSE value for every product type is considered as the best model. At the fourth stage of this process, based on the estimated parameter value generated by the best model, a market growth pattern map is constructed with self-organizing map algorithm. A self-organizing map is learning with market pattern parameters for all products or services as input data, and the products or services are organized into an $N{\times}N$ map. The number of clusters increase from 2 to M, depending on the characteristics of the nodes on the map. The clusters are divided into zones, and the clusters with the ability to provide the most meaningful explanation are selected. Based on the final selection of clusters, the boundaries between the nodes are selected and, ultimately, the market growth pattern map is completed. The last step is to determine the final characteristics of the clusters as well as the market growth curve. The average of the market growth pattern parameters in the clusters is taken to be a representative figure. Using this figure, a growth curve is drawn for each cluster, and their characteristics are analyzed. Also, taking into consideration the product types in each cluster, their characteristics can be qualitatively generated. We expect that the process and system that this paper suggests can be used as a tool for forecasting demand in the ICT and other industries.