• Title/Summary/Keyword: pattern mining

Search Result 621, Processing Time 0.026 seconds

Keyword Network Analysis for Technology Forecasting (기술예측을 위한 특허 키워드 네트워크 분석)

  • Choi, Jin-Ho;Kim, Hee-Su;Im, Nam-Gyu
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.227-240
    • /
    • 2011
  • New concepts and ideas often result from extensive recombination of existing concepts or ideas. Both researchers and developers build on existing concepts and ideas in published papers or registered patents to develop new theories and technologies that in turn serve as a basis for further development. As the importance of patent increases, so does that of patent analysis. Patent analysis is largely divided into network-based and keyword-based analyses. The former lacks its ability to analyze information technology in details while the letter is unable to identify the relationship between such technologies. In order to overcome the limitations of network-based and keyword-based analyses, this study, which blends those two methods, suggests the keyword network based analysis methodology. In this study, we collected significant technology information in each patent that is related to Light Emitting Diode (LED) through text mining, built a keyword network, and then executed a community network analysis on the collected data. The results of analysis are as the following. First, the patent keyword network indicated very low density and exceptionally high clustering coefficient. Technically, density is obtained by dividing the number of ties in a network by the number of all possible ties. The value ranges between 0 and 1, with higher values indicating denser networks and lower values indicating sparser networks. In real-world networks, the density varies depending on the size of a network; increasing the size of a network generally leads to a decrease in the density. The clustering coefficient is a network-level measure that illustrates the tendency of nodes to cluster in densely interconnected modules. This measure is to show the small-world property in which a network can be highly clustered even though it has a small average distance between nodes in spite of the large number of nodes. Therefore, high density in patent keyword network means that nodes in the patent keyword network are connected sporadically, and high clustering coefficient shows that nodes in the network are closely connected one another. Second, the cumulative degree distribution of the patent keyword network, as any other knowledge network like citation network or collaboration network, followed a clear power-law distribution. A well-known mechanism of this pattern is the preferential attachment mechanism, whereby a node with more links is likely to attain further new links in the evolution of the corresponding network. Unlike general normal distributions, the power-law distribution does not have a representative scale. This means that one cannot pick a representative or an average because there is always a considerable probability of finding much larger values. Networks with power-law distributions are therefore often referred to as scale-free networks. The presence of heavy-tailed scale-free distribution represents the fundamental signature of an emergent collective behavior of the actors who contribute to forming the network. In our context, the more frequently a patent keyword is used, the more often it is selected by researchers and is associated with other keywords or concepts to constitute and convey new patents or technologies. The evidence of power-law distribution implies that the preferential attachment mechanism suggests the origin of heavy-tailed distributions in a wide range of growing patent keyword network. Third, we found that among keywords that flew into a particular field, the vast majority of keywords with new links join existing keywords in the associated community in forming the concept of a new patent. This finding resulted in the same outcomes for both the short-term period (4-year) and long-term period (10-year) analyses. Furthermore, using the keyword combination information that was derived from the methodology suggested by our study enables one to forecast which concepts combine to form a new patent dimension and refer to those concepts when developing a new patent.

Corporate Credit Rating based on Bankruptcy Probability Using AdaBoost Algorithm-based Support Vector Machine (AdaBoost 알고리즘기반 SVM을 이용한 부실 확률분포 기반의 기업신용평가)

  • Shin, Taek-Soo;Hong, Tae-Ho
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.25-41
    • /
    • 2011
  • Recently, support vector machines (SVMs) are being recognized as competitive tools as compared with other data mining techniques for solving pattern recognition or classification decision problems. Furthermore, many researches, in particular, have proved them more powerful than traditional artificial neural networks (ANNs) (Amendolia et al., 2003; Huang et al., 2004, Huang et al., 2005; Tay and Cao, 2001; Min and Lee, 2005; Shin et al., 2005; Kim, 2003).The classification decision, such as a binary or multi-class decision problem, used by any classifier, i.e. data mining techniques is so cost-sensitive particularly in financial classification problems such as the credit ratings that if the credit ratings are misclassified, a terrible economic loss for investors or financial decision makers may happen. Therefore, it is necessary to convert the outputs of the classifier into wellcalibrated posterior probabilities-based multiclass credit ratings according to the bankruptcy probabilities. However, SVMs basically do not provide such probabilities. So it required to use any method to create the probabilities (Platt, 1999; Drish, 2001). This paper applied AdaBoost algorithm-based support vector machines (SVMs) into a bankruptcy prediction as a binary classification problem for the IT companies in Korea and then performed the multi-class credit ratings of the companies by making a normal distribution shape of posterior bankruptcy probabilities from the loss functions extracted from the SVMs. Our proposed approach also showed that their methods can minimize the misclassification problems by adjusting the credit grade interval ranges on condition that each credit grade for credit loan borrowers has its own credit risk, i.e. bankruptcy probability.

Real-time CRM Strategy of Big Data and Smart Offering System: KB Kookmin Card Case (KB국민카드의 빅데이터를 활용한 실시간 CRM 전략: 스마트 오퍼링 시스템)

  • Choi, Jaewon;Sohn, Bongjin;Lim, Hyuna
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.1-23
    • /
    • 2019
  • Big data refers to data that is difficult to store, manage, and analyze by existing software. As the lifestyle changes of consumers increase the size and types of needs that consumers desire, they are investing a lot of time and money to understand the needs of consumers. Companies in various industries utilize Big Data to improve their products and services to meet their needs, analyze unstructured data, and respond to real-time responses to products and services. The financial industry operates a decision support system that uses financial data to develop financial products and manage customer risks. The use of big data by financial institutions can effectively create added value of the value chain, and it is possible to develop a more advanced customer relationship management strategy. Financial institutions can utilize the purchase data and unstructured data generated by the credit card, and it becomes possible to confirm and satisfy the customer's desire. CRM has a granular process that can be measured in real time as it grows with information knowledge systems. With the development of information service and CRM, the platform has change and it has become possible to meet consumer needs in various environments. Recently, as the needs of consumers have diversified, more companies are providing systematic marketing services using data mining and advanced CRM (Customer Relationship Management) techniques. KB Kookmin Card, which started as a credit card business in 1980, introduced early stabilization of processes and computer systems, and actively participated in introducing new technologies and systems. In 2011, the bank and credit card companies separated, leading the 'Hye-dam Card' and 'One Card' markets, which were deviated from the existing concept. In 2017, the total use of domestic credit cards and check cards grew by 5.6% year-on-year to 886 trillion won. In 2018, we received a long-term rating of AA + as a result of our credit card evaluation. We confirmed that our credit rating was at the top of the list through effective marketing strategies and services. At present, Kookmin Card emphasizes strategies to meet the individual needs of customers and to maximize the lifetime value of consumers by utilizing payment data of customers. KB Kookmin Card combines internal and external big data and conducts marketing in real time or builds a system for monitoring. KB Kookmin Card has built a marketing system that detects realtime behavior using big data such as visiting the homepage and purchasing history by using the customer card information. It is designed to enable customers to capture action events in real time and execute marketing by utilizing the stores, locations, amounts, usage pattern, etc. of the card transactions. We have created more than 280 different scenarios based on the customer's life cycle and are conducting marketing plans to accommodate various customer groups in real time. We operate a smart offering system, which is a highly efficient marketing management system that detects customers' card usage, customer behavior, and location information in real time, and provides further refinement services by combining with various apps. This study aims to identify the traditional CRM to the current CRM strategy through the process of changing the CRM strategy. Finally, I will confirm the current CRM strategy through KB Kookmin card's big data utilization strategy and marketing activities and propose a marketing plan for KB Kookmin card's future CRM strategy. KB Kookmin Card should invest in securing ICT technology and human resources, which are becoming more sophisticated for the success and continuous growth of smart offering system. It is necessary to establish a strategy for securing profit from a long-term perspective and systematically proceed. Especially, in the current situation where privacy violation and personal information leakage issues are being addressed, efforts should be made to induce customers' recognition of marketing using customer information and to form corporate image emphasizing security.

Automatic Extraction of Opinion Words from Korean Product Reviews Using the k-Structure (k-Structure를 이용한 한국어 상품평 단어 자동 추출 방법)

  • Kang, Han-Hoon;Yoo, Seong-Joon;Han, Dong-Il
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.6
    • /
    • pp.470-479
    • /
    • 2010
  • In relation to the extraction of opinion words, it may be difficult to directly apply most of the methods suggested in existing English studies to the Korean language. Additionally, the manual method suggested by studies in Korea poses a problem with the extraction of opinion words in that it takes a long time. In addition, English thesaurus-based extraction of Korean opinion words leaves a challenge to reconsider the deterioration of precision attributed to the one to one mismatching between Korean and English words. Studies based on Korean phrase analyzers may potentially fail due to the fact that they select opinion words with a low level of frequency. Therefore, this study will suggest the k-Structure (k=5 or 8) method, which may possibly improve the precision while mutually complementing existing studies in Korea, in automatically extracting opinion words from a simple sentence in a given Korean product review. A simple sentence is defined to be composed of at least 3 words, i.e., a sentence including an opinion word in ${\pm}2$ distance from the attribute name (e.g., the 'battery' of a camera) of a evaluated product (e.g., a 'camera'). In the performance experiment, the precision of those opinion words for 8 previously given attribute names were automatically extracted and estimated for 1,868 product reviews collected from major domestic shopping malls, by using k-Structure. The results showed that k=5 led to a recall of 79.0% and a precision of 87.0%; while k=8 led to a recall of 92.35% and a precision of 89.3%. Also, a test was conducted using PMI-IR (Pointwise Mutual Information - Information Retrieval) out of those methods suggested in English studies, which resulted in a recall of 55% and a precision of 57%.

Science and Technology Policy Studies, Society, and the State : An Analysis of a Co-evolution Among Social Issue, Governmental Policy, and Academic Research in Science and Technology (과학기술정책 연구와 사회, 정부 : 과학기술의 사회이슈, 정부정책, 학술연구의 공진화 분석)

  • Kwon, Ki-Seok;Jeong, Seohwa;Yi, Chan-Goo
    • Journal of Korea Technology Innovation Society
    • /
    • v.21 no.1
    • /
    • pp.64-91
    • /
    • 2018
  • This study explores the interactive pattern among social issue, academic research, and governmental policy on science and technology during the last 20 years. In particular, we try understand wether the science and technology policy research and governmental policy meets social needs appropriately. In order to do this, we have collected text data from news articles, papers, and governmental documents. Based on these data, social network analysis and cluster analysis has been carried out. According to the results, we have found that science and technology policy researches tend to focus on fragmented technological innovation meeting urgent practical needs at the initial stage. However, recently, the main characteristics of science and technology policy research shows co-evolutionary patterns responding to society. Furthermore, time lag also has been observed in the process of interaction among the three bodies. Based on these results, we put forward some suggestions for upcoming researches in science and technology policy. Firstly, analysis levels are needed to be shifted from micro level to mezo or macro level. Secondly, more research efforts are required to be focused on policy process in science technology and its public management. Finally, we have to enhance the sensitiveness to social issues through studies on agenda setting in science and technology policy.

Design and Analysis of Efficient Operation Sequencing in FMC Robot Using Simulation and Sequential Patterns (시뮬레이션과 순차 패턴을 이용한 FMC 로봇의 효율적 작업 순서 설계 및 분석)

  • Kim, Sun-Gil;Kim, Youn-Jin;Lee, Hong-Chul
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.6
    • /
    • pp.2021-2029
    • /
    • 2010
  • This paper suggested the method to design and analyze FMC robot's dispatching rule using the Simulation and Sequential Patterns. To do this, first of all, we built FMC using simulation and then, extracted signals that facilities call a robot, saved it as the log type. Secondly, we built robot's optimal path using the Sequential Pattern Mining with the results of analyzing the log and relationship between machine and robot actions. Lastly, we adapted it to the A corp.'s manufacturing line for verifying its performance. As a result of applying the new dispatching rule in FMC, total throughput and total flow time decrease because of decreasing material loss time and increasing robot utility. Furthermore, because this method can be applied for every manufacturing plant using simulation, it can contribute to advance total FMC efficiency as well.

Spacio-temporal Analysis of Urban Population Exposure to Traffic-Related air Pollution (교통흐름에 기인하는 미세먼지 노출 도시인구에 대한 시.공간적 분석)

  • Lee, Keum-Sook
    • Journal of the Economic Geographical Society of Korea
    • /
    • v.11 no.1
    • /
    • pp.59-77
    • /
    • 2008
  • The purpose of this study is to investigate the impact of traffic-related air pollution on the urban population in the Metropolitan Seoul area. In particular, this study analyzes urban population exposure to traffic-related particulate materials(PM). For the purpose, this study examines the relationships between traffic flows and PM concentration levels during the last fifteen years. Traffic volumes have been decreased significantly in recent year in Seoul, however, PM levels have been declined less compare to traffic volumes. It may be related with the rapid growth in the population and vehicle numbers in Gyenggi, the outskirt of Seoul, where several New Towns have been developed in the middle of 1990's. The spatial pattern of commuting has changed, and thus and travel distances and traffic volumes have increased along the main roads connecting CBDs in Seoul and New Towns consisting of large residential apartment complexes. These changes in traffic flows and travel behaviors cause increasing exposure to traffic-related air pollution for urban population over the Metropolitan Seoul area. GIS techniques are applied to analyze the spatial patterns of traffic flows, population distributions, PM distributions, and passenger flows comprehensively. This study also analyzes real time base traffic flow data and passenger flow data obtained from T-card transaction database applying data mining techniques. This study also attempts to develop a space-time model for assessing journey-time exposure to traffic related air pollutants based on travel passenger frequency distribution function. The results of this study can be used for the implications for sustainable transport systems, public health and transportation policy by reducing urban air pollution and road traffics in the Metropolitan Seoul area.

  • PDF

Investigation of Subsurface Structure of Cheju Island by Gravity and Magnetic Methods (중력 및 자력 탐사에 의한 제주도 지질구조 연구)

  • Kwon, Byung-Doo;Lee, Heui-Soon;Jung, Gwi-Geum;Chung, Seung-Whan
    • Economic and Environmental Geology
    • /
    • v.28 no.4
    • /
    • pp.395-404
    • /
    • 1995
  • The geologic structure of the Cheju volcanic island has been investigated by analyzing the gravity and magnetic data. Bouguer gravity map shows apparent circular low anomalies at the central volacanic edifice, and the maximum difference of the anomaly values on the island appears to be 30 mgal. The subsurface structure of the island is modeled by three-dimensional depth inversion of gravity data by assuming the model consists of a stacked grid of rectangular prisms of volcanic rocks bounded below by basement rocks. The gravity modeling reveals that the interface between upper volvanic rocks and underlying basement warps downward under Mt. Halla with the maximum depth of 5 km. Magnetic data involve aeromagnetic and surface magnetic survey data. Both magnetic anomaly maps show characteristic features which resemble the typical pattern of total magnetic anomalies caused by a magnetic body magnetized in the direction of the geomagnetic field in the middle latitude region, though details of two maps are somewhat different. The reduced-to-pole magnetic anomaly maps reveal that main magnetic sources in the island are rift zones and the Halla volcanic edifice. The apparent magnetic boundaries inferred by the method of Cordell and Grauch (1985) are relatively well matched with known geologic boundaries such as that of Pyosunri basalt and Sihungri basalt which form the latest erupted masses. Inversion of aeromagnetic data was conducted with two variables: depth and susceptibility. The inversion results show high susceptibility bodies in rift zones along the long axis of the island, and at the central volcano. Depths to the basement are 1.5~3 km under the major axis, 1~1.5 km under the lava plateau and culminates at about 5 km under Mt. Halla. The prominent anomalies showing N-S trending appear in the eastern part of both gravity and magnetic maps. It is speculated that this trend may be associated with an undefined fault developed across the rift zones.

  • PDF

Sources and Distributions of Dissolved Organic Matter by Fluorescence Method in the Northeastern Pacific Ocean (북동태평양에서 형광 기법을 이용한 용존유기물의 기원 및 분포)

  • Son, Ju-Won;Son, Seung-Kyu;Ju, Se-Jong;Kim, Kyeong-Hong;Kim, Woong-Seo;Park, Yong-Chul
    • Ocean and Polar Research
    • /
    • v.29 no.2
    • /
    • pp.87-99
    • /
    • 2007
  • This study was conducted to understand the source and behavior of organic matter using the fluorescent technique (excitation-emission matrix) as a part of environmental monitoring program in the Korea manganese nodule mining site in the Northeastern Pacific Ocean. Water samples were collected at $0^{\circ},\;6^{\circ}N$, and $10.5^{\circ}N$ along $131.5^{\circ}W$ in August 2005. The concentration of total organic carbon (TOC) ranged from 58.01 to $171.93\;{\mu}M-C$. The vertical distribution of TOC was characterized as higher in the surface layer and decreased with depth. At $6^{\circ}N$, depth-integrated (from surface to 200 m depth) TOC was $337.1\;gC/m^2$, which was 1.4 times higher value than other stations. The exponential decay curve fit of vertical profile of TOC indicated that 59% of organic carbon produced by primary production in the surface layer could be decomposed by bacteria in the water column. Dissolved organic matter is generally classified into two distinctive groups based on their fluorescence characteristics using three-dimensional excitation/emission (Ex/Em) fluorescence mapping technique. One is known as biomacromolecule (BM; protein-like substance; showing max. at Ex 280/Em 330), mainly originated from biological metabolism. The other is geomacromolecule (GM; humic-like substance; showing max. at Ex 330/Em 430), mainly originated from microbial degradation processes. The concentration of BM and GM was from 0.42 to 7.29 TU (tryptophan unit) and from 0.06 to 1.81 QSU (quinine sulfate unit), respectively. The vertical distribution of BM was similar to that of TOC as high in the surface and decreased with depth. However, the vertical distribution of GM showed the reverse pattern of that of BM. From these results, it appeared that BM occupied a major part of TOC and was rapidly consumed by bacteria in the surface layer. GM was mainly transformed from BM by microbial processes and was a dominant component of TOC in the deep-sea layer.

Neutron Diffraction Study on the Crystal Structure of Yttria-Stabilized Zirconium Oxide (중성자회절법을 이용한 이트리아 저코니아의 결정구조 연구)

  • Jin-Ho Lee;Chang-Hee Lee;Won-Sa Kim
    • Journal of the Mineralogical Society of Korea
    • /
    • v.13 no.3
    • /
    • pp.164-170
    • /
    • 2000
  • Neutron single crystal and powder diffraction techniques have been applied to the structure analysis of yttria-stabilized zirconium, Z $r_{0.73}$ $Y_{0.27}$ $O_{1.87}$., prepared by the skull-melting method. The crystal structure has been determined to be cubic symmetry, space group Fm/equation omitted/ with a=5.155(2)$\AA$, V=136.99(5)$\AA$, Z=4, and R(F)=5.65%, $\omega$R(I)=10.57% for 70 integrated intensities of Bragg Peaks observed from single crystal of Z $r_{0.73}$ $Y_{0.27}$ $O_{1.87}$. The stabilizer atoms randomly occupy the zirconium sites and there are displacements of oxygen atoms with amplitudes of $\Delta$/a~0.033 and 0.11 along <110> and <100> directions from the ideal positions of the fluorite structure, respectively. There are no significant differences in crystallographic data between the single crystal and powder studies. Diffraction pattern after Rietveld refinement, using neutron powder data, has shown the evidence of a tetragonal impurity phase, or a slight tetragonal distortion.

  • PDF