• Title/Summary/Keyword: 실제 시스템

Search Result 11,664, Processing Time 0.04 seconds

Prediction of commitment and persistence in heterosexual involvements according to the styles of loving using a datamining technique (데이터마이닝을 활용한 사랑의 형태에 따른 연인관계 몰입수준 및 관계 지속여부 예측)

  • Park, Yoon-Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.69-85
    • /
    • 2016
  • Successful relationship with loving partners is one of the most important factors in life. In psychology, there have been some previous researches studying the factors influencing romantic relationships. However, most of these researches were performed based on statistical analysis; thus they have limitations in analyzing complex non-linear relationships or rules based reasoning. This research analyzes commitment and persistence in heterosexual involvement according to styles of loving using a datamining technique as well as statistical methods. In this research, we consider six different styles of loving - 'eros', 'ludus', 'stroge', 'pragma', 'mania' and 'agape' which influence romantic relationships between lovers, besides the factors suggested by the previous researches. These six types of love are defined by Lee (1977) as follows: 'eros' is romantic, passionate love; 'ludus' is a game-playing or uncommitted love; 'storge' is a slow developing, friendship-based love; 'pragma' is a pragmatic, practical, mutually beneficial relationship; 'mania' is an obsessive or possessive love and, lastly, 'agape' is a gentle, caring, giving type of love, brotherly love, not concerned with the self. In order to do this research, data from 105 heterosexual couples were collected. Using the data, a linear regression method was first performed to find out the important factors associated with a commitment to partners. The result shows that 'satisfaction', 'eros' and 'agape' are significant factors associated with the commitment level for both male and female. Interestingly, in male cases, 'agape' has a greater effect on commitment than 'eros'. On the other hand, in female cases, 'eros' is a more significant factor than 'agape' to commitment. In addition to that, 'investment' of the male is also crucial factor for male commitment. Next, decision tree analysis was performed to find out the characteristics of high commitment couples and low commitment couples. In order to build decision tree models in this experiment, 'decision tree' operator in the datamining tool, Rapid Miner was used. The experimental result shows that males having a high satisfaction level in relationship show a high commitment level. However, even though a male may not have a high satisfaction level, if he has made a lot of financial or mental investment in relationship, and his partner shows him a certain amount of 'agape', then he also shows a high commitment level to the female. In the case of female, a women having a high 'eros' and 'satisfaction' level shows a high commitment level. Otherwise, even though a female may not have a high satisfaction level, if her partner shows a certain amount of 'mania' then the female also shows a high commitment level. Finally, this research built a prediction model to establish whether the relationship will persist or break up using a decision tree. The result shows that the most important factor influencing to the break up is a 'narcissistic tendency' of the male. In addition to that, 'satisfaction', 'investment' and 'mania' of both male and female also affect a break up. Interestingly, while the 'mania' level of a male works positively to maintain the relationship, that of a female has a negative influence. The contribution of this research is adopting a new technique of analysis using a datamining method for psychology. In addition, the results of this research can provide useful advice to couples for building a harmonious relationship with each other. This research has several limitations. First, the experimental data was sampled based on oversampling technique to balance the size of each classes. Thus, it has a limitation of evaluating performances of the predictive models objectively. Second, the result data, whether the relationship persists of not, was collected relatively in short periods - 6 months after the initial data collection. Lastly, most of the respondents of the survey is in their 20's. In order to get more general results, we would like to extend this research to general populations.

Determinants of Mobile Application Use: A Study Focused on the Correlation between Application Categories (모바일 앱 사용에 영향을 미치는 요인에 관한 연구: 앱 카테고리 간 상관관계를 중심으로)

  • Park, Sangkyu;Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.157-176
    • /
    • 2016
  • For a long time, mobile phone had a sole function of communication. Recently however, abrupt innovations in technology allowed extension of the sphere in mobile phone activities. Development of technology enabled realization of almost computer-like environment even on a very small device. Such advancement yielded several forms of new high-tech devices such as smartphone and tablet PC, which quickly proliferated. Simultaneously with the diffusion of the mobile devices, mobile applications for those devices also prospered and soon became deeply penetrated in consumers' daily lives. Numerous mobile applications have been released in app stores yielding trillions of cumulative downloads. However, a big majority of the applications are disregarded from consumers. Even after the applications are purchased, they do not survive long in consumers' mobile devices and are soon abandoned. Nevertheless, it is imperative for both app developers and app-store operators to understand consumer behaviors and to develop marketing strategies aiming to make sustainable business by first increasing sales of mobile applications and by also designing surviving strategy for applications. Therefore, this research analyzes consumers' mobile application usage behavior in a frame of substitution/supplementary of application categories and several explanatory variables. Considering that consumers of mobile devices use multiple apps simultaneously, this research adopts multivariate probit models to explain mobile application usage behavior and to derive correlation between categories of applications for observing substitution/supplementary of application use. The research adopts several explanatory variables including sociodemographic data, user experiences of purchased applications that reflect future purchasing behavior of paid applications as well as consumer attitudes toward marketing efforts, variables representing consumer attitudes toward rating of the app and those representing consumer attitudes toward app-store promotion efforts (i.e., top developer badge and editor's choice badge). Results of this study can be explained in hedonic and utilitarian framework. Consumers who use hedonic applications, such as those of game and entertainment-related, are of young age with low education level. However, consumers who are old and have received higher education level prefer utilitarian application category such as life, information etc. There are disputable arguments over whether the users of SNS are hedonic or utilitarian. In our results, consumers who are younger and those with higher education level prefer using SNS category applications, which is in a middle of utilitarian and hedonic results. Also, applications that are directly related to tangible assets, such as banking, stock and mobile shopping, are only negatively related to experience of purchasing of paid app, meaning that consumers who put weights on tangible assets do not prefer buying paid application. Regarding categories, most correlations among categories are significantly positive. This is because someone who spend more time on mobile devices tends to use more applications. Game and entertainment category shows significant and positive correlation; however, there exists significantly negative correlation between game and information, as well as game and e-commerce categories of applications. Meanwhile, categories of game and SNS as well as game and finance have shown no significant correlations. This result clearly shows that mobile application usage behavior is quite clearly distinguishable - that the purpose of using mobile devices are polarized into utilitarian and hedonic purpose. This research proves several arguments that can only be explained by second-hand real data, not by survey data, and offers behavioral explanations of mobile application usage in consumers' perspectives. This research also shows substitution/supplementary patterns of consumer application usage, which then explain consumers' mobile application usage behaviors. However, this research has limitations in some points. Classification of categories itself is disputable, for classification is diverged among several studies. Therefore, there is a possibility of change in results depending on the classification. Lastly, although the data are collected in an individual application level, we reduce its observation into an individual level. Further research will be done to resolve these limitations.

Evaluation of Setup Uncertainty on the CTV Dose and Setup Margin Using Monte Carlo Simulation (몬테칼로 전산모사를 이용한 셋업오차가 임상표적체적에 전달되는 선량과 셋업마진에 대하여 미치는 영향 평가)

  • Cho, Il-Sung;Kwark, Jung-Won;Cho, Byung-Chul;Kim, Jong-Hoon;Ahn, Seung-Do;Park, Sung-Ho
    • Progress in Medical Physics
    • /
    • v.23 no.2
    • /
    • pp.81-90
    • /
    • 2012
  • The effect of setup uncertainties on CTV dose and the correlation between setup uncertainties and setup margin were evaluated by Monte Carlo based numerical simulation. Patient specific information of IMRT treatment plan for rectal cancer designed on the VARIAN Eclipse planning system was utilized for the Monte Carlo simulation program including the planned dose distribution and tumor volume information of a rectal cancer patient. The simulation program was developed for the purpose of the study on Linux environment using open source packages, GNU C++ and ROOT data analysis framework. All misalignments of patient setup were assumed to follow the central limit theorem. Thus systematic and random errors were generated according to the gaussian statistics with a given standard deviation as simulation input parameter. After the setup error simulations, the change of dose in CTV volume was analyzed with the simulation result. In order to verify the conventional margin recipe, the correlation between setup error and setup margin was compared with the margin formula developed on three dimensional conformal radiation therapy. The simulation was performed total 2,000 times for each simulation input of systematic and random errors independently. The size of standard deviation for generating patient setup errors was changed from 1 mm to 10 mm with 1 mm step. In case for the systematic error the minimum dose on CTV $D_{min}^{stat{\cdot}}$ was decreased from 100.4 to 72.50% and the mean dose $\bar{D}_{syst{\cdot}}$ was decreased from 100.45% to 97.88%. However the standard deviation of dose distribution in CTV volume was increased from 0.02% to 3.33%. The effect of random error gave the same result of a reduction of mean and minimum dose to CTV volume. It was found that the minimum dose on CTV volume $D_{min}^{rand{\cdot}}$ was reduced from 100.45% to 94.80% and the mean dose to CTV $\bar{D}_{rand{\cdot}}$ was decreased from 100.46% to 97.87%. Like systematic error, the standard deviation of CTV dose ${\Delta}D_{rand}$ was increased from 0.01% to 0.63%. After calculating a size of margin for each systematic and random error the "population ratio" was introduced and applied to verify margin recipe. It was found that the conventional margin formula satisfy margin object on IMRT treatment for rectal cancer. It is considered that the developed Monte-carlo based simulation program might be useful to study for patient setup error and dose coverage in CTV volume due to variations of margin size and setup error.

Diffusion equation model for geomorphic dating (지형연대 측정을 위한 디퓨젼 공식 모델)

  • Lee, Min Boo
    • Journal of the Korean Geographical Society
    • /
    • v.28 no.4
    • /
    • pp.285-297
    • /
    • 1993
  • For the application of the diffusion equation, slope height and maximum slope angle are calculated from the plotted slope profile. Using denudation rate as a solution for the diffusion equation, an apparent age index can be calculated, which is the total amount of denudation through total time. Plots of slope angle versus slope height and apparent age index versus slope height are useful for determining relative or absolute ages and denudation rates. Mathematical simulation plots of slope angle versus slope height can generate equal denudation-rate lines for a given age. Mathematical simulations of slope angle versus age for a given slope height, for equal denudation-rate at a particular profile site, and for comparing to other sites having controlled ages.

  • PDF

A Methodology for Automatic Multi-Categorization of Single-Categorized Documents (단일 카테고리 문서의 다중 카테고리 자동확장 방법론)

  • Hong, Jin-Sung;Kim, Namgyu;Lee, Sangwon
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.3
    • /
    • pp.77-92
    • /
    • 2014
  • Recently, numerous documents including unstructured data and text have been created due to the rapid increase in the usage of social media and the Internet. Each document is usually provided with a specific category for the convenience of the users. In the past, the categorization was performed manually. However, in the case of manual categorization, not only can the accuracy of the categorization be not guaranteed but the categorization also requires a large amount of time and huge costs. Many studies have been conducted towards the automatic creation of categories to solve the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorizing complex documents with multiple topics because the methods work by assuming that one document can be categorized into one category only. In order to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, they are also limited in that their learning process involves training using a multi-categorized document set. These methods therefore cannot be applied to multi-categorization of most documents unless multi-categorized training sets are provided. To overcome the limitation of the requirement of a multi-categorized training set by traditional multi-categorization algorithms, we propose a new methodology that can extend a category of a single-categorized document to multiple categorizes by analyzing relationships among categories, topics, and documents. First, we attempt to find the relationship between documents and topics by using the result of topic analysis for single-categorized documents. Second, we construct a correspondence table between topics and categories by investigating the relationship between them. Finally, we calculate the matching scores for each document to multiple categories. The results imply that a document can be classified into a certain category if and only if the matching score is higher than the predefined threshold. For example, we can classify a certain document into three categories that have larger matching scores than the predefined threshold. The main contribution of our study is that our methodology can improve the applicability of traditional multi-category classifiers by generating multi-categorized documents from single-categorized documents. Additionally, we propose a module for verifying the accuracy of the proposed methodology. For performance evaluation, we performed intensive experiments with news articles. News articles are clearly categorized based on the theme, whereas the use of vulgar language and slang is smaller than other usual text document. We collected news articles from July 2012 to June 2013. The articles exhibit large variations in terms of the number of types of categories. This is because readers have different levels of interest in each category. Additionally, the result is also attributed to the differences in the frequency of the events in each category. In order to minimize the distortion of the result from the number of articles in different categories, we extracted 3,000 articles equally from each of the eight categories. Therefore, the total number of articles used in our experiments was 24,000. The eight categories were "IT Science," "Economy," "Society," "Life and Culture," "World," "Sports," "Entertainment," and "Politics." By using the news articles that we collected, we calculated the document/category correspondence scores by utilizing topic/category and document/topics correspondence scores. The document/category correspondence score can be said to indicate the degree of correspondence of each document to a certain category. As a result, we could present two additional categories for each of the 23,089 documents. Precision, recall, and F-score were revealed to be 0.605, 0.629, and 0.617 respectively when only the top 1 predicted category was evaluated, whereas they were revealed to be 0.838, 0.290, and 0.431 when the top 1 - 3 predicted categories were considered. It was very interesting to find a large variation between the scores of the eight categories on precision, recall, and F-score.

Game Theoretic Optimization of Investment Portfolio Considering the Performance of Information Security Countermeasure (정보보호 대책의 성능을 고려한 투자 포트폴리오의 게임 이론적 최적화)

  • Lee, Sang-Hoon;Kim, Tae-Sung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.3
    • /
    • pp.37-50
    • /
    • 2020
  • Information security has become an important issue in the world. Various information and communication technologies, such as the Internet of Things, big data, cloud, and artificial intelligence, are developing, and the need for information security is increasing. Although the necessity of information security is expanding according to the development of information and communication technology, interest in information security investment is insufficient. In general, measuring the effect of information security investment is difficult, so appropriate investment is not being practice, and organizations are decreasing their information security investment. In addition, since the types and specification of information security measures are diverse, it is difficult to compare and evaluate the information security countermeasures objectively, and there is a lack of decision-making methods about information security investment. To develop the organization, policies and decisions related to information security are essential, and measuring the effect of information security investment is necessary. Therefore, this study proposes a method of constructing an investment portfolio for information security measures using game theory and derives an optimal defence probability. Using the two-person game model, the information security manager and the attacker are assumed to be the game players, and the information security countermeasures and information security threats are assumed as the strategy of the players, respectively. A zero-sum game that the sum of the players' payoffs is zero is assumed, and we derive a solution of a mixed strategy game in which a strategy is selected according to probability distribution among strategies. In the real world, there are various types of information security threats exist, so multiple information security measures should be considered to maintain the appropriate information security level of information systems. We assume that the defence ratio of the information security countermeasures is known, and we derive the optimal solution of the mixed strategy game using linear programming. The contributions of this study are as follows. First, we conduct analysis using real performance data of information security measures. Information security managers of organizations can use the methodology suggested in this study to make practical decisions when establishing investment portfolio for information security countermeasures. Second, the investment weight of information security countermeasures is derived. Since we derive the weight of each information security measure, not just whether or not information security measures have been invested, it is easy to construct an information security investment portfolio in a situation where investment decisions need to be made in consideration of a number of information security countermeasures. Finally, it is possible to find the optimal defence probability after constructing an investment portfolio of information security countermeasures. The information security managers of organizations can measure the specific investment effect by drawing out information security countermeasures that fit the organization's information security investment budget. Also, numerical examples are presented and computational results are analyzed. Based on the performance of various information security countermeasures: Firewall, IPS, and Antivirus, data related to information security measures are collected to construct a portfolio of information security countermeasures. The defence ratio of the information security countermeasures is created using a uniform distribution, and a coverage of performance is derived based on the report of each information security countermeasure. According to numerical examples that considered Firewall, IPS, and Antivirus as information security countermeasures, the investment weights of Firewall, IPS, and Antivirus are optimized to 60.74%, 39.26%, and 0%, respectively. The result shows that the defence probability of the organization is maximized to 83.87%. When the methodology and examples of this study are used in practice, information security managers can consider various types of information security measures, and the appropriate investment level of each measure can be reflected in the organization's budget.

Experimental Analysis of Nodal Head-outflow Relationship Using a Model Water Supply Network for Pressure Driven Analysis of Water Distribution System (상수관망 압력기반 수리해석을 위한 모의 실험시설 기반 절점의 압력-유량 관계 분석)

  • Chang, Dongeil;Kang, Kihoon
    • Journal of Korean Society of Environmental Engineers
    • /
    • v.36 no.6
    • /
    • pp.421-428
    • /
    • 2014
  • For the analysis of water supply network, demand-driven and pressure-driven analysis methods have been proposed. Of the two methods, demand-driven analysis (DDA) can only be used in a normal operation condition to evaluate hydraulic status of a pipe network. Under abnormal conditions, i.e., unexpected pipe destruction, or abnormal low pressure conditions, pressure-driven analysis (PDA) method should be used to estimate the suppliable flowrate at each node in a network. In order to carry out the pressure-driven analysis, head-outflow relationship (HOR), which estimates flowrate at a certain pressure at each node, should be first determined. Most previous studies empirically suggested that each node possesses its own characteristic head-outflow relationship, which, therefore, requires verification by using actual field data for proper application in PDA modeling. In this study, a model pipe network was constructed, and various operation scenarios of normal and abnormal conditions, which cannot be realized in real pipe networks, were established. Using the model network, data on pressure and flowrate at each node were obtained at each operation condition. Using the data obtained, previously proposed HOR equations were evaluated. In addition, head-outflow relationship at each node was analyzed especially under multiple pipe destruction events. By analyzing the experimental data obtained from the model network, it was found that flowrate reduction corresponding to a certain pressure drop (by pipe destruction at one or multiple points on the network) followed intrinsic head-outflow relationship of each node. By comparing the experimentally obtained head-outflow relationship with various HOR equations proposed by previous studies, the one proposed by Wagner et al. showed the best agreement with the exponential parameter, m of 3.0.

The Intelligent Determination Model of Audience Emotion for Implementing Personalized Exhibition (개인화 전시 서비스 구현을 위한 지능형 관객 감정 판단 모형)

  • Jung, Min-Kyu;Kim, Jae-Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.1
    • /
    • pp.39-57
    • /
    • 2012
  • Recently, due to the introduction of high-tech equipment in interactive exhibits, many people's attention has been concentrated on Interactive exhibits that can double the exhibition effect through the interaction with the audience. In addition, it is also possible to measure a variety of audience reaction in the interactive exhibition. Among various audience reactions, this research uses the change of the facial features that can be collected in an interactive exhibition space. This research develops an artificial neural network-based prediction model to predict the response of the audience by measuring the change of the facial features when the audience is given stimulation from the non-excited state. To present the emotion state of the audience, this research uses a Valence-Arousal model. So, this research suggests an overall framework composed of the following six steps. The first step is a step of collecting data for modeling. The data was collected from people participated in the 2012 Seoul DMC Culture Open, and the collected data was used for the experiments. The second step extracts 64 facial features from the collected data and compensates the facial feature values. The third step generates independent and dependent variables of an artificial neural network model. The fourth step extracts the independent variable that affects the dependent variable using the statistical technique. The fifth step builds an artificial neural network model and performs a learning process using train set and test set. Finally the last sixth step is to validate the prediction performance of artificial neural network model using the validation data set. The proposed model is compared with statistical predictive model to see whether it had better performance or not. As a result, although the data set in this experiment had much noise, the proposed model showed better results when the model was compared with multiple regression analysis model. If the prediction model of audience reaction was used in the real exhibition, it will be able to provide countermeasures and services appropriate to the audience's reaction viewing the exhibits. Specifically, if the arousal of audience about Exhibits is low, Action to increase arousal of the audience will be taken. For instance, we recommend the audience another preferred contents or using a light or sound to focus on these exhibits. In other words, when planning future exhibitions, planning the exhibition to satisfy various audience preferences would be possible. And it is expected to foster a personalized environment to concentrate on the exhibits. But, the proposed model in this research still shows the low prediction accuracy. The cause is in some parts as follows : First, the data covers diverse visitors of real exhibitions, so it was difficult to control the optimized experimental environment. So, the collected data has much noise, and it would results a lower accuracy. In further research, the data collection will be conducted in a more optimized experimental environment. The further research to increase the accuracy of the predictions of the model will be conducted. Second, using changes of facial expression only is thought to be not enough to extract audience emotions. If facial expression is combined with other responses, such as the sound, audience behavior, it would result a better result.

Ontology-Based Process-Oriented Knowledge Map Enabling Referential Navigation between Knowledge (지식 간 상호참조적 네비게이션이 가능한 온톨로지 기반 프로세스 중심 지식지도)

  • Yoo, Kee-Dong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.61-83
    • /
    • 2012
  • A knowledge map describes the network of related knowledge into the form of a diagram, and therefore underpins the structure of knowledge categorizing and archiving by defining the relationship of the referential navigation between knowledge. The referential navigation between knowledge means the relationship of cross-referencing exhibited when a piece of knowledge is utilized by a user. To understand the contents of the knowledge, a user usually requires additionally information or knowledge related with each other in the relation of cause and effect. This relation can be expanded as the effective connection between knowledge increases, and finally forms the network of knowledge. A network display of knowledge using nodes and links to arrange and to represent the relationship between concepts can provide a more complex knowledge structure than a hierarchical display. Moreover, it can facilitate a user to infer through the links shown on the network. For this reason, building a knowledge map based on the ontology technology has been emphasized to formally as well as objectively describe the knowledge and its relationships. As the necessity to build a knowledge map based on the structure of the ontology has been emphasized, not a few researches have been proposed to fulfill the needs. However, most of those researches to apply the ontology to build the knowledge map just focused on formally expressing knowledge and its relationships with other knowledge to promote the possibility of knowledge reuse. Although many types of knowledge maps based on the structure of the ontology were proposed, no researches have tried to design and implement the referential navigation-enabled knowledge map. This paper addresses a methodology to build the ontology-based knowledge map enabling the referential navigation between knowledge. The ontology-based knowledge map resulted from the proposed methodology can not only express the referential navigation between knowledge but also infer additional relationships among knowledge based on the referential relationships. The most highlighted benefits that can be delivered by applying the ontology technology to the knowledge map include; formal expression about knowledge and its relationships with others, automatic identification of the knowledge network based on the function of self-inference on the referential relationships, and automatic expansion of the knowledge-base designed to categorize and store knowledge according to the network between knowledge. To enable the referential navigation between knowledge included in the knowledge map, and therefore to form the knowledge map in the format of a network, the ontology must describe knowledge according to the relation with the process and task. A process is composed of component tasks, while a task is activated after any required knowledge is inputted. Since the relation of cause and effect between knowledge can be inherently determined by the sequence of tasks, the referential relationship between knowledge can be circuitously implemented if the knowledge is modeled to be one of input or output of each task. To describe the knowledge with respect to related process and task, the Protege-OWL, an editor that enables users to build ontologies for the Semantic Web, is used. An OWL ontology-based knowledge map includes descriptions of classes (process, task, and knowledge), properties (relationships between process and task, task and knowledge), and their instances. Given such an ontology, the OWL formal semantics specifies how to derive its logical consequences, i.e. facts not literally present in the ontology, but entailed by the semantics. Therefore a knowledge network can be automatically formulated based on the defined relationships, and the referential navigation between knowledge is enabled. To verify the validity of the proposed concepts, two real business process-oriented knowledge maps are exemplified: the knowledge map of the process of 'Business Trip Application' and 'Purchase Management'. By applying the 'DL-Query' provided by the Protege-OWL as a plug-in module, the performance of the implemented ontology-based knowledge map has been examined. Two kinds of queries to check whether the knowledge is networked with respect to the referential relations as well as the ontology-based knowledge network can infer further facts that are not literally described were tested. The test results show that not only the referential navigation between knowledge has been correctly realized, but also the additional inference has been accurately performed.

Finding Weighted Sequential Patterns over Data Streams via a Gap-based Weighting Approach (발생 간격 기반 가중치 부여 기법을 활용한 데이터 스트림에서 가중치 순차패턴 탐색)

  • Chang, Joong-Hyuk
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.3
    • /
    • pp.55-75
    • /
    • 2010
  • Sequential pattern mining aims to discover interesting sequential patterns in a sequence database, and it is one of the essential data mining tasks widely used in various application fields such as Web access pattern analysis, customer purchase pattern analysis, and DNA sequence analysis. In general sequential pattern mining, only the generation order of data element in a sequence is considered, so that it can easily find simple sequential patterns, but has a limit to find more interesting sequential patterns being widely used in real world applications. One of the essential research topics to compensate the limit is a topic of weighted sequential pattern mining. In weighted sequential pattern mining, not only the generation order of data element but also its weight is considered to get more interesting sequential patterns. In recent, data has been increasingly taking the form of continuous data streams rather than finite stored data sets in various application fields, the database research community has begun focusing its attention on processing over data streams. The data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. In data stream processing, each data element should be examined at most once to analyze the data stream, and the memory usage for data stream analysis should be restricted finitely although new data elements are continuously generated in a data stream. Moreover, newly generated data elements should be processed as fast as possible to produce the up-to-date analysis result of a data stream, so that it can be instantly utilized upon request. To satisfy these requirements, data stream processing sacrifices the correctness of its analysis result by allowing some error. Considering the changes in the form of data generated in real world application fields, many researches have been actively performed to find various kinds of knowledge embedded in data streams. They mainly focus on efficient mining of frequent itemsets and sequential patterns over data streams, which have been proven to be useful in conventional data mining for a finite data set. In addition, mining algorithms have also been proposed to efficiently reflect the changes of data streams over time into their mining results. However, they have been targeting on finding naively interesting patterns such as frequent patterns and simple sequential patterns, which are found intuitively, taking no interest in mining novel interesting patterns that express the characteristics of target data streams better. Therefore, it can be a valuable research topic in the field of mining data streams to define novel interesting patterns and develop a mining method finding the novel patterns, which will be effectively used to analyze recent data streams. This paper proposes a gap-based weighting approach for a sequential pattern and amining method of weighted sequential patterns over sequence data streams via the weighting approach. A gap-based weight of a sequential pattern can be computed from the gaps of data elements in the sequential pattern without any pre-defined weight information. That is, in the approach, the gaps of data elements in each sequential pattern as well as their generation orders are used to get the weight of the sequential pattern, therefore it can help to get more interesting and useful sequential patterns. Recently most of computer application fields generate data as a form of data streams rather than a finite data set. Considering the change of data, the proposed method is mainly focus on sequence data streams.