• 제목/요약/키워드: sequence information

검색결과 4,007건 처리시간 0.028초

시퀀스 유틸리티 리스트를 사용하여 높은 유틸리티 순차 패턴 탐사 기법 (Mining High Utility Sequential Patterns Using Sequence Utility Lists)

  • 박종수
    • 정보처리학회논문지:소프트웨어 및 데이터공학
    • /
    • 제7권2호
    • /
    • pp.51-62
    • /
    • 2018
  • 높은 유틸리티 순차 패턴 탐사는 데이터 마이닝에서 중요한 연구 주제로 간주되고 있다. 이 주제에 대해 몇 개의 알고리즘들이 제안되었지만, 그것들은 높은 유틸리티 순차 패턴 탐사의 탐색 공간이 커지는 문제에 부딪히게 된다. 한 시퀀스의 더 엄격한 유틸리티 상한 값은 탐색 공간에서 초기에 유망하지 않은 패턴들을 더 가지치기할 수 있다. 본 논문에서 새로운 유틸리티 상한 값을 제안하는데, 그것은 한 시퀀스와 그 자손 시퀀스들의 최대 예상 유틸리티인 sequence expected utility (SEU)이다. 높은 유틸리티 순차 패턴들을 탐사하는데 필수적인 정보를 유지하기 위해 각 패턴에 대한 시퀀스 유틸리티 리스트를 새로운 자료구조로 사용한다. SEU를 활용하여 높은 유틸리티 순차 패턴들을 찾아내는 알고리즘인 High Sequence Utility List-Span (HSUL-Span)을 제안한다. 서로 다른 영역의 합성 데이터세트와 실제 데이터세트에 대한 실험 결과는 HSUL-Span이 상당히 적은 수의 후보 패턴들을 생성하고 실행 시간 면에서 다른 알고리즘들보다 우수한 것을 보여준다.

UML Sequence Diagram을 활용한 공대공 교전 전술 명세 (The Specification of Air-to-Air Combat Tactics Using UML Sequence Diagram)

  • 박명환;오지현;김천영;설현주
    • 한국군사과학기술학회지
    • /
    • 제24권6호
    • /
    • pp.664-675
    • /
    • 2021
  • Air force air-to-air combat tactics are occurring at a high speed in three-dimensional space. The specification of the tactics requires dealing with a quite amount of information, which makes it a challenge to accurately describe the maneuvering procedure of the tactics. The specification of air-to-air tactics using natural languages is not suitable because of the intrinsic ambiguity of natural languages. Therefore, this paper proposes an approach of using UML Sequence Diagram to describe air-to-air combat tactics. Since the current Sequence Diagram notation is not sufficient to express all aspects of the tactics, we extend the syntax of the Sequence Diagram to accommodate the required features of air-to-air combat tactics. We evaluate the applicability of the extended Sequence Diagram to air-to-air combat tactics using a case example, that is the manned-unmanned teaming combat tactic. The result shows that Sequence Diagram specification is more advantageous than natural language specification in terms of readability, conciseness, and accuracy. However, the expressiveness of the Sequence Diagram is evaluated to be less powerful than natural language, requiring further study to address this issue.

The Sequence Labeling Approach for Text Alignment of Plagiarism Detection

  • Kong, Leilei;Han, Zhongyuan;Qi, Haoliang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제13권9호
    • /
    • pp.4814-4832
    • /
    • 2019
  • Plagiarism detection is increasingly exploiting text alignment. Text alignment involves extracting the plagiarism passages in a pair of the suspicious document and its source document. The heuristics have achieved excellent performance in text alignment. However, the further improvements of the heuristic methods mainly depends more on the experiences of experts, which makes the heuristics lack of the abilities for continuous improvements. To address this problem, machine learning maybe a proper way. Considering the position relations and the context of text segments pairs, we formalize the text alignment task as a problem of sequence labeling, improving the current methods at the model level. Especially, this paper proposes to use the probabilistic graphical model to tag the observed sequence of pairs of text segments. Hence we present the sequence labeling approach for text alignment in plagiarism detection based on Conditional Random Fields. The proposed approach is evaluated on the PAN@CLEF 2012 artificial high obfuscation plagiarism corpus and the simulated paraphrase plagiarism corpus, and compared with the methods achieved the best performance in PAN@CLEF 2012, 2013 and 2014. Experimental results demonstrate that the proposed approach significantly outperforms the state of the art methods.

Improving Malicious Web Code Classification with Sequence by Machine Learning

  • Paik, Incheon
    • IEIE Transactions on Smart Processing and Computing
    • /
    • 제3권5호
    • /
    • pp.319-324
    • /
    • 2014
  • Web applications make life more convenient. Many web applications have several kinds of user input (e.g. personal information, a user's comment of commercial goods, etc.) for the activities. On the other hand, there are a range of vulnerabilities in the input functions of Web applications. Malicious actions can be attempted using the free accessibility of many web applications. Attacks by the exploitation of these input vulnerabilities can be achieved by injecting malicious web code; it enables one to perform a variety of illegal actions, such as SQL Injection Attacks (SQLIAs) and Cross Site Scripting (XSS). These actions come down to theft, replacing personal information, or phishing. The existing solutions use a parser for the code, are limited to fixed and very small patterns, and are difficult to adapt to variations. A machine learning method can give leverage to cover a far broader range of malicious web code and is easy to adapt to variations and changes. Therefore, this paper suggests the adaptable classification of malicious web code by machine learning approaches for detecting the exploitation user inputs. The approach usually identifies the "looks-like malicious" code for real malicious code. More detailed classification using sequence information is also introduced. The precision for the "looks-like malicious code" is 99% and for the precise classification with sequence is 90%.

단백질의 세포내 소 기관별 분포 예측을 위한 서열 기반의 특징 추출 방법 (Sequence driven features for prediction of subcellular localization of proteins)

  • 김종경;최승진
    • 한국정보과학회:학술대회논문집
    • /
    • 한국정보과학회 2005년도 한국컴퓨터종합학술대회 논문집 Vol.32 No.1 (B)
    • /
    • pp.226-228
    • /
    • 2005
  • Predicting the cellular location of an unknown protein gives valuable information for inferring the possible function of the protein. For more accurate Prediction system, we need a good feature extraction method that transforms the raw sequence data into the numerical feature vector, minimizing information loss. In this paper we propose new methods of extracting underlying features only from the sequence data by computing pairwise sequence alignment scores. In addition, we use composition based features to improve prediction accuracy. To construct an SVM ensemble from separately trained SVM classifiers, we propose specificity based weighted majority voting . The overall prediction accuracy evaluated by the 5-fold cross-validation reached $88.53\%$ for the eukaryotic animal data set. By comparing the prediction accuracy of various feature extraction methods, we could get the biological insight on the location of targeting information. Our numerical experiments confirm that our new feature extraction methods are very useful forpredicting subcellular localization of proteins.

  • PDF

Development of the Recommender System of Arabic Books Based on the Content Similarity

  • Alotaibi, Shaykhah Hajed;Khan, Muhammad Badruddin
    • International Journal of Computer Science & Network Security
    • /
    • 제22권8호
    • /
    • pp.175-186
    • /
    • 2022
  • This research article develops an Arabic books' recommendation system, which is based on the content similarity that assists users to search for the right book and predict the appropriate and suitable books pertaining to their literary style. In fact, the system directs its users toward books, which can meet their needs from a large dataset of Information. Further, this system makes its predictions based on a set of data that is gathered from different books and converts it to vectors by using the TF-IDF system. After that, the recommendation algorithms such as the cosine similarity, the sequence matcher similarity, and the semantic similarity aggregate data to produce an efficient and effective recommendation. This approach is advantageous in recommending previously unrated books to users with unique interests. It is found to be proven from the obtained results that the results of the cosine similarity of the full content of books, the results of the sequence matcher similarity of Arabic titles of the books, and the results of the semantic similarity of English titles of the books are the best obtained results, and extremely close to the average of the result related to the human assigned/annotated similarity. Flask web application is developed with a simple interface to show the recommended Arabic books by using cosine similarity, sequence matcher similarity, and semantic similarity algorithms with all experiments that are conducted.

M-sequence and its applications to nonlinear system identification

  • Kashiwagi, Hiroshi
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 1994년도 Proceedings of the Korea Automatic Control Conference, 9th (KACC) ; Taejeon, Korea; 17-20 Oct. 1994
    • /
    • pp.7-12
    • /
    • 1994
  • This paper describes an outline of pseudorandom M-sequence and its applications to measurement and control engineering. At first, generation and properties of M-sequence is briefly described and then its applications to delay time measurement, information transmission by use of M-array, two dimensional positioning, fault detection of logical circuit, fault detection of RAM, linear and nonlinear system identification.

  • PDF

MODEL FOR DESIGN MANAGEMENT IN COLLABORATIVE ENVIRONMENT USING DESIGN STRUCTURE MATRIX AND DESIGN PARAMETERS' INFORMATION

  • Salman Akram;Jeonghwan Kim;Jongwon Seo
    • 국제학술발표논문집
    • /
    • The 3th International Conference on Construction Engineering and Project Management
    • /
    • pp.1307-1312
    • /
    • 2009
  • Design is an act based on multidisciplinary information. The involvement of various stakeholders makes it difficult to process, plan, and integrate. Iteration is frequent in most of the engineering design and development projects including construction. Design iterations cause rework, and extra efforts are required to get the optimal sequence and to manage the projects. The simple project management techniques are insufficient to fulfill the requirements of integrated design. This paper entails two things: design structure matrix and design parameters' information based model. The emphasis has been given to optimal sequence and crucial iteration using design structure matrix analysis technique. The design projects have been studied using survey data from industry. The optimal sequence and crucial iterations results have been utilized for proposed model. Model integrates two things: information about produced- required key design parameters and information of design changes during the design process. It will help to get familiar with Design management in order to fulfill contemporary needs.

  • PDF

여러 가지 수열을 적용한 STDR/SSTDR 기법의 성능 비교 및 개선 (Performance Comparison and Improvement of STDR/SSTDR Schemes Using Various Sequences)

  • 한정재;박소령
    • 한국통신학회논문지
    • /
    • 제39A권11호
    • /
    • pp.637-644
    • /
    • 2014
  • 이 논문에서는 다양한 길이와 종류의 수열을 사용한 STDR(sequence time domain reflectometry) 및 SSTDR(spread spectrum time domain reflectometry) 기법의 고장위치 탐지 성능을 비교하고, SSTDR 기법의 성능 향상을 위한 인가신호 제거 방식을 제안한다. 대표적인 PN(pseudo-noise) 수열인 m 수열, 자기상관(autocorrelation) 성질이 우수한 이진(binary) Barker 수열과 4위상 Frank 수열을 사용하였을 때, 전력선 채널 모형에서 고장 유형, 고장위치, 제안 기법 사용 유무를 바꾸어가며 오탐지율을 비교 분석한다. 감쇠가 심할 때와 고장위치가 매우 가까울 때 제안한 인가신호 제거 방식을 사용하면 고장위치 탐지 성능을 크게 개선시킬 수 있음을 모의실험으로 확인한다.