• Title/Summary/Keyword: 결정규칙

Search Result 942, Processing Time 0.034 seconds

Rough Set Analysis for Stock Market Timing (러프집합분석을 이용한 매매시점 결정)

  • Huh, Jin-Nyung;Kim, Kyoung-Jae;Han, In-Goo
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.3
    • /
    • pp.77-97
    • /
    • 2010
  • Market timing is an investment strategy which is used for obtaining excessive return from financial market. In general, detection of market timing means determining when to buy and sell to get excess return from trading. In many market timing systems, trading rules have been used as an engine to generate signals for trade. On the other hand, some researchers proposed the rough set analysis as a proper tool for market timing because it does not generate a signal for trade when the pattern of the market is uncertain by using the control function. The data for the rough set analysis should be discretized of numeric value because the rough set only accepts categorical data for analysis. Discretization searches for proper "cuts" for numeric data that determine intervals. All values that lie within each interval are transformed into same value. In general, there are four methods for data discretization in rough set analysis including equal frequency scaling, expert's knowledge-based discretization, minimum entropy scaling, and na$\ddot{i}$ve and Boolean reasoning-based discretization. Equal frequency scaling fixes a number of intervals and examines the histogram of each variable, then determines cuts so that approximately the same number of samples fall into each of the intervals. Expert's knowledge-based discretization determines cuts according to knowledge of domain experts through literature review or interview with experts. Minimum entropy scaling implements the algorithm based on recursively partitioning the value set of each variable so that a local measure of entropy is optimized. Na$\ddot{i}$ve and Booleanreasoning-based discretization searches categorical values by using Na$\ddot{i}$ve scaling the data, then finds the optimized dicretization thresholds through Boolean reasoning. Although the rough set analysis is promising for market timing, there is little research on the impact of the various data discretization methods on performance from trading using the rough set analysis. In this study, we compare stock market timing models using rough set analysis with various data discretization methods. The research data used in this study are the KOSPI 200 from May 1996 to October 1998. KOSPI 200 is the underlying index of the KOSPI 200 futures which is the first derivative instrument in the Korean stock market. The KOSPI 200 is a market value weighted index which consists of 200 stocks selected by criteria on liquidity and their status in corresponding industry including manufacturing, construction, communication, electricity and gas, distribution and services, and financing. The total number of samples is 660 trading days. In addition, this study uses popular technical indicators as independent variables. The experimental results show that the most profitable method for the training sample is the na$\ddot{i}$ve and Boolean reasoning but the expert's knowledge-based discretization is the most profitable method for the validation sample. In addition, the expert's knowledge-based discretization produced robust performance for both of training and validation sample. We also compared rough set analysis and decision tree. This study experimented C4.5 for the comparison purpose. The results show that rough set analysis with expert's knowledge-based discretization produced more profitable rules than C4.5.

Improving the Accuracy of Document Classification by Learning Heterogeneity (이질성 학습을 통한 문서 분류의 정확성 향상 기법)

  • Wong, William Xiu Shun;Hyun, Yoonjin;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.21-44
    • /
    • 2018
  • In recent years, the rapid development of internet technology and the popularization of smart devices have resulted in massive amounts of text data. Those text data were produced and distributed through various media platforms such as World Wide Web, Internet news feeds, microblog, and social media. However, this enormous amount of easily obtained information is lack of organization. Therefore, this problem has raised the interest of many researchers in order to manage this huge amount of information. Further, this problem also required professionals that are capable of classifying relevant information and hence text classification is introduced. Text classification is a challenging task in modern data analysis, which it needs to assign a text document into one or more predefined categories or classes. In text classification field, there are different kinds of techniques available such as K-Nearest Neighbor, Naïve Bayes Algorithm, Support Vector Machine, Decision Tree, and Artificial Neural Network. However, while dealing with huge amount of text data, model performance and accuracy becomes a challenge. According to the type of words used in the corpus and type of features created for classification, the performance of a text classification model can be varied. Most of the attempts are been made based on proposing a new algorithm or modifying an existing algorithm. This kind of research can be said already reached their certain limitations for further improvements. In this study, aside from proposing a new algorithm or modifying the algorithm, we focus on searching a way to modify the use of data. It is widely known that classifier performance is influenced by the quality of training data upon which this classifier is built. The real world datasets in most of the time contain noise, or in other words noisy data, these can actually affect the decision made by the classifiers built from these data. In this study, we consider that the data from different domains, which is heterogeneous data might have the characteristics of noise which can be utilized in the classification process. In order to build the classifier, machine learning algorithm is performed based on the assumption that the characteristics of training data and target data are the same or very similar to each other. However, in the case of unstructured data such as text, the features are determined according to the vocabularies included in the document. If the viewpoints of the learning data and target data are different, the features may be appearing different between these two data. In this study, we attempt to improve the classification accuracy by strengthening the robustness of the document classifier through artificially injecting the noise into the process of constructing the document classifier. With data coming from various kind of sources, these data are likely formatted differently. These cause difficulties for traditional machine learning algorithms because they are not developed to recognize different type of data representation at one time and to put them together in same generalization. Therefore, in order to utilize heterogeneous data in the learning process of document classifier, we apply semi-supervised learning in our study. However, unlabeled data might have the possibility to degrade the performance of the document classifier. Therefore, we further proposed a method called Rule Selection-Based Ensemble Semi-Supervised Learning Algorithm (RSESLA) to select only the documents that contributing to the accuracy improvement of the classifier. RSESLA creates multiple views by manipulating the features using different types of classification models and different types of heterogeneous data. The most confident classification rules will be selected and applied for the final decision making. In this paper, three different types of real-world data sources were used, which are news, twitter and blogs.

Legal status of Priave Transaction Regarding the Geostationary Satellite Orbit (지구정지궤도의 사적 거래의 국제법상 지위에 관한 연구)

  • Shin, Hong Kyun
    • The Korean Journal of Air & Space Law and Policy
    • /
    • v.29 no.2
    • /
    • pp.239-272
    • /
    • 2014
  • The rights and obligations of the Member States of ITU in the domain of international frequency management of the spectrum/orbit resource are incorporated in the Constitution and Convention of the ITU and in the Radio Regulations that complement them. These instruments contain the main principles and lay down the specific regulations governing the major elements such as rights and obligations of member administrations in obtaining access to the spectrum/orbit resource, as well as international recognition of these rights by recording frequency assignments and, as appropriate, any associated orbits, including the geostationary-satellite orbits used or intended to be used in the Master International Frequency Register (MIFR) Coordination is a further step in the process leading up to notification of the frequency assignments for recording in the MIFR. This procedure is a formal regulatory obligation both for an administration seeking to assign a frequency in its network and for an administration whose existing or planned services may be affected by that assignment. Regulatory problem lies in allowing administrations to fulfill their "bringing into use" duty for preserving his filing simply putting any satellites, whatever nationlity or technical specification may be, into filed orbit. This sort of regulatory lack may result in the emergence of the secondary market for satellite orbit. Within satellite orbit secondary market, the object of transaction may be the satellite itself, or the regulatory rights in rem, or the orbit registered in the MIFR. Recent case of selling the Koreasat belongs to the typical example of orbit transaction between private companies, the legality of which remains doubtedly controversial from the perspective of international space law as well as international transaction law. It must be noted, however, that the fact is the Koreasat 3 and its filed orbit is for sale.

Methodology for Issue-related R&D Keywords Packaging Using Text Mining (텍스트 마이닝 기반의 이슈 관련 R&D 키워드 패키징 방법론)

  • Hyun, Yoonjin;Shun, William Wong Xiu;Kim, Namgyu
    • Journal of Internet Computing and Services
    • /
    • v.16 no.2
    • /
    • pp.57-66
    • /
    • 2015
  • Considerable research efforts are being directed towards analyzing unstructured data such as text files and log files using commercial and noncommercial analytical tools. In particular, researchers are trying to extract meaningful knowledge through text mining in not only business but also many other areas such as politics, economics, and cultural studies. For instance, several studies have examined national pending issues by analyzing large volumes of text on various social issues. However, it is difficult to provide successful information services that can identify R&D documents on specific national pending issues. While users may specify certain keywords relating to national pending issues, they usually fail to retrieve appropriate R&D information primarily due to discrepancies between these terms and the corresponding terms actually used in the R&D documents. Thus, we need an intermediate logic to overcome these discrepancies, also to identify and package appropriate R&D information on specific national pending issues. To address this requirement, three methodologies are proposed in this study-a hybrid methodology for extracting and integrating keywords pertaining to national pending issues, a methodology for packaging R&D information that corresponds to national pending issues, and a methodology for constructing an associative issue network based on relevant R&D information. Data analysis techniques such as text mining, social network analysis, and association rules mining are utilized for establishing these methodologies. As the experiment result, the keyword enhancement rate by the proposed integration methodology reveals to be about 42.8%. For the second objective, three key analyses were conducted and a number of association rules between national pending issue keywords and R&D keywords were derived. The experiment regarding to the third objective, which is issue clustering based on R&D keywords is still in progress and expected to give tangible results in the future.

Environmental Assessment of Contaminated Soils around Abandoned Mines using The Current Soil Quality Standards

  • Ahn, Joo-Sung;Kim, Kyoung-Woong
    • Journal of Environmental Policy
    • /
    • v.2 no.1
    • /
    • pp.87-105
    • /
    • 2003
  • 토양오염기준은 토지이용별 또는 오염노출경로별 오염토양의 조사, 평가 및 복구를 위한 유용한 지침으로 이용될 수 있다. 이 연구에서는 한국을 포함한 몇몇 국가의 토양오염기준을 특히 중금속에 주안점을 두고 검토하였다. 또한 국내 일부 폐광산 지역의 환경오염 조사결과를 토양오염기준과 비교하며 평가하였다. 영국, 네덜란드, 독일, 캐나다, 일본 등의 국가에서는 복구우선순위 선정과 인체건강 및 생태계 보전을 위한 오염토양의 선별수단으로 토양오염기준을 개발하여 왔다. 국내에서는 1996년 토양환경보전법(시행규칙)의 시행에 따라 토양 오염기준 설정과 특정토양오염유발시설의 관리, 정기적 토양오염도 조사를 실시하고 있으며 이후 2001년 보완 개정되어 토양오염물질의 확대적용과 토양오염 조사, 관리 및 복원 방법의 보완이 이루어진 바 있다. 그럼에도 불구하고 현 중금속(Cd, Cu 및 Pb)의 오염기준을 적용할때 폐광산 지역의 폐기물과 이에 오염된 토양의 경우 다량의 중금속을 함유하고 있지만 오염지역으로 규정되지 않을 수도 있다. 이 연구 결과, 0.1N 염산용출법은 중금속의 토양 내 화학적 형태 중 일부 교환성 및 환원성 형태만을 추출하며 황화물 형태를 충분히 용출하지 못하고 있는 것으로 나타났다. 또한 장기적인 환경조건 변화에 따른 토양 내 중금속의 용출성 변화를 평가하기 위해서는 강산추출법이 요구되어진다. 따라서 폐광산지 역의 중금속에 의한 토양오염의 타당한 환경평가 결정을 위해 현 분석기법 외에 총함량 분석방법의 병행적용과 이에 따른 새로운 오염기준 설정 등이 필요한 것으로 제시되었다.

  • PDF

Review on the Updated International Standard Banking Practice(ISBP745) (국제표준은행관행(ISBP 745)의 주요개정사항 분석과 시사점)

  • Kim, Dong-Yoon
    • International Commerce and Information Review
    • /
    • v.15 no.3
    • /
    • pp.177-197
    • /
    • 2013
  • The updated International Standard Banking Practice(ISBP), International Chamber of Commerce Publication No. 745, was published to bring its contents in line with UCP 600. The ISBP 745 has dealt with A-Q that are commonly arising in the credit transactions and also explains in detail some of the articles of UCP 600 with the proper interpretation. The updated version ISBP 745 involved aligning it with UCP 600, making certain technical adjustments in expression of words such as "in conflict" instead of "inconsistent", etc. Despite the updated ISBP 745, there remains a question. Where the words "from" or "after" are used in regard to something other than a period of shipment or the maturity date of drafts, there is no guidance from the UCP as to its meaning. The party concerned in credit transactions is also required to have a full understanding of ISBP 745 and the articles of UCP 600 in order to acquire the documents preparation and examination facilitating the parties to minimize discrepancies.

  • PDF

Hyper-Rectangle Based Prototype Selection Algorithm Preserving Class Regions (클래스 영역을 보존하는 초월 사각형에 의한 프로토타입 선택 알고리즘)

  • Baek, Byunghyun;Euh, Seongyul;Hwang, Doosung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.3
    • /
    • pp.83-90
    • /
    • 2020
  • Prototype selection offers the advantage of ensuring low learning time and storage space by selecting the minimum data representative of in-class partitions from the training data. This paper designs a new training data generation method using hyper-rectangles that can be applied to general classification algorithms. Hyper-rectangular regions do not contain different class data and divide the same class space. The median value of the data within a hyper-rectangle is selected as a prototype to form new training data, and the size of the hyper-rectangle is adjusted to reflect the data distribution in the class area. A set cover optimization algorithm is proposed to select the minimum prototype set that represents the whole training data. The proposed method reduces the time complexity that requires the polynomial time of the set cover optimization algorithm by using the greedy algorithm and the distance equation without multiplication. In experimented comparison with hyper-sphere prototype selections, the proposed method is superior in terms of prototype rate and generalization performance.

Reading Children's Mind from Digital Drawings based on Dominant Color Analysis using ART2 Clustering and Fuzzy Logic (ART2 군집화와 퍼지 논리를 이용한 디지털 그림의 색채 주조색 분석에 의한 아동 심리 분석)

  • Kim, Kwang-baek
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.6
    • /
    • pp.1203-1208
    • /
    • 2016
  • For young children who are not spontaneous or not accurate in verbal communication of their emotions and experiences, drawing is a good means of expressing their status in mind and thus drawing analysis with chromatics is a traditional tool for art therapy. Recently, children enjoy digital drawing via painting tools thus there is a growing needs to develop an automatic digital drawing analysis tool based on chromatics and art therapy theory. In this paper, we propose such an analyzing tool based on dominant color analysis. Technically, we use ART2 clustering and fuzzy logic to understand the fuzziness of subjects' status of mind expressed in their digital drawings. The frequency of color usage is fuzzified with respect to the membership functions. After applying fuzzy logic to this fuzzified central vector, we determine the dominant color and supporting colors from the digital drawings and children's status of mind is then analyzed according to the color-personality relationships based on Alschuler and Hattwick's historical researches.

Maximum Profit Priority Goods First Loading Algorithm for Barge Loading Problem (바지선 적재 문제의 최대이득 물품 우선 적재 알고리즘)

  • Lee, Sang-Un
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.10
    • /
    • pp.169-173
    • /
    • 2014
  • Nobody has yet been able to determine the optimal solution conclusively whether NP-complete problems are in fact solvable in polynomial time. Gu$\acute{e}$ret et al. tries to obtain the optimal solution using linear programming with $O(m^4)$ time complexity for barge loading problem a kind of bin packing problem that is classified as nondeterministic polynomial time (NP)-complete problem. On the other hand, this paper suggests the loading rule of profit priority rank algorithm with O(m log m) time complexity. This paper decides the profit priority rank firstly. Then, we obtain the initial loading result using the rule of loading the good has profit priority order. Finally, we balance the loading and capability of barge swap the goods of unloading in previously loading in case of under loading. As a result of experiments, this algorithm reduces the $O(m^4)$ of linear programming to O(m log m) time complexity for NP-complete barge loading problem.

Smart Affect Jewelry based on Multi-modal (멀티 모달 기반의 스마트 감성 주얼리)

  • Kang, Yun-Jeong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.7
    • /
    • pp.1317-1324
    • /
    • 2016
  • Utilizing the Arduino platform to express the emotions that reflect the colors expressed the jewelry. Emotional color expression utilizes Plutchik's Wheel of Emotions model was applied to the similarity of emotions and colors. It receives the recognized value from the temperature, lighting, sound, pulse sensor and gyro sensor of a smart jewelery that can be easily accessible from your smartphone processes that recognize and process the emotion applied the rules of inference based on ontology. The emotional feelings color depending on the color looking for the emotion seen in context and applied to the smart LED jewelry. The emotion and the color combination of contextual information extracted from the recognition sensors are reflected in the built-in smart LED Jewelry depending on the emotions of the wearer. Take a light plus the emotion in a smart jewelery can represent the emotions of the situation, the doctor will be able to be a tool of representation.