• Title/Summary/Keyword: Data Management Techniques

Search Result 1,733, Processing Time 0.032 seconds

Using Data Mining Techniques to Predict Win-Loss in Korean Professional Baseball Games (데이터마이닝을 활용한 한국프로야구 승패예측모형 수립에 관한 연구)

  • Oh, Younhak;Kim, Han;Yun, Jaesub;Lee, Jong-Seok
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.40 no.1
    • /
    • pp.8-17
    • /
    • 2014
  • In this research, we employed various data mining techniques to build predictive models for win-loss prediction in Korean professional baseball games. The historical data containing information about players and teams was obtained from the official materials that are provided by the KBO website. Using the collected raw data, we additionally prepared two more types of dataset, which are in ratio and binary format respectively. Dividing away-team's records by the records of the corresponding home-team generated the ratio dataset, while the binary dataset was obtained by comparing the record values. We applied seven classification techniques to three (raw, ratio, and binary) datasets. The employed data mining techniques are decision tree, random forest, logistic regression, neural network, support vector machine, linear discriminant analysis, and quadratic discriminant analysis. Among 21(= 3 datasets${\times}$7 techniques) prediction scenarios, the most accurate model was obtained from the random forest technique based on the binary dataset, which prediction accuracy was 84.14%. It was also observed that using the ratio and the binary dataset helped to build better prediction models than using the raw data. From the capability of variable selection in decision tree, random forest, and stepwise logistic regression, we found that annual salary, earned run, strikeout, pitcher's winning percentage, and four balls are important winning factors of a game. This research is distinct from existing studies in that we used three different types of data and various data mining techniques for win-loss prediction in Korean professional baseball games.

Design of the Database Learning System based on Learner Management Techniques

  • Ahn, Jeong-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.4
    • /
    • pp.707-716
    • /
    • 2004
  • Recently, many areas of application such as statistics and industrial engineering are interested in the effective education of databases. In this article we design and implement a database learning system based on learner management techniques. The system supports a personalized/ team-centered learning environment, monitoring the learning attitude of learners, and a method for the assessment.

  • PDF

Application of SE Management Techniques for space Launch System Development (우주발사체 시스템 개발에 있어서의 SE관리기법 적용)

  • Jo, Mi-Ok;Jo, Byeong-Gyu;O, Beom-Seok;Park, Jeong-Ju;Jo, Gwang-Rae
    • 시스템엔지니어링워크숍
    • /
    • s.4
    • /
    • pp.90-94
    • /
    • 2004
  • System engineering(SE) management techniques applied for space launch system development are introduced to assess the current status and address the effwctiveness of these techniques. Management plans and guides are prepared for the work breakdown structure , data, comfiguration, interface control, Quality assurance, procurement, reliability, risk and verification/validation . Further improvement is required for the system engineering management plan(SEMP) to merge the international cooperation into current engineering managment system.

  • PDF

A Survey of State-of-the-Art Multi-Authority Attribute Based Encryption Schemes in Cloud Environment

  • Reetu, Gupta;Priyesh, Kanungo;Nirmal, Dagdee
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.1
    • /
    • pp.145-164
    • /
    • 2023
  • Cloud computing offers a platform that is both adaptable and scalable, making it ideal for outsourcing data for sharing. Various organizations outsource their data on cloud storage servers for availing management and sharing services. When the organizations outsource the data, they lose direct control on the data. This raises the privacy and security concerns. Cryptographic encryption methods can secure the data from the intruders as well as cloud service providers. Data owners may also specify access control policies such that only the users, who satisfy the policies, can access the data. Attribute based access control techniques are more suitable for the cloud environment as they cover large number of users coming from various domains. Multi-authority attribute-based encryption (MA-ABE) technique is one of the propitious attribute based access control technique, which allows data owner to enforce access policies on encrypted data. The main aim of this paper is to comprehensively survey various state-of-the-art MA-ABE schemes to explore different features such as attribute and key management techniques, access policy structure and its expressiveness, revocation of access rights, policy updating techniques, privacy preservation techniques, fast decryption and computation outsourcing, proxy re-encryption etc. Moreover, the paper presents feature-wise comparison of all the pertinent schemes in the field. Finally, some research challenges and directions are summarized that need to be addressed in near future.

Applications of Data Science Technologies in the Field of Groundwater Science and Future Trends (데이터 사이언스 기술의 지하수 분야 응용 사례 분석 및 발전 방향)

  • Jina Jeong;Jae Min Lee;Subi Lee;Woojong Yang;Weon Shik Han
    • Journal of Soil and Groundwater Environment
    • /
    • v.28 no.spc
    • /
    • pp.18-39
    • /
    • 2023
  • Rapid development of geophysical exploration and hydrogeologic monitoring techniques has yielded remarkable increase of datasets related to groundwater systems. Increased number of datasets contribute to understanding of general aquifer characteristics such as groundwater yield and flow, but understanding of complex heterogenous aquifers system is still a challenging task. Recently, applications of data science technique have become popular in the fields of geophysical explorations and monitoring, and such attempts are also extended in the groundwater field. This work reviewed current status and advancement in utilization of data science in groundwater field. The application of data science techniques facilitates effective and realistic analyses of aquifer system, and allows accurate prediction of aquifer system change in response to extreme climate events. Due to such benefits, data science techniques have become an effective tool to establish more sustainable groundwater management systems. It is expected that the techniques will further strengthen the theoretical framework in groundwater management to cope with upcoming challenges and limitations.

Optimization Methodology Integrated Data Mining and Statistical Method (데이터 마이닝과 통계적 기법을 통합한 최적화 기법)

  • Jung, Hey-Jin;Song, Suh-Ill
    • Proceedings of the Korean Society for Quality Management Conference
    • /
    • 2006.11a
    • /
    • pp.205-210
    • /
    • 2006
  • Nowaday manufacture technology and manufacture environment are changing rapidly. By development of computer and enlargement of technique, most of manufacture field are computerized. It is measured automatically do much quality characteristics thereby and great many data happen in a day. corporations is important if have gotten fast information that are useful from wide data to go first in international competition according to these change. Statistical process control(SPC) techniques are used as a problem solution tool at manufacturing process until present. However, this statistical methods is not applied more extensively because have much restrictions in realistic problem. In this paper, wish to develop more realistic and scientific new statistical design techniques doing to integrate data mining(DM) and statistical methods by the alternative to cope these problem. First step selects significant factor using DM techniques from datas of manufacturing process including much factors and second step wish to find optimum of process after get the estimated response function through response surf ace methodology(RSM) that is statistical techniques.

  • PDF

A Comparison of Data Extraction Techniques and an Implementation of Data Extraction Technique using Index DB -S Bank Case- (원천 시스템 환경을 고려한 데이터 추출 방식의 비교 및 Index DB를 이용한 추출 방식의 구현 -ㅅ 은행 사례를 중심으로-)

  • 김기운
    • Korean Management Science Review
    • /
    • v.20 no.2
    • /
    • pp.1-16
    • /
    • 2003
  • Previous research on data extraction and integration for data warehousing has concentrated mainly on the relational DBMS or partly on the object-oriented DBMS. Mostly, it describes issues related with the change data (deltas) capture and the incremental update by using the triggering technique of active database systems. But, little attention has been paid to data extraction approaches from other types of source systems like hierarchical DBMS, etc. and from source systems without triggering capability. This paper argues, from the practical point of view, that we need to consider not only the types of information sources and capabilities of ETT tools but also other factors of source systems such as operational characteristics (i.e., whether they support DBMS log, user log or no log, timestamp), and DBMS characteristics (i.e., whether they have the triggering capability or not, etc), in order to find out appropriate data extraction techniques that could be applied to different source systems. Having applied several different data extraction techniques (e.g., DBMS log, user log, triggering, timestamp-based extraction, file comparison) to S bank's source systems (e.g., IMS, DB2, ORACLE, and SAM file), we discovered that data extraction techniques available in a commercial ETT tool do not completely support data extraction from the DBMS log of IMS system. For such IMS systems, a new date extraction technique is proposed which first creates Index database and then updates the data warehouse using the Index database. We illustrates this technique using an example application.

Optimization Methodology Integrated Data Mining and Statistical Method (데이터 마이닝과 통계적 기법을 통합한 최적화 기법)

  • Song, Suh-Ill;Shin, Sang-Mun;Jung, Hey-Jin
    • Journal of Korean Society for Quality Management
    • /
    • v.34 no.4
    • /
    • pp.33-39
    • /
    • 2006
  • These days manufacture technology and manufacture environment are changing rapidly. By development of computer and enlargement of technique, most of manufacture field are computerized. In order to win international competition, it is important for companies how fast get the useful information from vast data. Statistical process control(SPC) techniques have been used as a problem solution tool at manufacturing process until present. However, these statistical methods are not applied more extensively because it has much restrictions in realistic problems. These statistical techniques have lots of problems when much data and factors are analyzed. In this paper, we proposed more practical and efficient a new statistical design technique which integrated data mining (DM) and statistical methods as alternative of problems. First step is selecting significant factor using DM feature selection algorithm from data of manufacturing process including many factors. Second step is finding optimum of process after estimating response function through response surface methodology(RSM) that is a statistical techniques

A Study of Data Mining Techniques in Bankruptcy Prediction (데이터 마이닝 기법의 기업도산예측 실증분석)

  • Lee, Kidong
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.28 no.2
    • /
    • pp.105-127
    • /
    • 2003
  • In this paper, four different data mining techniques, two neural networks and two statistical modeling techniques, are compared in terms of prediction accuracy in the context of bankruptcy prediction. In business setting, how to accurately detect the condition of a firm has been an important event in the literature. In neural networks, Backpropagation (BP) network and the Kohonen self-organizing feature map, are selected and compared each other while in statistical modeling techniques, discriminant analysis and logistic regression are also performed to provide performance benchmarks for the neural network experiment. The findings suggest that the BP network is a better choice among the data mining tools compared. This paper also identified some distinctive characteristics of Kohonen self-organizing feature map.

An Integrated Framework for Data Quality Management of Traffic Data Warehouses (고품질 데이터를 지원하는 교통데이터 웨어하우스 구축 기법)

  • Hwang, Jae-Il;Park, Seung-Yong;Nah, Yun-Mook
    • Journal of Korea Spatial Information System Society
    • /
    • v.10 no.4
    • /
    • pp.89-95
    • /
    • 2008
  • In this paper, we propose an integrated techniques for managing data quality in traffic data warehousing environments. We describe how to collect and construct the traffic data warehouses from the operational databases, such as FTMS and ARTIS. We explain how to configure the traffic data warehouses efficiently. Also, we propose a quality management techniques to provide high quality traffic data for various analytical transactions. Proposed techniques can contribute in providing high quality traffic data to the traffic related users and researcher, thus reducing data preprocessing and evaluation cost.

  • PDF