• Title/Summary/Keyword: data mining use factor

Search Result 38, Processing Time 0.023 seconds

A review of big data analytics and healthcare (빅데이터 분석과 헬스케어에 대한 동향)

  • Moon, Seok-Jae;Lee, Namju
    • Journal of the Korean Applied Science and Technology
    • /
    • v.37 no.1
    • /
    • pp.76-82
    • /
    • 2020
  • Big data analysis in healthcare research seems to be a necessary strategy for the convergence of sports science and technology in the era of the Fourth Industrial Revolution. The purpose of this study is to provide the basic review to secure the diversity of big data and healthcare convergence by discussing the concept, analysis method, and application examples of big data and by exploring the application. Text mining, data mining, opinion mining, process mining, cluster analysis, and social network analysis is currently used. Identifying high-risk factor for a certain condition, determining specific health determinants for diseases, monitoring bio signals, predicting diseases, providing training and treatments, and analyzing healthcare measurements would be possible via big data analysis. As a further work, the big data characteristics provide very appropriate basis to use promising software platforms for development of applications that can handle big data in healthcare and even more in sports science.

Merchandise Management Using Web Mining in Business To Customer Electronic Commerce (기업과 소비자간 전자상거래에서의 웹 마이닝을 이용한 상품관리)

  • 임광혁;홍한국;박상찬
    • Journal of Intelligence and Information Systems
    • /
    • v.7 no.1
    • /
    • pp.97-121
    • /
    • 2001
  • Until now, we have believed that one of advantages of cyber market is that it can virtually display and sell goods because it does not necessary maintain expensive physical shops and inventories. But, in a highly competitive environment, business model that does away with goods in stock must be modified. As we know in the case of AMAZON, leading companies already consider merchandise management as a critical success factor in their business model. That is, a solution to compete against one's competitors in a highly competitive environment is merchandise management as in the traditional retail market. Cyber market has not only past sales data but also web log data before sales data that contains information of path that customer search and purchase on cyber market as compared with traditional retail market. So if we can correctly analyze the characteristics of before sales patterns using web log data, we can better prepare for the potential customers and effectively manage inventories and merchandises. We introduce a systematic analysis method to extract useful data for merchandise management - demand forecasting, evaluating & selecting - using web mining that is the application of data mining techniques to the World Wide Web. We use various techniques of web mining such as clustering, mining association rules, mining sequential patterns.

  • PDF

Application of data mining and statistical measurement of agricultural high-quality development

  • Yan Zhou
    • Advances in nano research
    • /
    • v.14 no.3
    • /
    • pp.225-234
    • /
    • 2023
  • In this study, we aim to use big data resources and statistical analysis to obtain a reliable instruction to reach high-quality and high yield agricultural yields. In this regard, soil type data, raining and temperature data as well as wheat production in each year are collected for a specific region. Using statistical methodology, the acquired data was cleaned to remove incomplete and defective data. Afterwards, using several classification methods in machine learning we tried to distinguish between different factors and their influence on the final crop yields. Comparing the proposed models' prediction using statistical quantities correlation factor and mean squared error between predicted values of the crop yield and actual values the efficacy of machine learning methods is discussed. The results of the analysis show high accuracy of machine learning methods in the prediction of the crop yields. Moreover, it is indicated that the random forest (RF) classification approach provides best results among other classification methods utilized in this study.

The Application of Data Warehouse for Developing Construction Productivity Management System (건설생산성 관리 시스템 구축을 위한 데이터웨어하우스의 적용)

  • Oh, Se-Wook;Kim, Myoung-Ho;Kim, Young-Suk
    • Korean Journal of Construction Engineering and Management
    • /
    • v.7 no.2 s.30
    • /
    • pp.127-137
    • /
    • 2006
  • Productivity is important to evaluate an efficiency of performed work and organization in construction industry. The productivity should be defined as activity level rather than macro level in order to effectively use productivity data and manage a project. The primary objective of this study is to develop a construction productivity management system using data warehouse, OLAP and data mining technologies which enables to easily accumulate the construction productivity data and perform multi layer analysis. Finally, it is anticipated that the effective use of the developed system would be able to measure the result of project and make a plan of the similar project with reliability.

Unstructured Data Processing Using Keyword-Based Topic-Oriented Analysis (키워드 기반 주제중심 분석을 이용한 비정형데이터 처리)

  • Ko, Myung-Sook
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.11
    • /
    • pp.521-526
    • /
    • 2017
  • Data format of Big data is diverse and vast, and its generation speed is very fast, requiring new management and analysis methods, not traditional data processing methods. Textual mining techniques can be used to extract useful information from unstructured text written in human language in online documents on social networks. Identifying trends in the message of politics, economy, and culture left behind in social media is a factor in understanding what topics they are interested in. In this study, text mining was performed on online news related to a given keyword using topic - oriented analysis technique. We use Latent Dirichiet Allocation (LDA) to extract information from web documents and analyze which subjects are interested in a given keyword, and which topics are related to which core values are related.

A Management Improvement Study by the Use Survey of an Academic Library - Focused on the Analysis of Circulation Records of the C-Academic Library Users - (대학도서관 이용조사를 통한 경영개선 연구 - C 대학도서관 이용자의 대출기록 분석을 중심으로-)

  • Yoo, Kyeong-Jong;Park, Il-Jong
    • Journal of the Korean Society for information Management
    • /
    • v.24 no.3
    • /
    • pp.93-117
    • /
    • 2007
  • The books and circulation-related data in the Library Automation System(LAS) of C-academic library were collected and analyzed, and also the method which may be applied to the Customer Relationship Management (CRM) based on the results was suggested in this paper. Collected data were 269,387 bibliographic data of books, 12,281 patron data, and 39,269 circulation records. User identity, circulation frequencies, total number of circulated books, and publication year as relation factor from the analyzed data of circulation records were extracted. They were also analyzed, and verified by correlation coefficient.

A Study on the Basic Geometry Analysis of Abandoned Underground Mine Tunnels in Korea and Advanced Measuring-Analysis Technology for Underground Mine Cavities (한국의 폐광산 지하 채굴갱도 기초형상 분석 및 개선된 광산 지하공동 측정·분석 기술 연구)

  • Kim, Soo-Lo;Park, Sung-Bin;Choi, Byung-Hee;Yun, Jung-Mann;Jeong, Gyo-Cheol
    • Tunnel and Underground Space
    • /
    • v.26 no.6
    • /
    • pp.455-465
    • /
    • 2016
  • The collapse of underground mine development void for mineral resources can cause the subsidence of ground surface. In order to prevent the subsidence of ground, data such as maps or pictures of past mining site is important information for current mine reclamation works. In particular, mine subsidence management was based on mining maps and pictures. The process of the mining area surveys, safety evaluation, and ground reinforcement are normally possible with information such as maps and pictures in past mining. During the Japanese colonial period and 1960's, a lot of mines were developed in Korea indiscriminately. However, mining information at that time is limited to use. In the future, mining information will become even more rare. MIRECO intends to establish a realistic alternative solution. In this study, the basic numerical information of developed mine tunnels in Korea is statistically reviewed, and advanced underground cavity measuring technology was studied. 4,473 mine tunnel opening data of 1,784 abandoned mines in korea were collected and sorted. As a result of the analysis, the average value of small mine tunnel openings in Korea was 1.982 m in height and 1.959 m in width. The mean value of shape factor was analyzed as 0.485. The summary of these numerical mine data will be helpful for understanding and researching Korean abandoned mines. Therefore, the development of measurement technology for abandoned mine cavities and tunnels is expected to facilitate more effective mine subsidence management works in Korea.

Data Mining for the Effectiveness of Government Support Strategies for Technology Innovation in Service Sectors (서비스 부문의 기술혁신목적별 정부 지원제도의 활용도 분석 연구)

  • Hwang, Doo-Hyun;Kim, Woo-Jin;Sohn, So-Young
    • IE interfaces
    • /
    • v.21 no.2
    • /
    • pp.237-246
    • /
    • 2008
  • In today's competitive global environment, technological innovation is an important issue. Many countries are devising national level strategies to further strengthen industrial capacity in support of innovative companies. South Korea is no exception, and multiple strategies are in place to aid innovative development in the private sector. This study postulates that such national level strategies are applied differently depending on the innovation goal pursued by the service sector in Korea. We use data mining methods to test such research hypothesis. Factor analysis is used for clustering of various service companies, while association rule is used in finding the relationship per each cluster. The results show that national level strategies are underutilized and unequally distributed. This may be attributed to the disparity between the demand and needs of the private sector and the opinion of the government, which lead to underutilized and indistinguishable strategies.

Predicting movie audience with stacked generalization by combining machine learning algorithms

  • Park, Junghoon;Lim, Changwon
    • Communications for Statistical Applications and Methods
    • /
    • v.28 no.3
    • /
    • pp.217-232
    • /
    • 2021
  • The Korea film industry has matured and the number of movie-watching per capita has reached the highest level in the world. Since then, movie industry growth rate is decreasing and even the total sales of movies per year slightly decreased in 2018. The number of moviegoers is the first factor of sales in movie industry and also an important factor influencing additional sales. Thus it is important to predict the number of movie audiences. In this study, we predict the cumulative number of audiences of films using stacking, an ensemble method. Stacking is a kind of ensemble method that combines all the algorithms used in the prediction. We use box office data from Korea Film Council and web comment data from Daum Movie (www.movie.daum.net). This paper describes the process of collecting and preprocessing of explanatory variables and explains regression models used in stacking. Final stacking model outperforms in the prediction of test set in terms of RMSE.

Capacitively-coupled Resistivity Method - Applicability and Limitation (비접지식 전기비저항 탐사 - 적용성과 한계)

  • Lee Seong Kon;Cho Seong-Jun;Song Yoonho;Chung Seung-Hwan
    • Geophysics and Geophysical Exploration
    • /
    • v.5 no.1
    • /
    • pp.23-32
    • /
    • 2002
  • Capacitively-coupled resistivity (CCR) system is known to be very useful where galvanic contact to earth is impossible, such as the area covered with thick ice, snow, concrete or asphalt. This system injects current non-galvanically, i.e., capacitively to earth through line antenna and measures potential difference in a same manner. We derived geometric factor for two types of antenna configuration and presented the method of processing and converting the data obtained with CCR system suitable to conventional resistivity inversion analysis. The CCR system, however, has limitations on use at conductive area or electrically noisy area since it is very difficult to inject sufficient current to earth with this system as with conventional resistivity system. This causes low SM ratio when acquiring data with CCR system and great care must be taken in acquiring data with this system. Additionally the uniform contact between line antennas and earth is also crucial factor to obtain good S/N ratio data. The CCR method, however, enables one to perform continuous profiling over a survey line by dragging entire system and thus will be useful in rapid investigation of conductivity distribution in shallow subsurface.