• Title/Summary/Keyword: 유추 예측

Search Result 121, Processing Time 0.047 seconds

Korean Symptom-Based Disease Prediction Model according to Input Data Format and Positive/Negative (입력 데이터 형식 및 Positive/Negative에 따른 한국어 증상 기반 질병 예측 모델)

  • Min-Jung Kim;In-Whee Joe
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.418-421
    • /
    • 2023
  • 본 논문은 Word2Vec를 이용하여 한국어 증상 기반 질병 예측 모델을 제시한다. 아산병원 질환 백과의 크롤링 데이터를 세 가지 형식으로 나누어, 모델에 알맞은 데이터 형식을 찾고 모델에 적용한다. 가장 모델에 맞는 데이터 형식은 증상별 질병과 질병별 증상을 합친 경우이다. 데이터의 양을 늘려 임베딩 스페이스를 넓혔고, 가장 중요한 증상과 질병의 유사도도 정확하게 출력되었다. 이는 유사도가 높은 질병과 증상들이 제대로 학습이 되었다는 것을 알 수 있다. 이렇게 만들어진 예측 모델에 positive 증상을 입력하면 유사도가 향상되고, negative에 입력하면 하락하는 결과를 확인했다. 따라서 환자의 증상을 positive에 넣으면, 그 증상을 가진 질병이 가까워지는 반면, 환자의 증상이 아닌 증상을 negative에 넣으면, 환자에게 맞지 않는 질병이 멀어진다. 그러므로 환자의 상태에 맞는 질병을 유추해, 의사나 환자가 증상에 대한 질병을 알고 싶을 때 또는 검색에 유용하게 사용할 수 있다. 더불어, 질병의 진료과 데이터를 추가하여, 환자에게 맞는 진료과를 찾는 데도 도움을 줄 수 있다.

Prediction of Protein Subcellular Localization using Label Power-set Classification and Multi-class Probability Estimates (레이블 멱집합 분류와 다중클래스 확률추정을 사용한 단백질 세포내 위치 예측)

  • Chi, Sang-Mun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.10
    • /
    • pp.2562-2570
    • /
    • 2014
  • One of the important hints for inferring the function of unknown proteins is the knowledge about protein subcellular localization. Recently, there are considerable researches on the prediction of subcellular localization of proteins which simultaneously exist at multiple subcellular localization. In this paper, label power-set classification is improved for the accurate prediction of multiple subcellular localization. The predicted multi-labels from the label power-set classifier are combined with their prediction probability to give the final result. To find the accurate probability estimates of multi-classes, this paper employs pair-wise comparison and error-correcting output codes frameworks. Prediction experiments on protein subcellular localization show significant performance improvement.

A Data-based Sales Forecasting Support System for New Businesses (데이터기반의 신규 사업 매출추정방법 연구: 지능형 사업평가 시스템을 중심으로)

  • Jun, Seung-Pyo;Sung, Tae-Eung;Choi, San
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.1
    • /
    • pp.1-22
    • /
    • 2017
  • Analysis of future business or investment opportunities, such as business feasibility analysis and company or technology valuation, necessitate objective estimation on the relevant market and expected sales. While there are various ways to classify the estimation methods of these new sales or market size, they can be broadly divided into top-down and bottom-up approaches by benchmark references. Both methods, however, require a lot of resources and time. Therefore, we propose a data-based intelligent demand forecasting system to support evaluation of new business. This study focuses on analogical forecasting, one of the traditional quantitative forecasting methods, to develop sales forecasting intelligence systems for new businesses. Instead of simply estimating sales for a few years, we hereby propose a method of estimating the sales of new businesses by using the initial sales and the sales growth rate of similar companies. To demonstrate the appropriateness of this method, it is examined whether the sales performance of recently established companies in the same industry category in Korea can be utilized as a reference variable for the analogical forecasting. In this study, we examined whether the phenomenon of "mean reversion" was observed in the sales of start-up companies in order to identify errors in estimating sales of new businesses based on industry sales growth rate and whether the differences in business environment resulting from the different timing of business launch affects growth rate. We also conducted analyses of variance (ANOVA) and latent growth model (LGM) to identify differences in sales growth rates by industry category. Based on the results, we proposed industry-specific range and linear forecasting models. This study analyzed the sales of only 150,000 start-up companies in Korea in the last 10 years, and identified that the average growth rate of start-ups in Korea is higher than the industry average in the first few years, but it shortly shows the phenomenon of mean-reversion. In addition, although the start-up founding juncture affects the sales growth rate, it is not high significantly and the sales growth rate can be different according to the industry classification. Utilizing both this phenomenon and the performance of start-up companies in relevant industries, we have proposed two models of new business sales based on the sales growth rate. The method proposed in this study makes it possible to objectively and quickly estimate the sales of new business by industry, and it is expected to provide reference information to judge whether sales estimated by other methods (top-down/bottom-up approach) pass the bounds from ordinary cases in relevant industry. In particular, the results of this study can be practically used as useful reference information for business feasibility analysis or technical valuation for entering new business. When using the existing top-down method, it can be used to set the range of market size or market share. As well, when using the bottom-up method, the estimation period may be set in accordance of the mean reverting period information for the growth rate. The two models proposed in this study will enable rapid and objective sales estimation of new businesses, and are expected to improve the efficiency of business feasibility analysis and technology valuation process by developing intelligent information system. In academic perspectives, it is a very important discovery that the phenomenon of 'mean reversion' is found among start-up companies out of general small-and-medium enterprises (SMEs) as well as stable companies such as listed companies. In particular, there exists the significance of this study in that over the large-scale data the mean reverting phenomenon of the start-up firms' sales growth rate is different from that of the listed companies, and that there is a difference in each industry. If a linear model, which is useful for estimating the sales of a specific company, is highly likely to be utilized in practical aspects, it can be explained that the range model, which can be used for the estimation method of the sales of the unspecified firms, is highly likely to be used in political aspects. It implies that when analyzing the business activities and performance of a specific industry group or enterprise group there is political usability in that the range model enables to provide references and compare them by data based start-up sales forecasting system.

Contextual Inquiry on Multi-tasking Using a Mobile Phone (모바일폰에서의 멀티태스킹 사용 맥락조사)

  • Chung, Seung-Eun;Rhee, Jeong-Yoon;Lee, Shin-Hae;Ryoo, Han-Young
    • 한국HCI학회:학술대회논문집
    • /
    • 2009.02a
    • /
    • pp.938-943
    • /
    • 2009
  • This paper presents the minimum groups of tasks that should allow for multi-tasking by each main task when using a mobile phone. Imaging the situation that various tasks are seamlessly happened and making clear which tasks they need are not simple for users. Thus, we explore multi-tasking needs between every two tasks first, out of entire 16 functions selected from general functions that mobile phones have. Next, we create multi-tasking scenarios by analogy connecting each previous task to appropriate tasks that user's needs are revealed. In this manner, 11 scenarios are introduced finally. We expect that the result of our research is possible to be applicable to the development of user-centered design that multi-tasking contexts are considered.

  • PDF

A Study of Agenda Mining for Humanities-Based Convergence Research (인문사회기반 융합연구 의제 도출 연구)

  • Park, Minsu;Noh, Younghee
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.4
    • /
    • pp.62-76
    • /
    • 2020
  • In this study, we analyzed future emerging technologies from the perspective of convergence research and arranged them into mega-trends, trends, and issues, to carry out predictions of the future environment and search for technologies that are expected to be closely related to human life, and ultimately to achieve convergence research agenda that can predict various social problems. First, we investigated and analyzed the literature mentioned about various promising technologies and analyzed them to analyze key words and summarize the most frequently used core key words to infer trends. Then, an agenda was drawn that emphasized connectivity with the humanities-based convergence research by stratifying and organizing the inferred trends and classifying them as core and derived trends. The necessity, innovation, convergence, feasibility, future-oriented, and acceptability of these derived agendas were investigated through a survey. The analysis showed that researchers conducting convergence research showed a high interest in agenda items that deal closely with daily life with feasible technologies in the near future, while they showed a rather low interest in issues such as technology that can be realized in the distant future, terrorism or international conflicts.

Correlation Analysis between Sea Surface Temperature in the near Korea and Rainfall/Temperature (우리나라 근해의 해수면 온도 및 기온과 강수량과의 상관성 분석)

  • Kwon, Hyun-Han;Oh, Tae-Suk;Ahn, Jae-Hyun;Moon, Young-Il
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2006.05a
    • /
    • pp.1460-1464
    • /
    • 2006
  • 강수량의 특성 및 계절적인 양상은 지협적인 원인이기 보다는 해수면 온도(sea surface temperature)와 같은 기상 현상에 주로 영향을 받는다. 이러한 관점에서 강수량과 같은 수문변량의 장기적인 거동을 기상인자로부터 유추하고자 하는 연구는 무엇보다 중요하며 이러한 추론을 바탕으로 강수량의 장기예측 및 모의를 위한 기본적인 도구로 활용을 가능케 한다. 따라서 본 연구의 주요 목적은 해수면 온도를 기본으로 강수량과 기온의 변동성 및 상관성을 분석하고자 하며, 무엇보다 한반도 근해의 해수면 온도와의 직 간접적인 개연성을 살펴봄으로서 보다 효과적인 강수량 예측을 위한 하나의 변수로서의 가능성을 평가하고자 한다. 이를 위해 다양한 분석 방법 즉, 연주기를 제거하지 않은 자료의 선형적인 지체 상관 분석, 연주기를 제거하기 위해 표준화 된 자료의 지체 상관 분석 및 비모수적 상관분석을 수행하였다. 연주기를 제거하지 않은 자료의 경우 매우 강한 상관관계를 나타내었지만 이는 주로 계절 특성으로 인한 것으로 사료된다. 그러나 연주기를 제거한 Anomaly는 상대적으로 매우 작은 상관성을 보이고 있으나 유의성 검토를 통해 통계적으로 유의한 관계가 존재함을 확인 할 수 있었다. 따라서 강수량의 예측을 하나의 변수로서 이용이 가능할 것으로 사료되나 근해뿐만 아니라 한반도 기상의 연관성을 갖는 타 지역기상인자와의 보다 통합적인 검토가 필요하다 하겠다.

  • PDF

Error Estimation Based on the Bhattacharyya Distance for Classifying Multimodal Data (Multimodal 데이터에 대한 분류 에러 예측 기법)

  • Choe, Ui-Seon;Kim, Jae-Hui;Lee, Cheol-Hui
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.39 no.2
    • /
    • pp.147-154
    • /
    • 2002
  • In this paper, we propose an error estimation method based on the Bhattacharyya distance for multimodal data. First, we try to find the empirical relationship between the classification error and the Bhattacharyya distance. Then, we investigate the possibility to derive the error estimation equation based on the Bhattacharyya distance for multimodal data. We assume that the distribution of multimodal data can be approximated as a mixture of several Gaussian distributions. Experimental results with remotely sensed data showed that there exist strong relationships between the Bhattacharyya distance and the classification error and that it is possible to predict the classification error using the Bhattacharyya distance for multimodal data.

Utilization of A Data Base for Query Processing of natural language on the Repository of natural language (자연어 저장소에 기반을 둔 자연어 질의처리를 위한 데이터베이스 활용 방안에 관한 연구)

  • Jeon, Danny;LEE, Byeong Rae
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2012.04a
    • /
    • pp.1058-1061
    • /
    • 2012
  • 최근 웹을 기반으로 한 계속적인 기술 발전에 따라 의사결정에 필요한 데이터의 요구는 점점 다양해지고 있으며 다양한 요구를 효과적으로 대응하기 위해 데이터 추출 방법에 대한 연구도 지속적으로 이루어지고 있다. 이에 본 논문에서는 자연어를 통해 사용자가 쉽게 원하는 자료를 추출 할 수 있는 방법론을 연구 하였다. 자연어 처리 기술에 대한 연구는 여러 방면에서 이루어지고 있는데 그 중에서도 본 논문에서는 기존의 자연어 처리 연구를 바탕으로 크게 3가지 형태로 연구 진행 하였다. 사용자가 입력한 정보를 바탕으로 유추하여 자연어를 처리하거나 이후 진행될 검색을 선 예측 하는 방법과 사용자 별로 검색되는 자연어를 통해 연관 관계를 설정하여 사용자에게 예측검색을 유도하는 방법 그리고 의사 결정을 위해 구축된 데이터베이스 스키마 정보를 이용하여 사용자가 쉽게 질의 문을 생성할 수 있도록 하는 방법론 연구이다. 본 논문을 통해 연구된 내용은 실제 구축하여 진행 하였고, 연구결과로 생성된 질의 문이 효과적으로 시스템에서 처리 되는 과정에 대한 연구도 함께 진행하고 검증하였다.

Requirement Analysis and Drag Prediction for the Aerodynamic Configuration of a Bearingless Rotor Hub (무베어링 로터 허브 형상에 대한 요구도 분석 및 항력 예측)

  • Kang, Hee-Jung
    • Aerospace Engineering and Technology
    • /
    • v.11 no.1
    • /
    • pp.19-26
    • /
    • 2012
  • The requirement for the aerodynamic hub drag, allocated from the system requirement of development of a bearingless rotor hub, was analyzed and embodied to be substantiated by the methodology assigned from the requirement. Drag prediction for the initial hub configuration was carried out by hand calculation using aerodynamic drag coefficients and the design change about the sectional shape of torque tube was suggested to satisfy the requirement. Finally, drag prediction was performed for the changed hub configuration by using unstructured overset mesh technique and parallel computation and the calculated result satisfied the requirement of the aerodynamic hub drag. It was found that the drag of final hub configuration was also within the range of drag inferred from the trendline of developed helicopter.

Predict Solar Radiation According to Weather Report (일기예보를 이용한 일사량 예측기법개발)

  • Won, Jong-Min;Doe, Geun-Young;Heo, Na-Ri
    • Journal of Navigation and Port Research
    • /
    • v.35 no.5
    • /
    • pp.387-392
    • /
    • 2011
  • The value of Photovoltaic as an independent power supply is small, but the city's carbon emissions reduction and for the reduction of fossil fuel use distributed power is the power source to a very high value. However, according to the weather conditions for solar power generation by power fluctuations because of the size distribution to be effective, the big swing for effectively controlling real-time monitoring should be made. But that depends on solar power generation solar radiation forecasts from the National Weather Service does not need to predict it, and this study, the diffuse sky radiation in the history of the solar radiation in the darkness of the clouds, thick and weather forecasts can be inferred from the atmospheric transmittance to announce this value is calculated to represent each weather forecast solar radiation and solar radiation predicted by substituting the expression And the measured solar radiation and CRM (Cloud Cover Radiation Model) technique with an expression of Kasten and Czeplak irradiation when compared to the calculated predictions were verified.