• 제목/요약/키워드: small data set

검색결과 664건 처리시간 0.027초

단계적 슈퍼픽셀 병합을 통한 이미지 분할 방법에서 특권정보의 활용 방안 (Image Segmentation by Cascaded Superpixel Merging with Privileged Information)

  • 박용진
    • 한국정보통신학회논문지
    • /
    • 제23권9호
    • /
    • pp.1049-1059
    • /
    • 2019
  • 기존의 영역 병합을 통한 이미지 분할 방법에서는 이웃한 두 영역 사이의 정보만을 이용하여 병합 모델을 학습한다. 학습 과정에서는 두 영역 사이의 지역적인 정보뿐만 아니라 물체 정보와 같은 전역적인 정보 또한 활용 가능하므로 주어진 모든 정보를 활용하여 병합 모델의 성능을 높이는 것이 바람직하다. 본 논문에서는 학습 기반의 이미지 분할 알고리즘에서 학습 시에만 사용 가능한 특권정보를 활용하는 SVM+ 방법을 제안한다. 특권정보는 학습 시에만 사용 가능한 정보이므로 전통적인 지도학습 방법으로는 학습이 불가하다. SVM+와 같은 특권정보를 학습할 수 있는 구조를 통해 지역 정보뿐만 아니라 물체 정보를 포함하여 영역 간의 병합 여부를 결정하는 모델을 학습하였다. BSDS 500 데이터 세트와 VOC 2012 데이터 세트에서 벤치마크를 수행하였으며 대부분의 평가 지표에서 개선된 성능을 보여 주었다. 특히 학습 데이터 세트가 작은 경우에 기존의 알고리즘에 비해서 월등히 뛰어난 성능을 보인다.

소규모사업장 보건기술지원사업에서의 간호활동경험 : 포커스그룹 인터뷰 (The Experience of Nurses Who are working in the Government-Funded Subsidized Occupational Health Program for Small Scale Industries : Focus Group Interview)

  • 한영란;김수근;하은희
    • 한국직업건강간호학회지
    • /
    • 제11권2호
    • /
    • pp.132-149
    • /
    • 2002
  • Purpose: This study was conducted to provide the fadamental information to prepare better Occupational Health Program for SSI based on the evaluation of the experience of nurse who were working in the Government-funded subsidized occupational health program for SSI. Method: The focus group method was used. Data was collected using focus group interviews and analyzed in the framework of the Grounded theory method as mapped by Strauss and Corbin(1990). The subjects were 14 nurses. Result: We identified 60 concepts, 30sub-categories and 6 categories(Table 1). Categories were Various services, Difficulties in implement Services, Successful strategies, Program Evaluation, Alternative Plans and Adaptation to new field. Conclusion: Nurses evaluated the effect of this program that autonomous health management ability has not changed although the possibility for the small scale industries to start the health management arose by acknowledging the importance of the health management of the workplace. In spite of many difficulties in implement services, nurses provided various health services using successful strategies. They suggest that various programs such as a model set-up suitable for the quality of workplaces and a role model set-up of the occupational health nurse for SSI as well as various efforts for the activated this program were needed. Nurses had a hard time in unfamiliar field when they began this program and overcame this difficulties with various strategies.

  • PDF

해파리 출현빈도에 따른 여수 정치망어업의 경영실적 고찰 (A study on the management performance of a set net fishery according to the blooming frequency of jelly fish Nemopliema nomurai in Yeosu)

  • 송세현;이상고;김희용
    • 수산해양기술연구
    • /
    • 제51권1호
    • /
    • pp.42-49
    • /
    • 2015
  • According to the catch condition of Scomberomorus niphonius in autumn season affected greatly, the catch price for the set net fishery. Catch production and the selling price were relatively even except 2009 showing a great big blooming jellyfish of Nemopliema nomurai in 2008~2011. The fishing cost of the set net fishery in Yeosu has increased gradually by the decrease of catch production and unexpected environmental change like as jelly fish blooming. The increase of fishing cost diminished net income and caused a negative impact in profitability. The lowest Fisheries ratio of gross profit to gross costs the set net fishery were appeared 60.2% in 2010, respectively. Bycatch was highest in 2008 and lowest in 2009. In general, the bycatch was occurred from May to July every year and when Scomber japonicus was most dominant in the catch price by bycatch had a advantage in the profit side. However, the catch increase of immature small fishes by the bycatch, which will bring about the decrease of fisheries resources. Finally, the present state in set net fisheries will act as a defect on the long-term management of fisheries resources.

An improvement of LEM2 algorithm

  • The, Anh-Pham;Lee, Young-Koo;Lee, Sung-Young
    • 한국정보과학회:학술대회논문집
    • /
    • 한국정보과학회 2011년도 한국컴퓨터종합학술대회논문집 Vol.38 No.1(A)
    • /
    • pp.302-304
    • /
    • 2011
  • Rule based machine learning techniques are very important in our real world now. We can list out some important application which we can apply rule based machine learning algorithm such as medical data mining, business transaction mining. The different between rules based machine learning and model based machine learning is that model based machine learning out put some models, which often are very difficult to understand by expert or human. But rule based techniques output are the rule sets which is in IF THEN format. For example IF blood pressure=90 and kidney problem=yes then take this drug. By this way, medical doctor can easy modify and update some usable rule. This is the scenario in medical decision support system. Currently, Rough set is one of the most famous theory which can be used for produce the rule. LEM2 is the algorithm use this theory and can produce the small set of rule on the database. In this paper, we present an improvement of LEM2 algorithm which incorporates the variable precision techniques.

자유수면의 모의를 위한 레블셑V 기법의 적용 (Application of the Level Set Method for Free Surface Modeling)

  • 이해균
    • 한국콘텐츠학회논문지
    • /
    • 제10권10호
    • /
    • pp.451-455
    • /
    • 2010
  • 수리학에서 다루는 유체 현상은 물과 공기의 경계면인 자유수면을 포함할 때가 많다. 수표면의 곡률이 작은 경우에는 정수압 가정이 적절하지만, 그렇지 않은 경우에는 비정수압 분포를 고려하여야 한다. 이와 같은 문제에서는 수심적분된 천수방정식이 아닌, 네비어-스토크스 방정식(Navier-Stokes equations)에 의존해야 할 때가 많다. 특성이 다른 두 유체, 예를 들면, 물과 공기, 물과 기름과 같이 섞이지 않는 두 유체의 모의를 위하여 본 연구에서는 널리 알려진 레블셑V 기법을, 고전적인 문제인 댐 파괴 문제의 모의에 적용하여 이를 실험결과, 기존 수치모델링 결과와 비교하고 효용성을 확인하였다.

인공지능 접근방법에 의한 S/W 공수예측 (Software Effort Estimation Using Artificial Intelligence Approaches)

  • 전응섭
    • 한국IT서비스학회:학술대회논문집
    • /
    • 한국IT서비스학회 2003년도 추계학술대회
    • /
    • pp.616-623
    • /
    • 2003
  • Since the computing environment changes very rapidly, the estimation of software effort is very difficult because it is not easy to collect a sufficient number of relevant cases from the historical data. If we pinpoint the cases, the number of cases becomes too small. However if we adopt too many cases, the relevance declines. So in this paper we attempt to balance the number of cases and relevance. Since many researches on software effort estimation showed that the neural network models perform at least as well as the other approaches, so we selected the neural network model as the basic estimator. We propose a search method that finds the right level of relevant cases for the neural network model. For the selected case set, eliminating the qualitative input factors with the same values can reduce the scale of the neural network model. Since there exists a multitude of combinations of case sets, we need to search for the optimal reduced neural network model and corresponding case set. To find the quasi-optimal model from the hierarchy of reduced neural network models, we adopted the beam search technique and devised the Case-Set Selection Algorithm. This algorithm can be adopted in the case-adaptive software effort estimation systems.

  • PDF

ARIMA 모델을 이용한 항공운임예측에 관한 연구 (A Study of Air Freight Forecasting Using the ARIMA Model)

  • 서상석;박종우;송광석;조승균
    • 유통과학연구
    • /
    • 제12권2호
    • /
    • pp.59-71
    • /
    • 2014
  • Purpose - In recent years, many firms have attempted various approaches to cope with the continual increase of aviation transportation. The previous research into freight charge forecasting models has focused on regression analyses using a few influence factors to calculate the future price. However, these approaches have limitations that make them difficult to apply into practice: They cannot respond promptly to small price changes and their predictive power is relatively low. Therefore, the current study proposes a freight charge-forecasting model using time series data instead a regression approach. The main purposes of this study can thus be summarized as follows. First, a proper model for freight charge using the autoregressive integrated moving average (ARIMA) model, which is mainly used for time series forecast, is presented. Second, a modified ARIMA model for freight charge prediction and the standard process of determining freight charge based on the model is presented. Third, a straightforward freight charge prediction model for practitioners to apply and utilize is presented. Research design, data, and methodology - To develop a new freight charge model, this study proposes the ARIMAC(p,q) model, which applies time difference constantly to address the correlation coefficient (autocorrelation function and partial autocorrelation function) problem as it appears in the ARIMA(p,q) model and materialize an error-adjusted ARIMAC(p,q). Cargo Account Settlement Systems (CASS) data from the International Air Transport Association (IATA) are used to predict the air freight charge. In the modeling, freight charge data for 72 months (from January 2006 to December 2011) are used for the training set, and a prediction interval of 23 months (from January 2012 to November 2013) is used for the validation set. The freight charge from November 2012 to November 2013 is predicted for three routes - Los Angeles, Miami, and Vienna - and the accuracy of the prediction interval is analyzed using mean absolute percentage error (MAPE). Results - The result of the proposed model shows better accuracy of prediction because the MAPE of the error-adjusted ARIMAC model is 10% and the MAPE of ARIMAC is 11.2% for the L.A. route. For the Miami route, the proposed model also shows slightly better accuracy in that the MAPE of the error-adjusted ARIMAC model is 3.5%, while that of ARIMAC is 3.7%. However, for the Vienna route, the accuracy of ARIMAC is better because the MAPE of ARIMAC is 14.5% and the MAPE of the error-adjusted ARIMAC model is 15.7%. Conclusions - The accuracy of the error-adjusted ARIMAC model appears better when a route's freight charge variance is large, and the accuracy of ARIMA is better when the freight charge variance is small or has a trend of ascent or descent. From the results, it can be concluded that the ARIMAC model, which uses moving averages, has less predictive power for small price changes, while the error-adjusted ARIMAC model, which uses error correction, has the advantage of being able to respond to price changes quickly.

Hot Data Verification Method Considering Continuity and Frequency of Write Requests Using Counting Filter

  • Lee, Seung-Woo;Ryu, Kwan-Woo
    • 한국컴퓨터정보학회논문지
    • /
    • 제24권6호
    • /
    • pp.1-9
    • /
    • 2019
  • Hard disks, which have long been used as secondary storage in computing systems, are increasingly being replaced by solid state drives (SSDs), due to their relatively fast data input / output speeds and small, light weight. SSDs that use NAND flash memory as a storage medium are significantly different from hard disks in terms of physical operation and internal operation. In particular, there is a feature that data overwrite can not be performed, which causes erase operation before writing. In order to solve this problem, a hot data for frequently updating a data for a specific page is distinguished from a cold data for a relatively non-hot data. Hot data identification helps to improve overall performance by identifying and managing hot data separately. Among the various hot data identification methods known so far, there is a technique of recording consecutive write requests by using a Bloom filter and judging the values by hot data. However, the Bloom filter technique has a problem that a new bit array must be generated every time a set of items is changed. In addition, since it is judged based on a continuous write request, it is possible to make a wrong judgment. In this paper, we propose a method using a counting filter for accurate hot data verification. The proposed method examines consecutive write requests. It also records the number of times consecutive write requests occur. The proposed method enables more accurate hot data verification.

로버스트 추정을 이용한 다중 프로세서에서의 데이터 통신 예측 모델 (Data Communication Prediction Model in Multiprocessors based on Robust Estimation)

  • 전장환;이강우
    • 정보처리학회논문지A
    • /
    • 제12A권3호
    • /
    • pp.243-252
    • /
    • 2005
  • 본 논문에서는 최소제곱 추정기법과 로버스트 추정기법을 사용하여 다중 프로세서 시스템에서의 데이터 통신의 빈도를 모델링하는 방법을 제안한다. 몇 가지의 서로 다른 크기의 작은 입력 데이터들을 작업부하 프로그램에 부과하여 그때마다의 통신 빈도를 측정하고, 이 측정된 값들에 두 가지 통계적 추정기법을 순차적으로 적용함으로써 통신 빈도를 정확히 예측할 수 있는 모델을 구축하는 방법이다. 이 모델링 기법은 작업부하나 목표시스템의 구조적인 사양에 무관하게 입력 데이터의 크기에만 의존하므로 다양한 작업부하와 목표시스템에 대하여 그대로 적용할 수 있는 장점이 있다. 또한 목표시스템에서 작업부하의 알고리즘적 동적특성이 수학적인 공식으로 반영되므로 데이터 통신이외의 성능 데이터를 모델링하는 데에도 적용할 수 있다. 본 논문에서는 대표적인 다중 프로세서인 공유메모리 시스템에서 데이터 통신을 유발하는 핵심 요소인 캐시접근실패의 빈도에 대한 모델을 구하였으며, 12번의 실험 중 5번의 경우에는 $1\%$ 미만, 나머지 경우에는 $3\%$ 내외의 대단히 정확한 예측 오차율을 보였다.

Structure Preserving Dimensionality Reduction : A Fuzzy Logic Approach

  • Nikhil R. Pal;Gautam K. Nandal;Kumar, Eluri-Vijaya
    • 한국지능시스템학회:학술대회논문집
    • /
    • 한국퍼지및지능시스템학회 1998년도 The Third Asian Fuzzy Systems Symposium
    • /
    • pp.426-431
    • /
    • 1998
  • We propose a fuzzy rule based method for structure preserving dimensionality reduction. This method selects a small representative sample and applies Sammon's method to project it. The input data points are then augmented by the corresponding projected(output) data points. The augmented data set thus obtained is clustered with the fuzzy c-means(FCM) clustering algorithm. Each cluster is then translated into a fuzzy rule for projection. Our rule based system is computationally very efficient compared to Sammon's method and is quite effective to project new points, i.e., it has good predictability.

  • PDF