• 제목/요약/키워드: Data quality metrics

검색결과 135건 처리시간 0.028초

전지구 격자형 CHIRPS 위성 강우자료의 한반도 적용성 분석 (Assessment and Validation of New Global Grid-based CHIRPS Satellite Rainfall Products Over Korea)

  • 전민기;남원호;문영식;김한중
    • 한국농공학회논문집
    • /
    • 제62권2호
    • /
    • pp.39-52
    • /
    • 2020
  • A high quality, long-term, high-resolution precipitation dataset is an essential in climate analyses and global water cycles. Rainfall data from station observations are inadequate over many parts of the world, especially North Korea, due to non-existent observation networks, or limited reporting of gauge observations. As a result, satellite-based rainfall estimates have been used as an alternative as a supplement to station observations. The Climate Hazards Group Infrared Precipitation (CHIRP) and CHIRP combined with station observations (CHIRPS) are recently produced satellite-based rainfall products with relatively high spatial and temporal resolutions and global coverage. CHIRPS is a global precipitation product and is made available at daily to seasonal time scales with a spatial resolution of 0.05° and a 1981 to near real-time period of record. In this study, we analyze the applicability of CHIRPS data on the Korean Peninsula by supplementing the lack of precipitation data of North Korea. We compared the daily precipitation estimates from CHIRPS with 81 rain gauges across Korea using several statistical metrics in the long-term period of 1981-2017. To summarize the results, the CHIRPS product for the Korean Peninsula was shown an acceptable performance when it is used for hydrological applications based on monthly rainfall amounts. Overall, this study concludes that CHIRPS can be a valuable complement to gauge precipitation data for estimating precipitation and climate, hydrological application, for example, drought monitoring in this region.

An AutoML-driven Antenna Performance Prediction Model in the Autonomous Driving Radar Manufacturing Process

  • So-Hyang Bak;Kwanghoon Pio Kim
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제17권12호
    • /
    • pp.3330-3344
    • /
    • 2023
  • This paper proposes an antenna performance prediction model in the autonomous driving radar manufacturing process. Our research work is based upon a challenge dataset, Driving Radar Manufacturing Process Dataset, and a typical AutoML machine learning workflow engine, Pycaret open-source Python library. Note that the dataset contains the total 70 data-items, out of which 54 used as input features and 16 used as output features, and the dataset is properly built into resolving the multi-output regression problem. During the data regression analysis and preprocessing phase, we identified several input features having similar correlations and so detached some of those input features, which may become a serious cause of the multicollinearity problem that affect the overall model performance. In the training phase, we train each of output-feature regression models by using the AutoML approach. Next, we selected the top 5 models showing the higher performances in the AutoML result reports and applied the ensemble method so as for the selected models' performances to be improved. In performing the experimental performance evaluation of the regression prediction model, we particularly used two metrics, MAE and RMSE, and the results of which were 0.6928 and 1.2065, respectively. Additionally, we carried out a series of experiments to verify the proposed model's performance by comparing with other existing models' performances. In conclusion, we enhance accuracy for safer autonomous vehicles, reduces manufacturing costs through AutoML-Pycaret and machine learning ensembled model, and prevents the production of faulty radar systems, conserving resources. Ultimately, the proposed model holds significant promise not only for antenna performance but also for improving manufacturing quality and advancing radar systems in autonomous vehicles.

오토인코더에 기반한 딥러닝을 이용한 사이버대학교 학생의 학업 성취도 예측 분석 시스템 연구 (Study for Prediction System of Learning Achievements of Cyber University Students using Deep Learning based on Autoencoder)

  • 이현진
    • 디지털콘텐츠학회 논문지
    • /
    • 제19권6호
    • /
    • pp.1115-1121
    • /
    • 2018
  • 본 논문에서는 사이버대학교 학습관리시스템에 누적된 데이터를 기반으로 학습 성과를 예측하기 위하여 딥러닝에 기반한 데이터 분석 방법을 연구하였다. 학습자의 학업 성취도를 예측하면, 학습자의 학습을 촉진하여 교육의 질을 높일 수 있는 도구로 활용될 수 있다. 학습 성과의 예측의 정확도를 향상시키기 위하여 오토인코더에 기반하여 한학기 출결 상황을 예측하고, 학기 진행 중인 평가 요소들과 결합하여 딥러닝으로 학습하여 최종 예측의 정확도를 높였다. 제안하는 예측 방법을 검증하기 위하여 학습 진행 과정의 출결데이터의 예측과 평가요소 데이터를 활용하여 최종학습 성취도를 예측하였다. 실험을 통하여 학기 진행중에 학습자의 성취도를 예측할 수 있는 것을 보였다.

모바일 인터넷 장비에 기반한 모바일 서비스 평가를 위한 실용적인 품질모델 (A Practical Quality Model for Evaluation of Mobile Services Based on Mobile Internet Device)

  • 오상헌;라현정;김수동
    • 한국정보과학회논문지:소프트웨어및응용
    • /
    • 제37권5호
    • /
    • pp.341-353
    • /
    • 2010
  • 모바일 인터넷 디바이스(Mobile Internet Device, MID)는 장소에 구애를 받지 않고 다양한 무선 인터넷(Wi-Fi, GSM, CDMA, 3G등)을 이용하여 어플리케이션 서비스를 이용하는 장치이다. MID는 휴대성, 인터넷 접속성 등 편리함으로 널리 사용될 것으로 예상되나, 크기가 작은 장치로서 낮은 성능의 CPU, 작은 용량의 메모리, 낮은 전력, 작은 액정의 화면 등 자원의 제약성을 가지고 있다. 따라서, 높은 사양을 요구하는 어플리케이션을 MID에 설치하거나 많은 양의 데이터를 메모리에 저장하고 처리하는데 제약을 가지고 있다. 이런 한계를 극복하기 위한 효과적인 방법은 필요한 기능을 서버 측의 클라우드 서비스 형태로 설치 운영하고, MID 단말기에서 필요한 기능은 인터넷을 이용하여 호출 형태로 사용한다. 기능이 MID 클라이언트와 서버에 분산되어 있고, 원격의 서비스를 사용하며, 제3의 개발자가 개발한 서비스 등을 이용하기 때문에 서비스의 이질성(Heterogeneity)과 독립성(Independability)등으로 서비스의 품질 (Quality of Service, QoS)이 떨어질 수 있다. 클라우드 서비스의 고유한 속성인 이질성과 독립성으로 인해 품질측정이 전통적 소프트웨어 품질측정보다 기술적으로 어렵다. 본 논문에서는 MID와 클라우드 서비스의 특징을 규명하고, 이를 근간으로 모바일 서비스의 품질을 측정하기 위한 품질모델을 유도 제시한다. 품질모델은 품질속성과 각 속성별로 적용할 수 있는 메트릭으로 구성된다. 제시된 품질모델은 사례연구를 통하여 본 연구의 실효성과 적용 가능성을 보여준다.

임계 대역 필터를 이용한 과도음의 라우드니스 계산 모델 (Calculation Model of Time Varying Loudness by Using the Critical-banded Filters)

  • 정혁;이정권
    • 한국음향학회지
    • /
    • 제19권5호
    • /
    • pp.65-70
    • /
    • 2000
  • 라우드니스(loudness)는 음질 평가에 있어서 가장 중요한 음질 인자로 간주되고 있고, 그 계산을 위해 정상음에 대한 국제규격도 마련되어 있다. 본 연구에서는, 이의 일반화를 위해 라우드니스 계산 모델에 과도음 해석 과정을 포함한 새로운 방법을 제시하고자 한다. 이를 위하여 과도 신호의 대역 분할 및 대역별 음압 레벨 변화 예측을 위한 신호 처리 기법과 과도 음에 대한 청각 반응을 모델링한 포스트 마스킹(post-masking) 및 라우드니스 시간 적분 모델이 도입되었다. 또한 순음의 라우드니스 해석에서 기존 라우드니스 모델이 갖고 있는 신호 해석상의 문제점을 개선하기 위하여 임계 대역폭의 1/2 간격으로 배치된 총 47개의 임계 대역 필터를 이용하였다. 제안된 모델의 유효성을 확인하기 위하여 기존의 임상 실험 결과 비교하였고, 예측치와 임상치는 아주 좋은 일치 경향을 가짐을 확인하였다.

  • PDF

Prediction & Assessment of Change Prone Classes Using Statistical & Machine Learning Techniques

  • Malhotra, Ruchika;Jangra, Ravi
    • Journal of Information Processing Systems
    • /
    • 제13권4호
    • /
    • pp.778-804
    • /
    • 2017
  • Software today has become an inseparable part of our life. In order to achieve the ever demanding needs of customers, it has to rapidly evolve and include a number of changes. In this paper, our aim is to study the relationship of object oriented metrics with change proneness attribute of a class. Prediction models based on this study can help us in identifying change prone classes of a software. We can then focus our efforts on these change prone classes during testing to yield a better quality software. Previously, researchers have used statistical methods for predicting change prone classes. But machine learning methods are rarely used for identification of change prone classes. In our study, we evaluate and compare the performances of ten machine learning methods with the statistical method. This evaluation is based on two open source software systems developed in Java language. We also validated the developed prediction models using other software data set in the same domain (3D modelling). The performance of the predicted models was evaluated using receiver operating characteristic analysis. The results indicate that the machine learning methods are at par with the statistical method for prediction of change prone classes. Another analysis showed that the models constructed for a software can also be used to predict change prone nature of classes of another software in the same domain. This study would help developers in performing effective regression testing at low cost and effort. It will also help the developers to design an effective model that results in less change prone classes, hence better maintenance.

Design of Query Processing System to Retrieve Information from Social Network using NLP

  • Virmani, Charu;Juneja, Dimple;Pillai, Anuradha
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제12권3호
    • /
    • pp.1168-1188
    • /
    • 2018
  • Social Network Aggregators are used to maintain and manage manifold accounts over multiple online social networks. Displaying the Activity feed for each social network on a common dashboard has been the status quo of social aggregators for long, however retrieving the desired data from various social networks is a major concern. A user inputs the query desiring the specific outcome from the social networks. Since the intention of the query is solely known by user, therefore the output of the query may not be as per user's expectation unless the system considers 'user-centric' factors. Moreover, the quality of solution depends on these user-centric factors, the user inclination and the nature of the network as well. Thus, there is a need for a system that understands the user's intent serving structured objects. Further, choosing the best execution and optimal ranking functions is also a high priority concern. The current work finds motivation from the above requirements and thus proposes the design of a query processing system to retrieve information from social network that extracts user's intent from various social networks. For further improvements in the research the machine learning techniques are incorporated such as Latent Dirichlet Algorithm (LDA) and Ranking Algorithm to improve the query results and fetch the information using data mining techniques.The proposed framework uniquely contributes a user-centric query retrieval model based on natural language and it is worth mentioning that the proposed framework is efficient when compared on temporal metrics. The proposed Query Processing System to Retrieve Information from Social Network (QPSSN) will increase the discoverability of the user, helps the businesses to collaboratively execute promotions, determine new networks and people. It is an innovative approach to investigate the new aspects of social network. The proposed model offers a significant breakthrough scoring up to precision and recall respectively.

통합 평가치 예측 방안의 협력 필터링 성능 개선 효과 (The Effect of an Integrated Rating Prediction Method on Performance Improvement of Collaborative Filtering)

  • 이수정
    • 한국인터넷방송통신학회논문지
    • /
    • 제21권5호
    • /
    • pp.221-226
    • /
    • 2021
  • 협력 필터링 기반의 추천 시스템은 사용자들의 평가 이력을 바탕으로 하여 현 사용자가 선호할 만한 상품들을 추천해 주며 현재 다양한 상업용 목적의 필수불가결한 기능이다. 추천 상품을 결정하기 위하여, 유사한 평가 이력을 기반으로 미평가 상품들에 대한 선호 예측치를 산출하는데, 기존 연구에서 대개 두 가지 방법, 즉, 유사 사용자 기반 또는 유사 항목 기반 방법을 각기 개별적으로 활용해 왔다. 이들 방법들은 사용자들의 평가 데이터가 희소할 경우 또는 유사 사용자나 유사 항목을 구하기 어려울 경우에 산출한 예측치의 정확성이 저하되는 문제점이 있다. 본 연구에서는 이들 두가지 방법을 통합하여 평가치를 예측하는 새로운 방법을 제안한다. 제안 방법의 장점은 보다 많은 수의 유사 평가치들을 참조할 수 있으므로 추천의 질이 향상된다는 점이다. 성능 실험 결과 제안 방법은 희소한 데이터셋에서 예측치 정확도, 추천 항목 적합도, 항목 순위 적합도의 모든 측면에서 기존 방법의 성능을 크게 향상시켰으며, 다소 밀집한 데이터셋에서는 예측치 정확도 측면에서는 가장 우수하고, 다른 평가 척도에서는 기존 방법과 대등한 결과를 보였다.

An Application of Machine Learning in Retail for Demand Forecasting

  • Muhammad Umer Farooq;Mustafa Latif;Waseemullah;Mirza Adnan Baig;Muhammad Ali Akhtar;Nuzhat Sana
    • International Journal of Computer Science & Network Security
    • /
    • 제23권9호
    • /
    • pp.1-7
    • /
    • 2023
  • Demand prediction is an essential component of any business or supply chain. Large retailers need to keep track of tens of millions of items flows each day to ensure smooth operations and strong margins. The demand prediction is in the epicenter of this planning tornado. For business processes in retail companies that deal with a variety of products with short shelf life and foodstuffs, forecast accuracy is of the utmost importance due to the shifting demand pattern, which is impacted by an environment of dynamic and fast response. All sectors strive to produce the ideal quantity of goods at the ideal time, but for retailers, this issue is especially crucial as they also need to effectively manage perishable inventories. In light of this, this research aims to show how Machine Learning approaches can help with demand forecasting in retail and future sales predictions. This will be done in two steps. One by using historic data and another by using open data of weather conditions, fuel, Consumer Price Index (CPI), holidays, any specific events in that area etc. Several machine learning algorithms were applied and compared using the r-squared and mean absolute percentage error (MAPE) assessment metrics. The suggested method improves the effectiveness and quality of feature selection while using a small number of well-chosen features to increase demand prediction accuracy. The model is tested with a one-year weekly dataset after being trained with a two-year weekly dataset. The results show that the suggested expanded feature selection approach provides a very good MAPE range, a very respectable and encouraging value for anticipating retail demand in retail systems.

An Application of Machine Learning in Retail for Demand Forecasting

  • Muhammad Umer Farooq;Mustafa Latif;Waseem;Mirza Adnan Baig;Muhammad Ali Akhtar;Nuzhat Sana
    • International Journal of Computer Science & Network Security
    • /
    • 제23권8호
    • /
    • pp.210-216
    • /
    • 2023
  • Demand prediction is an essential component of any business or supply chain. Large retailers need to keep track of tens of millions of items flows each day to ensure smooth operations and strong margins. The demand prediction is in the epicenter of this planning tornado. For business processes in retail companies that deal with a variety of products with short shelf life and foodstuffs, forecast accuracy is of the utmost importance due to the shifting demand pattern, which is impacted by an environment of dynamic and fast response. All sectors strive to produce the ideal quantity of goods at the ideal time, but for retailers, this issue is especially crucial as they also need to effectively manage perishable inventories. In light of this, this research aims to show how Machine Learning approaches can help with demand forecasting in retail and future sales predictions. This will be done in two steps. One by using historic data and another by using open data of weather conditions, fuel, Consumer Price Index (CPI), holidays, any specific events in that area etc. Several machine learning algorithms were applied and compared using the r-squared and mean absolute percentage error (MAPE) assessment metrics. The suggested method improves the effectiveness and quality of feature selection while using a small number of well-chosen features to increase demand prediction accuracy. The model is tested with a one-year weekly dataset after being trained with a two-year weekly dataset. The results show that the suggested expanded feature selection approach provides a very good MAPE range, a very respectable and encouraging value for anticipating retail demand in retail systems.