• Title/Summary/Keyword: data scarcity

Search Result 191, Processing Time 0.028 seconds

Influence of climate change on crop water requirements to improve water management and maize crop productivity

  • Adeola, Adeyemi Khalid;Adelodun, Bashir;Odey, Golden;Choi, Kyung Sook
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.126-126
    • /
    • 2022
  • Climate change has continued to impact meteorological factors like rainfall in many countries including Nigeria. Thus, altering the rainfall patterns which subsequently affect the crop yield. Maize is an important cereal grown in northern Nigeria, along with sorghum, rice, and millet. Due to the challenge of water scarcity during the dry season, it has become critical to design appropriate strategies for planning, developing, and management of the limited available water resources to increase the maize yield. This study, therefore, determines the quantity of water required to produce maize from planting to harvesting and the impact of drought on maize during different growth stages in the region. Rainfall data from six rain gauge stations for a period of 36 years (1979-2014) was considered for the analysis. The standardized precipitation and evapotranspiration index (SPEI) is used to evaluate the severity of drought. Using the CROPWAT model, the evapotranspiration was calculated using the Penman-Monteith method, while the crop water requirements (CWRs) and irrigation scheduling for the maize crop was also determined. Irrigation was considered for 100% of critical soil moisture loss. At different phases of maize crop growth, the model predicted daily and monthly crop water requirements. The crop water requirement was found to be 319.0 mm and the irrigation requirement was 15.5 mm. The CROPWAT 8.0 model adequately estimated the yield reduction caused by water stress and climatic impacts, which makes this model appropriate for determining the crop water requirements, irrigation planning, and management.

  • PDF

Why Data Capability is Important to become an AI Matured Organization?

  • Gyeung-min Kim
    • Journal of Information Technology Applications and Management
    • /
    • v.31 no.3
    • /
    • pp.165-179
    • /
    • 2024
  • Although firms with advanced analytics and machine learning (which is often called AI) capabilities are considered to be highly successful in the market by making decisions and actions based on quantitative analysis using data, the scarcity of historical data and the lack of right data infrastructure are the problems for the organizations to perform such projects. The objective of this study, is to identify a road map for the organization to reach data capability maturity to become AI matured organizations. First, this study defines the terms, AI capability, data capability and AI matured organization. Then using content analyses, organizations' data practices performed for AI system development and operation are analyzed to infer a data capability roadmap to become an AI matured organization.

Changes in the Winter-Spring Center Timing over Upper Indus River Basin in Pakistan

  • Ali, Shahid;Kam, Jonghun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.372-372
    • /
    • 2021
  • The agriculture sector plays a vital role in the economy of Pakistan by contributing about 20% of the GDP and 42% of the labor force. Rivers from the top of Himalayas are the major water resources for this agriculture sector. Recent reports have found that Pakistan is one of the most vulnerable country to climate change that can cause water scarcity which is a big challenge to the communities. Previous studies have investigated the impact of climate change on the trend of streamflow, but the understanding of seasonal change in the regional hydrologic regimes remained limited. Therefore, a better understanding of the seasonal hydrologic change will help cope with the future water scarcity issue. In this study, we used the daily stream flow data for four major river basins of Pakistan (Chenab, Indus, Jhelum and Kabul) over 1962 - 2019. Utilizing these daily river discharge data, we calculated the winter-spring center time and the summer-autumn center times. In this study Winter-spring center time (WSCT) is defined as the day of the calendar year during which half of the total six months (Jan-Jun) discharge volume was exceeded. Results show that the four river basins experienced a statistically significant decreasing trend of WSCT, that is the center time keeps coming earlier compared to the past. We further used the Climate Research Unit (CRU) climate data comprising of the average temperature and precipitation for the four basins and found that the increasing average temperature value causes the early melting of the snow covers and glaciers that resulted in the decreasing of 1st center time value by 4 to 8 days. The findings of this study informs an alarming situation for the agriculture sector specifically.

  • PDF

Named entity recognition using transfer learning and small human- and meta-pseudo-labeled datasets

  • Kyoungman Bae;Joon-Ho Lim
    • ETRI Journal
    • /
    • v.46 no.1
    • /
    • pp.59-70
    • /
    • 2024
  • We introduce a high-performance named entity recognition (NER) model for written and spoken language. To overcome challenges related to labeled data scarcity and domain shifts, we use transfer learning to leverage our previously developed KorBERT as the base model. We also adopt a meta-pseudo-label method using a teacher/student framework with labeled and unlabeled data. Our model presents two modifications. First, the student model is updated with an average loss from both human- and pseudo-labeled data. Second, the influence of noisy pseudo-labeled data is mitigated by considering feedback scores and updating the teacher model only when below a threshold (0.0005). We achieve the target NER performance in the spoken language domain and improve that in the written language domain by proposing a straightforward rollback method that reverts to the best model based on scarce human-labeled data. Further improvement is achieved by adjusting the label vector weights in the named entity dictionary.

Privacy-Preserving Two-Party Collaborative Filtering on Overlapped Ratings

  • Memis, Burak;Yakut, Ibrahim
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.8
    • /
    • pp.2948-2966
    • /
    • 2014
  • To promote recommendation services through prediction quality, some privacy-preserving collaborative filtering solutions are proposed to make e-commerce parties collaborate on partitioned data. It is almost probable that two parties hold ratings for the same users and items simultaneously; however, existing two-party privacy-preserving collaborative filtering solutions do not cover such overlaps. Since rating values and rated items are confidential, overlapping ratings make privacy-preservation more challenging. This study examines how to estimate predictions privately based on partitioned data with overlapped entries between two e-commerce companies. We consider both user-based and item-based collaborative filtering approaches and propose novel privacy-preserving collaborative filtering schemes in this sense. We also evaluate our schemes using real movie dataset, and the empirical outcomes show that the parties can promote collaborative services using our schemes.

Optimal SMDP-Based Connection Admission Control Mechanism in Cognitive Radio Sensor Networks

  • Hosseini, Elahe;Berangi, Reza
    • ETRI Journal
    • /
    • v.39 no.3
    • /
    • pp.345-352
    • /
    • 2017
  • Traffic management is a highly beneficial mechanism for satisfying quality-of-service requirements and overcoming the resource scarcity problems in networks. This paper introduces an optimal connection admission control mechanism to decrease the packet loss ratio and end-to-end delay in cognitive radio sensor networks (CRSNs). This mechanism admits data flows based on the value of information sent by the sensor nodes, the network state, and the estimated required resources of the data flows. The number of required channels of each data flow is estimated using a proposed formula that is inspired by a graph coloring approach. The proposed admission control mechanism is formulated as a semi-Markov decision process and a linear programming problem is derived to obtain the optimal admission control policy for obtaining the maximum reward. Simulation results demonstrate that the proposed mechanism outperforms a recently proposed admission control mechanism in CRSNs.

Calibration for Gingivitis Binary Classifier via Epoch-wise Decaying Label-Smoothing (라벨 스무딩을 활용한 치은염 이진 분류기 캘리브레이션)

  • Lee, Sanghyun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.10a
    • /
    • pp.594-596
    • /
    • 2021
  • Future healthcare systems will heavily rely on ill-labeled data due to scarcity of the experts who are trained enough to label the data. Considering the contamination of the dataset, it is not desirable to make the neural network being overconfident to the dataset, but rather giving them some margins for the prediction is preferable. In this paper, we propose a novel epoch-wise decaying label-smoothing function to alleviate the model over-confidency, and it outperforms the neural network trained with conventional cross entropy by 6.0%.

  • PDF

A Movie Recommendation System based on Fuzzy-AHP with User Preference and Partition Algorithm (사용자 선호도와 군집 알고리즘을 이용한 퍼지-계층적 분석 기법 기반 영화 추천 시스템)

  • Oh, Jae-Taek;Lee, Sang-Yong
    • Journal of Digital Convergence
    • /
    • v.15 no.11
    • /
    • pp.425-432
    • /
    • 2017
  • The current recommendation systems have problems including the difficulty of figuring out whether they recommend items that actual users have preference for or have simple interest in, the scarcity of data to recommend proper items due to the extremely small number of users, and the cold-start issue of the dropping system performance to recommend items that can satisfy users according to the influx of new users. In an effort to solve these problems, this study implemented a movie recommendation system to ensure user satisfaction by using the Fuzzy-Analytic Hierarchy Process, which can reflect uncertain situations and problems, and the data partition algorithm to group similar items among the given ones. The data of a survey on movie preference with 61 users was applied to the system, and the results show that it solved the data scarcity problem based on the Fuzzy-AHP and recommended items fit for a user with the data partition algorithm even with the influx of new users. It is thought that research on the density-based clustering will be needed to filter out future noise data or outlier data.

Document Image Binarization by GAN with Unpaired Data Training

  • Dang, Quang-Vinh;Lee, Guee-Sang
    • International Journal of Contents
    • /
    • v.16 no.2
    • /
    • pp.8-18
    • /
    • 2020
  • Data is critical in deep learning but the scarcity of data often occurs in research, especially in the preparation of the paired training data. In this paper, document image binarization with unpaired data is studied by introducing adversarial learning, excluding the need for supervised or labeled datasets. However, the simple extension of the previous unpaired training to binarization inevitably leads to poor performance compared to paired data training. Thus, a new deep learning approach is proposed by introducing a multi-diversity of higher quality generated images. In this paper, a two-stage model is proposed that comprises the generative adversarial network (GAN) followed by the U-net network. In the first stage, the GAN uses the unpaired image data to create paired image data. With the second stage, the generated paired image data are passed through the U-net network for binarization. Thus, the trained U-net becomes the binarization model during the testing. The proposed model has been evaluated over the publicly available DIBCO dataset and it outperforms other techniques on unpaired training data. The paper shows the potential of using unpaired data for binarization, for the first time in the literature, which can be further improved to replace paired data training for binarization in the future.

Multi-channel Long Short-Term Memory with Domain Knowledge for Context Awareness and User Intention

  • Cho, Dan-Bi;Lee, Hyun-Young;Kang, Seung-Shik
    • Journal of Information Processing Systems
    • /
    • v.17 no.5
    • /
    • pp.867-878
    • /
    • 2021
  • In context awareness and user intention tasks, dataset construction is expensive because specific domain data are required. Although pretraining with a large corpus can effectively resolve the issue of lack of data, it ignores domain knowledge. Herein, we concentrate on data domain knowledge while addressing data scarcity and accordingly propose a multi-channel long short-term memory (LSTM). Because multi-channel LSTM integrates pretrained vectors such as task and general knowledge, it effectively prevents catastrophic forgetting between vectors of task and general knowledge to represent the context as a set of features. To evaluate the proposed model with reference to the baseline model, which is a single-channel LSTM, we performed two tasks: voice phishing with context awareness and movie review sentiment classification. The results verified that multi-channel LSTM outperforms single-channel LSTM in both tasks. We further experimented on different multi-channel LSTMs depending on the domain and data size of general knowledge in the model and confirmed that the effect of multi-channel LSTM integrating the two types of knowledge from downstream task data and raw data to overcome the lack of data.