• Title/Summary/Keyword: unbalanced data

Search Result 325, Processing Time 0.027 seconds

A Study on the Relationship between the Raising Conditions and the Physical Growth and Health in Early Childhood (유아의 신체 발육 및 건강도에 대한 생활 제 조건의 관여도에 관한 연구)

  • Kim, Myung
    • Korean Journal of Health Education and Promotion
    • /
    • v.12 no.1
    • /
    • pp.111-127
    • /
    • 1995
  • This study was designed to investigate the relationships between children's physical growth and health and their raising conditions and to find out the important conditions to improve their physical growth and health. The rasing conditions were classified into three major parts; i. e., family conditions, conditions of nutrition up-taking, and conditions of rest or sleep and exercise or play. Then, the questionnaire including the items to survey these three areas of raising condition and the items to evaluate the health status were given to children' mothers or fathers and filled up by them. The data of their 4 anthropometric measures; body weight, stature, sitting height and chest girth, were also collected from their latest records of health examination. The data of health status were converted to health scores representing 6 domains of health; digestive organs, respiratory organs, auto-nerve systems, fatigue, others and health as a whole. Then, correlations of raising conditions were determined with four antropometric measures and 6 health scores as criterion variables. Then, number of families to live together, and child's birth order in the domain of family conditions, habits of unbalanced diet, eating frequency of eggs, fruits, green and yellow vegetables, light colored vegetables, milk products in the domain of nutrition up-take, and the time for study at home, the place for play, the number of friends to play with together, the hours for playing out of door, the hours for playing sports, and the hours to move their body for assisting house keeping were picked out to investigate their relationships with physical growth and health status. Then, their habits of unbalanced diet and eating frequency of eggs, green or yellow colored vegetables, and milk products were found more influential conditions and more over, birth order, the time for study at home, the time to play out of door showed moderate degree of connection with physical growth and health status in early childhood.

  • PDF

Rendezvous Node Selection in Interworking of a Drone and Wireless Sensor Networks (드론과 무선 센서 네트워크 연동에서 랑데부 노드 선정)

  • Min, Hong;Jung, Jinman;Heo, Junyoung;Kim, Bongjae
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.17 no.1
    • /
    • pp.167-172
    • /
    • 2017
  • Mobile nodes are used for prolonging the life-time of the entire wireless sensor networks and many studies that use drones to collected data have been actively conducted with the development of drone related technology. In case of associating a drone and tactical wireless sensor networks, real-time feature and efficiency are improved. The previous studies so focus on reducing drone's flight distance that the energy consumption of sensor nodes is unbalanced. This unbalanced energy consumption accelerates the network partition and increases drone's flight distance. In this paper, we proposed a new selection scheme considered drone's flight distance and nodes' life-time to solve this problem when rendezvous nodes that collect data from their cluster and directly communicate with a drone are selected.

Bayesian Model Selection in the Unbalanced Random Effect Model

  • Kim, Dal-Ho;Kang, Sang-Gil;Lee, Woo-Dong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.4
    • /
    • pp.743-752
    • /
    • 2004
  • In this paper, we develop the Bayesian model selection procedure using the reference prior for comparing two nested model such as the independent and intraclass models using the distance or divergence between the two as the basis of comparison. A suitable criterion for this is the power divergence measure as introduced by Cressie and Read(1984). Such a measure includes the Kullback -Liebler divergence measures and the Hellinger divergence measure as special cases. For this problem, the power divergence measure turns out to be a function solely of $\rho$, the intraclass correlation coefficient. Also, this function is convex, and the minimum is attained at $\rho=0$. We use reference prior for $\rho$. Due to the duality between hypothesis tests and set estimation, the hypothesis testing problem can also be solved by solving a corresponding set estimation problem. The present paper develops Bayesian method based on the Kullback-Liebler and Hellinger divergence measures, rejecting $H_0:\rho=0$ when the specified divergence measure exceeds some number d. This number d is so chosen that the resulting credible interval for the divergence measure has specified coverage probability $1-{\alpha}$. The length of such an interval is compared with the equal two-tailed credible interval and the HPD credible interval for $\rho$ with the same coverage probability which can also be inverted into acceptance regions of $H_0:\rho=0$. Example is considered where the HPD interval based on the one-at- a-time reference prior turns out to be the shortest credible interval having the same coverage probability.

  • PDF

The Impact of Dual Labor Markets on Labor Productivity: Evidence from the OECD (노동시장 이중구조가 노동생산성에 미치는 영향: OECD 국가를 중심으로)

  • Choi, Koangsung;Lee, Jieun;Choe, Chung
    • Economic Analysis
    • /
    • v.25 no.3
    • /
    • pp.1-29
    • /
    • 2019
  • This paper examines the impact of a dual labor market structure on labor productivity using unbalanced panel data from 29 OECD member countries between 1990 and 2015. By applying a variety of regression models on the panel data (e.g., a pooled regression, a fixed effects model and a GMM), we explore how changes in worker-type composition among temporary, permanent and self-employed workers contribute to productivity growth. While it appears that our results differ slightly, depending on the econometric models, overall an increase in the share of permanent workers leads to a relatively higher increase in productivity growth. On the other hand, it is also seen that the effects of the share of temporary workers on labor productivity are considerably lower than that of permanent and self-employed workers. To sum it up, our findings indicate that an increase in temporary workers could have an adverse effect on labor productivity.

Text filtering by Boosting Linear Perceptrons

  • O, Jang-Min;Zhang, Byoung-Tak
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.10 no.4
    • /
    • pp.374-378
    • /
    • 2000
  • in information retrieval, lack of positive examples is a main cause of poor performance. In this case most learning algorithms may not characteristics in the data to low recall. To solve the problem of unbalanced data, we propose a boosting method that uses linear perceptrons as weak learnrs. The perceptrons are trained on local data sets. The proposed algorithm is applied to text filtering problem for which only a small portion of positive examples is available. In the experiment on category crude of the Reuters-21578 document set, the boosting method achieved the recall of 80.8%, which is 37.2% improvement over multilayer with comparable precision.

  • PDF

Virtual System for Manufacture of Train Tilting eXpress using Project Data Management (틸팅 차량 설계를 위한 Virtual System 구축 연구)

  • Song, Yongsoo;Han, Seong-ho;Seo, Sung-Il
    • Journal of the Korean Society of Systems Engineering
    • /
    • v.1 no.2
    • /
    • pp.19-25
    • /
    • 2005
  • Tilting train has been developed to increase the operational speed of the trains on conventional lines which have many curves. The train are tilted at curves to compensate for unbalanced carbody centrifugal acceleration to a greater extent than compensation produced by the track cant, so that passengers do not feel centrifugal acceleration and thus trains can run at higher speed at curves. This paper developed PDM(product data management) to make a system engineering of TTX(Tilting Train eXpress) with maximum operation speed of 180 km/h.

  • PDF

Generative Model of Acceleration Data for Deep Learning-based Damage Detection for Bridges Using Generative Adversarial Network (딥러닝 기반 교량 손상추정을 위한 Generative Adversarial Network를 이용한 가속도 데이터 생성 모델)

  • Lee, Kanghyeok;Shin, Do Hyoung
    • Journal of KIBIM
    • /
    • v.9 no.1
    • /
    • pp.42-51
    • /
    • 2019
  • Maintenance of aging structures has attracted societal attention. Maintenance of the aging structure can be efficiently performed with a digital twin. In order to maintain the structure based on the digital twin, it is required to accurately detect the damage of the structure. Meanwhile, deep learning-based damage detection approaches have shown good performance for detecting damage of structures. However, in order to develop such deep learning-based damage detection approaches, it is necessary to use a large number of data before and after damage, but there is a problem that the amount of data before and after the damage is unbalanced in reality. In order to solve this problem, this study proposed a method based on Generative adversarial network, one of Generative Model, for generating acceleration data usually used for damage detection approaches. As results, it is confirmed that the acceleration data generated by the GAN has a very similar pattern to the acceleration generated by the simulation with structural analysis software. These results show that not only the pattern of the macroscopic data but also the frequency domain of the acceleration data can be reproduced. Therefore, these findings show that the GAN model can analyze complex acceleration data on its own, and it is thought that this data can help training of the deep learning-based damage detection approaches.

An Automatic Urban Function District Division Method Based on Big Data Analysis of POI

  • Guo, Hao;Liu, Haiqing;Wang, Shengli;Zhang, Yu
    • Journal of Information Processing Systems
    • /
    • v.17 no.3
    • /
    • pp.645-657
    • /
    • 2021
  • Along with the rapid development of the economy, the urban scale has extended rapidly, leading to the formation of different types of urban function districts (UFDs), such as central business, residential and industrial districts. Recognizing the spatial distributions of these districts is of great significance to manage the evolving role of urban planning and further help in developing reliable urban planning programs. In this paper, we propose an automatic UFD division method based on big data analysis of point of interest (POI) data. Considering that the distribution of POI data is unbalanced in a geographic space, a dichotomy-based data retrieval method was used to improve the efficiency of the data crawling process. Further, a POI spatial feature analysis method based on the mean shift algorithm is proposed, where data points with similar attributive characteristics are clustered to form the function districts. The proposed method was thoroughly tested in an actual urban case scenario and the results show its superior performance. Further, the suitability of fit to practical situations reaches 88.4%, demonstrating a reasonable UFD division result.

An Efficient Wireless Signal Classification Based on Data Augmentation (데이터 증강 기반 효율적인 무선 신호 분류 연구 )

  • Sangsoon Lim
    • Journal of Platform Technology
    • /
    • v.10 no.4
    • /
    • pp.47-55
    • /
    • 2022
  • Recently, diverse devices using different wireless technologies are gradually increasing in the IoT environment. In particular, it is essential to design an efficient feature extraction approach and detect the exact types of radio signals in order to accurately identify various radio signal modulation techniques. However, it is difficult to gather labeled wireless signal in a real environment due to the complexity of the process. In addition, various learning techniques based on deep learning have been proposed for wireless signal classification. In the case of deep learning, if the training dataset is not enough, it frequently meets the overfitting problem, which causes performance degradation of wireless signal classification techniques using deep learning models. In this paper, we propose a generative adversarial network(GAN) based on data augmentation techniques to improve classification performance when various wireless signals exist. When there are various types of wireless signals to be classified, if the amount of data representing a specific radio signal is small or unbalanced, the proposed solution is used to increase the amount of data related to the required wireless signal. In order to verify the validity of the proposed data augmentation algorithm, we generated the additional data for the specific wireless signal and implemented a CNN and LSTM-based wireless signal classifier based on the result of balancing. The experimental results show that the classification accuracy of the proposed solution is higher than when the data is unbalanced.

An Adaptive Workflow Scheduling Scheme Based on an Estimated Data Processing Rate for Next Generation Sequencing in Cloud Computing

  • Kim, Byungsang;Youn, Chan-Hyun;Park, Yong-Sung;Lee, Yonggyu;Choi, Wan
    • Journal of Information Processing Systems
    • /
    • v.8 no.4
    • /
    • pp.555-566
    • /
    • 2012
  • The cloud environment makes it possible to analyze large data sets in a scalable computing infrastructure. In the bioinformatics field, the applications are composed of the complex workflow tasks, which require huge data storage as well as a computing-intensive parallel workload. Many approaches have been introduced in distributed solutions. However, they focus on static resource provisioning with a batch-processing scheme in a local computing farm and data storage. In the case of a large-scale workflow system, it is inevitable and valuable to outsource the entire or a part of their tasks to public clouds for reducing resource costs. The problems, however, occurred at the transfer time for huge dataset as well as there being an unbalanced completion time of different problem sizes. In this paper, we propose an adaptive resource-provisioning scheme that includes run-time data distribution and collection services for hiding the data transfer time. The proposed adaptive resource-provisioning scheme optimizes the allocation ratio of computing elements to the different datasets in order to minimize the total makespan under resource constraints. We conducted the experiments with a well-known sequence alignment algorithm and the results showed that the proposed scheme is efficient for the cloud environment.