• Title/Summary/Keyword: Large Scale Data

Search Result 2,773, Processing Time 0.03 seconds

F_MixBERT: Sentiment Analysis Model using Focal Loss for Imbalanced E-commerce Reviews

  • Fengqian Pang;Xi Chen;Letong Li;Xin Xu;Zhiqiang Xing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.2
    • /
    • pp.263-283
    • /
    • 2024
  • Users' comments after online shopping are critical to product reputation and business improvement. These comments, sometimes known as e-commerce reviews, influence other customers' purchasing decisions. To confront large amounts of e-commerce reviews, automatic analysis based on machine learning and deep learning draws more and more attention. A core task therein is sentiment analysis. However, the e-commerce reviews exhibit the following characteristics: (1) inconsistency between comment content and the star rating; (2) a large number of unlabeled data, i.e., comments without a star rating, and (3) the data imbalance caused by the sparse negative comments. This paper employs Bidirectional Encoder Representation from Transformers (BERT), one of the best natural language processing models, as the base model. According to the above data characteristics, we propose the F_MixBERT framework, to more effectively use inconsistently low-quality and unlabeled data and resolve the problem of data imbalance. In the framework, the proposed MixBERT incorporates the MixMatch approach into BERT's high-dimensional vectors to train the unlabeled and low-quality data with generated pseudo labels. Meanwhile, data imbalance is resolved by Focal loss, which penalizes the contribution of large-scale data and easily-identifiable data to total loss. Comparative experiments demonstrate that the proposed framework outperforms BERT and MixBERT for sentiment analysis of e-commerce comments.

Estimation of the methane generation rate constant using a large-scale respirometer at a landfill site

  • Park, Jin-Kyu;Tameda, Kazuo;Higuchi, Sotaro;Lee, Nam-Hoon
    • Environmental Engineering Research
    • /
    • v.22 no.4
    • /
    • pp.339-346
    • /
    • 2017
  • The objective of this study is the evaluation of the performance of a large-scale respirometer (LSR) of 17.7 L in the determination of the methane generation rate constant (k) values. To achieve this objective, a comparison between anaerobic (GB21) and LSR tests was conducted. The data were modeled using a linear function, and the resulting correlation coefficient ($R^2$) of the linear regression is 0.91. This result shows that despite the aerobic conditions, the biodegradability values that were obtained from the LSR test produced results that are similar to those from the GB21 test. In this respect, the LSR test can be an indicator of the anaerobic biodegradability for landfill waste. In addition, the results show the high repeatability of the tests with an average coefficient of variance (CV) that is lower than 10%; furthermore, the CV for the LSR is lower than that of the GB21, which indicates that the LSR-test method could provide a better representation of waste samples. Therefore, the LSR method allows for both the prediction of the long-term biodegradation potential in a shorter length of time and the reduction of the sampling errors that are caused by the heterogeneity of waste samples. The k values are $0.156y^{-1}$ and $0.127y^{-1}$ for the cumulative biogas production (GB21) and the cumulative oxygen uptake for the LSR, respectively.

Synoptic Characteristics of Cold Days over South Korea and Their Relationship with Large-Scale Climate Variability (한반도 혹한 발생시 종관장 특성과 대규모 기후 변동성 간의 연관성)

  • Yoo, Yeong-Eun;Son, Seok-Woo;Kim, Hyeong-Seog;Jeong, Jee-Hoon
    • Atmosphere
    • /
    • v.25 no.3
    • /
    • pp.435-447
    • /
    • 2015
  • This study explores the synoptic characteristics of cold days over South Korea and their relationship with large-scale climate variability. The cold day, which is different from cold surge, is defined when daily-mean surface air temperature, averaged over 11 KMA stations, is colder than 1-percentile temperature in each year by considering its long-term trend over 1960~2012. Such event is detected by quantile regression and the related synoptic patterns are identified in reanalysis data. Composite geopotential height anomalies at 500 hPa show that cold days are often preceded by positive anomalies in high latitudes and negative anomalies in midlatitudes on the west of Korea. While the formers are quasi-stationary and quasi-barotropic, and often qualified as blocking highs, the latters are associated with transient cyclones. At cold days, the north-south dipole in geopotential height anomalies becomes west-east dipole in the lower troposphere as high-latitude anticyclone expands equatorward to the Northern China and mid-latitude cyclone moves eastward and rapidly develops over the East Sea. The resulting northerlies cause cold days in Korea. By performing composite analyses of large-scale climate indices, it is further found that the occurrence of these cold days are preferable when the Arctic Oscillation is in its negative phase and/or East Asian monsoon circulation and Siberian high are anomalously strong.

Strength Characteristics of Square Concrete Column Confined by Carbon Composite Tube (탄소섬유튜브로 횡구속된 각형 콘크리트 기둥의 압축강도 성능에 관한 연구)

  • 홍원기;김희철;윤석한;박순섭
    • Journal of the Earthquake Engineering Society of Korea
    • /
    • v.7 no.1
    • /
    • pp.1-7
    • /
    • 2003
  • The carbon composite tube can play an important role in replacing or complementing longitudinal and transverse reinforcing steels by providing ductility and strength for conventional columns. In this study, both the experimental and analytical investigations of axial behavior of large-scale square concrete columns confined by carbon composite tube are presented. The specimens are filament-wound carbon composite with 90$^{\circ}$+30$^{\circ}$, 90$^{\circ}$+45$^{\circ}$ winding angle respect to longitudinal axis of tube. The instrumented large-scale concrete-filled composite tubes(CFCT) are subjected to monotonic axial loads exerted by 10,000kN UTM. The influence of winding angle, thickness of tube on stress-strain relationships of the confined columns is identified and discussed. Proposed equations to predict both the strength and ductility of confined columns by carbon composite tube demonstrate good correlation with test data obtained from large-scale specimens.

A Change of Large-scale Circulations in the Indian Ocean and Asia Since 1976/77 and Its Impact on the Rising Surface Temperature in Siberia

  • Lim, Han-Cheol;Jhun, Jong-Ghap;Kwon, Won-Tae;Moon, Byung-Kwon
    • Journal of the Korean earth science society
    • /
    • v.30 no.5
    • /
    • pp.660-670
    • /
    • 2009
  • This study examines the changes of an interdecadal circulation over the Asian continent to find cause of the surface warming in Siberia from 1958 to 2004. According to our study, there is a coherency between a long-term change of sea surface temperature in the Indian Ocean and the rapid increase of air temperature in Siberia since 1976/1977. In this study, we suggest that mean wind field changes induced by the positive sea surface temperature anomalies of the Indian Ocean since 1976/1977 are caused of inter-decadal variations in a large-scale circulation over the Asian continent. It also indicates that the inter-decadal circulation over the Asian continent is accompanied with warm southerly winds near surface, which have significantly contributed to the increase of surface temperature in Siberia. These southerly winds have been one of the most dominant interdecadal variations over the Asian continent since 1976/1977. In addition, we investigated the long-term trend mode of 850 hPa geopotential height data over the Asian continent from the Empirical Orthogonal Function (EOF) analysis for 1958-2004. In result, we found that there was an anomalously high pressure pattern over the Asian continent, it is called 'the Asian High mode'. It is thus suggested that the Asian High mode is another response of interdecadal changes of large-scale circulations over the Asian continent.

Mass Production of Poly(3-Hydroxybutyrate) by Fed-Batch Cultures of Ralstonia eutropha with Nitrogen and Phosphate Limitation

  • Ryu, Hee-Wook;Cho, Kyung-Suk;Kim, Beom-Soo;Chang, Yong-Keun;Chang, Ho-Nam;Shim, Hyun-Joo
    • Journal of Microbiology and Biotechnology
    • /
    • v.9 no.6
    • /
    • pp.751-756
    • /
    • 1999
  • For mass production of poly(3-hydroxybutyrate) (PHB), high cell density cultures of Ralstonia eutropha were carried out in 2.5-1 and 60-1 fermentors by two fed-batch culture techniques of nitrogen and phosphate limitation. When the nitrogen limitation technique was employed using both an on-line glucose monitoring and control system, a high concentration level of PHB (121g/l) was obtained in the small-scale fermentor of 2.5 1. However, the PHB concentration obtained in a large-scale fermentor of 60 1 only turned out to be 60g/l. In contrast, when another fed-batch culture technique of the phosphate-limitation employing dissolved oxygen (DO) stat glucose feeding was used, a large amount of PHB was successfully produced in both 60-1 and 2.5-1 fermentors. In a 2.5-1 fermentor, concentrations of PHB and cells obtained in 58 h were 175 and 210 g/l, respectively, which corresponded to the PHB productivity level of 3.02 g/l/h. In a 60-1 fermentor, a final cell concentration of 221 g/l and a PHB concentration of 180 g/l with PHB productivity level of 3.75 g/l/h were obtained in 48h. PHB content and yield from glucose were 81% and 0.38g PHB/g glucose, respectively. These data suggest that the phosphate limitation technique is more effective compared to nitrogen limitation in the mass production of PHB by R. eutropha of a large scale.

  • PDF

Role-based Self-Organization Protocol of Clustering Hierarchy for Wireless Sensor Networks (무선 센서 네트워크를 위한 계층형 클러스터링의 역할 기반 자가 구성 프로토콜)

  • Go, Sung-Hyun;Kim, Hyoung-Jin
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.6
    • /
    • pp.137-145
    • /
    • 2008
  • In general, a large-scale wireless sensor network(WSNs) is composed of hundreds of or thousands of sensor nodes. In this large-scale wireless sensor networks, it is required to maintain and manage the networks to lower management cost and obtain high energy efficiency. Users should be provided with sensing service at the level of quality for users through an efficient system. In evaluating the result data quality provided from this network to users, the number of sensors related to event detection has an important role. Accordingly, the network protocol which can provide proper QoS at the level of users demanding quality should be designed in a way such that the overall system function has not to be influenced even if some sensor nodes are in error. The energy consumption is minimized at the same time. The protocol suggested in this article is based on the LEACH protocol and is a role-based self-Organization one that is appropriate for large-scale networks which need constant monitoring.

  • PDF

ABox Realization Reasoning in Distributed In-Memory System (분산 메모리 환경에서의 ABox 실체화 추론)

  • Lee, Wan-Gon;Park, Young-Tack
    • Journal of KIISE
    • /
    • v.42 no.7
    • /
    • pp.852-859
    • /
    • 2015
  • As the amount of knowledge information significantly increases, a lot of progress has been made in the studies focusing on how to reason large scale ontology effectively at the level of RDFS or OWL. These reasoning methods are divided into TBox classifications and ABox realizations. A TBox classification mainly deals with integrity and dependencies in schema, whereas an ABox realization mainly handles a variety of issues in instances. Therefore, the ABox realization is very important in practical applications. In this paper, we propose a realization method for analyzing the constraint of the specified class, so that the reasoning system automatically infers the classes to which instances belong. Unlike conventional methods that take advantage of the object oriented language based distributed file system, we propose a large scale ontology reasoning method using spark, which is a functional programming-based in-memory system. To verify the effectiveness of the proposed method, we used instances created from the Wine ontology by W3C(120 to 600 million triples). The proposed system processed the largest 600 million triples and generated 951 million triples in 51 minutes (696 K triple / sec) in our largest experiment.

Factors Affecting Public Non-compliance With Large-scale Social Restrictions to Control COVID-19 Transmission in Greater Jakarta, Indonesia

  • Rosha, Bunga Christitha;Suryaputri, Indri Yunita;Irawan, Irlina Raswanti;Arfines, Prisca Petty;Triwinarto, Agus
    • Journal of Preventive Medicine and Public Health
    • /
    • v.54 no.4
    • /
    • pp.221-229
    • /
    • 2021
  • Objectives: The Indonesian government issued large-scale social restrictions (called Pembatasan Sosial Berskala Besar, or PSBB) at the beginning of the coronavirus disease 2019 (COVID-19) pandemic to control the spread of COVID-19 in Jakarta, Bogor, Depok, Tangerang, and Bekasi (Greater Jakarta). Public compliance poses a challenge when implementing large-scale social restrictions, and various factors have contributed to public non-compliance with the regulation. This study aimed to determine the degree of non-compliance and identify the factors that contributed to public non-compliance with the PSBB in Greater Jakarta, Indonesia. Methods: This was a quantitative study with a cross-sectional design. A total of 839 residents of Greater Jakarta participated in this study. Data were collected online using a Google Form, and convenience sampling was undertaken. Univariate and multivariate analyses were performed to explore the relationships between public non-compliance with the PSBB regulation and socio-demographic variables, respondents' opinion of the PSBB, and social capital. Results: A total of 22.6% of subjects reported participating in activities that did not comply with the PSBB. The variables that most affected non-compliance with the PSBB were age, gender, income, opinion of the PSBB, and social capital. Conclusions: Strengthening social capital and providing information about COVID-19 prevention measures, such as washing one's hands with soap, wearing masks properly, and maintaining social distancing, is essential. Robust public understanding will foster trust and cooperation with regard to COVID-19 prevention efforts and provide a basis for mutual agreement regarding rules/penalties.

Development of Large-scale Tool Dynamometer for Measuring Three-axis Individual Force (3축 분력 측정이 가능한 대형 공구동력계 개발)

  • Kim, Joong-Seon;Wang, Duck-Hyun
    • Journal of the Korean Society of Manufacturing Process Engineers
    • /
    • v.18 no.5
    • /
    • pp.29-36
    • /
    • 2019
  • In modern society in which the fourth industrial revolution has come to the fore and rapid technology innovations are taking place, a phenomenon of making and selling small quantities of various products that consumers want instead of mass producing one item has emerged. As the market is moving toward the multi-item small-sized production system, there is a need for a system in which a machine independently judges and carries out machining and post-processing. In order for a machine to judge processing on its own, it is necessary to measure the force applied to a product. This study aimed to develop a large-scale dynamometer that enables three-axis measurement using octagonal ring load cells. As for the device's configuration, four octagonal ring load cells, which were previously researched, were used to enable three-axis measurement. It was reconfigured by modifying the attachment position of the octagonal ring load cells' strain gauge and the Wheatstone bridge of each axis, and a system was set up to allow the monitoring of data measured through the monitor. The configured device calculated a strain rate by an experiment, and this rate was compared with the theoretical strain rate to find a correction value. The correction value was entered into a formula, deriving a modified formula. The modified formula was entered into the device, which completed the large-scale dynamometer.