• Title/Summary/Keyword: Real-time analysis system

Search Result 3,096, Processing Time 0.032 seconds

Latent topics-based product reputation mining (잠재 토픽 기반의 제품 평판 마이닝)

  • Park, Sang-Min;On, Byung-Won
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.39-70
    • /
    • 2017
  • Data-drive analytics techniques have been recently applied to public surveys. Instead of simply gathering survey results or expert opinions to research the preference for a recently launched product, enterprises need a way to collect and analyze various types of online data and then accurately figure out customer preferences. In the main concept of existing data-based survey methods, the sentiment lexicon for a particular domain is first constructed by domain experts who usually judge the positive, neutral, or negative meanings of the frequently used words from the collected text documents. In order to research the preference for a particular product, the existing approach collects (1) review posts, which are related to the product, from several product review web sites; (2) extracts sentences (or phrases) in the collection after the pre-processing step such as stemming and removal of stop words is performed; (3) classifies the polarity (either positive or negative sense) of each sentence (or phrase) based on the sentiment lexicon; and (4) estimates the positive and negative ratios of the product by dividing the total numbers of the positive and negative sentences (or phrases) by the total number of the sentences (or phrases) in the collection. Furthermore, the existing approach automatically finds important sentences (or phrases) including the positive and negative meaning to/against the product. As a motivated example, given a product like Sonata made by Hyundai Motors, customers often want to see the summary note including what positive points are in the 'car design' aspect as well as what negative points are in thesame aspect. They also want to gain more useful information regarding other aspects such as 'car quality', 'car performance', and 'car service.' Such an information will enable customers to make good choice when they attempt to purchase brand-new vehicles. In addition, automobile makers will be able to figure out the preference and positive/negative points for new models on market. In the near future, the weak points of the models will be improved by the sentiment analysis. For this, the existing approach computes the sentiment score of each sentence (or phrase) and then selects top-k sentences (or phrases) with the highest positive and negative scores. However, the existing approach has several shortcomings and is limited to apply to real applications. The main disadvantages of the existing approach is as follows: (1) The main aspects (e.g., car design, quality, performance, and service) to a product (e.g., Hyundai Sonata) are not considered. Through the sentiment analysis without considering aspects, as a result, the summary note including the positive and negative ratios of the product and top-k sentences (or phrases) with the highest sentiment scores in the entire corpus is just reported to customers and car makers. This approach is not enough and main aspects of the target product need to be considered in the sentiment analysis. (2) In general, since the same word has different meanings across different domains, the sentiment lexicon which is proper to each domain needs to be constructed. The efficient way to construct the sentiment lexicon per domain is required because the sentiment lexicon construction is labor intensive and time consuming. To address the above problems, in this article, we propose a novel product reputation mining algorithm that (1) extracts topics hidden in review documents written by customers; (2) mines main aspects based on the extracted topics; (3) measures the positive and negative ratios of the product using the aspects; and (4) presents the digest in which a few important sentences with the positive and negative meanings are listed in each aspect. Unlike the existing approach, using hidden topics makes experts construct the sentimental lexicon easily and quickly. Furthermore, reinforcing topic semantics, we can improve the accuracy of the product reputation mining algorithms more largely than that of the existing approach. In the experiments, we collected large review documents to the domestic vehicles such as K5, SM5, and Avante; measured the positive and negative ratios of the three cars; showed top-k positive and negative summaries per aspect; and conducted statistical analysis. Our experimental results clearly show the effectiveness of the proposed method, compared with the existing method.

Analysis on the Positional Accuracy of the Non-orthogonal Two-pair kV Imaging Systems for Real-time Tumor Tracking Using XCAT (XCAT를 이용한 실시간 종양 위치 추적을 위한 비직교 스테레오 엑스선 영상시스템에서의 위치 추정 정확도 분석에 관한 연구)

  • Jeong, Hanseong;Kim, Youngju;Oh, Ohsung;Lee, Seho;Jeon, Hosang;Lee, Seung Wook
    • Progress in Medical Physics
    • /
    • v.26 no.3
    • /
    • pp.143-152
    • /
    • 2015
  • In this study, we aim to design the architecture of the kV imaging system for tumor tracking in the dual-head gantry system and analyze its accuracy by simulations. We established mathematical formulas and algorithms to track the tumor position with the two-pair kV imaging systems when they are in the non-orthogonal positions. The algorithms have been designed in the homogeneous coordinate framework and the position of the source and the detector coordinates are used to estimate the tumor position. 4D XCAT (4D extended cardiac-torso) software was used in the simulation to identify the influence of the angle between the two-pair kV imaging systems and the resolution of the detectors to the accuracy in the position estimation. A metal marker fiducial has been inserted in a numerical human phantom of XCAT and the kV projections were acquired at various angles and resolutions using CT projection software of the XCAT. As a result, a positional accuracy of less than about 1mm was achieved when the resolution of the detector is higher than 1.5 mm/pixel and the angle between the kV imaging systems is approximately between $90^{\circ}$ and $50^{\circ}$. When the resolution is lower than 1.5 mm/pixel, the positional errors were higher than 1mm and the error fluctuation by the angles was greater. The resolution of the detector was critical in the positional accuracy for the tumor tracking and determines the range for the acceptable angle range between the kV imaging systems. Also, we found that the positional accuracy analysis method using XCAT developed in this study is highly useful and will be a invaluable tool for further refined design of the kV imaging systems for tumor tracking systems.

The Stock Portfolio Recommendation System based on the Correlation between the Stock Message Boards and the Stock Market (인터넷 주식 토론방 게시물과 주식시장의 상관관계 분석을 통한 투자 종목 선정 시스템)

  • Lee, Yun-Jung;Kim, Gun-Woo;Woo, Gyun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.10
    • /
    • pp.441-450
    • /
    • 2014
  • The stock market is constantly changing and sometimes the stock prices unaccountably plummet or surge. So, the stock market is recognized as a complex system and the change on the stock prices is unpredictable. Recently, many researchers try to understand the stock market as the network among individual stocks and to find a clue about the change of the stock prices from big data being created in real time from Internet. We focus on the correlation between the stock prices and the human interactions in Internet especially in the stock message boards. To uncover this correlation, we collected and investigated the articles concerning with 57 target companies, members of KOSPI200. From the analysis result, we found that there is no significant correlation between the stock prices and the article volume, but the strength of correlation between the article volume and the stock prices is relevant to the stock return. We propose a new method for recommending stock portfolio base on the result of our analysis. According to the simulated investment test using the article data from the stock message boards in 'Daum' portal site, the returns of our portfolio is about 1.55% per month, which is about 0.72% and 1.21% higher than that of the Markowitz's efficient portfolio and that of the KOSPI average respectively. Also, the case using the data from 'Naver' portal site, the stock returns of our proposed portfolio is about 0.90%, which is 0.35%, 0.40%, and 0.58% higher than those of our previous portfolio, Markowitz's efficient portfolio, and KOSPI average respectively. This study presents that collective human behavior on Internet stock message board can be much helpful to understand the stock market and the correlation between the stock price and the collective human behavior can be used to invest in stocks.

Ensemble Learning with Support Vector Machines for Bond Rating (회사채 신용등급 예측을 위한 SVM 앙상블학습)

  • Kim, Myoung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.29-45
    • /
    • 2012
  • Bond rating is regarded as an important event for measuring financial risk of companies and for determining the investment returns of investors. As a result, it has been a popular research topic for researchers to predict companies' credit ratings by applying statistical and machine learning techniques. The statistical techniques, including multiple regression, multiple discriminant analysis (MDA), logistic models (LOGIT), and probit analysis, have been traditionally used in bond rating. However, one major drawback is that it should be based on strict assumptions. Such strict assumptions include linearity, normality, independence among predictor variables and pre-existing functional forms relating the criterion variablesand the predictor variables. Those strict assumptions of traditional statistics have limited their application to the real world. Machine learning techniques also used in bond rating prediction models include decision trees (DT), neural networks (NN), and Support Vector Machine (SVM). Especially, SVM is recognized as a new and promising classification and regression analysis method. SVM learns a separating hyperplane that can maximize the margin between two categories. SVM is simple enough to be analyzed mathematical, and leads to high performance in practical applications. SVM implements the structuralrisk minimization principle and searches to minimize an upper bound of the generalization error. In addition, the solution of SVM may be a global optimum and thus, overfitting is unlikely to occur with SVM. In addition, SVM does not require too many data sample for training since it builds prediction models by only using some representative sample near the boundaries called support vectors. A number of experimental researches have indicated that SVM has been successfully applied in a variety of pattern recognition fields. However, there are three major drawbacks that can be potential causes for degrading SVM's performance. First, SVM is originally proposed for solving binary-class classification problems. Methods for combining SVMs for multi-class classification such as One-Against-One, One-Against-All have been proposed, but they do not improve the performance in multi-class classification problem as much as SVM for binary-class classification. Second, approximation algorithms (e.g. decomposition methods, sequential minimal optimization algorithm) could be used for effective multi-class computation to reduce computation time, but it could deteriorate classification performance. Third, the difficulty in multi-class prediction problems is in data imbalance problem that can occur when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such a classifier. SVM ensemble learning is one of machine learning methods to cope with the above drawbacks. Ensemble learning is a method for improving the performance of classification and prediction algorithms. AdaBoost is one of the widely used ensemble learning techniques. It constructs a composite classifier by sequentially training classifiers while increasing weight on the misclassified observations through iterations. The observations that are incorrectly predicted by previous classifiers are chosen more often than examples that are correctly predicted. Thus Boosting attempts to produce new classifiers that are better able to predict examples for which the current ensemble's performance is poor. In this way, it can reinforce the training of the misclassified observations of the minority class. This paper proposes a multiclass Geometric Mean-based Boosting (MGM-Boost) to resolve multiclass prediction problem. Since MGM-Boost introduces the notion of geometric mean into AdaBoost, it can perform learning process considering the geometric mean-based accuracy and errors of multiclass. This study applies MGM-Boost to the real-world bond rating case for Korean companies to examine the feasibility of MGM-Boost. 10-fold cross validations for threetimes with different random seeds are performed in order to ensure that the comparison among three different classifiers does not happen by chance. For each of 10-fold cross validation, the entire data set is first partitioned into tenequal-sized sets, and then each set is in turn used as the test set while the classifier trains on the other nine sets. That is, cross-validated folds have been tested independently of each algorithm. Through these steps, we have obtained the results for classifiers on each of the 30 experiments. In the comparison of arithmetic mean-based prediction accuracy between individual classifiers, MGM-Boost (52.95%) shows higher prediction accuracy than both AdaBoost (51.69%) and SVM (49.47%). MGM-Boost (28.12%) also shows the higher prediction accuracy than AdaBoost (24.65%) and SVM (15.42%)in terms of geometric mean-based prediction accuracy. T-test is used to examine whether the performance of each classifiers for 30 folds is significantly different. The results indicate that performance of MGM-Boost is significantly different from AdaBoost and SVM classifiers at 1% level. These results mean that MGM-Boost can provide robust and stable solutions to multi-classproblems such as bond rating.

Object Tracking Based on Exactly Reweighted Online Total-Error-Rate Minimization (정확히 재가중되는 온라인 전체 에러율 최소화 기반의 객체 추적)

  • JANG, Se-In;PARK, Choong-Shik
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.53-65
    • /
    • 2019
  • Object tracking is one of important steps to achieve video-based surveillance systems. Object tracking is considered as an essential task similar to object detection and recognition. In order to perform object tracking, various machine learning methods (e.g., least-squares, perceptron and support vector machine) can be applied for different designs of tracking systems. In general, generative methods (e.g., principal component analysis) were utilized due to its simplicity and effectiveness. However, the generative methods were only focused on modeling the target object. Due to this limitation, discriminative methods (e.g., binary classification) were adopted to distinguish the target object and the background. Among the machine learning methods for binary classification, total error rate minimization can be used as one of successful machine learning methods for binary classification. The total error rate minimization can achieve a global minimum due to a quadratic approximation to a step function while other methods (e.g., support vector machine) seek local minima using nonlinear functions (e.g., hinge loss function). Due to this quadratic approximation, the total error rate minimization could obtain appropriate properties in solving optimization problems for binary classification. However, this total error rate minimization was based on a batch mode setting. The batch mode setting can be limited to several applications under offline learning. Due to limited computing resources, offline learning could not handle large scale data sets. Compared to offline learning, online learning can update its solution without storing all training samples in learning process. Due to increment of large scale data sets, online learning becomes one of essential properties for various applications. Since object tracking needs to handle data samples in real time, online learning based total error rate minimization methods are necessary to efficiently address object tracking problems. Due to the need of the online learning, an online learning based total error rate minimization method was developed. However, an approximately reweighted technique was developed. Although the approximation technique is utilized, this online version of the total error rate minimization could achieve good performances in biometric applications. However, this method is assumed that the total error rate minimization can be asymptotically achieved when only the number of training samples is infinite. Although there is the assumption to achieve the total error rate minimization, the approximation issue can continuously accumulate learning errors according to increment of training samples. Due to this reason, the approximated online learning solution can then lead a wrong solution. The wrong solution can make significant errors when it is applied to surveillance systems. In this paper, we propose an exactly reweighted technique to recursively update the solution of the total error rate minimization in online learning manner. Compared to the approximately reweighted online total error rate minimization, an exactly reweighted online total error rate minimization is achieved. The proposed exact online learning method based on the total error rate minimization is then applied to object tracking problems. In our object tracking system, particle filtering is adopted. In particle filtering, our observation model is consisted of both generative and discriminative methods to leverage the advantages between generative and discriminative properties. In our experiments, our proposed object tracking system achieves promising performances on 8 public video sequences over competing object tracking systems. The paired t-test is also reported to evaluate its quality of the results. Our proposed online learning method can be extended under the deep learning architecture which can cover the shallow and deep networks. Moreover, online learning methods, that need the exact reweighting process, can use our proposed reweighting technique. In addition to object tracking, the proposed online learning method can be easily applied to object detection and recognition. Therefore, our proposed methods can contribute to online learning community and object tracking, detection and recognition communities.

Intercomparison of Daegwallyeong Cloud Physics Observation System (CPOS) Products and the Visibility Calculation by the FSSP Size Distribution during 2006-2008 (대관령 구름물리관측시스템 산출물 평가 및 FSSP를 이용한 시정환산 시험연구)

  • Yang, Ha-Young;Jeong, Jin-Yim;Chang, Ki-Ho;Cha, Joo-Wan;Jung, Jae-Won;Kim, Yoo-Chul;Lee, Myoung-Joo;Bae, Jin-Young;Kang, Sun-Young;Kim, Kum-Lan;Choi, Young-Jean;Choi, Chee-Young
    • Korean Journal of Remote Sensing
    • /
    • v.26 no.2
    • /
    • pp.65-73
    • /
    • 2010
  • To observe and analyze the characteristics of cloud and precipitation properties, the Cloud physics Observation System (CPOS) has been operated from December 2003 at Daegwallyeong ($37.4^{\circ}N$, $128.4^{\circ}E$, 842 m) in the Taebaek Mountains. The major instruments of CPOS are follows: Forward Scattering Spectrometer Probe (FSSP), Optical Particle Counter (OPC), Visibility Sensor (VS), PARSIVEL disdrometer, Microwave Radiometer (MWR), and Micro Rain Radar (MRR). The former four instruments (FSSP, OPC, visibility sensor, and PARSIVEL) are for the observation and analysis of characteristics of the ground cloud (fog) and precipitation, and the others are for the vertical cloud characteristics (http://weamod.metri.re.kr) in real time. For verification of CPOS products, the comparison between the instrumental products has been conducted: the qualitative size distributions of FSSP and OPC during the hygroscopic seeding experiments, the precipitable water vapors of MWR and radiosonde, and the rainfall rates of the PARSIVEL(or MRR) and rain gauge. Most of comparisons show a good agreement with the correlation coefficient more than 0.7. These reliable CPOS products will be useful for the cloud-related studies such as the cloud-aerosol indirect effect or cloud seeding. The visibility value is derived from the droplet size distribution of FSSP. The derived FSSP visibility shows the constant overestimation by 1.7 to 1.9 times compared with the values of two visibility sensors (SVS (Sentry Visibility Sensor) and PWD22 (Present Weather Detect 22)). We believe this bias is come from the limitation of the droplet size range ($2{\sim}47\;{\mu}m$) measured by FSSP. Further studies are needed after introducing new instruments with other ranges.

Using Viable Eggs to Determine Oviposition Models and Life Table Analysis of Riptortus pedestris (Fabricius) (Hemiptera: Alydidae) (톱다리개미허리노린재의 수정란을 이용한 산란모형과 생명표분석)

  • Ahn, Jeong Joon;Choi, Kyoung San;Koh, Sang Wook
    • Korean journal of applied entomology
    • /
    • v.58 no.2
    • /
    • pp.111-120
    • /
    • 2019
  • Riptortus pedestris (Fabricius) (Hemiptera: Alydidae) is an economically important insect pest of soybean and fruit trees. We investigated the temperature effects on the adult fecundity and longevity, and determined the parameters of oviposition models and life table at different constant temperatures 15.8, 19.7, 24.0, 27.8, 32.6, 34.0, and $35.5^{\circ}C$. R. pedestris females reproduced successfully from 19.7 to $35.5^{\circ}C$ except $15.8^{\circ}C$. The longevity of R. pedestris was longest at $15.8^{\circ}C$ and it decreased with increasing temperature (76.6 days at $19.7^{\circ}C$ and 20.6 days at $35.5^{\circ}C$). The number of total eggs and viable eggs was highest at $24.0^{\circ}C$ (193.5 and 151.2). Egg hatchability was highest at $27.8^{\circ}C$ (84.0%). We compared the results of oviposition models and life table parameters using both total eggs and viable eggs. The parameter value (c: the maximum reproductive capacity) (190 eggs) of temperature dependent total fecundity model using total eggs was higher than that of the model using viable eggs. When we analyzed the life table parameter the values of net reproductive rate and mean generation time using viable eggs were lower than those using total eggs. The oviposition models and life table analysis using viable eggs will be helpful to understand the real population transition of R. pedestris in agricultural system.

A Study on Sea Surface Temperature Changes in South Sea (Tongyeong coast), South Korea, Following the Passage of Typhoon KHANUN in 2023 (2023년 태풍 카눈 통과에 따른 한국 남해 통영해역 수온 변동 연구)

  • Jae-Dong Hwang;Ji-Suk Ahn;Ju-Yeon Kim;Hui-Tae Joo;Byung-Hwa Min;Ki-Ho Nam;Si-Woo Lee
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.30 no.1
    • /
    • pp.13-19
    • /
    • 2024
  • An analysis of the coastal water temperature in the Tongyeong waters, the eastern sea of the South Sea of Korea, revealed that the water temperature rose sharply before the typhoon made landfall. The water temperature rise occurred throughout the entire water column. An analysis of the sea surface temperature data observed by NOAA(National Oceanic and Atmospheric Administration) satellites, indicated that sea water with a temperature of 30℃ existed in the eastern waters of the eastern South Sea of Korea before the typhoon landed. The southeastern sea of Korea is an area where ocean currents prevail from west to east owing to the Tsushima Warm Current. However, an analysis of the satellite data showed that seawater at 30℃ moved from east to west, indicating that it was affected by the Ekman transport caused by the typhoon before landing. In addition, because the eastern waters of the South Sea are not as deep as those of the East Sea, the water temperature of the entire water layer may remain constant owing to vertical mixing caused by the wind. Because the rise in water temperature in each water layer occurred on the same day, the rise in the bottom water temperature can be considered as owing to vertical mixing. Indeed, the southeastern sea of Korea is a sea area where the water temperature can rise rapidly depending on the direction of approach of the typhoon and the location of high temperature formation.

Returns and Resale Price Maintenance in Book Distribution (도서유통(圖書流通) 효율화(效率化)를 위한 공정거래정책(公正去來政策))

  • Shin, Kwang-shik
    • KDI Journal of Economic Policy
    • /
    • v.13 no.2
    • /
    • pp.141-161
    • /
    • 1991
  • Resale price maintenance has long been employed in book distribution, perhaps longer than for any other product. Another unusual practice in the book trade that has proven to be quite durable in spite of its substantial cost in real resources is the returns policy. Publishers typically grant the right to return unsold books within a stipulated time for full credit against future orders. This paper investigates the functions and effects of resale price maintenance in the book trade, and argues that resale price maintenance and returns are substitute methods of providing the same economic function. Resale price maintenance can be used to compensate booksellers for initially stocking books with uncertain prospects and for providing a conduit through which manufacturers acquire information about consumer demand (market testing services). Permitting the return of unsold books for full credit places a floor under retail prices and transfers a considerable portion of the cost of introducing a new product line back to the publisher. Both reflect publishers' needs to have their books displayed. In the U.S. returns privileges were first proposed in 1913, roughly coincident with the Macy decision outlawing RPM. Publishers slowly granted return privileges, which become nearly universal by 1970. The decline in margins in recent years has been accompanied by an increase in returns as the return policy served to substitute for lost margins on successful titles as a methods of compensating full-line booksellers. In contrast, returns privileges are unusual in countries where price maintenance in books has been practiced. These observations are consistent with our analysis. In Korea, resale price maintenance of books is practiced under an exception to Korean antitrust law. The availability of effective price maintenance is likely to reduce the use of returns programs. Since consumers prefer to obtain books at outlets where they know the books are likely to be stocked rather than taking a chance on stores that carry a more limited line, it also provides a strong incentive for booksellers to expand. But the privilege of resale price maintenance should be confined to books which publishers want to be price maintained. Resale price maintenance and returns system differ in the transactions costs associated with inventory holding, and publishers' judgement on the comparative advantage of the two schemes should be honored. Publishers should also remain free to authorize sales at discount at any time not to impair the ability of booksellers to dispose of product variants that prove unpopular.

  • PDF

Expressional Analysis of Superoxide Dismutase in Olive Flounder (Paralichthys olivaceus) against Viral Hemorrhagic Septicemia Virus Infection (Viral hemorrhagic septicemia virus (VHSV) 감염에 대한 넙치 superoxide dismutase(Of-SOD)의 발현분석)

  • Lee, Young Mee;Kim, Jung-Eun;Noh, Jae Koo;Kim, Hyun Chul;Park, Choul-Ji;Park, Jong-Won;Kim, Kyung-Kil;Lee, Jeong-Ho
    • Journal of Life Science
    • /
    • v.24 no.12
    • /
    • pp.1371-1377
    • /
    • 2014
  • Superoxide dismutase is a family of important antioxidant metalloenzymes and catalyzes the dismutation of toxic superoxide anions into dioxygen and hydrogen peroxide. A recent study identified the partial superoxide dismutase (SOD) gene in olive flounder (Paralichthys olivaceus). The same study reported that it strongly induced benzo[a]pyrene and that it was an indicator of aquatic oxidative stress responses. However, its transcriptional response against viral infection has not been investigated. In the present study, the spatial and temporal expression profiles were analyzed to investigate the function of Of-SOD in the antiviral response. The Of-SOD transcripts were ubiquitously detected at various levels in diverse tissues in a real-time PCR. The expression of Of-SOD was significantly higher in the muscles, liver, and brain but extremely low in the stomach and spleen. Following a VHSV challenge, the expression of Of-SOD increased within 3 h in the kidneys and decreased to the original level 2 days postchallenge. In muscle, liver, and brain, Of-SOD mRNA was similarly up-regulated at 3-6 h postchallenge and then decreased to the basal level. Although the expression pattern and induction time differed slightly depending on the tissue, the transcript of Of-SOD consistently increased in the acute infection response, but the expression was low in the chronic response. The expression of Of-SOD was induced after the VHSV infection, and Of-SOD was probably involved in the immune response against the viral challenge. These results suggest that SOD may play important roles in the immune defense system of P. olivaceus and perhaps contribute to the protective effects against oxidative stress in olive flounder.