• Title/Summary/Keyword: New Algorithm

Search Result 11,736, Processing Time 0.038 seconds

A Collaborative Filtering System Combined with Users' Review Mining : Application to the Recommendation of Smartphone Apps (사용자 리뷰 마이닝을 결합한 협업 필터링 시스템: 스마트폰 앱 추천에의 응용)

  • Jeon, ByeoungKug;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.1-18
    • /
    • 2015
  • Collaborative filtering(CF) algorithm has been popularly used for recommender systems in both academic and practical applications. A general CF system compares users based on how similar they are, and creates recommendation results with the items favored by other people with similar tastes. Thus, it is very important for CF to measure the similarities between users because the recommendation quality depends on it. In most cases, users' explicit numeric ratings of items(i.e. quantitative information) have only been used to calculate the similarities between users in CF. However, several studies indicated that qualitative information such as user's reviews on the items may contribute to measure these similarities more accurately. Considering that a lot of people are likely to share their honest opinion on the items they purchased recently due to the advent of the Web 2.0, user's reviews can be regarded as the informative source for identifying user's preference with accuracy. Under this background, this study proposes a new hybrid recommender system that combines with users' review mining. Our proposed system is based on conventional memory-based CF, but it is designed to use both user's numeric ratings and his/her text reviews on the items when calculating similarities between users. In specific, our system creates not only user-item rating matrix, but also user-item review term matrix. Then, it calculates rating similarity and review similarity from each matrix, and calculates the final user-to-user similarity based on these two similarities(i.e. rating and review similarities). As the methods for calculating review similarity between users, we proposed two alternatives - one is to use the frequency of the commonly used terms, and the other one is to use the sum of the importance weights of the commonly used terms in users' review. In the case of the importance weights of terms, we proposed the use of average TF-IDF(Term Frequency - Inverse Document Frequency) weights. To validate the applicability of the proposed system, we applied it to the implementation of a recommender system for smartphone applications (hereafter, app). At present, over a million apps are offered in each app stores operated by Google and Apple. Due to this information overload, users have difficulty in selecting proper apps that they really want. Furthermore, app store operators like Google and Apple have cumulated huge amount of users' reviews on apps until now. Thus, we chose smartphone app stores as the application domain of our system. In order to collect the experimental data set, we built and operated a Web-based data collection system for about two weeks. As a result, we could obtain 1,246 valid responses(ratings and reviews) from 78 users. The experimental system was implemented using Microsoft Visual Basic for Applications(VBA) and SAS Text Miner. And, to avoid distortion due to human intervention, we did not adopt any refining works by human during the user's review mining process. To examine the effectiveness of the proposed system, we compared its performance to the performance of conventional CF system. The performances of recommender systems were evaluated by using average MAE(mean absolute error). The experimental results showed that our proposed system(MAE = 0.7867 ~ 0.7881) slightly outperformed a conventional CF system(MAE = 0.7939). Also, they showed that the calculation of review similarity between users based on the TF-IDF weights(MAE = 0.7867) leaded to better recommendation accuracy than the calculation based on the frequency of the commonly used terms in reviews(MAE = 0.7881). The results from paired samples t-test presented that our proposed system with review similarity calculation using the frequency of the commonly used terms outperformed conventional CF system with 10% statistical significance level. Our study sheds a light on the application of users' review information for facilitating electronic commerce by recommending proper items to users.

A Study on the Design of Case-based Reasoning Office Knowledge Recommender System for Office Professionals (사례기반추론을 이용한 사무지식 추천시스템)

  • Kim, Myong-Ok;Na, Jung-Ah
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.131-146
    • /
    • 2011
  • It is becoming more essential than ever for office professionals to become competent in information collection/gathering and problem solving in today's global business society. In particular, office professionals do not only assist simple chores but are also forced to make decisions as quickly and efficiently as possible in problematic situations that can end in either profit or loss to their company. Since office professionals rely heavily on their tacit knowledge to solve problems that arise in everyday business situations, it is truly helpful and efficient to refer to similar business cases from the past and share or reuse such previous business knowledge for better performance results. Case-based reasoning(CBR) is a problem-solving method which utilizes previous similar cases to solve problems. Through CBR, the closest case to the current business situation can be searched and retrieved from the case or knowledge base and can be referred to for a new solution. This reduces the time and resources needed and increase success probability. The main purpose of this study is to design a system called COKRS(Case-based reasoning Office Knowledge Recommender System) and develop a prototype for it. COKRS manages cases and their meta data, accepts key words from the user and searches the casebase for the most similar past case to the input keyword, and communicates with users to collect information about the quality of the case provided and continuously apply the information to update values on the similarity table. Core concepts like system architecture, definition of a case, meta database, similarity table have been introduced, and also an algorithm to retrieve all similar cases from past work history has also been proposed. In this research, a case is best defined as a work experience in office administration. However, defining a case in office administration was not an easy task in reality. We surveyed 10 office professionals in order to get an idea of how to define a case in office administration and found out that in most cases any type of office work is to be recorded digitally and/or non-digitally. Therefore, we have defined a record or document case as for COKRS. Similarity table was composed of items of the result of job analysis for office professionals conducted in a previous research. Values between items of the similarity table were initially set to those from researchers' experiences and literature review. The results of this study could also be utilized in other areas of business for knowledge sharing wherever it is necessary and beneficial to share and learn from past experiences. We expect this research to be a reference for researchers and developers who are in this area or interested in office knowledge recommendation system based on CBR. Focus group interview(FGI) was conducted with ten administrative assistants carefully selected from various areas of business. They were given a chance to try out COKRS in an actual work setting and make some suggestions for future improvement. FGI has identified the user-interface for saving and searching cases for keywords as the most positive aspect of COKRS, and has identified the most urgently needed improvement as transforming tacit knowledge and knowhow into recorded documents more efficiently. Also, the focus group has mentioned that it is essential to secure enough support, encouragement, and reward from the company and promote positive attitude and atmosphere for knowledge sharing for everybody's benefit in the company.

Patients Setup Verification Tool for RT (PSVTS) : DRR, Simulation, Portal and Digital images (방사선치료 시 환자자세 검증을 위한 분석용 도구 개발)

  • Lee Suk;Seong Jinsil;Kwon Soo I1;Chu Sung Sil;Lee Chang Geol;Suh Chang Ok
    • Radiation Oncology Journal
    • /
    • v.21 no.1
    • /
    • pp.100-106
    • /
    • 2003
  • Purpose : To develop a patients' setup verification tool (PSVT) to verify the alignment of the machine and the target isocenters, and the reproduclbility of patients' setup for three dimensional conformal radiotherapy (3DCRT) and intensity modulated radiotherapy (IMRT). The utilization of this system is evaluated through phantom and patient case studies. Materials and methods : We developed and clinically tested a new method for patients' setup verification, using digitally reconstructed radiography (DRR), simulation, porial and digital images. The PSVT system was networked to a Pentium PC for the transmission of the acquired images to the PC for analysis. To verify the alignment of the machine and target isocenters, orthogonal pairs of simulation images were used as verification images. Errors in the isocenter alignment were measured by comparing the verification images with DRR of CT Images. Orthogonal films were taken of all the patients once a week. These verification films were compared with the DRR were used for the treatment setup. By performing this procedure every treatment, using humanoid phantom and patient cases, the errors of localization can be analyzed, with adjustments made from the translation. The reproducibility of the patients' setup was verified using portal and digital images. Results : The PSVT system was developed to verify the alignment of the machine and the target isocenters, and the reproducibility of the patients' setup for 3DCRT and IMRT. The results show that the localization errors are 0.8$\pm$0.2 mm (AP) and 1.0$\pm$0.3 mm (Lateral) in the cases relating to the brain and 1.1$\pm$0.5 mm (AP) and 1.0$\pm$0.6 mm (Lateral) in the cases relating to the pelvis. The reproducibility of the patients' setup was verified by visualization, using real-time image acquisition, leading to the practical utilization of our software Conclusions : A PSVT system was developed for the verification of the alignment between machine and the target isocenters, and the reproduclbility of the patients' setup in 3DCRT and IMRT. With adjustment of the completed GUI-based algorithm, and a good quality DRR image, our software may be used for clinical applications.

Development of Gated Myocardial SPECT Analysis Software and Evaluation of Left Ventricular Contraction Function (게이트 심근 SPECT 분석 소프트웨어의 개발과 좌심실 수축 기능 평가)

  • Lee, Byeong-Il;Lee, Dong-Soo;Lee, Jae-Sung;Chung, June-Key;Lee, Myung-Chul;Choi, Heung-Kook
    • The Korean Journal of Nuclear Medicine
    • /
    • v.37 no.2
    • /
    • pp.73-82
    • /
    • 2003
  • Objectives: A new software (Cardiac SPECT Analyzer: CSA) was developed for quantification of volumes and election fraction on gated myocardial SPECT. Volumes and ejection fraction by CSA were validated by comparing with those quantified by Quantitative Gated SPECT (QGS) software. Materials and Methods: Gated myocardial SPECT was peformed in 40 patients with ejection fraction from 15% to 85%. In 26 patients, gated myocardial SPECT was acquired again with the patients in situ. A cylinder model was used to eliminate noise semi-automatically and profile data was extracted using Gaussian fitting after smoothing. The boundary points of endo- and epicardium were found using an iterative learning algorithm. Enddiastolic (EDV) and endsystolic volumes (ESV) and election fraction (EF) were calculated. These values were compared with those calculated by QGS and the same gated SPECT data was repeatedly quantified by CSA and variation of the values on sequential measurements of the same patients on the repeated acquisition. Results: From the 40 patient data, EF, EDV and ESV by CSA were correlated with those by QGS with the correlation coefficients of 0.97, 0.92, 0.96. Two standard deviation (SD) of EF on Bland Altman plot was 10.1%. Repeated measurements of EF, EDV, and ESV by CSA were correlated with each other with the coefficients of 0.96, 0.99, and 0.99 for EF, EDV and ESV respectively. On repeated acquisition, reproducibility was also excellent with correlation coefficients of 0.89, 0.97, 0.98, and coefficient of variation of 8.2%, 5.4mL, 8.5mL and 2SD of 10.6%, 21.2mL, and 16.4mL on Bland Altman plot for EF, EDV and ESV. Conclusion: We developed the software of CSA for quantification of volumes and ejection fraction on gated myocardial SPECT. Volumes and ejection fraction quantified using this software was found valid for its correctness and precision.

Development of Neural Network Based Cycle Length Design Model Minimizing Delay for Traffic Responsive Control (실시간 신호제어를 위한 신경망 적용 지체최소화 주기길이 설계모형 개발)

  • Lee, Jung-Youn;Kim, Jin-Tae;Chang, Myung-Soon
    • Journal of Korean Society of Transportation
    • /
    • v.22 no.3 s.74
    • /
    • pp.145-157
    • /
    • 2004
  • The cycle length design model of the Korean traffic responsive signal control systems is devised to vary a cycle length as a response to changes in traffic demand in real time by utilizing parameters specified by a system operator and such field information as degrees of saturation of through phases. Since no explicit guideline is provided to a system operator, the system tends to include ambiguity in terms of the system optimization. In addition, the cycle lengths produced by the existing model have yet been verified if they are comparable to the ones minimizing delay. This paper presents the studies conducted (1) to find shortcomings embedded in the existing model by comparing the cycle lengths produced by the model against the ones minimizing delay and (2) to propose a new direction to design a cycle length minimizing delay and excluding such operator oriented parameters. It was found from the study that the cycle lengths from the existing model fail to minimize delay and promote intersection operational conditions to be unsatisfied when traffic volume is low, due to the feature of the changed target operational volume-to-capacity ratio embedded in the model. The 64 different neural network based cycle length design models were developed based on simulation data surrogating field data. The CORSIM optimal cycle lengths minimizing delay were found through the COST software developed for the study. COST searches for the CORSIM optimal cycle length minimizing delay with a heuristic searching method, a hybrid genetic algorithm. Among 64 models, the best one producing cycle lengths close enough to the optimal was selected through statistical tests. It was found from the verification test that the best model designs a cycle length as similar pattern to the ones minimizing delay. The cycle lengths from the proposed model are comparable to the ones from TRANSYT-7F.

Development of JPEG2000 Viewer for Mobile Image System (이동형 의료영상 장치를 위한 JPEG2000 영상 뷰어 개발)

  • 김새롬;정해조;강원석;이재훈;이상호;신성범;유선국;김희중
    • Progress in Medical Physics
    • /
    • v.14 no.2
    • /
    • pp.124-130
    • /
    • 2003
  • Currently, as a consequence of PACS (Picture Archiving Communication System) implementation many hospitals are replacing conventional film-type interpretations of diagnostic medical images with new digital-format interpretations that can also be saved, and retrieve However, the big limitation in PACS is considered to be the lack of mobility. The purpose of this study is to determine the optimal communication packet size. This was done by considering the terms occurred in the wireless communication. After encoding medical image using JPGE2000 image compression method, This method embodied auto-error correction technique preventing the loss of packets occurred during wireless communication. A PC class server, with capabilities to load, collect data, save images, and connect with other network, was installed. Image data were compressed using JPEG2000 algorithm which supports the capability of high energy density and compression ratio, to communicate through a wireless network. Image data were also transmitted in block units coeded by JPEG2000 to prevent the loss of the packets in a wireless network. When JPGE2000 image data were decoded in a PUA (Personal Digital Assistant), it was instantaneous for a MR (Magnetic Resonance) head image of 256${\times}$256 pixels, while it took approximately 5 seconds to decode a CR (Computed Radiography) chest image of 800${\times}$790 pixels. In the transmission of the image data using a CDMA 1X module (Code-Division Multiple Access 1st Generation), 256 byte/sec was considered a stable transmission rate, but packets were lost in the intervals at the transmission rate of 1Kbyte/sec. However, even with a transmission rate above 1 Kbyte/sec, packets were not lost in wireless LAN. Current PACS are not compatible with wireless networks. because it does not have an interface between wired and wireless. Thus, the mobile JPEG2000 image viewing system was developed in order to complement mobility-a limitation in PACS. Moreover, the weak-connections of the wireless network was enhanced by re-transmitting image data within a limitations The results of this study are expected to play an interface role between the current wired-networks PACS and the mobile devices.

  • PDF

Water Balance Projection Using Climate Change Scenarios in the Korean Peninsula (기후변화 시나리오를 활용한 미래 한반도 물수급 전망)

  • Kim, Cho-Rong;Kim, Young-Oh;Seo, Seung Beom;Choi, Su-Woong
    • Journal of Korea Water Resources Association
    • /
    • v.46 no.8
    • /
    • pp.807-819
    • /
    • 2013
  • This study proposes a new methodology for future water balance projection considering climate change by assigning a weight to each scenario instead of inputting future streamflows based on GCMs into a water balance model directly. K-nearest neighbor algorithm was employed to assign weights and streamflows in non-flood period (October to the following June) was selected as the criterion for assigning weights. GCM-driven precipitation was input to TANK model to simulate future streamflow scenarios and Quantile Mapping was applied to correct bias between GCM hindcast and historical data. Based on these bias-corrected streamflows, different weights were assigned to each streamflow scenarios to calculate water shortage for the projection periods; 2020s (2010~2039), 2050s (2040~2069), and 2080s (2070~2099). As a result by applying the proposed methodology to project water shortage over the Korean Peninsula, average water shortage for 2020s is projected to increase to 10~32% comparing to the basis (1967~2003). In addition, according to getting decreased in streamflows in non-flood period gradually by 2080s, average water shortage for 2080s is projected to increase up to 97% (516.5 million $m^3/yr$) as maximum comparing to the basis. While the existing research on climate change gives radical increase in future water shortage, the results projected by the weighting method shows conservative change. This study has significance in the applicability of water balance projection regarding climate change, keeping the existing framework of national water resources planning and this lessens the confusion for decision-makers in water sectors.

VKOSPI Forecasting and Option Trading Application Using SVM (SVM을 이용한 VKOSPI 일 중 변화 예측과 실제 옵션 매매에의 적용)

  • Ra, Yun Seon;Choi, Heung Sik;Kim, Sun Woong
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.177-192
    • /
    • 2016
  • Machine learning is a field of artificial intelligence. It refers to an area of computer science related to providing machines the ability to perform their own data analysis, decision making and forecasting. For example, one of the representative machine learning models is artificial neural network, which is a statistical learning algorithm inspired by the neural network structure of biology. In addition, there are other machine learning models such as decision tree model, naive bayes model and SVM(support vector machine) model. Among the machine learning models, we use SVM model in this study because it is mainly used for classification and regression analysis that fits well to our study. The core principle of SVM is to find a reasonable hyperplane that distinguishes different group in the data space. Given information about the data in any two groups, the SVM model judges to which group the new data belongs based on the hyperplane obtained from the given data set. Thus, the more the amount of meaningful data, the better the machine learning ability. In recent years, many financial experts have focused on machine learning, seeing the possibility of combining with machine learning and the financial field where vast amounts of financial data exist. Machine learning techniques have been proved to be powerful in describing the non-stationary and chaotic stock price dynamics. A lot of researches have been successfully conducted on forecasting of stock prices using machine learning algorithms. Recently, financial companies have begun to provide Robo-Advisor service, a compound word of Robot and Advisor, which can perform various financial tasks through advanced algorithms using rapidly changing huge amount of data. Robo-Adviser's main task is to advise the investors about the investor's personal investment propensity and to provide the service to manage the portfolio automatically. In this study, we propose a method of forecasting the Korean volatility index, VKOSPI, using the SVM model, which is one of the machine learning methods, and applying it to real option trading to increase the trading performance. VKOSPI is a measure of the future volatility of the KOSPI 200 index based on KOSPI 200 index option prices. VKOSPI is similar to the VIX index, which is based on S&P 500 option price in the United States. The Korea Exchange(KRX) calculates and announce the real-time VKOSPI index. VKOSPI is the same as the usual volatility and affects the option prices. The direction of VKOSPI and option prices show positive relation regardless of the option type (call and put options with various striking prices). If the volatility increases, all of the call and put option premium increases because the probability of the option's exercise possibility increases. The investor can know the rising value of the option price with respect to the volatility rising value in real time through Vega, a Black-Scholes's measurement index of an option's sensitivity to changes in the volatility. Therefore, accurate forecasting of VKOSPI movements is one of the important factors that can generate profit in option trading. In this study, we verified through real option data that the accurate forecast of VKOSPI is able to make a big profit in real option trading. To the best of our knowledge, there have been no studies on the idea of predicting the direction of VKOSPI based on machine learning and introducing the idea of applying it to actual option trading. In this study predicted daily VKOSPI changes through SVM model and then made intraday option strangle position, which gives profit as option prices reduce, only when VKOSPI is expected to decline during daytime. We analyzed the results and tested whether it is applicable to real option trading based on SVM's prediction. The results showed the prediction accuracy of VKOSPI was 57.83% on average, and the number of position entry times was 43.2 times, which is less than half of the benchmark (100 times). A small number of trading is an indicator of trading efficiency. In addition, the experiment proved that the trading performance was significantly higher than the benchmark.

Measurement and Quality Control of MIROS Wave Radar Data at Dokdo (독도 MIROS Wave Radar를 이용한 파랑관측 및 품질관리)

  • Jun, Hyunjung;Min, Yongchim;Jeong, Jin-Yong;Do, Kideok
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.32 no.2
    • /
    • pp.135-145
    • /
    • 2020
  • Wave observation is widely used to direct observation method for observing the water surface elevation using wave buoy or pressure gauge and remote-sensing wave observation method. The wave buoy and pressure gauge can produce high-quality wave data but have disadvantages of the high risk of damage and loss of the instrument, and high maintenance cost in the offshore area. On the other hand, remote observation method such as radar is easy to maintain by installing the equipment on the land, but the accuracy is somewhat lower than the direct observation method. This study investigates the data quality of MIROS Wave and Current Radar (MWR) installed at Dokdo and improve the data quality of remote wave observation data using the wave buoy (CWB) observation data operated by the Korea Meteorological Administration. We applied and developed the three types of wave data quality control; 1) the combined use (Optimal Filter) of the filter designed by MIROS (Reduce Noise Frequency, Phillips Check, Energy Level Check), 2) Spike Test Algorithm (Spike Test) developed by OOI (Ocean Observatories Initiative) and 3) a new filter (H-Ts QC) using the significant wave height-period relationship. As a result, the wave observation data of MWR using three quality control have some reliability about the significant wave height. On the other hand, there are still some errors in the significant wave period, so improvements are required. Also, since the wave observation data of MWR is different somewhat from the CWB data in high waves of over 3 m, further research such as collection and analysis of long-term remote wave observation data and filter development is necessary.

Development of Sample Survey Design for the Industrial Research and Development Statistics (표본조사에 의한 기업 연구개발활동 통계 작성방안)

  • Cho, Seong-Pyo;Park, Sun-Young;Han, Ki-In;Noh, Min-Sun
    • Journal of Technology Innovation
    • /
    • v.17 no.2
    • /
    • pp.1-23
    • /
    • 2009
  • The Survey on the Industrial Research and Development(R&D) is the primary source of information on R&D performed by Korea industrial sector. The results of the survey are used to assess trends in R&D expenditures. Government agencies, corporations, and research organizations use the data to investigate productivity determinants, formulate tax policy, and compare individual company performance with industry averages. Recently, Korea Industrial Technology Association(KOITA) has collected the data by complete enumeration. Koita has, currently, considered sample survey because the number of R&D institutions in industry has been dramatically increased. This study develops survey design for the industrial research and development(R&D) statistics by introducing a sample survey. Companies are divided into 8 groups according to the amount of R&D expenditures and firm size or type. We collect the sample from 24 or 8 sampling strata and compare the results with those of complete enumeration survey. The estimates from 24 sampling strata are not significantly different to the results of complete enumeration survey. We propose the survey design as follows: Companies are divided into 11 groups including the companies of which R&D expenditures are unknown. All large companies are included in the survey and medium and small companies are sampled from 70% and 3%. Simple random sampling (SRS) is applied to the small company partition since they show uniform distribution in R&D expenditures. The independent probability proportionate to size (PPS) sampling procedure may be applied to those companies identified as 'not R&D performers'. When respondents do not provide the requested information, estimates for the missing data are made using imputation algorithms. In the future study, new key variables should be developed in survey questionnaires.

  • PDF