Search | Korea Science

Performance analysis of Frequent Itemset Mining Technique based on Transaction Weight Constraints (트랜잭션 가중치 기반의 빈발 아이템셋 마이닝 기법의 성능분석)

Yun, Unil;Pyun, Gwangbum
- Journal of Internet Computing and Services
- /
- v.16 no.1
- /
- pp.67-74
- /
- 2015
In recent years, frequent itemset mining for considering the importance of each item has been intensively studied as one of important issues in the data mining field. According to strategies utilizing the item importance, itemset mining approaches for discovering itemsets based on the item importance are classified as follows: weighted frequent itemset mining, frequent itemset mining using transactional weights, and utility itemset mining. In this paper, we perform empirical analysis with respect to frequent itemset mining algorithms based on transactional weights. The mining algorithms compute transactional weights by utilizing the weight for each item in large databases. In addition, these algorithms discover weighted frequent itemsets on the basis of the item frequency and weight of each transaction. Consequently, we can see the importance of a certain transaction through the database analysis because the weight for the transaction has higher value if it contains many items with high values. We not only analyze the advantages and disadvantages but also compare the performance of the most famous algorithms in the frequent itemset mining field based on the transactional weights. As a representative of the frequent itemset mining using transactional weights, WIS introduces the concept and strategies of transactional weights. In addition, there are various other state-of-the-art algorithms, WIT-FWIs, WIT-FWIs-MODIFY, and WIT-FWIs-DIFF, for extracting itemsets with the weight information. To efficiently conduct processes for mining weighted frequent itemsets, three algorithms use the special Lattice-like data structure, called WIT-tree. The algorithms do not need to an additional database scanning operation after the construction of WIT-tree is finished since each node of WIT-tree has item information such as item and transaction IDs. In particular, the traditional algorithms conduct a number of database scanning operations to mine weighted itemsets, whereas the algorithms based on WIT-tree solve the overhead problem that can occur in the mining processes by reading databases only one time. Additionally, the algorithms use the technique for generating each new itemset of length N+1 on the basis of two different itemsets of length N. To discover new weighted itemsets, WIT-FWIs performs the itemset combination processes by using the information of transactions that contain all the itemsets. WIT-FWIs-MODIFY has a unique feature decreasing operations for calculating the frequency of the new itemset. WIT-FWIs-DIFF utilizes a technique using the difference of two itemsets. To compare and analyze the performance of the algorithms in various environments, we use real datasets of two types (i.e., dense and sparse) in terms of the runtime and maximum memory usage. Moreover, a scalability test is conducted to evaluate the stability for each algorithm when the size of a database is changed. As a result, WIT-FWIs and WIT-FWIs-MODIFY show the best performance in the dense dataset, and in sparse dataset, WIT-FWI-DIFF has mining efficiency better than the other algorithms. Compared to the algorithms using WIT-tree, WIS based on the Apriori technique has the worst efficiency because it requires a large number of computations more than the others on average.
https://doi.org/10.7472/jksii.2015.16.1.67 인용 PDF KSCI

Development of Intelligent Job Classification System based on Job Posting on Job Sites (구인구직사이트의 구인정보 기반 지능형 직무분류체계의 구축)

Lee, Jung Seung
- Journal of Intelligence and Information Systems
- /
- v.25 no.4
- /
- pp.123-139
- /
- 2019
The job classification system of major job sites differs from site to site and is different from the job classification system of the 'SQF(Sectoral Qualifications Framework)' proposed by the SW field. Therefore, a new job classification system is needed for SW companies, SW job seekers, and job sites to understand. The purpose of this study is to establish a standard job classification system that reflects market demand by analyzing SQF based on job offer information of major job sites and the NCS(National Competency Standards). For this purpose, the association analysis between occupations of major job sites is conducted and the association rule between SQF and occupation is conducted to derive the association rule between occupations. Using this association rule, we proposed an intelligent job classification system based on data mapping the job classification system of major job sites and SQF and job classification system. First, major job sites are selected to obtain information on the job classification system of the SW market. Then We identify ways to collect job information from each site and collect data through open API. Focusing on the relationship between the data, filtering only the job information posted on each job site at the same time, other job information is deleted. Next, we will map the job classification system between job sites using the association rules derived from the association analysis. We will complete the mapping between these market segments, discuss with the experts, further map the SQF, and finally propose a new job classification system. As a result, more than 30,000 job listings were collected in XML format using open API in 'WORKNET,' 'JOBKOREA,' and 'saramin', which are the main job sites in Korea. After filtering out about 900 job postings simultaneously posted on multiple job sites, 800 association rules were derived by applying the Apriori algorithm, which is a frequent pattern mining. Based on 800 related rules, the job classification system of WORKNET, JOBKOREA, and saramin and the SQF job classification system were mapped and classified into 1st and 4th stages. In the new job taxonomy, the first primary class, IT consulting, computer system, network, and security related job system, consisted of three secondary classifications, five tertiary classifications, and five fourth classifications. The second primary classification, the database and the job system related to system operation, consisted of three secondary classifications, three tertiary classifications, and four fourth classifications. The third primary category, Web Planning, Web Programming, Web Design, and Game, was composed of four secondary classifications, nine tertiary classifications, and two fourth classifications. The last primary classification, job systems related to ICT management, computer and communication engineering technology, consisted of three secondary classifications and six tertiary classifications. In particular, the new job classification system has a relatively flexible stage of classification, unlike other existing classification systems. WORKNET divides jobs into third categories, JOBKOREA divides jobs into second categories, and the subdivided jobs into keywords. saramin divided the job into the second classification, and the subdivided the job into keyword form. The newly proposed standard job classification system accepts some keyword-based jobs, and treats some product names as jobs. In the classification system, not only are jobs suspended in the second classification, but there are also jobs that are subdivided into the fourth classification. This reflected the idea that not all jobs could be broken down into the same steps. We also proposed a combination of rules and experts' opinions from market data collected and conducted associative analysis. Therefore, the newly proposed job classification system can be regarded as a data-based intelligent job classification system that reflects the market demand, unlike the existing job classification system. This study is meaningful in that it suggests a new job classification system that reflects market demand by attempting mapping between occupations based on data through the association analysis between occupations rather than intuition of some experts. However, this study has a limitation in that it cannot fully reflect the market demand that changes over time because the data collection point is temporary. As market demands change over time, including seasonal factors and major corporate public recruitment timings, continuous data monitoring and repeated experiments are needed to achieve more accurate matching. The results of this study can be used to suggest the direction of improvement of SQF in the SW industry in the future, and it is expected to be transferred to other industries with the experience of success in the SW industry.
https://doi.org/10.13088/jiis.2019.25.4.123 인용 PDF KSCI

Incorporating Social Relationship discovered from User's Behavior into Collaborative Filtering (사용자 행동 기반의 사회적 관계를 결합한 사용자 협업적 여과 방법)

Thay, Setha;Ha, Inay;Jo, Geun-Sik
- Journal of Intelligence and Information Systems
- /
- v.19 no.2
- /
- pp.1-20
- /
- 2013
Nowadays, social network is a huge communication platform for providing people to connect with one another and to bring users together to share common interests, experiences, and their daily activities. Users spend hours per day in maintaining personal information and interacting with other people via posting, commenting, messaging, games, social events, and applications. Due to the growth of user's distributed information in social network, there is a great potential to utilize the social data to enhance the quality of recommender system. There are some researches focusing on social network analysis that investigate how social network can be used in recommendation domain. Among these researches, we are interested in taking advantages of the interaction between a user and others in social network that can be determined and known as social relationship. Furthermore, mostly user's decisions before purchasing some products depend on suggestion of people who have either the same preferences or closer relationship. For this reason, we believe that user's relationship in social network can provide an effective way to increase the quality in prediction user's interests of recommender system. Therefore, social relationship between users encountered from social network is a common factor to improve the way of predicting user's preferences in the conventional approach. Recommender system is dramatically increasing in popularity and currently being used by many e-commerce sites such as Amazon.com, Last.fm, eBay.com, etc. Collaborative filtering (CF) method is one of the essential and powerful techniques in recommender system for suggesting the appropriate items to user by learning user's preferences. CF method focuses on user data and generates automatic prediction about user's interests by gathering information from users who share similar background and preferences. Specifically, the intension of CF method is to find users who have similar preferences and to suggest target user items that were mostly preferred by those nearest neighbor users. There are two basic units that need to be considered by CF method, the user and the item. Each user needs to provide his rating value on items i.e. movies, products, books, etc to indicate their interests on those items. In addition, CF uses the user-rating matrix to find a group of users who have similar rating with target user. Then, it predicts unknown rating value for items that target user has not rated. Currently, CF has been successfully implemented in both information filtering and e-commerce applications. However, it remains some important challenges such as cold start, data sparsity, and scalability reflected on quality and accuracy of prediction. In order to overcome these challenges, many researchers have proposed various kinds of CF method such as hybrid CF, trust-based CF, social network-based CF, etc. In the purpose of improving the recommendation performance and prediction accuracy of standard CF, in this paper we propose a method which integrates traditional CF technique with social relationship between users discovered from user's behavior in social network i.e. Facebook. We identify user's relationship from behavior of user such as posts and comments interacted with friends in Facebook. We believe that social relationship implicitly inferred from user's behavior can be likely applied to compensate the limitation of conventional approach. Therefore, we extract posts and comments of each user by using Facebook Graph API and calculate feature score among each term to obtain feature vector for computing similarity of user. Then, we combine the result with similarity value computed using traditional CF technique. Finally, our system provides a list of recommended items according to neighbor users who have the biggest total similarity value to the target user. In order to verify and evaluate our proposed method we have performed an experiment on data collected from our Movies Rating System. Prediction accuracy evaluation is conducted to demonstrate how much our algorithm gives the correctness of recommendation to user in terms of MAE. Then, the evaluation of performance is made to show the effectiveness of our method in terms of precision, recall, and F1-measure. Evaluation on coverage is also included in our experiment to see the ability of generating recommendation. The experimental results show that our proposed method outperform and more accurate in suggesting items to users with better performance. The effectiveness of user's behavior in social network particularly shows the significant improvement by up to 6% on recommendation accuracy. Moreover, experiment of recommendation performance shows that incorporating social relationship observed from user's behavior into CF is beneficial and useful to generate recommendation with 7% improvement of performance compared with benchmark methods. Finally, we confirm that interaction between users in social network is able to enhance the accuracy and give better recommendation in conventional approach.
https://doi.org/10.13088/jiis.2013.19.2.001 인용 PDF KSCI

An Energy Efficient Cluster Management Method based on Autonomous Learning in a Server Cluster Environment (서버 클러스터 환경에서 자율학습기반의 에너지 효율적인 클러스터 관리 기법)

Cho, Sungchul;Kwak, Hukeun;Chung, Kyusik
- KIPS Transactions on Computer and Communication Systems
- /
- v.4 no.6
- /
- pp.185-196
- /
- 2015
Energy aware server clusters aim to reduce power consumption at maximum while keeping QoS(Quality of Service) compared to energy non-aware server clusters. They adjust the power mode of each server in a fixed or variable time interval to let only the minimum number of servers needed to handle current user requests ON. Previous studies on energy aware server cluster put efforts to reduce power consumption further or to keep QoS, but they do not consider energy efficiency well. In this paper, we propose an energy efficient cluster management based on autonomous learning for energy aware server clusters. Using parameters optimized through autonomous learning, our method adjusts server power mode to achieve maximum performance with respect to power consumption. Our method repeats the following procedure for adjusting the power modes of servers. Firstly, according to the current load and traffic pattern, it classifies current workload pattern type in a predetermined way. Secondly, it searches learning table to check whether learning has been performed for the classified workload pattern type in the past. If yes, it uses the already-stored parameters. Otherwise, it performs learning for the classified workload pattern type to find the best parameters in terms of energy efficiency and stores the optimized parameters. Thirdly, it adjusts server power mode with the parameters. We implemented the proposed method and performed experiments with a cluster of 16 servers using three different kinds of load patterns. Experimental results show that the proposed method is better than the existing methods in terms of energy efficiency: the numbers of good response per unit power consumed in the proposed method are 99.8%, 107.5% and 141.8% of those in the existing static method, 102.0%, 107.0% and 106.8% of those in the existing prediction method for banking load pattern, real load pattern, and virtual load pattern, respectively.
https://doi.org/10.3745/KTCCS.2015.4.6.185 인용 PDF KSCI

Natural Language Processing Model for Data Visualization Interaction in Chatbot Environment (챗봇 환경에서 데이터 시각화 인터랙션을 위한 자연어처리 모델)

Oh, Sang Heon;Hur, Su Jin;Kim, Sung-Hee
- KIPS Transactions on Computer and Communication Systems
- /
- v.9 no.11
- /
- pp.281-290
- /
- 2020
With the spread of smartphones, services that want to use personalized data are increasing. In particular, healthcare-related services deal with a variety of data, and data visualization techniques are used to effectively show this. As data visualization techniques are used, interactions in visualization are also naturally emphasized. In the PC environment, since the interaction for data visualization is performed with a mouse, various filtering for data is provided. On the other hand, in the case of interaction in a mobile environment, the screen size is small and it is difficult to recognize whether or not the interaction is possible, so that only limited visualization provided by the app can be provided through a button touch method. In order to overcome the limitation of interaction in such a mobile environment, we intend to enable data visualization interactions through conversations with chatbots so that users can check individual data through various visualizations. To do this, it is necessary to convert the user's query into a query and retrieve the result data through the converted query in the database that is storing data periodically. There are many studies currently being done to convert natural language into queries, but research on converting user queries into queries based on visualization has not been done yet. Therefore, in this paper, we will focus on query generation in a situation where a data visualization technique has been determined in advance. Supported interactions are filtering on task x-axis values and comparison between two groups. The test scenario utilized data on the number of steps, and filtering for the x-axis period was shown as a bar graph, and a comparison between the two groups was shown as a line graph. In order to develop a natural language processing model that can receive requested information through visualization, about 15,800 training data were collected through a survey of 1,000 people. As a result of algorithm development and performance evaluation, about 89% accuracy in classification model and 99% accuracy in query generation model was obtained.
https://doi.org/10.3745/KTCCS.2020.9.11.281 인용 PDF KSCI

Perceptions of Information Technology Competencies among Gifted and Non-gifted High School Students (영재와 평재 고등학생의 IT 역량에 대한 인식)

Shin, Min;Ahn, Doehee
- Journal of Gifted/Talented Education
- /
- v.25 no.2
- /
- pp.339-358
- /
- 2015
This study was to examine perceptions of information technology(IT) competencies among gifted and non-gifted students(i.e., information science high school students and technical high school students). Of the 370 high school students surveyed from 3 high schools(i.e., gifted academy, information science high school, and technical high school) in three metropolitan cities, Korea, 351 students completed and returned the questionnaires yielding a total response rate of 94.86%. High school students recognized the IT professional competence as being most important when recruiting IT employees. And they considered that practice-oriented education was the most importantly needed to improve their IT skills. In addition, the most important sub-factors of IT core competencies among gifted academy students and information science high school students were basic software skills. Also Technical high school students responded that the main network and security capabilities were the most importantly needed to do so. Finally, the most appropriate training courses for enhancing IT competencies were recognized differently among gifted and non-gifted students. Gifted academy students responded that the 'algorithm' was the mostly needed for enhancing IT competencies, whereas information science high school students responded that 'data structures' and 'computer architecture' were mostly needed to do. For technical high school students, they responded that a 'programming language' course was the most needed to do so. Results are discussed in relations to IT corporate and school settings.
https://doi.org/10.9722/JGTE.2015.25.2.339 인용 PDF KSCI

Target Advertisement Service using a Viewer's Profile Reasoning (시청자 프로파일 추론 기법을 이용한 표적 광고 서비스)

Kim Munjo;Im Jeongyeon;Kang Sanggil;Kim Munchrul;Kang Kyungok
- Journal of Broadcast Engineering
- /
- v.10 no.1 s.26
- /
- pp.43-56
- /
- 2005
In the existing broadcasting environment, it is not easy to serve the bi-directional service between a broadcasting server and a TV audience. In the uni-directional broadcasting environments, almost TV programs are scheduled depending on the viewers' popular watching time, and the advertisement contents in these TV programs are mainly arranged by the popularity and the ages of the audience. The audiences make an effort to sort and select their favorite programs. However, the advertisement programs which support the TV program the audience want are not served to the appropriate audiences efficiently. This randomly provided advertisement contents can occur to the audiences' indifference and avoidance. In this paper, we propose the target advertisement service for the appropriate distribution of the advertisement contents. The proposed target advertisement service estimates the audience's profile without any issuing the private information and provides the target-advertised contents by using his/her estimated profile. For the experimental results, we used the real audiences' TV usage history such as the ages, fonder and time of the programs from AC Neilson Korea. And we show the accuracy of the proposed target advertisement service algorithm. NDS (Normalized Distance Sum) and the Vector correlation method, and implementation of our target advertisement service system.
PDF KSCI

Development of Gated Myocardial SPECT Analysis Software and Evaluation of Left Ventricular Contraction Function (게이트 심근 SPECT 분석 소프트웨어의 개발과 좌심실 수축 기능 평가)

Lee, Byeong-Il;Lee, Dong-Soo;Lee, Jae-Sung;Chung, June-Key;Lee, Myung-Chul;Choi, Heung-Kook
- The Korean Journal of Nuclear Medicine
- /
- v.37 no.2
- /
- pp.73-82
- /
- 2003
Objectives: A new software (Cardiac SPECT Analyzer: CSA) was developed for quantification of volumes and election fraction on gated myocardial SPECT. Volumes and ejection fraction by CSA were validated by comparing with those quantified by Quantitative Gated SPECT (QGS) software. Materials and Methods: Gated myocardial SPECT was peformed in 40 patients with ejection fraction from 15% to 85%. In 26 patients, gated myocardial SPECT was acquired again with the patients in situ. A cylinder model was used to eliminate noise semi-automatically and profile data was extracted using Gaussian fitting after smoothing. The boundary points of endo- and epicardium were found using an iterative learning algorithm. Enddiastolic (EDV) and endsystolic volumes (ESV) and election fraction (EF) were calculated. These values were compared with those calculated by QGS and the same gated SPECT data was repeatedly quantified by CSA and variation of the values on sequential measurements of the same patients on the repeated acquisition. Results: From the 40 patient data, EF, EDV and ESV by CSA were correlated with those by QGS with the correlation coefficients of 0.97, 0.92, 0.96. Two standard deviation (SD) of EF on Bland Altman plot was 10.1%. Repeated measurements of EF, EDV, and ESV by CSA were correlated with each other with the coefficients of 0.96, 0.99, and 0.99 for EF, EDV and ESV respectively. On repeated acquisition, reproducibility was also excellent with correlation coefficients of 0.89, 0.97, 0.98, and coefficient of variation of 8.2%, 5.4mL, 8.5mL and 2SD of 10.6%, 21.2mL, and 16.4mL on Bland Altman plot for EF, EDV and ESV. Conclusion: We developed the software of CSA for quantification of volumes and ejection fraction on gated myocardial SPECT. Volumes and ejection fraction quantified using this software was found valid for its correctness and precision.
PDF KSCI

Selectively Partial Encryption of Images in Wavelet Domain (웨이블릿 영역에서의 선택적 부분 영상 암호화)

;Dujit Dey
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.28 no.6C
- /
- pp.648-658
- /
- 2003
As the usage of image/video contents increase, a security problem for the payed image data or the ones requiring confidentiality is raised. This paper proposed an image encryption methodology to hide the image information. The target data of it is the result from quantization in wavelet domain. This method encrypts only part of the image data rather than the whole data of the original image, in which three types of data selection methodologies were involved. First, by using the fact that the wavelet transform decomposes the original image into frequency sub-bands, only some of the frequency sub-bands were included in encryption to make the resulting image unrecognizable. In the data to represent each pixel, only MSBs were taken for encryption. Finally, pixels to be encrypted in a specific sub-band were selected randomly by using LFSR(Linear Feedback Shift Register). Part of the key for encryption was used for the seed value of LFSR and in selecting the parallel output bits of the LFSR for random selection so that the strength of encryption algorithm increased. The experiments have been performed with the proposed methods implemented in software for about 500 images, from which the result showed that only about 1/1000 amount of data to the original image can obtain the encryption effect not to recognize the original image. Consequently, we are sure that the proposed are efficient image encryption methods to acquire the high encryption effect with small amount of encryption. Also, in this paper, several encryption scheme according to the selection of the sub-bands and the number of bits from LFSR outputs for pixel selection have been proposed, and it has been shown that there exits a relation of trade-off between the execution time and the effect of the encryption. It means that the proposed methods can be selectively used according to the application areas. Also, because the proposed methods are performed in the application layer, they are expected to be a good solution for the end-to-end security problem, which is appearing as one of the important problems in the networks with both wired and wireless sections.
PDF KSCI

VKOSPI Forecasting and Option Trading Application Using SVM (SVM을 이용한 VKOSPI 일 중 변화 예측과 실제 옵션 매매에의 적용)

Ra, Yun Seon;Choi, Heung Sik;Kim, Sun Woong
- Journal of Intelligence and Information Systems
- /
- v.22 no.4
- /
- pp.177-192
- /
- 2016
Machine learning is a field of artificial intelligence. It refers to an area of computer science related to providing machines the ability to perform their own data analysis, decision making and forecasting. For example, one of the representative machine learning models is artificial neural network, which is a statistical learning algorithm inspired by the neural network structure of biology. In addition, there are other machine learning models such as decision tree model, naive bayes model and SVM(support vector machine) model. Among the machine learning models, we use SVM model in this study because it is mainly used for classification and regression analysis that fits well to our study. The core principle of SVM is to find a reasonable hyperplane that distinguishes different group in the data space. Given information about the data in any two groups, the SVM model judges to which group the new data belongs based on the hyperplane obtained from the given data set. Thus, the more the amount of meaningful data, the better the machine learning ability. In recent years, many financial experts have focused on machine learning, seeing the possibility of combining with machine learning and the financial field where vast amounts of financial data exist. Machine learning techniques have been proved to be powerful in describing the non-stationary and chaotic stock price dynamics. A lot of researches have been successfully conducted on forecasting of stock prices using machine learning algorithms. Recently, financial companies have begun to provide Robo-Advisor service, a compound word of Robot and Advisor, which can perform various financial tasks through advanced algorithms using rapidly changing huge amount of data. Robo-Adviser's main task is to advise the investors about the investor's personal investment propensity and to provide the service to manage the portfolio automatically. In this study, we propose a method of forecasting the Korean volatility index, VKOSPI, using the SVM model, which is one of the machine learning methods, and applying it to real option trading to increase the trading performance. VKOSPI is a measure of the future volatility of the KOSPI 200 index based on KOSPI 200 index option prices. VKOSPI is similar to the VIX index, which is based on S&P 500 option price in the United States. The Korea Exchange(KRX) calculates and announce the real-time VKOSPI index. VKOSPI is the same as the usual volatility and affects the option prices. The direction of VKOSPI and option prices show positive relation regardless of the option type (call and put options with various striking prices). If the volatility increases, all of the call and put option premium increases because the probability of the option's exercise possibility increases. The investor can know the rising value of the option price with respect to the volatility rising value in real time through Vega, a Black-Scholes's measurement index of an option's sensitivity to changes in the volatility. Therefore, accurate forecasting of VKOSPI movements is one of the important factors that can generate profit in option trading. In this study, we verified through real option data that the accurate forecast of VKOSPI is able to make a big profit in real option trading. To the best of our knowledge, there have been no studies on the idea of predicting the direction of VKOSPI based on machine learning and introducing the idea of applying it to actual option trading. In this study predicted daily VKOSPI changes through SVM model and then made intraday option strangle position, which gives profit as option prices reduce, only when VKOSPI is expected to decline during daytime. We analyzed the results and tested whether it is applicable to real option trading based on SVM's prediction. The results showed the prediction accuracy of VKOSPI was 57.83% on average, and the number of position entry times was 43.2 times, which is less than half of the benchmark (100 times). A small number of trading is an indicator of trading efficiency. In addition, the experiment proved that the trading performance was significantly higher than the benchmark.
https://doi.org/10.13088/jiis.2016.22.4.177 인용 PDF KSCI

Search Result 12,648, Processing Time 0.037 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)