• Title/Summary/Keyword: 급수링

Search Result 89, Processing Time 0.025 seconds

Scalable Collaborative Filtering Technique based on Adaptive Clustering (적응형 군집화 기반 확장 용이한 협업 필터링 기법)

  • Lee, O-Joun;Hong, Min-Sung;Lee, Won-Jin;Lee, Jae-Dong
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.73-92
    • /
    • 2014
  • An Adaptive Clustering-based Collaborative Filtering Technique was proposed to solve the fundamental problems of collaborative filtering, such as cold-start problems, scalability problems and data sparsity problems. Previous collaborative filtering techniques were carried out according to the recommendations based on the predicted preference of the user to a particular item using a similar item subset and a similar user subset composed based on the preference of users to items. For this reason, if the density of the user preference matrix is low, the reliability of the recommendation system will decrease rapidly. Therefore, the difficulty of creating a similar item subset and similar user subset will be increased. In addition, as the scale of service increases, the time needed to create a similar item subset and similar user subset increases geometrically, and the response time of the recommendation system is then increased. To solve these problems, this paper suggests a collaborative filtering technique that adapts a condition actively to the model and adopts the concepts of a context-based filtering technique. This technique consists of four major methodologies. First, items are made, the users are clustered according their feature vectors, and an inter-cluster preference between each item cluster and user cluster is then assumed. According to this method, the run-time for creating a similar item subset or user subset can be economized, the reliability of a recommendation system can be made higher than that using only the user preference information for creating a similar item subset or similar user subset, and the cold start problem can be partially solved. Second, recommendations are made using the prior composed item and user clusters and inter-cluster preference between each item cluster and user cluster. In this phase, a list of items is made for users by examining the item clusters in the order of the size of the inter-cluster preference of the user cluster, in which the user belongs, and selecting and ranking the items according to the predicted or recorded user preference information. Using this method, the creation of a recommendation model phase bears the highest load of the recommendation system, and it minimizes the load of the recommendation system in run-time. Therefore, the scalability problem and large scale recommendation system can be performed with collaborative filtering, which is highly reliable. Third, the missing user preference information is predicted using the item and user clusters. Using this method, the problem caused by the low density of the user preference matrix can be mitigated. Existing studies on this used an item-based prediction or user-based prediction. In this paper, Hao Ji's idea, which uses both an item-based prediction and user-based prediction, was improved. The reliability of the recommendation service can be improved by combining the predictive values of both techniques by applying the condition of the recommendation model. By predicting the user preference based on the item or user clusters, the time required to predict the user preference can be reduced, and missing user preference in run-time can be predicted. Fourth, the item and user feature vector can be made to learn the following input of the user feedback. This phase applied normalized user feedback to the item and user feature vector. This method can mitigate the problems caused by the use of the concepts of context-based filtering, such as the item and user feature vector based on the user profile and item properties. The problems with using the item and user feature vector are due to the limitation of quantifying the qualitative features of the items and users. Therefore, the elements of the user and item feature vectors are made to match one to one, and if user feedback to a particular item is obtained, it will be applied to the feature vector using the opposite one. Verification of this method was accomplished by comparing the performance with existing hybrid filtering techniques. Two methods were used for verification: MAE(Mean Absolute Error) and response time. Using MAE, this technique was confirmed to improve the reliability of the recommendation system. Using the response time, this technique was found to be suitable for a large scaled recommendation system. This paper suggested an Adaptive Clustering-based Collaborative Filtering Technique with high reliability and low time complexity, but it had some limitations. This technique focused on reducing the time complexity. Hence, an improvement in reliability was not expected. The next topic will be to improve this technique by rule-based filtering.

Intelligent Web Crawler for Supporting Big Data Analysis Services (빅데이터 분석 서비스 지원을 위한 지능형 웹 크롤러)

  • Seo, Dongmin;Jung, Hanmin
    • The Journal of the Korea Contents Association
    • /
    • v.13 no.12
    • /
    • pp.575-584
    • /
    • 2013
  • Data types used for big-data analysis are very widely, such as news, blog, SNS, papers, patents, sensed data, and etc. Particularly, the utilization of web documents offering reliable data in real time is increasing gradually. And web crawlers that collect web documents automatically have grown in importance because big-data is being used in many different fields and web data are growing exponentially every year. However, existing web crawlers can't collect whole web documents in a web site because existing web crawlers collect web documents with only URLs included in web documents collected in some web sites. Also, existing web crawlers can collect web documents collected by other web crawlers already because information about web documents collected in each web crawler isn't efficiently managed between web crawlers. Therefore, this paper proposed a distributed web crawler. To resolve the problems of existing web crawler, the proposed web crawler collects web documents by RSS of each web site and Google search API. And the web crawler provides fast crawling performance by a client-server model based on RMI and NIO that minimize network traffic. Furthermore, the web crawler extracts core content from a web document by a keyword similarity comparison on tags included in a web documents. Finally, to verify the superiority of our web crawler, we compare our web crawler with existing web crawlers in various experiments.

A Labeling Methods for Keyword Search over Large XML Documents (대용량 XML 문서의 키워드 검색을 위한 레이블링 기법)

  • Sun, Dong-Han;Hwang, Soo-Chan
    • Journal of KIISE
    • /
    • v.41 no.9
    • /
    • pp.699-706
    • /
    • 2014
  • As XML documents are getting bigger and more complex, a keyword-based search method that does not require structural information is needed to search these large XML documents. In order to use this method, not only all keywords expressed as nodes in the XML document must be labeled for indexing but also structural information should be well represented. However, the existing labeling methods either have very simple information of XML documents for index or represent the structural information which is difficult to deal with the increase of XML documents' size. As the size of XML documents is getting larger, it causes either the poor performance of keyword search or the exponential increase of space usage. In this paper, we present the Repetitive Prime Labeling Scheme (RPLS) in order to improve the problem of the existing labeling methods for keyword-based search of large XML documents. This method is based on the existing prime number labeling method and allows a parent's prime number to be used at a lower level repeatedly so that the number of prime numbers being generated can be reduced. Then, we show an experimental result of the comparison between our methods and the existing methods.

Comparative analysis of students and teachers' perceptions on internet-based learning style (사이버 학습에서 학습양식에 대한 학습자와 교수자의 인식차이 분석)

  • Cho Eun-Sun;Kim In-Sook
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2005.05a
    • /
    • pp.19-30
    • /
    • 2005
  • This study examined student's and teachers' perceptions of learning style on internet-based learning. Many research have studied student's learning style both in traditional learning settings and inter-based learning settings. These studies emphasized that student's learning style is one of the important indicators of establishing learning strategies and expecting positive learning outcomes. However, it has been rare to find the comparative analyses between learners and teachers to see how differently they perceive the learning style on internet-based learning. This study analyzed 386 elementary and 627 middle school students and 130 cyber teachers. They filled out the 15 items questionnaire on internet-based learning style. The results showed that the student's and teachers preferred the self-directed internet learning style rather than cooperative and dependent styles. However, the teachers judged the students more self-directed and dependent style preferred than the students self perception. In conclusion, this study suggests the differentiated teaching-learning design and development strategies on internet learning, especially based on subject matter. Also, there should be advanced tutor strategies considering learners' different learning styles. Finally, many research on internet learning styles should be conducted in various learning areas and learner ages and characteristics.

  • PDF

Relationship Analysis between Malware and Sybil for Android Apps Recommender System (안드로이드 앱 추천 시스템을 위한 Sybil공격과 Malware의 관계 분석)

  • Oh, Hayoung
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.26 no.5
    • /
    • pp.1235-1241
    • /
    • 2016
  • Personalized App recommendation system is recently famous since the number of various apps that can be used in smart phones that increases exponentially. However, the site users using google play site with malwares have experienced severe damages of privacy exposure and extortion as well as a simple damage of satisfaction descent at the same time. In addition, Sybil attack (Sybil) manipulating the score (rating) of each app with falmay also present because of the social networks development. Up until now, the sybil detection studies and malicious apps studies have been conducted independently. But it is important to determine finally the existence of intelligent attack with Sybil and malware simultaneously when we consider the intelligent attack types in real-time. Therefore, in this paper we experimentally evaluate the relationship between malware and sybils based on real cralwed dataset of goodlplay. Through the extensive evaluations, the correlation between malware and sybils is low for malware providers to hide themselves from Anti-Virus (AV).

Optimal design and operation of water transmission system (상수도 송·배수시스템의 최적 설계 및 운영 모형 개발)

  • Choi, Jeongwook;Jeong, Gimoon;Kim, Kangmin;Kang, Doosun
    • Journal of Korea Water Resources Association
    • /
    • v.51 no.12
    • /
    • pp.1171-1180
    • /
    • 2018
  • Korea's water transmission system is operated by the nonpressure flow method that flows from highlands to lowlands due to the nature of Korea with many mountainous areas. In order to store water in the highlands, the water pumps are installed and operated. However, In this process, a lot of electrical energy is consumed. therefore, it is necessary to minimize the energy consumption by optimizing the size and operation schedule of the water pumps. The optimal capacity and operation method of the water pump are affected by the size of the tank (distributing reservoir). Therefore, in order to economically design and operate the water transmission system, it is reasonable to consider both the construction cost of the water pump and the tank and the long-term operation cost of the water pump at the step of determining the scale of the initial facilities. In this study, the optimum design model was developed that can optimize both the optimal size of the water pump and the tank and the operation scheduling of the water pump by using the genetic algorithm (GA). The developed model was verified by applying it to the water transmission systems operated in Korea. It is expected that this study will help to estimate the optimal size of the water pump and the tank in the initial design of the water transmission system.

Evaluation of Water Quality Characteristics on Tributaries of Dongjin River Watershed (동진강 유역내 하천의 특성별 영향평가)

  • Yun, Sun-Gang;Kim, Won-Il;Kim, Jin-Ho;Kim, Seon-Jong;Koh, Mun-Hwan;Eom, Ki-Cheol
    • Korean Journal of Environmental Agriculture
    • /
    • v.21 no.4
    • /
    • pp.243-247
    • /
    • 2002
  • Irrigation water quality along Donjin river watershed was monitored to find a possible pollutant, for maintaining water quality to achieve food safety through water quality preservation of river. As a pollution indicators, such as Biological Oxygen Demand(BOD), Chemical Oxygen Demand(COD), Total Nitrogen(T-N), and Total Phosphate(T-P) in Dongjin river were examined from May to November in 2001. The results were as follows : The BOD level of Dongjin river ranged from 2.84 to 6.45 mg/L, which would be in a II$\sim$IV grade of the potable water criteria by Ministry of Environment. Averaged BOD level of downstream DJ6(After Jeongupcheon confluence) was 4.07 mg/L. The average COD level of Dongjin river ranged from 11.20 to 32.96 mg/L. COD level of DJ6 rapidly increased rapidly after the junction of Dongjin river and Jungupcheon because it showed the latter had relatively high pollution level. T-N content were significantly high in all sites of Dongjin river ranged through 4.16 to 5.84 mg/L. T-P examined high concentration than another thing point by 0.19 mg/L after Jeongupcheon confluence as BOD and COD. COD of main stream was expressed high concentration to dry season after rainy season. In case of T-P, pollution degree of dry season before rainy season appeared and examined that quality of water was worsened go by dry season after rainy season. The water quality of Dongjin river was deteriorated with inflow of Jungupcheon polluted by municipal and industrial sites near Jungup city.

Validation of Extreme Rainfall Estimation in an Urban Area derived from Satellite Data : A Case Study on the Heavy Rainfall Event in July, 2011 (위성 자료를 이용한 도시지역 극치강우 모니터링: 2011년 7월 집중호우를 중심으로)

  • Yoon, Sun-Kwon;Park, Kyung-Won;Kim, Jong Pil;Jung, Il-Won
    • Journal of Korea Water Resources Association
    • /
    • v.47 no.4
    • /
    • pp.371-384
    • /
    • 2014
  • This study developed a new algorithm of extreme rainfall extraction based on the Communication, Ocean and Meteorological Satellite (COMS) and the Tropical Rainfall Measurement Mission (TRMM) Satellite image data and evaluated its applicability for the heavy rainfall event in July-2011 in Seoul, South Korea. The power-series-regression-based Z-R relationship was employed for taking into account for empirical relationships between TRMM/PR, TRMM/VIRS, COMS, and Automatic Weather System(AWS) at each elevation. The estimated Z-R relationship ($Z=303R^{0.72}$) agreed well with observation from AWS (correlation coefficient=0.57). The estimated 10-minute rainfall intensities from the COMS satellite using the Z-R relationship generated underestimated rainfall intensities. For a small rainfall event the Z-R relationship tended to overestimated rainfall intensities. However, the overall patterns of estimated rainfall were very comparable with the observed data. The correlation coefficients and the Root Mean Square Error (RMSE) of 10-minute rainfall series from COMS and AWS gave 0.517, and 3.146, respectively. In addition, the averaged error value of the spatial correlation matrix ranged from -0.530 to -0.228, indicating negative correlation. To reduce the error by extreme rainfall estimation using satellite datasets it is required to take into more extreme factors and improve the algorithm through further study. This study showed the potential utility of multi-geostationary satellite data for building up sub-daily rainfall and establishing the real-time flood alert system in ungauged watersheds.

Multi-Variate Tabular Data Processing and Visualization Scheme for Machine Learning based Analysis: A Case Study using Titanic Dataset (기계 학습 기반 분석을 위한 다변량 정형 데이터 처리 및 시각화 방법: Titanic 데이터셋 적용 사례 연구)

  • Juhyoung Sung;Kiwon Kwon;Kyoungwon Park;Byoungchul Song
    • Journal of Internet Computing and Services
    • /
    • v.25 no.4
    • /
    • pp.121-130
    • /
    • 2024
  • As internet and communication technology (ICT) is improved exponentially, types and amount of available data also increase. Even though data analysis including statistics is significant to utilize this large amount of data, there are inevitable limits to process various and complex data in general way. Meanwhile, there are many attempts to apply machine learning (ML) in various fields to solve the problems according to the enhancement in computational performance and increase in demands for autonomous systems. Especially, data processing for the model input and designing the model to solve the objective function are critical to achieve the model performance. Data processing methods according to the type and property have been presented through many studies and the performance of ML highly varies depending on the methods. Nevertheless, there are difficulties in deciding which data processing method for data analysis since the types and characteristics of data have become more diverse. Specifically, multi-variate data processing is essential for solving non-linear problem based on ML. In this paper, we present a multi-variate tabular data processing scheme for ML-aided data analysis by using Titanic dataset from Kaggle including various kinds of data. We present the methods like input variable filtering applying statistical analysis and normalization according to the data property. In addition, we analyze the data structure using visualization. Lastly, we design an ML model and train the model by applying the proposed multi-variate data process. After that, we analyze the passenger's survival prediction performance of the trained model. We expect that the proposed multi-variate data processing and visualization can be extended to various environments for ML based analysis.