• Title/Summary/Keyword: Search algorithm

Search Result 3,904, Processing Time 0.03 seconds

A CF-based Health Functional Recommender System using Extended User Similarity Measure (확장된 사용자 유사도를 이용한 CF-기반 건강기능식품 추천 시스템)

  • Sein Hong;Euiju Jeong;Jaekyeong Kim
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.3
    • /
    • pp.1-17
    • /
    • 2023
  • With the recent rapid development of ICT(Information and Communication Technology) and the popularization of digital devices, the size of the online market continues to grow. As a result, we live in a flood of information. Thus, customers are facing information overload problems that require a lot of time and money to select products. Therefore, a personalized recommender system has become an essential methodology to address such issues. Collaborative Filtering(CF) is the most widely used recommender system. Traditional recommender systems mainly utilize quantitative data such as rating values, resulting in poor recommendation accuracy. Quantitative data cannot fully reflect the user's preference. To solve such a problem, studies that reflect qualitative data, such as review contents, are being actively conducted these days. To quantify user review contents, text mining was used in this study. The general CF consists of the following three steps: user-item matrix generation, Top-N neighborhood group search, and Top-K recommendation list generation. In this study, we propose a recommendation algorithm that applies an extended similarity measure, which utilize quantified review contents in addition to user rating values. After calculating review similarity by applying TF-IDF, Word2Vec, and Doc2Vec techniques to review content, extended similarity is created by combining user rating similarity and quantified review contents. To verify this, we used user ratings and review data from the e-commerce site Amazon's "Health and Personal Care". The proposed recommendation model using extended similarity measure showed superior performance to the traditional recommendation model using only user rating value-based similarity measure. In addition, among the various text mining techniques, the similarity obtained using the TF-IDF technique showed the best performance when used in the neighbor group search and recommendation list generation step.

Metagenomic analysis of bacterial community structure and diversity of lignocellulolytic bacteria in Vietnamese native goat rumen

  • Do, Thi Huyen;Dao, Trong Khoa;Nguyen, Khanh Hoang Viet;Le, Ngoc Giang;Nguyen, Thi Mai Phuong;Le, Tung Lam;Phung, Thu Nguyet;Straalen, Nico M. van;Roelofs, Dick;Truong, Nam Hai
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.31 no.5
    • /
    • pp.738-747
    • /
    • 2018
  • Objective: In a previous study, analysis of Illumina sequenced metagenomic DNA data of bacteria in Vietnamese goats' rumen showed a high diversity of putative lignocellulolytic genes. In this study, taxonomy speculation of microbial community and lignocellulolytic bacteria population in the rumen was conducted to elucidate a role of bacterial structure for effective degradation of plant materials. Methods: The metagenomic data had been subjected into Basic Local Alignment Search Tool (BLASTX) algorithm and the National Center for Biotechnology Information non-redundant sequence database. Here the BLASTX hits were further processed by the Metagenome Analyzer program to statistically analyze the abundance of taxa. Results: Microbial community in the rumen is defined by dominance of Bacteroidetes compared to Firmicutes. The ratio of Firmicutes versus Bacteroidetes was 0.36:1. An abundance of Synergistetes was uniquely identified in the goat microbiome may be formed by host genotype. With regard to bacterial lignocellulose degraders, the ratio of lignocellulolytic genes affiliated with Firmicutes compared to the genes linked to Bacteroidetes was 0.11:1, in which the genes encoding putative hemicellulases, carbohydrate esterases, polysaccharide lyases originated from Bacteroidetes were 14 to 20 times higher than from Firmicutes. Firmicutes seem to possess more cellulose hydrolysis capacity showing a Firmicutes/Bacteroidetes ratio of 0.35:1. Analysis of lignocellulolytic potential degraders shows that four species belonged to Bacteroidetes phylum, while two species belonged to Firmicutes phylum harbouring at least 12 different catalytic domains for all lignocellulose pretreatment, cellulose, as well as hemicellulose saccharification. Conclusion: Based on these findings, we speculate that increasing the members of Bacteroidetes to keep a low ratio of Firmicutes versus Bacteroidetes in goat rumen has resulted most likely in an increased lignocellulose digestion.

A Design on the Multimedia Fingerprinting code based on Feature Point for Forensic Marking (포렌식 마킹을 위한 특징점 기반의 동적 멀티미디어 핑거프린팅 코드 설계)

  • Rhee, Kang-Hyeon
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.48 no.4
    • /
    • pp.27-34
    • /
    • 2011
  • In this paper, it was presented a design on the dynamic multimedia fingerprinting code for anti-collusion code(ACC) in the protection of multimedia content. Multimedia fingerprinting code for the conventional ACC, is designed with a mathematical method to increase k to k+1 by transform from BIBD's an incidence matrix to a complement matrix. A codevector of the complement matrix is allowanced fingerprinting code to a user' authority and embedded into a content. In the proposed algorithm, the feature points were drawing from a content which user bought, with based on these to design the dynamical multimedia fingerprinting code. The candidate codes of ACC which satisfied BIBD's v and k+1 condition is registered in the codebook, and then a matrix is generated(Below that it calls "Rhee matrix") with ${\lambda}+1$ condition. In the experimental results, the codevector of Rhee matrix based on a feature point of the content is generated to exist k in the confidence interval at the significance level ($1-{\alpha}$). Euclidean distances between row and row and column and column each other of Rhee matrix is working out same k value as like the compliment matrices based on BIBD and Graph. Moreover, first row and column of Rhee matrix are an initial firing vector and to be a forensic mark of content protection. Because of the connection of the rest codevectors is reported in the codebook, when trace a colluded code, it isn't necessity to solve a correlation coefficient between original fingerprinting code and the colluded code but only search the codebook then a trace of the colluder is easy. Thus, the generated Rhee matrix in this paper has an excellent robustness and fidelity more than the mathematically generated matrix based on BIBD as ACC.

A System for Automatic Classification of Traditional Culture Texts (전통문화 콘텐츠 표준체계를 활용한 자동 텍스트 분류 시스템)

  • Hur, YunA;Lee, DongYub;Kim, Kuekyeng;Yu, Wonhee;Lim, HeuiSeok
    • Journal of the Korea Convergence Society
    • /
    • v.8 no.12
    • /
    • pp.39-47
    • /
    • 2017
  • The Internet have increased the number of digital web documents related to the history and traditions of Korean Culture. However, users who search for creators or materials related to traditional cultures are not able to get the information they want and the results are not enough. Document classification is required to access this effective information. In the past, document classification has been difficult to manually and manually classify documents, but it has recently been difficult to spend a lot of time and money. Therefore, this paper develops an automatic text classification model of traditional cultural contents based on the data of the Korean information culture field composed of systematic classifications of traditional cultural contents. This study applied TF-IDF model, Bag-of-Words model, and TF-IDF/Bag-of-Words combined model to extract word frequencies for 'Korea Traditional Culture' data. And we developed the automatic text classification model of traditional cultural contents using Support Vector Machine classification algorithm.

Dynamic Frequency Reuse Scheme Based on Traffic Load Ratio for Heterogeneous Cellular Networks (이종 셀룰러 네트워크 환경에서 트래픽 비율에 따른 동적 주파수 재사용 기법)

  • Chung, Sungmoon
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.40 no.12
    • /
    • pp.2539-2548
    • /
    • 2015
  • Overcoming inter-cell interference and spectrum scarcity are major issues in heterogeneous cellular networks. Static Frequency reuse schemes have been proposed as an effective way to manage the spectrum and reduce ICI(Inter cell Interference) in cellular networks. In a kind of static frequency reuse scheme, the allocations of transmission power and subcarriers in each cell are fixed prior to system deployment. This limits the potential performance of the static frequency reuse scheme. Also, most of dynamic frequency reuse schemes did not consider small cell and the network environment when the traffic load of each cell is heavy and non-uniform. In this paper, we propose an inter-cell resource allocation algorithm that dynamically optimizes subcarrier allocations for the multi-cell heterogeneous networks. The proposed dynamic frequency reuse scheme first finds the subcarrier usage in each cell-edge by using the exhaustive search and allocates subcarrier for all the cells except small cells. After that it allocates subcarrier for the small cell and then iteratively repeats the process. Proposed dynamic frequency reuse scheme performs better than previous frequency reuse schemes in terms of the throughput by improving the spectral efficiency due to it is able to adapt the network environment immediately when the traffic load of each cell is heavy and non-uniform.

Finding the time sensitive frequent itemsets based on data mining technique in data streams (데이터 스트림에서 데이터 마이닝 기법 기반의 시간을 고려한 상대적인 빈발항목 탐색)

  • Park, Tae-Su;Chun, Seok-Ju;Lee, Ju-Hong;Kang, Yun-Hee;Choi, Bum-Ghi
    • Journal of The Korean Association of Information Education
    • /
    • v.9 no.3
    • /
    • pp.453-462
    • /
    • 2005
  • Recently, due to technical improvements of storage devices and networks, the amount of data increase rapidly. In addition, it is required to find the knowledge embedded in a data stream as fast as possible. Huge data in a data stream are created continuously and changed fast. Various algorithms for finding frequent itemsets in a data stream are actively proposed. Current researches do not offer appropriate method to find frequent itemsets in which flow of time is reflected but provide only frequent items using total aggregation values. In this paper we proposes a novel algorithm for finding the relative frequent itemsets according to the time in a data stream. We also propose the method to save frequent items and sub-frequent items in order to take limited memory into account and the method to update time variant frequent items. The performance of the proposed method is analyzed through a series of experiments. The proposed method can search both frequent itemsets and relative frequent itemsets only using the action patterns of the students at each time slot. Thus, our method can enhance the effectiveness of learning and make the best plan for individual learning.

  • PDF

Dynamic modeling of LD converter processes

  • Yun, Sang Yeop;Jung, Ho Chul;Lee, In-Beum;Chang, Kun Soo
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1991.10b
    • /
    • pp.1639-1645
    • /
    • 1991
  • Because of the important role LD converters play in the production of high quality steel, various dynamic models have been attempted in the past by many researchers not only to understand the complex chemical reactions that take place in the converter process but also to assist the converter operation itself using computers. And yet no single dynamic model was found to be completely satisfactory because of the complexity involved with the process. The process indeed involves dynamic energy and mass balances at high temperatures accompanied by complex chemical reactions and transport phenomena in the molten state. In the present study, a mathematical model describing the dynamic behavior of LD converter process has been developed. The dynamic model describes the time behavior of the temperature and the concentrations of chemical species in the hot metal bath and slag. The analysis was greatly facilitated by dividing the entire process into three zones according to the physical boundaries and reaction mechanisms. These three zones were hot metal (zone 1), slag (zone 2) and emulsion (zone 3) zones. The removal rate of Si, C, Mn and P and the rate of Fe oxidation in the hot metal bath, and the change of composition in the slag were obtained as functions of time, operating conditions and kinetic parameters. The temperature behavior in the metal bath and the slag was also obtained by considering the heat transfer between the mixing and the slag zones and the heat generated from chemical reactions involving oxygen blowing. To identify the unknown parameters in the equations and simulate the dynamic model, Hooke and Jeeves parttern search and Runge-Kutta integration algorithm were used. By testing and fitting the model with the data obtained from the operation of POSCO #2 steelmaking plant, the dynamic model was able to predict the characteristics of the main components in the LD converter. It was possible to predict the optimum CO gas recovery by computer simulation

  • PDF

Cloud P2P OLAP: Query Processing Method and Index structure for Peer-to-Peer OLAP on Cloud Computing (Cloud P2P OLAP: 클라우드 컴퓨팅 환경에서의 Peer-to-Peer OLAP 질의처리기법 및 인덱스 구조)

  • Joo, Kil-Hong;Kim, Hun-Dong;Lee, Won-Suk
    • Journal of Internet Computing and Services
    • /
    • v.12 no.4
    • /
    • pp.157-172
    • /
    • 2011
  • The latest active studies on distributed OLAP to adopt a distributed environment are mainly focused on DHT P2P OLAP and Grid OLAP. However, these approaches have its weak points, the P2P OLAP has limitations to multidimensional range queries in the cloud computing environment due to the nature of structured P2P. On the other hand, the Grid OLAP has no regard for adjacency and time series. It focused on its own sub set lookup algorithm. To overcome the above limits, this paper proposes an efficient central managed P2P approach for a cloud computing environment. When a multi-level hybrid P2P method is combined with an index load distribution scheme, the performance of a multi-dimensional range query is enhanced. The proposed scheme makes the OLAP query results of a user to be able to reused by other users' volatile cube search. For this purpose, this paper examines the combination of an aggregation cube hierarchy tree, a quad-tree, and an interval-tree as an efficient index structure. As a result, the proposed cloud P2P OLAP scheme can manage the adjacency and time series factor of an OLAP query. The performance of the proposed scheme is analyzed by a series of experiments to identify its various characteristics.

Development of GML Map Visualization Service and POI Management Tool using Tagging (GML 지도 가시화 서비스 및 태깅을 이용한 POI 관리 도구 개발)

  • Park, Yong-Jin;Song, Eun-Ha;Jeong, Young-Sik
    • Journal of Internet Computing and Services
    • /
    • v.9 no.3
    • /
    • pp.141-158
    • /
    • 2008
  • In this paper, we developed the GML Map Server which visualized the map based on GML as international standard for exchanging the common format map and for interoperability of GIS information. And also, it should transmit effectively GML map into the mobile device by using dynamic map partition and caching. It manages a partition based on the visualization area of a mobile device in order to visualize the map to a mobile device in real time, and transmits the partition area by serializing it for the benefit of transmission. Also, the received partition area is compounded in a mobile device and is visualized by being partitioned again as four visible areas based on the display of a mobile device. Then, the area is managed by applying a caching algorithm in consideration of repetitiveness for a received map for the efficient operation of resources. Also, in order to prevent the delay in transmission time as regards the instance density area of the map, an adaptive map partition mechanism is proposed for maintaining the regularity of transmission time. GML Map Server can trace the position of mobile device with WIPI environment in this paper. The field emulator can be created mobile devices and mobile devices be moved and traced it's position instead of real-world. And we developed POIM(POI Management) for management hierarchically POI information and for the efficiency POI search by using the individual tagging technology with visual interface.

  • PDF

Development of Route Planning System for Intermodal Transportation Based on an Agent Collecting Schedule Information (운송스케줄 정보수집 에이전트 기반 복합운송 경로계획 시스템)

  • Choi, Hyung-Rim;Kim, Hyun-Soo;Park, Byung-Joo;Kang, Moo-Hong
    • Information Systems Review
    • /
    • v.10 no.1
    • /
    • pp.115-133
    • /
    • 2008
  • The third-party logistics industry mainly delivers goods from a place to an arrival place on behalf of the freight owner. To handle the work, they need a transportation route including transportation equipment between the starting place and the arrival place, schedule information for departure/arrival and transportation cost. Actually, automatic searching for an optimal transportation route, which considers arrival and departure points for intermodal transportation, is not a simple problem. To search efficiently transportation route, the collection of schedule information for intermodal transportation and transportation route generation have become critical and vital issues for logistics companies. Usually, they manually make a plan for a transportation route by their experience. Because of this, they are limited in their ability if there is too much cargo volume and a great many transactions. Furthermore, their dependence on the conventional way in doing business causes an inefficient selection of transporters or transportation routes. Also, it fails to provide diverse alternatives for transportation routes to the customers, and as a result, increases logistics costs. In an effort to solve these problems, this study aims to develop a route planning system based on agent, which can collect scattered schedule information on the Web. The route planning system also has an algorithm for transportation route generation in intermodal transportation.