• Title/Summary/Keyword: $A^*$ search algorithm

Search Result 3,553, Processing Time 0.035 seconds

A CF-based Health Functional Recommender System using Extended User Similarity Measure (확장된 사용자 유사도를 이용한 CF-기반 건강기능식품 추천 시스템)

  • Sein Hong;Euiju Jeong;Jaekyeong Kim
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.3
    • /
    • pp.1-17
    • /
    • 2023
  • With the recent rapid development of ICT(Information and Communication Technology) and the popularization of digital devices, the size of the online market continues to grow. As a result, we live in a flood of information. Thus, customers are facing information overload problems that require a lot of time and money to select products. Therefore, a personalized recommender system has become an essential methodology to address such issues. Collaborative Filtering(CF) is the most widely used recommender system. Traditional recommender systems mainly utilize quantitative data such as rating values, resulting in poor recommendation accuracy. Quantitative data cannot fully reflect the user's preference. To solve such a problem, studies that reflect qualitative data, such as review contents, are being actively conducted these days. To quantify user review contents, text mining was used in this study. The general CF consists of the following three steps: user-item matrix generation, Top-N neighborhood group search, and Top-K recommendation list generation. In this study, we propose a recommendation algorithm that applies an extended similarity measure, which utilize quantified review contents in addition to user rating values. After calculating review similarity by applying TF-IDF, Word2Vec, and Doc2Vec techniques to review content, extended similarity is created by combining user rating similarity and quantified review contents. To verify this, we used user ratings and review data from the e-commerce site Amazon's "Health and Personal Care". The proposed recommendation model using extended similarity measure showed superior performance to the traditional recommendation model using only user rating value-based similarity measure. In addition, among the various text mining techniques, the similarity obtained using the TF-IDF technique showed the best performance when used in the neighbor group search and recommendation list generation step.

Finding the time sensitive frequent itemsets based on data mining technique in data streams (데이터 스트림에서 데이터 마이닝 기법 기반의 시간을 고려한 상대적인 빈발항목 탐색)

  • Park, Tae-Su;Chun, Seok-Ju;Lee, Ju-Hong;Kang, Yun-Hee;Choi, Bum-Ghi
    • Journal of The Korean Association of Information Education
    • /
    • v.9 no.3
    • /
    • pp.453-462
    • /
    • 2005
  • Recently, due to technical improvements of storage devices and networks, the amount of data increase rapidly. In addition, it is required to find the knowledge embedded in a data stream as fast as possible. Huge data in a data stream are created continuously and changed fast. Various algorithms for finding frequent itemsets in a data stream are actively proposed. Current researches do not offer appropriate method to find frequent itemsets in which flow of time is reflected but provide only frequent items using total aggregation values. In this paper we proposes a novel algorithm for finding the relative frequent itemsets according to the time in a data stream. We also propose the method to save frequent items and sub-frequent items in order to take limited memory into account and the method to update time variant frequent items. The performance of the proposed method is analyzed through a series of experiments. The proposed method can search both frequent itemsets and relative frequent itemsets only using the action patterns of the students at each time slot. Thus, our method can enhance the effectiveness of learning and make the best plan for individual learning.

  • PDF

Identification and Biochemical Characterization of Xylanase-producing Streptomyces glaucescens subsp. WJ-1 Isolated from Soil in Jeju Island, Korea (제주도 토양에서 분리한 xylanase 생산균주 Streptomyces glaucescens subsp. WJ-1의 동정 및 효소의 생화학적 특성 연구)

  • Kim, Da Som;Jung, Sung Cheol;Bae, Chang Hwan;Chi, Won-Jae
    • Microbiology and Biotechnology Letters
    • /
    • v.45 no.1
    • /
    • pp.43-50
    • /
    • 2017
  • A xylan-degrading bacterium (strain WJ-1) was isolated from soil collected from Jeju Island, Republic of Korea. Strain WJ-1 was characterized as a gram-positive, aerobic, and spore-forming bacterium. The predominant fatty acid in this bacterium was anteiso-$C_{15:0}$ (42.99%). A similarity search based on 16S rRNA gene sequences suggested that the strain belonged to the genus Streptomyces. Further, strain WJ-1 shared the highest sequence similarity with the type strains Streptomyces spinoveruucosus NBRC 14228, S. minutiscleroticus NBRC 13000, and S. glaucescens NBRC 12774. Together, they formed a coherent cluster in a phylogenetic tree based on the neighbor-joining algorithm. The DNA G+C content of strain WJ-1 was 74.7 mol%. The level of DNA-DNA relatedness between strain WJ-1 and the closest related species S. glaucescens NBRC 12774 was 85.7%. DNA-DNA hybridization, 16S rRNA gene sequence similarity, and the phenotypic and chemotaxonomic characteristics suggest that strain WJ-1 constitutes a novel subspecies of S. glaucescens. Thus, the strain was designated as S. glaucescens subsp. WJ-1 (Korean Agricultural Culture Collection [KACC] accession number 92086). Additionally, strain WJ-1 secreted thermostable endo-type xylanases that converted xylan to xylooligosaccharides such as xylotriose and xylotetraose. The enzymes exhibited optimal activity at pH 7.0 and $55^{\circ}C$.

Water leakage accident analysis of water supply networks using big data analysis technique (R기반 빅데이터 분석기법을 활용한 상수도시스템 누수사고 분석)

  • Hong, Sung-Jin;Yoo, Do-Guen
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.spc1
    • /
    • pp.1261-1270
    • /
    • 2022
  • The purpose of this study is to collect and analyze information related to water leaks that cannot be easily accessed, and utilized by using the news search results that people can easily access. We applied a web crawling technique for extracting big data news on water leakage accidents in the water supply system and presented an algorithm in a procedural way to obtain accurate leak accident news. In addition, a data analysis technique suitable for water leakage accident information analysis was developed so that additional information such as the date and time of occurrence, cause of occurrence, location of occurrence, damaged facilities, damage effect. The primary goal of value extraction through big data-based leak analysis proposed in this study is to extract a meaningful value through comparison with the existing waterworks statistical results. In addition, the proposed method can be used to effectively respond to consumers or determine the service level of water supply networks. In other words, the presentation of such analysis results suggests the need to inform the public of information such as accidents a little more, and can be used in conjunction to prepare a radio wave and response system that can quickly respond in case of an accident.

Optimization of Support Vector Machines for Financial Forecasting (재무예측을 위한 Support Vector Machine의 최적화)

  • Kim, Kyoung-Jae;Ahn, Hyun-Chul
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.241-254
    • /
    • 2011
  • Financial time-series forecasting is one of the most important issues because it is essential for the risk management of financial institutions. Therefore, researchers have tried to forecast financial time-series using various data mining techniques such as regression, artificial neural networks, decision trees, k-nearest neighbor etc. Recently, support vector machines (SVMs) are popularly applied to this research area because they have advantages that they don't require huge training data and have low possibility of overfitting. However, a user must determine several design factors by heuristics in order to use SVM. For example, the selection of appropriate kernel function and its parameters and proper feature subset selection are major design factors of SVM. Other than these factors, the proper selection of instance subset may also improve the forecasting performance of SVM by eliminating irrelevant and distorting training instances. Nonetheless, there have been few studies that have applied instance selection to SVM, especially in the domain of stock market prediction. Instance selection tries to choose proper instance subsets from original training data. It may be considered as a method of knowledge refinement and it maintains the instance-base. This study proposes the novel instance selection algorithm for SVMs. The proposed technique in this study uses genetic algorithm (GA) to optimize instance selection process with parameter optimization simultaneously. We call the model as ISVM (SVM with Instance selection) in this study. Experiments on stock market data are implemented using ISVM. In this study, the GA searches for optimal or near-optimal values of kernel parameters and relevant instances for SVMs. This study needs two sets of parameters in chromosomes in GA setting : The codes for kernel parameters and for instance selection. For the controlling parameters of the GA search, the population size is set at 50 organisms and the value of the crossover rate is set at 0.7 while the mutation rate is 0.1. As the stopping condition, 50 generations are permitted. The application data used in this study consists of technical indicators and the direction of change in the daily Korea stock price index (KOSPI). The total number of samples is 2218 trading days. We separate the whole data into three subsets as training, test, hold-out data set. The number of data in each subset is 1056, 581, 581 respectively. This study compares ISVM to several comparative models including logistic regression (logit), backpropagation neural networks (ANN), nearest neighbor (1-NN), conventional SVM (SVM) and SVM with the optimized parameters (PSVM). In especial, PSVM uses optimized kernel parameters by the genetic algorithm. The experimental results show that ISVM outperforms 1-NN by 15.32%, ANN by 6.89%, Logit and SVM by 5.34%, and PSVM by 4.82% for the holdout data. For ISVM, only 556 data from 1056 original training data are used to produce the result. In addition, the two-sample test for proportions is used to examine whether ISVM significantly outperforms other comparative models. The results indicate that ISVM outperforms ANN and 1-NN at the 1% statistical significance level. In addition, ISVM performs better than Logit, SVM and PSVM at the 5% statistical significance level.

Metagenomic analysis of bacterial community structure and diversity of lignocellulolytic bacteria in Vietnamese native goat rumen

  • Do, Thi Huyen;Dao, Trong Khoa;Nguyen, Khanh Hoang Viet;Le, Ngoc Giang;Nguyen, Thi Mai Phuong;Le, Tung Lam;Phung, Thu Nguyet;Straalen, Nico M. van;Roelofs, Dick;Truong, Nam Hai
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.31 no.5
    • /
    • pp.738-747
    • /
    • 2018
  • Objective: In a previous study, analysis of Illumina sequenced metagenomic DNA data of bacteria in Vietnamese goats' rumen showed a high diversity of putative lignocellulolytic genes. In this study, taxonomy speculation of microbial community and lignocellulolytic bacteria population in the rumen was conducted to elucidate a role of bacterial structure for effective degradation of plant materials. Methods: The metagenomic data had been subjected into Basic Local Alignment Search Tool (BLASTX) algorithm and the National Center for Biotechnology Information non-redundant sequence database. Here the BLASTX hits were further processed by the Metagenome Analyzer program to statistically analyze the abundance of taxa. Results: Microbial community in the rumen is defined by dominance of Bacteroidetes compared to Firmicutes. The ratio of Firmicutes versus Bacteroidetes was 0.36:1. An abundance of Synergistetes was uniquely identified in the goat microbiome may be formed by host genotype. With regard to bacterial lignocellulose degraders, the ratio of lignocellulolytic genes affiliated with Firmicutes compared to the genes linked to Bacteroidetes was 0.11:1, in which the genes encoding putative hemicellulases, carbohydrate esterases, polysaccharide lyases originated from Bacteroidetes were 14 to 20 times higher than from Firmicutes. Firmicutes seem to possess more cellulose hydrolysis capacity showing a Firmicutes/Bacteroidetes ratio of 0.35:1. Analysis of lignocellulolytic potential degraders shows that four species belonged to Bacteroidetes phylum, while two species belonged to Firmicutes phylum harbouring at least 12 different catalytic domains for all lignocellulose pretreatment, cellulose, as well as hemicellulose saccharification. Conclusion: Based on these findings, we speculate that increasing the members of Bacteroidetes to keep a low ratio of Firmicutes versus Bacteroidetes in goat rumen has resulted most likely in an increased lignocellulose digestion.

A Boundary Matching and Post-processing Method for the Temporal Error Concealment in H.264/AVC (H.264/AVC의 시간적 오류 은닉을 위한 경계 정합과 후처리 방법)

  • Lee, Jun-Woo;Na, Sang-Il;Won, In-Su;Lim, Dae-Kyu;Jeong, Dong-Seok
    • Journal of Korea Multimedia Society
    • /
    • v.12 no.11
    • /
    • pp.1563-1571
    • /
    • 2009
  • In this paper, we propose a new boundary matching method for the temporal error concealment and a post processing algorithm for perceptual quality improvement of the concealed frame. Temporal error concealment is a method that substitutes error blocks with similar blocks from the reference frame. In conventional H.264/AVC standard, it compares outside pixels of erroneous block with inside pixels of reference block to find the most similar block. However, it is very possible that the conventional method substitutes erroneous block with the wrong one because it compares only narrow spatial range of pixels. In this paper, for substituting erroneous blocks with more correct blocks, we propose enhanced boundary matching method by comparing inside and outside pixels of reference block with outside pixels of erroneous block and setting up additional candidate motion vector in the fixed search range based on maximum and minimum value of candidate motion vectors. Furthermore, we propose a post processing method to smooth edges between concealed and decoded blocks without error by using the modified deblocking filter. We identified that the proposed method shows quality improvement of about 0.9dB over the conventional boundary matching methods.

  • PDF

Artificial Intelligence Algorithms, Model-Based Social Data Collection and Content Exploration (소셜데이터 분석 및 인공지능 알고리즘 기반 범죄 수사 기법 연구)

  • An, Dong-Uk;Leem, Choon Seong
    • The Journal of Bigdata
    • /
    • v.4 no.2
    • /
    • pp.23-34
    • /
    • 2019
  • Recently, the crime that utilizes the digital platform is continuously increasing. About 140,000 cases occurred in 2015 and about 150,000 cases occurred in 2016. Therefore, it is considered that there is a limit handling those online crimes by old-fashioned investigation techniques. Investigators' manual online search and cognitive investigation methods those are broadly used today are not enough to proactively cope with rapid changing civil crimes. In addition, the characteristics of the content that is posted to unspecified users of social media makes investigations more difficult. This study suggests the site-based collection and the Open API among the content web collection methods considering the characteristics of the online media where the infringement crimes occur. Since illegal content is published and deleted quickly, and new words and alterations are generated quickly and variously, it is difficult to recognize them quickly by dictionary-based morphological analysis registered manually. In order to solve this problem, we propose a tokenizing method in the existing dictionary-based morphological analysis through WPM (Word Piece Model), which is a data preprocessing method for quick recognizing and responding to illegal contents posting online infringement crimes. In the analysis of data, the optimal precision is verified through the Vote-based ensemble method by utilizing a classification learning model based on supervised learning for the investigation of illegal contents. This study utilizes a sorting algorithm model centering on illegal multilevel business cases to proactively recognize crimes invading the public economy, and presents an empirical study to effectively deal with social data collection and content investigation.

  • PDF

High Bit-Rates Quantization of the First-Order Markov Process Based on a Codebook-Constrained Sample-Adaptive Product Quantizers (부호책 제한을 가지는 표본 적응 프로덕트 양자기를 이용한 1차 마르코프 과정의 고 전송률 양자화)

  • Kim, Dong-Sik
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.49 no.1
    • /
    • pp.19-30
    • /
    • 2012
  • For digital data compression, the quantization is the main part of the lossy source coding. In order to improve the performance of quantization, the vector quantizer(VQ) can be employed. The encoding complexity, however, exponentially increases as the vector dimension or bit rate gets large. Much research has been conducted to alleviate such problems of VQ. Especially for high bit rates, a constrained VQ, which is called the sample-adaptive product quantizer(SAPQ), has been proposed for reducing the hugh encoding complexity of regular VQs. SAPQ has very similar structure as to the product VQ(PQ). However, the quantizer performance can be better than the PQ case. Further, the encoding complexity and the memory requirement for the codebooks are lower than the regular full-search VQ case. Among SAPQs, 1-SAPQ has a simple quantizer structure, where each product codebook is symmetric with respect to the diagonal line in the underlying vector space. It is known that 1-SAPQ shows a good performance for i.i.d. sources. In this paper, a study on designing 1-SAPQ for the first-order Markov process. For an efficient design of 1-SAPQ, an algorithm for the initial codebook is proposed, and through the numerical analysis it is shown that 1-SAPQ shows better quantizer distortion than the VQ case, of which encoding complexity is similar to that of 1-SAPQ, and shows distortions, which are close to that of the DPCM(differential pulse coded modulation) scheme with the Lloyd-Max quantizer.

Treatment Strategies for Psychotic Depression (정신병적 우울증의 치료 전략)

  • Lee, Soyoung Irene;Jung, Han-Yong
    • Korean Journal of Biological Psychiatry
    • /
    • v.13 no.4
    • /
    • pp.234-243
    • /
    • 2006
  • Objectives : Several factors, such as biological markers, clinical correlates, and course of the depressive disorders with psychotic symptoms differ from those without psychotic symptoms. Therefore, specification of a treatment algorithm for depressive disorder with psychotic symptoms is legitimated. This article provides a systematic review of somatic treatments for depressive disorder with psychotic symptoms. Methods : According to the search strategy of the Clinical Research Center for Depression of Korean Health 21 R & D Project, first, PubMed and EMBASE were searched using terms with regard to the treatment of depressive disorders with psychotic symptoms(until July 2006). Reference lists of related reviews and studies were searched. In addition, relevant practice guidelines were searched using PubMed. All identified clinical literatures were reviewed and summarized in a narrative manner. Results : Treatment options, such as a combination of an antidepressant and an antipsychotic versus an antidepressant or an antipsychotic alone are summarized. In addition, issues regarding the electroconvulsive therapy( ECT), combination therapy, and maintenance treatment are discussed. Conclusion : In former times, the combination of an antidepressant and an antipsychotic or ECT were recommended as the first line treatment for depressive disorder with psychotic symptoms. Recently, however, there was a suggestion that there was no conclusive evidence that the combination of an antidepressant and an antipsychotic drug is more effective than an antidepressant alone. More evidence regarding the pharmacological treatment for depressive disorder with psychotic symptoms is needed.

  • PDF