• Title/Summary/Keyword: 2 step cluster analysis

Search Result 64, Processing Time 0.024 seconds

Cloning, Sequencing and Expression of apxIA, IIA, IIIA of Actinobacillus pleuropneumoniae Isolated in Korea (국내 분리 흉막폐렴균의 apxIA, IIA, IIIA 유전자 Cloning, 염기서열 분석 및 단백질 발현)

  • Shin, Sung-jae;Cho, Young-wook;Yoo, Han-sang
    • Korean Journal of Veterinary Research
    • /
    • v.43 no.2
    • /
    • pp.247-253
    • /
    • 2003
  • Actinobacillus pleuropneumoniae causes a highly contagious pleuropneumoniae in swine. The bacterium produces several virulence factors such as exotoxin, LPS, capsular polysaccharide, etc. Among them, the exotoxin, called Apx, has been focused as the major virulence factor, and the toxin consists of 4 gene cluster. apx CABD. apxA is the structural gene of toxin and has four different types, I, II, III, and IV. As the first step of development of a new subunit vaccine, the three different types of apxA gene were amplified from A. pleuropneumoniae isolated from Korea by PCR with primer designed based on the N- and C-terminal of the toxin. The sizes of apxIA, IIA and IIIA were 3,073, 2,971 and 3,159bps, respectively. The comparison of whole DNA sequences of apxIA, IIA and IIIA genes with those of the reference strain demonstrated 98%, 99% and 98% homology, respectively. In addition, the phylogenetic analysis was performed based on the amino acid sequences compared with 12 different RTX toxin family using the neighbor-joining method. ApxA proteins of Korean isolates were identical with reference strains in this study. All ApxA proteins were expressed in E. coli with pQE expression vector and identified using Western blot with polyclonal antibodies against culture supernatants of A. pleuropneumoniae serotype 2 or 5. The sizes of each expressed ApxA protein were about 120, 110, 125 kDa (M.W.), respectively. The results obtained in this study could be used for the future study to develop a new vaccine to porcine pleuropneumoniae.

Query Processing Model Using Two-level Fuzzy Knowledge Base (2단계 퍼지 지식베이스를 이용한 질의 처리 모델)

  • Lee, Ki-Young;Kim, Young-Un
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.4 s.36
    • /
    • pp.1-16
    • /
    • 2005
  • When Web-based special retrieval systems for scientific field extremely restrict the expression of user's information request, the process of the information content analysis and that of the information acquisition become inconsistent. Accordingly, this study suggests the re-ranking retrieval model which reflects the content based similarity between user's inquiry terms and index words by grasping the document knowledge structure. In order to accomplish this, the former constructs a thesaurus and similarity relation matrix to provide the subject analysis mechanism and the latter propose the algorithm which establishes a search model such as query expansion in order to analyze the user's demands. Therefore, the algorithm that this study suggests as retrieval utilizing the information structure of a retrieval system can be content-based retrieval mechanism to establish a 2-step search model for the preservation of recall and improvement of accuracy which was a weak point of the previous fuzzy retrieval model.

  • PDF

Influence of Limerence and Ruminative Response on Dating Violence in Romantic Relationship (연인관계에서의 집착과 반추적 반응이 데이트 폭력에 미치는 영향)

  • Jeong, Goo-Churl
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.11
    • /
    • pp.479-490
    • /
    • 2017
  • The study analyzed the relationship between dating violence and limerence and ruminative response in romentic relationship. The subjects were 205 college students who had experience of dating. And mean age of subjects was 22.1 years. Analysis methods were correlation analysis, ANOVA, two-step cluster analysis, and multinomial logistic regression analysis. The results of this study are as follows. First, self-reproach ruminative respone were significantly higher the victim group and perpetrator victim group than the general group. Second, all sub-factors of ruminative respone were significantly higher the victim group and perpetrator victim group than the general group. Third, the self-reproach ruminative respone was significant positive explanatory variable on dating violence. Fifth, the victim limerence experience significantly increased the odds ratio of victim group of dating violence by 3.3 times, and that of perpetrator victim group of dating violence by 10.9 times. Based on these findings, he discussed the importance of dating violence and the importance of limerence and rumination.

Difference of Collaboration·Empathy Skill and Adaptation of School Life according to School Bullying Types (집단따돌림 유형에 따른 협동 및 공감기술과 학교생활적응의 차이)

  • Park, Wan-Sung;Jeong, Goo-Churl
    • The Journal of the Korea Contents Association
    • /
    • v.16 no.11
    • /
    • pp.399-408
    • /
    • 2016
  • This research was conducted to analyze the relationship among school bullying types, collaboration empathy skills, and adaptation of school life. A survey was conducted for the research, and asked 213 adolescents in middle and high schools in capital area(middle school: 106, high school: 107). Data Analysis was used a two-step cluster analysis to classify the type of bullying, explanation of a prediction variable according to the groups were analyzed by a multiple logistic regression analysis. The results of analysis of the research are as in the following. First, experience of afflicting or suffering from school bullying had negative correlation with collaboration empathy skills, and also with school life adaptation. Secondly, assailant group and victim group of school bullying was related to the lack of collaboration skill, and also related with empathy skill. Thirdly, collaboration empathy skills was influential factor on the adaptation of school life. Based on the results, collaboration empathy skills reduce the experience of bullying, and have a positive impact on the adaptation of school life. It confirmed the need for a social skills training program and discussed the implications.

Comparison between Planned and Actual Data of Block Assembly Process using Process Mining in Shipyards (조선 산업에서 프로세스 마이닝을 이용한 블록 조립 프로세스의 계획 및 실적 비교 분석)

  • Lee, Dongha;Park, Jae Hun;Bae, Hyerim
    • The Journal of Society for e-Business Studies
    • /
    • v.18 no.4
    • /
    • pp.145-167
    • /
    • 2013
  • This paper proposes a method to compare planned processes with actual processes of bock assembly operations in shipbuilding industry. Process models can be discovered using the process mining techniques both for planned and actual log data. The comparison between planned and actual process is focused in this paper. The analysis procedure consists of five steps : 1) data pre-processing, 2) definition of analysis level, 3) clustering of assembly bocks, 4) discovery of process model per cluster, and 5) comparison between planned and actual processes per cluster. In step 5, it is proposed to compare those processes by the several perspectives such as process model, task, process instance and fitness. For each perspective, we also defined comparison factors. Especially, in the fitness perspective, cross fitness is proposed and analyzed by the quantity of fitness between the discovered process model by own data and the other data(for example, the fitness of planned model to actual data, and the fitness of actual model to planned data). The effectiveness of the proposed methods was verified in a case study using planned data of block assembly planning system (BAPS) and actual data generated from block assembly monitoring system (BAMS) of a top ranked shipbuilding company in Korea.

Genetic Diversity and Population Structure of Korean Soybean Landrace [Glycine max(L.) Merr.]

  • Cho, Gyu-Taek;Lee, Jeong-Ran;Moon, Jung-Kyung;Yoon, Mun-Sup;Baek, Hyung-Jin;Kang, Jung-Hoon;Kim, Tae-San;Paek, Nam-Chon
    • Journal of Crop Science and Biotechnology
    • /
    • v.11 no.2
    • /
    • pp.83-90
    • /
    • 2008
  • Two hundred and sixty Korean soybean landrace accessions were analyzed for polymorphism at 92 simple sequence repeat(SSR) loci. The 995 identified alleles served as raw data for estimating genetic diversity and population structure. The number of alleles at a locus ranged from three to 27 with a mean of 10.4 alleles per locus. $F_{ST}$ values estimated by analysis of molecular variance(AMOVA) using SSR data set were 0.018, 0.027, and 0.016 for usage, collection site and maturity groups, respectively, indicating little genetic differentiation. The model-based clustering analysis placed the accessions into three clusters(K=3) with 0.0503 of $F_{ST}$, indicating moderate genetic differentiation. Duncan's Multiple Range Test at K = 3 on the basis of 18 quantitative traits revealed that one cluster was mainly differentiated from the other two clusters by seed related traits and the other two clusters were differentiated from each other by biochemical traits. Genetic structure of Korean soybean landraces was differentiated by model-based clustering and supported by their phenotypic traits in part. This preliminary study could be the first step towards more efficient germplasm management and utilization of soybean landraces and helpful in association studies between genotypic and phenotypic traits in Korean soybean landraces.

  • PDF

A Study on the Seafood Consumer's Value Analysis and Market Segmentation (수산물 소비에 대한 가치체계 분석과 시장세분화에 관한 연구)

  • Zhang, Chun-Feng;Jang, Young-Soo
    • The Journal of Fisheries Business Administration
    • /
    • v.42 no.2
    • /
    • pp.47-68
    • /
    • 2011
  • Values are lasting beliefs that are at the center of human behavior and not be often changed. Different values make different behaviors, and similar values form similar behaviors. Consumers' values affect not only the cognitive process but also behaviors in a powerful and comprehensive way. There have been many studies regarding prediction of consumer patterns and identification, measurement methods of values. This is because if we can accurately measure the value system, it can be used in many areas of marketing such as market segmentation, new product development, and advertisement. In case of seafood, it is also necessary to make marketing strategies by segmenting consumers based on their value systems. The objectives of this study are as follows: First, it is to find out the connection process from the properties of seafood products that consumers consider important, to the benefits, and finally to the values they pursue by applying the means-end chain theory, using the Laddering method. Second, using a two-step cluster analysis, we aim to segment seafood markets based on consumers' values and investigate characteristics of segmented markets. Based on objectives, it is expected that this study would provide informations on seafood consumers and help to establish seafood marketing strategies for producers and distributors. Analytical results of the value system using a means-end chain theory indicated that there were seven complete links, that is, ladders among fresh seafood products. In case of processed seafood products, there were total 9 complete ladders. The empirical analytical results of market segmentation according to the values showed fresh seafood products were divided into three groups. In case of processed seafood products were segmented into two groups.

Analysis of the Influence Factors of Data Loading Performance Using Apache Sqoop (아파치 스쿱을 사용한 하둡의 데이터 적재 성능 영향 요인 분석)

  • Chen, Liu;Ko, Junghyun;Yeo, Jeongmo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.2
    • /
    • pp.77-82
    • /
    • 2015
  • Big Data technology has been attracted much attention in aspect of fast data processing. Research of practicing Big Data technology is also ongoing to process large-scale structured data much faster in Relatioinal Database(RDB). Although there are lots of studies about measuring analyzing performance, studies about structured data loading performance, prior step of analyzing, is very rare. Thus, in this study, structured data in RDB is tested the performance that loads distributed processing platform Hadoop using Apache sqoop. Also in order to analyze the influence factors of data loading, it is tested repeatedly with different options of data loading and compared with data loading performance among RDB based servers. Although data loading performance of Apache Sqoop in test environment was low, but in large-scale Hadoop cluster environment we can expect much better performance because of getting more hardware resources. It is expected to be based on study improving data loading performance and whole steps of performance analyzing structured data in Hadoop Platform.

Multi-Vector Document Embedding Using Semantic Decomposition of Complex Documents (복합 문서의 의미적 분해를 통한 다중 벡터 문서 임베딩 방법론)

  • Park, Jongin;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.19-41
    • /
    • 2019
  • According to the rapidly increasing demand for text data analysis, research and investment in text mining are being actively conducted not only in academia but also in various industries. Text mining is generally conducted in two steps. In the first step, the text of the collected document is tokenized and structured to convert the original document into a computer-readable form. In the second step, tasks such as document classification, clustering, and topic modeling are conducted according to the purpose of analysis. Until recently, text mining-related studies have been focused on the application of the second steps, such as document classification, clustering, and topic modeling. However, with the discovery that the text structuring process substantially influences the quality of the analysis results, various embedding methods have actively been studied to improve the quality of analysis results by preserving the meaning of words and documents in the process of representing text data as vectors. Unlike structured data, which can be directly applied to a variety of operations and traditional analysis techniques, Unstructured text should be preceded by a structuring task that transforms the original document into a form that the computer can understand before analysis. It is called "Embedding" that arbitrary objects are mapped to a specific dimension space while maintaining algebraic properties for structuring the text data. Recently, attempts have been made to embed not only words but also sentences, paragraphs, and entire documents in various aspects. Particularly, with the demand for analysis of document embedding increases rapidly, many algorithms have been developed to support it. Among them, doc2Vec which extends word2Vec and embeds each document into one vector is most widely used. However, the traditional document embedding method represented by doc2Vec generates a vector for each document using the whole corpus included in the document. This causes a limit that the document vector is affected by not only core words but also miscellaneous words. Additionally, the traditional document embedding schemes usually map each document into a single corresponding vector. Therefore, it is difficult to represent a complex document with multiple subjects into a single vector accurately using the traditional approach. In this paper, we propose a new multi-vector document embedding method to overcome these limitations of the traditional document embedding methods. This study targets documents that explicitly separate body content and keywords. In the case of a document without keywords, this method can be applied after extract keywords through various analysis methods. However, since this is not the core subject of the proposed method, we introduce the process of applying the proposed method to documents that predefine keywords in the text. The proposed method consists of (1) Parsing, (2) Word Embedding, (3) Keyword Vector Extraction, (4) Keyword Clustering, and (5) Multiple-Vector Generation. The specific process is as follows. all text in a document is tokenized and each token is represented as a vector having N-dimensional real value through word embedding. After that, to overcome the limitations of the traditional document embedding method that is affected by not only the core word but also the miscellaneous words, vectors corresponding to the keywords of each document are extracted and make up sets of keyword vector for each document. Next, clustering is conducted on a set of keywords for each document to identify multiple subjects included in the document. Finally, a Multi-vector is generated from vectors of keywords constituting each cluster. The experiments for 3.147 academic papers revealed that the single vector-based traditional approach cannot properly map complex documents because of interference among subjects in each vector. With the proposed multi-vector based method, we ascertained that complex documents can be vectorized more accurately by eliminating the interference among subjects.

Trip Generation Model based on Geographically Weighted Regression (공간가중회귀분석을 이용한 통행발생모형)

  • Kim, Jin-Hui;Park, Il-Seop;Jeong, Jin-Hyeok
    • Journal of Korean Society of Transportation
    • /
    • v.29 no.2
    • /
    • pp.101-109
    • /
    • 2011
  • In most of the urbanized cities, socio-economic attributes tend to cluster as patterns of similarity in space, namely spatial autocorrelation, by agglomeration forces. The classical linear regression model, the most frequently adopted in the trip generation step, cannot sufficiently represent this effect. In order to take into account the effect properly, we need a model which adequately deals with the spatial dependence patterns. In this study, the Geographically Weighted Regression (GWR) model is adopted as an alternative method for the local analysis of relationships in multivariate data sets; that is GWR extends this traditional regression framework by estimating local rather than global parameters. This study shows the existence of spatial effects in the production and attraction of home base/non-home based trips through the GWR model using travel data collected in Daegu metropolitan area. Furthermore, LISA is employed to verify the fact that the local spatial autocorrelation exists.