• Title/Summary/Keyword: Sequence Mining

Search Result 164, Processing Time 0.024 seconds

Assessment of geometric nonlinear behavior in composite beams with partial shear interaction

  • Jie Wen;Abdul Hamid Sheikh;Md. Alhaz Uddin;A.B.M. Saiful Islam;Md. Arifuzzaman
    • Steel and Composite Structures
    • /
    • v.48 no.6
    • /
    • pp.693-708
    • /
    • 2023
  • Composite beams, two materials joined together, have become more common in structural engineering over the past few decades because they have better mechanical and structural properties. The shear connectors between their layers exhibit some deformability with finite stiffness, resulting in interfacial shear slip, a phenomenon known as partial shear interaction. Such a partial shear interaction contributes significantly to the composite beams. To provide precise predictions of the geometric nonlinear behavior shown by two-layered composite beams with interfacial shear slips, a robust analytical model has been developed that incorporates the influence of significant displacements. The application of a higher-order beam theory to the two material layers results in a third-order adjustment of the longitudinal displacement within each layer along the depth of the beam. Deformable shear connectors are employed at the interface to represent the partial shear interaction by means of a sequence of shear connectors that are evenly distributed throughout the beam's length. The Von-Karman theory of large deflection incorporates geometric nonlinearity into the governing equations, which are then solved analytically using the Navier solution technique. Suggested model exhibits a notable level of agreement with published findings, and numerical outputs derived from finite element (FE) model. Large displacement substantially reduces deflection, interfacial shear slip, and stress values. Geometric nonlinearity has a significant impact on beams with larger span-to-depth ratio and a greater degree of shear connector deformability. Potentially, the analytical model can accurately predict the geometric nonlinear responses of composite beams. The model has a high degree of generality, which might aid in the numerical solution of composite beams with varying configurations and shear criteria.

Network Anomaly Traffic Detection Using WGAN-CNN-BiLSTM in Big Data Cloud-Edge Collaborative Computing Environment

  • Yue Wang
    • Journal of Information Processing Systems
    • /
    • v.20 no.3
    • /
    • pp.375-390
    • /
    • 2024
  • Edge computing architecture has effectively alleviated the computing pressure on cloud platforms, reduced network bandwidth consumption, and improved the quality of service for user experience; however, it has also introduced new security issues. Existing anomaly detection methods in big data scenarios with cloud-edge computing collaboration face several challenges, such as sample imbalance, difficulty in dealing with complex network traffic attacks, and difficulty in effectively training large-scale data or overly complex deep-learning network models. A lightweight deep-learning model was proposed to address these challenges. First, normalization on the user side was used to preprocess the traffic data. On the edge side, a trained Wasserstein generative adversarial network (WGAN) was used to supplement the data samples, which effectively alleviates the imbalance issue of a few types of samples while occupying a small amount of edge-computing resources. Finally, a trained lightweight deep learning network model is deployed on the edge side, and the preprocessed and expanded local data are used to fine-tune the trained model. This ensures that the data of each edge node are more consistent with the local characteristics, effectively improving the system's detection ability. In the designed lightweight deep learning network model, two sets of convolutional pooling layers of convolutional neural networks (CNN) were used to extract spatial features. The bidirectional long short-term memory network (BiLSTM) was used to collect time sequence features, and the weight of traffic features was adjusted through the attention mechanism, improving the model's ability to identify abnormal traffic features. The proposed model was experimentally demonstrated using the NSL-KDD, UNSW-NB15, and CIC-ISD2018 datasets. The accuracies of the proposed model on the three datasets were as high as 0.974, 0.925, and 0.953, respectively, showing superior accuracy to other comparative models. The proposed lightweight deep learning network model has good application prospects for anomaly traffic detection in cloud-edge collaborative computing architectures.

A Topic Modeling-based Recommender System Considering Changes in User Preferences (고객 선호 변화를 고려한 토픽 모델링 기반 추천 시스템)

  • Kang, So Young;Kim, Jae Kyeong;Choi, Il Young;Kang, Chang Dong
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.43-56
    • /
    • 2020
  • Recommender systems help users make the best choice among various options. Especially, recommender systems play important roles in internet sites as digital information is generated innumerable every second. Many studies on recommender systems have focused on an accurate recommendation. However, there are some problems to overcome in order for the recommendation system to be commercially successful. First, there is a lack of transparency in the recommender system. That is, users cannot know why products are recommended. Second, the recommender system cannot immediately reflect changes in user preferences. That is, although the preference of the user's product changes over time, the recommender system must rebuild the model to reflect the user's preference. Therefore, in this study, we proposed a recommendation methodology using topic modeling and sequential association rule mining to solve these problems from review data. Product reviews provide useful information for recommendations because product reviews include not only rating of the product but also various contents such as user experiences and emotional state. So, reviews imply user preference for the product. So, topic modeling is useful for explaining why items are recommended to users. In addition, sequential association rule mining is useful for identifying changes in user preferences. The proposed methodology is largely divided into two phases. The first phase is to create user profile based on topic modeling. After extracting topics from user reviews on products, user profile on topics is created. The second phase is to recommend products using sequential rules that appear in buying behaviors of users as time passes. The buying behaviors are derived from a change in the topic of each user. A collaborative filtering-based recommendation system was developed as a benchmark system, and we compared the performance of the proposed methodology with that of the collaborative filtering-based recommendation system using Amazon's review dataset. As evaluation metrics, accuracy, recall, precision, and F1 were used. For topic modeling, collapsed Gibbs sampling was conducted. And we extracted 15 topics. Looking at the main topics, topic 1, top 3, topic 4, topic 7, topic 9, topic 13, topic 14 are related to "comedy shows", "high-teen drama series", "crime investigation drama", "horror theme", "British drama", "medical drama", "science fiction drama", respectively. As a result of comparative analysis, the proposed methodology outperformed the collaborative filtering-based recommendation system. From the results, we found that the time just prior to the recommendation was very important for inferring changes in user preference. Therefore, the proposed methodology not only can secure the transparency of the recommender system but also can reflect the user's preferences that change over time. However, the proposed methodology has some limitations. The proposed methodology cannot recommend product elaborately if the number of products included in the topic is large. In addition, the number of sequential patterns is small because the number of topics is too small. Therefore, future research needs to consider these limitations.

Index-based Searching on Timestamped Event Sequences (타임스탬프를 갖는 이벤트 시퀀스의 인덱스 기반 검색)

  • 박상현;원정임;윤지희;김상욱
    • Journal of KIISE:Databases
    • /
    • v.31 no.5
    • /
    • pp.468-478
    • /
    • 2004
  • It is essential in various application areas of data mining and bioinformatics to effectively retrieve the occurrences of interesting patterns from sequence databases. For example, let's consider a network event management system that records the types and timestamp values of events occurred in a specific network component(ex. router). The typical query to find out the temporal casual relationships among the network events is as fellows: 'Find all occurrences of CiscoDCDLinkUp that are fellowed by MLMStatusUP that are subsequently followed by TCPConnectionClose, under the constraint that the interval between the first two events is not larger than 20 seconds, and the interval between the first and third events is not larger than 40 secondsTCPConnectionClose. This paper proposes an indexing method that enables to efficiently answer such a query. Unlike the previous methods that rely on inefficient sequential scan methods or data structures not easily supported by DBMSs, the proposed method uses a multi-dimensional spatial index, which is proven to be efficient both in storage and search, to find the answers quickly without false dismissals. Given a sliding window W, the input to a multi-dimensional spatial index is a n-dimensional vector whose i-th element is the interval between the first event of W and the first occurrence of the event type Ei in W. Here, n is the number of event types that can be occurred in the system of interest. The problem of‘dimensionality curse’may happen when n is large. Therefore, we use the dimension selection or event type grouping to avoid this problem. The experimental results reveal that our proposed technique can be a few orders of magnitude faster than the sequential scan and ISO-Depth index methods.hods.

An Iterative, Interactive and Unified Seismic Velocity Analysis (반복적 대화식 통합 탄성파 속도분석)

  • Suh Sayng-Yong;Chung Bu-Heung;Jang Seong-Hyung
    • Geophysics and Geophysical Exploration
    • /
    • v.2 no.1
    • /
    • pp.26-32
    • /
    • 1999
  • Among the various seismic data processing sequences, the velocity analysis is the most time consuming and man-hour intensive processing steps. For the production seismic data processing, a good velocity analysis tool as well as the high performance computer is required. The tool must give fast and accurate velocity analysis. There are two different approches in the velocity analysis, batch and interactive. In the batch processing, a velocity plot is made at every analysis point. Generally, the plot consisted of a semblance contour, super gather, and a stack pannel. The interpreter chooses the velocity function by analyzing the velocity plot. The technique is highly dependent on the interpreters skill and requires human efforts. As the high speed graphic workstations are becoming more popular, various interactive velocity analysis programs are developed. Although, the programs enabled faster picking of the velocity nodes using mouse, the main improvement of these programs is simply the replacement of the paper plot by the graphic screen. The velocity spectrum is highly sensitive to the presence of the noise, especially the coherent noise often found in the shallow region of the marine seismic data. For the accurate velocity analysis, these noise must be removed before the spectrum is computed. Also, the velocity analysis must be carried out by carefully choosing the location of the analysis point and accuarate computation of the spectrum. The analyzed velocity function must be verified by the mute and stack, and the sequence must be repeated most time. Therefore an iterative, interactive, and unified velocity analysis tool is highly required. An interactive velocity analysis program, xva(X-Window based Velocity Analysis) was invented. The program handles all processes required in the velocity analysis such as composing the super gather, computing the velocity spectrum, NMO correction, mute, and stack. Most of the parameter changes give the final stack via a few mouse clicks thereby enabling the iterative and interactive processing. A simple trace indexing scheme is introduced and a program to nike the index of the Geobit seismic disk file was invented. The index is used to reference the original input, i.e., CDP sort, directly A transformation techinique of the mute function between the T-X domain and NMOC domain is introduced and adopted to the program. The result of the transform is simliar to the remove-NMO technique in suppressing the shallow noise such as direct wave and refracted wave. However, it has two improvements, i.e., no interpolation error and very high speed computing time. By the introduction of the technique, the mute times can be easily designed from the NMOC domain and applied to the super gather in the T-X domain, thereby producing more accurate velocity spectrum interactively. The xva program consists of 28 files, 12,029 lines, 34,990 words and 304,073 characters. The program references Geobit utility libraries and can be installed under Geobit preinstalled environment. The program runs on X-Window/Motif environment. The program menu is designed according to the Motif style guide. A brief usage of the program has been discussed. The program allows fast and accurate seismic velocity analysis, which is necessary computing the AVO (Amplitude Versus Offset) based DHI (Direct Hydrocarn Indicator), and making the high quality seismic sections.

  • PDF

Cytochrome P450 monooxygenase analysis in free-living and symbiotic microalgae Coccomyxa sp. C-169 and Chlorella sp. NC64A

  • Mthakathi, Ntsane Trevor;Kgosiemang, Ipeleng Kopano Rosinah;Chen, Wanping;Mohlatsane, Molikeng Eric;Mojahi, Thebeyapelo Jacob;Yu, Jae-Hyuk;Mashele, Samson Sitheni;Syed, Khajamohiddin
    • ALGAE
    • /
    • v.30 no.3
    • /
    • pp.233-239
    • /
    • 2015
  • Microalgae research is gaining momentum because of their potential biotechnological applications, including the generation of biofuels. Genome sequencing analysis of two model microalgal species, polar free-living Coccomyxa sp. C-169 and symbiotic Chlorella sp. NC64A, revealed insights into the factors responsible for their lifestyle and unravelled biotechnologically valuable proteins. However, genome sequence analysis under-explored cytochrome P450 monooxygenases (P450s), heme-thiolate proteins ubiquitously present in species belonging to different biological kingdoms. In this study we performed genome data-mining, annotation and comparative analysis of P450s in these two model algal species. Sixty-nine P450s were found in two algal species. Coccomyxa sp. showed 40 P450s and Chlorella sp. showed 29 P450s in their genome. Sixty-eight P450s (>100 amino acid in length) were grouped into 32 P450 families and 46 P450 subfamilies. Among the P450 families, 27 P450 families were novel and not found in other biological kingdoms. The new P450 families are CYP745-CYP747, CYP845-CYP863, and CYP904-CYP908. Five P450 families, CYP51, CYP97, CYP710, CYP745, and CYP746, were commonly found between two algal species and 16 and 11 P450 families were unique to Coccomyxa sp. and Chlorella sp. Synteny analysis and gene-structure analysis revealed P450 duplications in both species. Functional analysis based on homolog P450s suggested that CYP51 and CYP710 family members are involved in membrane ergosterol biosynthesis. CYP55 and CYP97 family members are involved in nitric oxide reduction and biosynthesis of carotenoids. This is the first report on comparative analysis of P450s in the microalgal species Coccomyxa sp. C-169 and Chlorella sp. NC64A.

Evaluation of Novel Constitutive Expression Vectors Equipped with Mined Promoters from Metagenome (메타게놈에서 발굴한 프로모터를 장착한 새로운 항시발현 벡터의 가치평가)

  • Han, Sang-Soo;Kim, Geun-Joong
    • Microbiology and Biotechnology Letters
    • /
    • v.36 no.4
    • /
    • pp.260-267
    • /
    • 2008
  • The choice of expression vector is very important for industrial production of proteins. Therefore, the systematic mining of promoters over a wider range of genetic resource and/or host is required. We previously reported a novel bidirectional reporting system (pBGR) for the isolation of promoters from metagenome and screened useful promoters that functioned constitutively in E. coli under general culture conditions. Among them, three promoter sequences including each upstream region were amplified by PCR and used to construct new expression vectors. To facilitate subcloning, a multi-cloning site was incorporated into the downstream region of the revere primer sequence. At these sites, GFP, esterase and $\beta$-glucosidase were subcloned and analyzed the constitutive expression ability of new promoter in terms of protein solubility and expression level. As a result, these vectors expressed the proteins constitutively to a level of $2{\sim}3%$ of the total cell protein in soluble fraction (>80 %). This study suggested that excavation of metagenomic promoters for construction of expression vector in a certain strain could provide a way for the development of the expression systems.

Noise Control Boundary Image Matching Using Time-Series Moving Average Transform (시계열 이동평균 변환을 이용한 노이즈 제어 윤곽선 이미지 매칭)

  • Kim, Bum-Soo;Moon, Yang-Sae;Kim, Jin-Ho
    • Journal of KIISE:Databases
    • /
    • v.36 no.4
    • /
    • pp.327-340
    • /
    • 2009
  • To achieve the noise reduction effect in boundary image matching, we use the moving average transform of time-series matching. Our motivation is based on an intuition that using the moving average transform we may exploit the noise reduction effect in boundary image matching as in time-series matching. To confirm this simple intuition, we first propose $\kappa$-order image matching, which applies the moving average transform to boundary image matching. A boundary image can be represented as a sequence in the time-series domain, and our $\kappa$-order image matching identifies similar images in this time-series domain by comparing the $\kappa$-moving average transformed sequences. Next, we propose an index-based matching method that efficiently performs $\kappa$-order image matching on a large volume of image databases, and formally prove the correctness of the index-based method. Moreover, we formally analyze the relationship between an order $\kappa$ and its matching result, and present a systematic way of controlling the noise reduction effect by changing the order $\kappa$. Experimental results show that our $\kappa$-order image matching exploits the noise reduction effect, and our index-based matching method outperforms the sequential scan by one or two orders of magnitude.

Gramene database: A resource for comparative plant genomics, pathways and phylogenomics analyses

  • Tello-Ruiz, Marcela K.;Stein, Joshua;Wei, Sharon;Preece, Justin;Naithani, Sushma;Olson, Andrew;Jiao, Yinping;Gupta, Parul;Kumari, Sunita;Chougule, Kapeel;Elser, Justin;Wang, Bo;Thomason, James;Zhang, Lifang;D'Eustachio, Peter;Petryszak, Robert;Kersey, Paul;Lee, PanYoung Koung;Jaiswal, kaj;Ware, Doreen
    • Proceedings of the Korean Society of Crop Science Conference
    • /
    • 2017.06a
    • /
    • pp.135-135
    • /
    • 2017
  • The Gramene database (http://www.gramene.org) is a powerful online resource for agricultural researchers, plant breeders and educators that provides easy access to reference data, visualizations and analytical tools for conducting cross-species comparisons. Learn the benefits of using Gramene to enrich your lectures, accelerate your research goals, and respond to your organismal community needs. Gramene's genomes portal hosts browsers for 44 complete reference genomes, including crops and model organisms, each displaying functional annotations, gene-trees with orthologous and paralogous gene classification, and whole-genome alignments. SNP and structural diversity data, available for 11 species, are displayed in the context of gene annotation, protein domains and functional consequences on transcript structure (e.g., missense variant). Browsers from multiple species can be viewed simultaneously with links to community-driven organismal databases. Thus, while hosting the underlying data for comparative studies, the portal also provides unified access to diverse plant community resources, and the ability for communities to upload and display private data sets in multiple standard formats. Our BioMart data mining interface enable complex queries and bulk download of sequence, annotation, homology and variation data. Gramene's pathway portal, the Plant Reactome, hosts over 240 pathways curated in rice and inferred in 66 additional plant species by orthology projection. Users may compare pathways across species, query and visualize curated expression data from EMBL-EBI's Expression Atlas in the context of pathways, analyze genome-scale expression data, and conduct pathway enrichment analysis. Our integrated search database and modern user interface leverage these diverse annotations to facilitate finding genes through selecting auto-suggested filters with interactive views of the results.

  • PDF

Reliability Analysis of VOC Data for Opinion Mining (오피니언 마이닝을 위한 VOC 데이타의 신뢰성 분석)

  • Kim, Dongwon;Yu, Song Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.217-245
    • /
    • 2016
  • The purpose of this study is to verify how 7 sentiment domains extracted through sentiment analysis from social media have an influence on business performance. It consists of three phases. In phase I, we constructed the sentiment lexicon after crawling 45,447 pieces of VOC (Voice of the Customer) on 26 auto companies from the car community and extracting the POS information and built a seven-sensitive domains. In phase II, in order to retain the reliability of experimental data, we examined auto-correlation analysis and PCA. In phase III, we investigated how 7 domains impact on the market share of three major (GM, FCA, and VOLKSWAGEN) auto companies by using linear regression analysis. The findings from the auto-correlation analysis proved auto-correlation and the sequence of the sentiments, and the results from PCA reported the 7 sentiments connected with positivity, negativity and neutrality. As a result of linear regression analysis on model 1, we indentified that the sentimental factors have a significant influence on the actual market share. In particular, not only posotive and negative sentiment domains, but neutral sentiment had significantly impacted on auto market share. As we apply the availability of data to the market, and take advantage of auto-correlation of the market-related information and the sentiment, the findings will be a huge contribution to other researches on sentiment analysis as well as actual business performances in various ways.