• Title/Summary/Keyword: sequence-to-sequence 모델

Search Result 695, Processing Time 0.028 seconds

Processing Method of Mass Small File Using Hadoop Platform (하둡 플랫폼을 이용한 대량의 스몰파일 처리방법)

  • Kim, Chang-Bok;Chung, Jae-Pil
    • Journal of Advanced Navigation Technology
    • /
    • v.18 no.4
    • /
    • pp.401-408
    • /
    • 2014
  • Hadoop is composed with MapReduce programming model for distributed processing and HDFS distributed file system. Hadoop is suitable framework for big data processing, but processing of mass small files have many problems. The processing of mass small file in hadoop have problems to created one mapper per one file, and it have problems to needed many memory for store of meta information of file. This paper have comparison evaluation processing method of mass small file with various method in hadoop platform. The processing of general compression format is inadequate because of processing by one mapper regardless of data size. The processing of sequence and hadoop archive file is removed memory problem of namenode by compress and combine of small file. Hadoop archive file is faster then sequence file about combine time of small file. The processing using CombineFileInputFormat class is needed not combine of small file, and it have similar speed big data processing method.

Numerical Study of Structural Behavior of Underground Silo Structures for Low-and-Intermediate-level Radioactive Waste Disposal Facility (중저준위 방폐물 처분 사일로 구조물의 구조거동 수치해석 연구)

  • Kim, Sun-Hoon;Kim, Kwang-Jin
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.35 no.3
    • /
    • pp.183-190
    • /
    • 2022
  • The construction of an underground silo structure was the first stage of erecting the Gyeongju low-and-intermediate-level radioactive waste disposal facility. The facility, completed in 2014, has a scale of 100 000 drums and is currently in operation. The underground silo structure, 25 and 50 m in diameter and height, respectively, consists of cylindrical (for storing waste packages) and dome parts. The dome is divided into lower (connected to the operation tunnel) and upper parts. The wall of the underground silo structure is an approximately 1-m-thick reinforced concrete liner. In this study, finite element analysis was performed for each phase of the construction sequence and operation of the underground silo structure. Two-dimensional axial symmetric finite element analysis was implemented using the SMAP-3D program. Three-dimensional finite element analysis was also performed to examine the reliability of the two-dimensional axial symmetric finite element model. The structural behavior of the underground silo structure was predicted, and its structural safety was examined.

Statistical Modeling on the Sorption of Heavy Metals by Clay Minerals (점토의 중금속 흡착에 대한 통계모델링)

  • 정찬호;김수진
    • The Journal of Engineering Geology
    • /
    • v.13 no.3
    • /
    • pp.369-378
    • /
    • 2003
  • The statistical modeling was introduced to satisfy various experimental conditions on the sorption of heavy metals (Pb, Cu, Cd, and Zn) by clay minerals, i.e. kaolinite, illite and chlorite. The Box-Behnken model designed statistically was applied to determine a relative impact among three variables such as pH, HCO3(or K) concentration and initial concentration of heavy metals. The SAS program was used to obtain the statistical solution by surface response analysis. The results of a statistical sorption modelling indicated that pH is a strong impact of the variables influencing the sorption of heavy metals. A relative effect between an initial concentration of heavy metals and bicarbonate(or K) concentration is dependent on solution condition. The sorption edge of heavy metals as function of pH shows sigmoidal curve, and a great increase in the range of pH 6~8. The sorption sequence among heavy metals is Cu>Pb>>Zn>Cd. The solution chemistry exerts greater influence on the sorption of heavy metals rather than the crystal chemistry of clay minerals. The potassium exerts some effect into a sorption competition with heavy metals. The research suggests that the statistical modeling is an effective method to demonstrate sorption results in three dimension and to reduce the effort of batch sorption experiment.

A Eukaryotic Gene Structure Prediction Program Using Duration HMM (Duration HMM을 이용한 진핵생물 유전자 예측 프로그램 개발)

  • Tae, Hong-Seok;Park, Gi-Jeong
    • Korean Journal of Microbiology
    • /
    • v.39 no.4
    • /
    • pp.207-215
    • /
    • 2003
  • Gene structure prediction, which is to predict protein coding regions in a given nucleotide sequence, is the most important process in annotating genes and greatly affects gene analysis and genome annotation. As eukaryotic genes have more complicated stuructures in DNA sequences than those of prokaryotic genes, analysis programs for eukaryotic gene structure prediction have more diverse and more complicated computational models. We have developed EGSP, a eukaryotic gene structure program, using duration hidden markov model. The program consists of two major processes, one of which is a training process to produce parameter values from training data sets and the other of which is to predict protein coding regions based on the parameter values. The program predicts multiple genes rather than a single gene from a DNA sequence. A few computational models were implemented to detect signal pattern and their scanning efficiency was tested. Prediction performance was calculated and was compared with those of a few commonly used programs, GenScan, GeneID and Morgan based on a few criteria. The results show that the program can be practically used as a stand-alone program and a module in a system. For gene prediction of eukaryotic microbial genomes, training and prediction analysis was done with Saccharomyces chromosomes and the result shows the program is currently practically applicable to real eukaryotic microbial genomes.

Methanol Extraction of Nitrogen Heterocyclic Compound Contained in Model Coal Tar Fraction of Nine Components System (9성분계 모델 콜타르 흡수유 중에 함유된 질소고리화합물의 메탄올 추출)

  • Kim, Su Jin;Kang, Ho Cheol
    • Applied Chemistry for Engineering
    • /
    • v.25 no.2
    • /
    • pp.142-146
    • /
    • 2014
  • The separation of nitrogen heterocyclic compound (NHC) contained in a model coal tar fraction of nine components system was investigated by distribution equilibrium. The model coal tar fraction comprising NHC group (NHCs; indole (In), quinoline (Q), iso-quinoline (iQ), quinaldine(Qu)), bicyclic aromatic compound group (BACs; 1-methylnaphthalene (1MN), 2-methylnaphthalene (2MN), dimethylnaphthalene (DMN)), biphenyl (Bp) and phenyl ether (Pe) and the aqueous methanol were used as the raw materials and the solvent of this work, respectively. A batch-stirred tank was used as the liquid-liquid contact unit of this work. The distribution coefficient of NHCs increased by increasing the equilibrium operation temperature, whereas the selectivity of NHCs with respect to BACs decreased. Decreasing the initial volume ratio of water to the solvent resulted in deteriorating the selectivity of NHCs in reference to BACs, but improving the distribution coefficients of NHCs. At a fixed experimental condition, the sequence of the distribution coefficient and the selectivity with reference to BACs for each groups was increased in order of NHCs > Bp > BACs > Pe and NHCs > Bp> Pe, respectively. Also, the sequence of the distribution coefficient for entire compounds was in order of In > iQ = Q > Qu > Bp > 1MN = 2MN > Pe > DMN. The maximum yield of NHCs and the selectivity of NHCs based on BACs obtained by methanol extraction were 94 and 23%, respectively. Furthermore, the recovery process for NHCs from coal tar was studied by using the experimental results from this work.

Strength Design of Lightweight Composite Bicycle Frame (복합재료 라미네이트 경량화 자전거 프레임의 강도 설계)

  • Lee, Jin Ah;Hong, Hyoung Taek;Chun, Heung Jae
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.37 no.2
    • /
    • pp.265-270
    • /
    • 2013
  • Strength design for a lightweight bicycle frame made of carbon/epoxy composite laminates was studied using Tsai-Wu's failure criterion. For the design of bicycle frames, reducing the weight of the frame is of great importance. Furthermore, the frame should satisfy the required strength under specific loading cases. In accordance with the European EN 14764 standard for bicycle frames, three loading cases-pedaling, vertical, and level loadings-were investigated in this study. Because of the anisotropic characteristics of composite materials, it is important to decide the appropriate stacking sequence and the number of layers to be used in the composite bicycle frame. From finite element analysis results, the most suitable stacking sequence of the fiber orientation and the number of layers were determined. The stacking sequences of $[0]_{8n}$, $[90]_{8n}$, $[0/90]_{2ns}$, $[{\pm}45]_{2ns}$, $[0/{\pm}45/90]_{ns}$ (n = 1, 2, 3, 4) were used in the analysis. The results indicated that the $[0/{\pm}45/90]_{3s}$ lay-up model was suitable for a composite bicycle frame. Furthermore, the weakest point and layer were investigated.

The Application of an HMM-based Clustering Method to Speaker Independent Word Recognition (HMM을 기본으로한 집단화 방법의 불특정화자 단어 인식에 응용)

  • Lim, H.;Park, S.-Y.;Park, M.-W.
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.5
    • /
    • pp.5-10
    • /
    • 1995
  • In this paper we present a clustering procedure based on the use of HMM in order to get multiple statistical models which can well absorb the variants of each speaker with different ways of saying words. The HMM-clustered models obtained from the developed technique are applied to the speaker independent isolated word recognition. The HMM clustering method splits off all observation sequences with poor likelihood scores which fall below threshold from the training set and create a new model out of the observation sequences in the new cluster. Clustering is iterated by classifying each observation sequence as belonging to the cluster whose model has the maximum likelihood score. If any clutter has changed from the previous iteration the model in that cluster is reestimated by using the Baum-Welch reestimation procedure. Therefore, this method is more efficient than the conventional template-based clustering technique due to the integration capability of the clustering procedure and the parameter estimation. Experimental data show that the HMM-based clustering procedure leads to $1.43\%$ performance improvements over the conventional template-based clustering method and $2.08\%$ improvements over the single HMM method for the case of recognition of the isolated korean digits.

  • PDF

Design of the Business Management System Based on .NET Using UML (UML을 이용한 닷넷기반의 경영관리시스템에 대한 설계)

  • Lee Sang-Young
    • Journal of the Korea Computer Industry Society
    • /
    • v.7 no.3
    • /
    • pp.179-184
    • /
    • 2006
  • many corporation introduce information system by method to maximize profit and to reduce expense within flood of information along with development of Information Technology. In this paper, construed AS-IS using UML that is an object intention methodology design tool via process of TO-BE process model design and so on to base Use Case diagram, sequence diagram, component diagram make out that do and this is Microsoft's .NET And wished to raise reusability of component and development productivity of software as that embody smaller enterprise type business administration system that .NET-based relationship extension is possible.

  • PDF

Design of Multiple-Purpose Protocol Test System (다기능 프로토콜 시험시스템 설계)

  • 최양희
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.15 no.5
    • /
    • pp.434-445
    • /
    • 1990
  • Protocol testing techniques are expanded from the traditional simple function- testing based on the OSI model, to sophisticated performance testing, conformance testing and interoperability testing. In addition, both point-to-point and point-to-multipoint protocols are to be covered. This paper presents a new multiple-purpose protocol test system where the common platform includes the test sequence generation and test result analysis, and the modular test execution part is selectively adjusted according to the test purposes and protocols under test. This paper describes test system for network routing protocol and test system for transport protocol, designed upon the ideas of the multiple-purpose protocol test system.

  • PDF

Token-Based Classification and Dataset Construction for Detecting Modified Profanity (변형된 비속어 탐지를 위한 토큰 기반의 분류 및 데이터셋)

  • Sungmin Ko;Youhyun Shin
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.4
    • /
    • pp.181-188
    • /
    • 2024
  • Traditional profanity detection methods have limitations in identifying intentionally altered profanities. This paper introduces a new method based on Named Entity Recognition, a subfield of Natural Language Processing. We developed a profanity detection technique using sequence labeling, for which we constructed a dataset by labeling some profanities in Korean malicious comments and conducted experiments. Additionally, to enhance the model's performance, we augmented the dataset by labeling parts of a Korean hate speech dataset using one of the large language models, ChatGPT, and conducted training. During this process, we confirmed that filtering the dataset created by the large language model by humans alone could improve performance. This suggests that human oversight is still necessary in the dataset augmentation process.