• Title/Summary/Keyword: Labeled Data

Search Result 458, Processing Time 0.032 seconds

Automatic Training Corpus Generation Method of Named Entity Recognition Using Knowledge-Bases (개체명 인식 코퍼스 생성을 위한 지식베이스 활용 기법)

  • Park, Youngmin;Kim, Yejin;Kang, Sangwoo;Seo, Jungyun
    • Korean Journal of Cognitive Science
    • /
    • v.27 no.1
    • /
    • pp.27-41
    • /
    • 2016
  • Named entity recognition is to classify elements in text into predefined categories and used for various departments which receives natural language inputs. In this paper, we propose a method which can generate named entity training corpus automatically using knowledge bases. We apply two different methods to generate corpus depending on the knowledge bases. One of the methods attaches named entity labels to text data using Wikipedia. The other method crawls data from web and labels named entities to web text data using Freebase. We conduct two experiments to evaluate corpus quality and our proposed method for generating Named entity recognition corpus automatically. We extract sentences randomly from two corpus which called Wikipedia corpus and Web corpus then label them to validate both automatic labeled corpus. We also show the performance of named entity recognizer trained by corpus generated in our proposed method. The result shows that our proposed method adapts well with new corpus which reflects diverse sentence structures and the newest entities.

  • PDF

Object Detection Based on Hellinger Distance IoU and Objectron Application (Hellinger 거리 IoU와 Objectron 적용을 기반으로 하는 객체 감지)

  • Kim, Yong-Gil;Moon, Kyung-Il
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.2
    • /
    • pp.63-70
    • /
    • 2022
  • Although 2D Object detection has been largely improved in the past years with the advance of deep learning methods and the use of large labeled image datasets, 3D object detection from 2D imagery is a challenging problem in a variety of applications such as robotics, due to the lack of data and diversity of appearances and shapes of objects within a category. Google has just announced the launch of Objectron that has a novel data pipeline using mobile augmented reality session data. However, it also is corresponding to 2D-driven 3D object detection technique. This study explores more mature 2D object detection method, and applies its 2D projection to Objectron 3D lifting system. Most object detection methods use bounding boxes to encode and represent the object shape and location. In this work, we explore a stochastic representation of object regions using Gaussian distributions. We also present a similarity measure for the Gaussian distributions based on the Hellinger Distance, which can be viewed as a stochastic Intersection-over-Union. Our experimental results show that the proposed Gaussian representations are closer to annotated segmentation masks in available datasets. Thus, less accuracy problem that is one of several limitations of Objectron can be relaxed.

Enhancement of Buckling Characteristics for Composite Square Tube by Load Type Analysis (하중유형 분석을 통한 좌굴에 강한 복합재료 사각관 설계에 관한 연구)

  • Seokwoo Ham;Seungmin Ji;Seong S. Cheon
    • Composites Research
    • /
    • v.36 no.1
    • /
    • pp.53-58
    • /
    • 2023
  • The PIC design method is assigning different stacking sequences for each shell element through the preliminary FE analysis. In previous study, machine learning was applied to the PIC design method in order to assign the region efficiently, and the training data is labeled by dividing each region into tension, compression, and shear through the preliminary FE analysis results value. However, since buckling is not considered, when buckling occurs, it can't be divided into appropriate loading type. In the present study, it was proposed PIC-NTL (PIC design using novel technique for analyzing load type) which is method for applying a novel technique for analyzing load type considering buckling to the conventional PIC design. The stress triaxiality for each ply were analyzed for buckling analysis, and the representative loading type was designated through the determined loading type within decision area divided into two regions of the same size in the thickness direction of the elements. The input value of the training data and label consisted in coordination of element and representative loading type of each decision area, respectively. A machine learning model was trained through the training data, and the hyperparameters that affect the performance of the machine learning model were tuned to optimal values through Bayesian algorithm. Among the tuned machine learning models, the SVM model showed the highest performance. Most effective stacking sequence were mapped into PIC tube based on trained SVM model. FE analysis results show the design method proposed in this study has superior external loading resistance and energy absorption compared to previous study.

Detecting Common Weakness Enumeration(CWE) Based on the Transfer Learning of CodeBERT Model (CodeBERT 모델의 전이 학습 기반 코드 공통 취약점 탐색)

  • Chansol Park;So Young Moon;R. Young Chul Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.10
    • /
    • pp.431-436
    • /
    • 2023
  • Recently the incorporation of artificial intelligence approaches in the field of software engineering has been one of the big topics. In the world, there are actively studying in two directions: 1) software engineering for artificial intelligence and 2) artificial intelligence for software engineering. We attempt to apply artificial intelligence to software engineering to identify and refactor bad code module areas. To learn the patterns of bad code elements well, we must have many datasets with bad code elements labeled correctly for artificial intelligence in this task. The current problems have insufficient datasets for learning and can not guarantee the accuracy of the datasets that we collected. To solve this problem, when collecting code data, bad code data is collected only for code module areas with high-complexity, not the entire code. We propose a method for exploring common weakness enumeration by learning the collected dataset based on transfer learning of the CodeBERT model. The CodeBERT model learns the corresponding dataset more about common weakness patterns in code. With this approach, we expect to identify common weakness patterns more accurately better than one in traditional software engineering.

Studies of the Effects of Acupuncture Stimulation at Huatuo Jiaji(EX B2) Points on Axonal Regeneration of Injured Sciatic Nerve in the Rats (화타협척혈 침자극에 의한 손상 말초신경의 재생효과에 관한 연구)

  • Kim, Dae-Feel;Park, Young-Hoi;Keum, Dong-Ho
    • Journal of Korean Medicine Rehabilitation
    • /
    • v.18 no.4
    • /
    • pp.39-61
    • /
    • 2008
  • Objectives : The present study was performed to investigate whether acupuncture stimulation in the rats affected regeneration properties of the injured sciatic nerve. A differential effect of acupuncture stimulation on the one point near the spinal nerve root controlling sciatic nerve activity and the other point in the peripheral area subordinated by injured nerve was compared. Materials and Methods: Rat sciatic nerves were injured by crush, and the effects on axonal regeneration on injured sciatic nerves were evaluated by acupuncture stimulation at two different regions. In proximal acupuncture stimulation group, acupuncture stimulation was performed on Huatuo Jiaji(EX B2) points located from L5 to S1 vertebral levels to stimulate the nearest spinal nerve root that innervates sciatic nerves. In distal acupuncture stimulation group, acupuncture stimulation was performed on Zusanli(ST 36) and Weizhong(BL 40) points to stimulate at peripheral area dominated by injured sciatic nerves. Acupuncture stimulation was given every other days for 1 or 2 weeks. Sciatic nerve tissues collected from acupuncture stimulation experimental groups, injury control group, and intact animal group were used for protein analysis by Western blotting or Hoechst nuclear staining. To determine axonal regeneration, Dil fluorescence dye was injected into the sciatic nerve 0.5 cm distal to the injury site in individual animal groups and Dil-labeled cells by retrograde tracing were measured in the DRG at lumbar 5 or in the spinal cord. DRG sensory neurons prepared from individual animal groups were used to measure the extent of neurite outgrowth and for immunofluorescence staining with anti-GAP-43 antibody. Results : Animal groups given proximal or distal acupuncture stimulation showed upregulation of GAP-43 and Cdc2 protein levels in the sciatic nerve at 7 days after injury. Cdk2 protein levels were strongly induced by nerve injury, but did not show changes by acupuncture stimulation. Phospho-Erk1/2 protein levels were elevated by acupuncture stimulation above those present in the injury control animals. These increase in regeneration-associated protein levels appeared to be related with increase cell proliferation in the injured sciatic nerves. Hoechst 33258 staining of sciatic nerve tissue to visualize nuclei of individual cells showed increased Schwann cell number in the distal portion of the injured nerve 7 and 14 days after injury and further increases by acupuncture stimulation particularly at the proximal position. Measurement of axonal regeneration by retrograde tracing showed significantly increased Dil-labeled cells in proximal acupuncture stimulation group compared to distal acupuncture stimulation group and injury control group. Finally, an evaluation of axonal regeneration by retrograde tracing showed increased number of Dil labeled cells in the DRG at lumbar 5 or in the ventral horn of the spinal cord at lower thoracic level at 7 days after nerve injury. Conclusions : The present data show that the proximal acupuncture stimulation at Huatuo Jiaji(EX B2) points governing injured sciatic nerves was more effective for axonal regeneration than the distal acupuncture stimulation. Further studies on functional recovery or associated molecular mechanisms should be critical for developing animal models and clinical applications.

A Nucleotide Exchange Factor, BAP, dissociated Protein-Molecular Chaperone Complex in vitro (In vitro에서 핵산치환인자 BAP이 단백질-분자 샤페론 복합체 해리에 미치는 영향)

  • Lee Myoung-Joo;Kim Dong-Eun;Lee Tae-Ho;Jeong Yong-Kee;Kim Young-Hee;Chung Kyung-Tae
    • Journal of Life Science
    • /
    • v.16 no.3 s.76
    • /
    • pp.409-414
    • /
    • 2006
  • Molecular chaperones and folding enzymes in the endoplasmic reticulum (ER) associate with the newly synthesized proteins to prevent their aggregation and help them fold and assemble correctly. Chaperone function of BiP, which is a Hsp70 homologue in ER, is controlled by the N-terminal ATPase domain. The ATPase activity of the ATPase domain is affected by regulatory factors. BAP was identified as a nucleotide exchange factor of BiP (Grp78), which exchanges ADP with ATP in the ATPase domain of BiP This study presents whether BAP can influence folding of a protein, immunoglobulin heavy chain that is bound to BiP tightly. We first examined which nucleotide of ADP and ATP affects on BAP binding to BiP The data showed that endogenous BAP of HEK293 cells prefers ADP for binding to BiP in vitro, suggesting that BAP first releases ADP from the ATPase domain in order to exchange with ATP. Immunoglobulin heavy chain, an unfolded protein substrate, was released from BiP in the presence of BAP but not in the presence of ERdj3, which is another regulatory factor for BiP accelerating the rate of ATP hydrolysis of BiP The ADP-releasing function of BAP was, therefore, believed to be responsible for immunoglobulin heavy chain release from BiP. Grp170, another Hsp70 homologue in ER, did not co-precipited with BAP from $[^{35}S]$-metabolic labeled HEK293 lysate containing both overexpressed Grp170 and BAP. These data suggested that BAP has no specificity to Grp170 although the ATPase domains of Grp170 and BiP are homologous each other.

Change of Recommended Energy Intake for Korea (한국인의 에너지 섭취권장량 변화)

  • Na, Hyeon-Ju;Kim, Mi-Jeong;Kim, Young-Nam
    • Journal of Korean Home Economics Education Association
    • /
    • v.23 no.3
    • /
    • pp.121-138
    • /
    • 2011
  • This research examined the amounts and methods change of recommended energy intake(REI) from 1962's recommended dietary intakes for Korean to 2010's dietary reference intakes for Koreans. REI is composed of 3 factors, such basal metabolic rate(or Resting Energy Expenditure, REE), activity energy, and thermogenic effect of foods. The first 1962 calculation formula of REI was weight based formula, that of 95's was the weight based REE multiplied by activity coefficient, and the recent one of 2005's(Estimated Energy Requirement: EER) was age, height. weight, and the activity level applying formula derived from the energy expenditure data by doubly labeled water technique(DLW). During the 50 years or so, REIs were reduced in all age groups, according to the activity(labor) strength and hour were reduced. The individual REI calculation method was introduced in 1995, and individual REI calculation was recommended since to prevent obesity. For the better REI estimation for Koreans, REI calculation formula derived from our peoples' DLW energy expenditure data is required.

  • PDF

Developing a Korean Standard Brain Atlas on the basis of Statistical and Probabilistic Approach and Visualization tool for Functional image analysis (확률 및 통계적 개념에 근거한 한국인 표준 뇌 지도 작성 및 기능 영상 분석을 위한 가시화 방법에 관한 연구)

  • Koo, B.B.;Lee, J.M.;Kim, J.S.;Lee, J.S.;Kim, I.Y.;Kim, J.J.;Lee, D.S.;Kwon, J.S.;Kim, S.I.
    • The Korean Journal of Nuclear Medicine
    • /
    • v.37 no.3
    • /
    • pp.162-170
    • /
    • 2003
  • The probabilistic anatomical maps are used to localize the functional neuro-images and morphological variability. The quantitative indicator is very important to inquire the anatomical position of an activated legion because functional image data has the low-resolution nature and no inherent anatomical information. Although previously developed MNI probabilistic anatomical map was enough to localize the data, it was not suitable for the Korean brains because of the morphological difference between Occidental and Oriental. In this study, we develop a probabilistic anatomical map for Korean normal brain. Normal 75 blains of T1-weighted spoiled gradient echo magnetic resonance images were acquired on a 1.5-T GESIGNA scanner. Then, a standard brain is selected in the group through a clinician searches a brain of the average property in the Talairach coordinate system. With the standard brain, an anatomist delineates 89 regions of interest (ROI) parcellating cortical and subcortical areas. The parcellated ROIs of the standard are warped and overlapped into each brain by maximizing intensity similarity. And every brain is automatically labeledwith the registered ROIs. Each of the same-labeled region is linearly normalize to the standard brain, and the occurrence of each legion is counted. Finally, 89 probabilistic ROI volumes are generated. This paper presents a probabilistic anatomical map for localizing the functional and structural analysis of Korean normal brain. In the future, we'll develop the group specific probabilistic anatomical maps of OCD and schizophrenia disease.

Frequently Occurred Information Extraction from a Collection of Labeled Trees (라벨 트리 데이터의 빈번하게 발생하는 정보 추출)

  • Paik, Ju-Ryon;Nam, Jung-Hyun;Ahn, Sung-Joon;Kim, Ung-Mo
    • Journal of Internet Computing and Services
    • /
    • v.10 no.5
    • /
    • pp.65-78
    • /
    • 2009
  • The most commonly adopted approach to find valuable information from tree data is to extract frequently occurring subtree patterns from them. Because mining frequent tree patterns has a wide range of applications such as xml mining, web usage mining, bioinformatics, and network multicast routing, many algorithms have been recently proposed to find the patterns. However, existing tree mining algorithms suffer from several serious pitfalls in finding frequent tree patterns from massive tree datasets. Some of the major problems are due to (1) modeling data as hierarchical tree structure, (2) the computationally high cost of the candidate maintenance, (3) the repetitious input dataset scans, and (4) the high memory dependency. These problems stem from that most of these algorithms are based on the well-known apriori algorithm and have used anti-monotone property for candidate generation and frequency counting in their algorithms. To solve the problems, we base a pattern-growth approach rather than the apriori approach, and choose to extract maximal frequent subtree patterns instead of frequent subtree patterns. The proposed method not only gets rid of the process for infrequent subtrees pruning, but also totally eliminates the problem of generating candidate subtrees. Hence, it significantly improves the whole mining process.

  • PDF

Automated Vehicle Research by Recognizing Maneuvering Modes using LSTM Model (LSTM 모델 기반 주행 모드 인식을 통한 자율 주행에 관한 연구)

  • Kim, Eunhui;Oh, Alice
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.16 no.4
    • /
    • pp.153-163
    • /
    • 2017
  • This research is based on the previous research that personally preferred safe distance, rotating angle and speed are differentiated. Thus, we use machine learning model for recognizing maneuvering modes trained per personal or per similar driving pattern groups, and we evaluate automatic driving according to maneuvering modes. By utilizing driving knowledge, we subdivided 8 kinds of longitudinal modes and 4 kinds of lateral modes, and by combining the longitudinal and lateral modes, we build 21 kinds of maneuvering modes. we train the labeled data set per time stamp through RNN, LSTM and Bi-LSTM models by the trips of drivers, which are supervised deep learning models, and evaluate the maneuvering modes of automatic driving for the test data set. The evaluation dataset is aggregated of living trips of 3,000 populations by VTTI in USA for 3 years and we use 1500 trips of 22 people and training, validation and test dataset ratio is 80%, 10% and 10%, respectively. For recognizing longitudinal 8 kinds of maneuvering modes, RNN achieves better accuracy compared to LSTM, Bi-LSTM. However, Bi-LSTM improves the accuracy in recognizing 21 kinds of longitudinal and lateral maneuvering modes in comparison with RNN and LSTM as 1.54% and 0.47%, respectively.