• 제목/요약/키워드: extracting methods

검색결과 948건 처리시간 0.023초

통계 패키지에서의 데이터 접근 방식 비교 (Comparing Data Access Methods in Statistical Packages)

  • 강근석
    • Communications for Statistical Applications and Methods
    • /
    • 제16권3호
    • /
    • pp.437-447
    • /
    • 2009
  • 최근에 산업현장에서의 통계전문가들에게는 여러 가지 통계분석기법을 사용한 자료 분석 외에 다양한 형태의 자료 저장장치에서 추출 또는 생성의 과정을 거쳐 분석 목적에 적합한 자료를 구성해야하는 문제에 많이 부닥치고 있다. 본 논문에서는 현재 일반적으로 사용되고 있는 여러 통계 패키지들에서 제공하고 있는 데이터 접근방식을 살펴보고 각 기능들을 비교 분석하고자 한다. 이들 방식에 대한 정확한 이해는 특히 데이터마이닝 등 대용량의 자료를 분석하고자 할 때 데이터 처리과정에서의 어려움으로 발생하는 비용과 시간을 감소시켜주어 통계전문가들이 통계분석에 더욱 많은 작업을 할애할 수 있도록 해줄 것이다.

아이디어 마이닝 분야에서 문헌과 웹페이지의 아이디어 발췌에 대한 연구 (A Study on Extracting Ideas from Documents and Webpages in the Field of Idea Mining)

  • 이태영
    • 정보관리학회지
    • /
    • 제29권1호
    • /
    • pp.25-43
    • /
    • 2012
  • 일반적인 문헌/문서나 웹페이지에서 창조에 도움이 되는 아이디어와 준아이디어를 색출하기 위하여 아이디어 마이닝 기법을 적용하였다. 아이디어 마이닝과 의견 마이닝 및 논제 신호 마이닝에서 사용하는 발췌 기법으로 웹 페이지, 문헌, 문서 등에 포함되어 있는 아이디어를 발췌하였다. 발췌 기법을 (1) 결정적 단서 어구, (2) 단서 멀티미디어, (3) 문맥 신호, 및 (4) 담화 구절 방법으로 정리하여 7가지 아이디어 유형 -사상, 계획, 의견, 글, 그림, 소리, 공식 별로 실험하였다. 각 기법들의 효율성은 재현율과 정확률을 혼합한 F 측정값으로 판단하였고 (1), (3), (4) 방법은 대체로 긍정적인 평가를 얻었다. 특히, 결정적 단서 어구는 아이디어 적출에 문맥 신호는 준아이디어 추출에 효과적인 것으로 판단되었다.

생태독성평가를 위한 Soil Extracts, Soil Elutriates, Soil Suspensions 추출기법 (Review of the Extraction Methods of Soil Extracts, Soil Elutriates, and Soil Suspensions for Ecotoxicity Assessments)

  • 남선화;안윤주
    • 한국지하수토양환경학회지:지하수토양환경
    • /
    • 제19권3호
    • /
    • pp.15-24
    • /
    • 2014
  • Soil pollution has been recognized as a serious problem because it causes groundwater pollution through medium contacts. Although concentration of individual chemical could be more easily measured by physico-chemical analysis, it is not easy to consider the bioavailability of edaphic receptors living in soil or groundwater. To measure the toxicity of soil, the soil extracts (soil elutriates or soil suspensions in the other words) are often used due to the difficulties of extracting soil pore water. In this study, we reviewed 15 toxicity test methods found in literature to analyze the detail of each extraction method and to recommend the most frequently used extraction methods. The identified most commonly used extraction methods are as following: The 1 : 4 soil:water ratio, 24 hours shaking time, room temperature, dark, and separation of supernatant using a $0.45{\mu}m$ pore size filter.

The Sequence Labeling Approach for Text Alignment of Plagiarism Detection

  • Kong, Leilei;Han, Zhongyuan;Qi, Haoliang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제13권9호
    • /
    • pp.4814-4832
    • /
    • 2019
  • Plagiarism detection is increasingly exploiting text alignment. Text alignment involves extracting the plagiarism passages in a pair of the suspicious document and its source document. The heuristics have achieved excellent performance in text alignment. However, the further improvements of the heuristic methods mainly depends more on the experiences of experts, which makes the heuristics lack of the abilities for continuous improvements. To address this problem, machine learning maybe a proper way. Considering the position relations and the context of text segments pairs, we formalize the text alignment task as a problem of sequence labeling, improving the current methods at the model level. Especially, this paper proposes to use the probabilistic graphical model to tag the observed sequence of pairs of text segments. Hence we present the sequence labeling approach for text alignment in plagiarism detection based on Conditional Random Fields. The proposed approach is evaluated on the PAN@CLEF 2012 artificial high obfuscation plagiarism corpus and the simulated paraphrase plagiarism corpus, and compared with the methods achieved the best performance in PAN@CLEF 2012, 2013 and 2014. Experimental results demonstrate that the proposed approach significantly outperforms the state of the art methods.

New Breast Measurement Technique and Bra Sizing System Based on 3D Body Scan Data

  • Oh, Seolyoung;Chun, Jongsuk
    • 대한인간공학회지
    • /
    • 제33권4호
    • /
    • pp.299-311
    • /
    • 2014
  • Objective: The aim of this study was to develop a method for measuring breast size from three-dimensional (3D) body scan image data. Background: Previous bra studies established reference points by directly contacting the subject's naked skin to determine the boundary of the breast. But some subjects were uncomfortable with these types of measurements. This study examined noncontact methods of extracting breast reference points from 3D body scan data that were collected while subjects were wearing standardized soft bras. Method: 3D body scan data of 32 Korean women were analyzed. The subjects were selected from the Size Korea 2010 study. The breast landmarks were identified by graphic analyses of slicing contour lines on 3D body scan data. Results: Three methods determining bra cup size were compared. The M1 and M2 methods determined cup size by calculating the difference between bust girth and under-bust girth. The M3 method determined bra cup size by measuring breast arc length. Conclusion: The researchers proposed an anthropometric bra cup sizing system with the breast arc length (M3 method). It was measured from the geometrically defined landmarks on the 3D body scan slicing contour lines. The new bra cup size was highly correlated with breast depth. Application: The noncontact measuring method used in this study can be applied to the ergonomic studies measuring sensitive body parts.

NLP기반 NER을 이용해 소셜 네트워크의 조직 구조 탐색을 위한 협력 프레임 워크 (A Collaborative Framework for Discovering the Organizational Structure of Social Networks Using NER Based on NLP)

  • 프랭크 엘리호데;양현호;이재완
    • 인터넷정보학회논문지
    • /
    • 제13권2호
    • /
    • pp.99-108
    • /
    • 2012
  • 방대한 양의 데이터로부터 정보추출의 정확도를 향상시키기 위한 많은 방법이 개발되어 왔다. 본 논문에서는NER(named entity recognition), 문장 추출, 스피치 태깅과 같은 여러 가지의 자연어 처리 작업을 통합하여 텍스트를 분석하였다. 데이터는 도메인에 특화된 데이터 추출 에이전트를 사용하여 웹에서 수집한 텍스트로 구성하였고, 위에서 언급한 자연어 처리 작업을 사용하여 비 구조화된 데이터로부터 정보를 추출하는 프레임 워크를 개발하였다. 조직 구조의 탐색을 위한 택스트 추출 및 분석 관점에서 연구의 성능을 시뮬레이션을 통해 분석하였으며, 시뮬레이션 결과, 정보추출에서 MUC 및 CoNLL과 같은 다른 NER 분석기 보다 성능이 우수함을 보였다.

발화속도 적응적인 한국어 연속음 인식기 (Adaptive Korean Continuous Speech Recognizer to Speech Rate)

  • 김재범;박찬규;한미성;이정현
    • 한국정보처리학회논문지
    • /
    • 제4권6호
    • /
    • pp.1531-1540
    • /
    • 1997
  • 본 논문에서는 발화속도 측정과 이를 통한 보상방법을 통하여 성능 향상된 한국어 연속음 인식 시스템을 제안한다. 연속음 인식은 다양한 조음화 현상과 발화속도의 변화로 인하여 고립단어 인식에 비하여 어렵다. 따라서, 연속음 인식을 위해서는 조음화 현상과 발화속도의 변화를 모델링할 수 있는 방법이 필요하다. 본 논문에서는 발화속도를 포만트의 변화율로서 측정하였고, 이 정보를 이용하여 빠른 발화에서는 상대적으로 많은 특징벡터를 발생시켜 보상을 시도하였다. 또한 조음화 현상을 모델링하기 위하여 한국어의 다이폰 집합을 514개로 정의하였고, 훈련을 위한 음성 DB론느 ETRI의 445 단어 DB를 사용하였다. 이러한 방법을 결합한 한국어 연속음 인식기를 DHMM (Discrete Hidden Markov Model)으로 구현하여 인식률이 향상됨을 보였다.

  • PDF

소루쟁이뿌리를 이용한 면직물 천연염색 (Natural Dyeing of Cotton Fabrics with Rumex crispus L. Root)

  • 한미란;이정숙
    • 한국의류학회지
    • /
    • 제33권2호
    • /
    • pp.222-229
    • /
    • 2009
  • The natural dyeing of cotton fabrics with Rumex crispus L. root extract was investigated. The dyeability of Rumex crispus L. root extract was evaluated with conditions of concentration, temperature, time, repeat-numbers, pH, mordants variables, methods of mordanting, color fastness and antibacterial activity, etc. The maximum V-visible spectrum possessed absorption band of Rumex crispus L. extract appeared at 274nm and 336nm. The amount of dyes extracted was increased with extracting concentration, temperature and time. The K/S value increased with increasing dyeing concentration and repeat-numbers. The K/S value increased with increasing dyeing temperature and time, the exhaustion was saturated in $90^{\circ}C\;and\;80min$, respectively. Surface colors of fabrics dyed with pH 3, 7, 11 extract were RP-R-YR-Y range. The light fastness and washing fastness showed good results in Fe-mordanted. The dry leaning fastness appeared more than 4 grade. Rubbing fastness was better in dry methods han that in wet methods. In the result of antibacterial activity, the decrease rate was 9.9% to Staphylococcus aureus with the dyed fabric of cotton.

단순지지 경계조건을 가진 임의 형상 평판의 효율적인 고유진동수 추출을 위한 NDIF법의 대수 고유치 문제로의 정식화 (A Formulation of NDIF Method to the Algebraic Eigenvalue Problem for Efficiently Extracting Natural Frequencies of Arbitrarily Shaped Plates with the Simply Supported Boundary Condition)

  • 강상욱;김진곤
    • 한국소음진동공학회논문집
    • /
    • 제19권6호
    • /
    • pp.607-613
    • /
    • 2009
  • A new formulation of NDIF method to the algebraic eigenvalue problem is introduced to efficiently extract natural frequencies of arbitrarily shaped plates with the simply supported boundary condition. NDIF method, which was developed by the authors for the free vibration analysis of arbitrarily shaped membranes and plates, has the feature that it yields highly accurate natural frequencies compared with other analytical methods or numerical methods(FEM and BEM). However, NDIF method has the weak point that it needs the inefficient procedure of searching natural frequencies by plotting the values of the determinant of a system matrix in the frequency range of interest. A new formulation of NDIF method developed in the paper doesn't require the above inefficient procedure and natural frequencies can be efficiently obtained by solving the typical algebraic eigenvalue problem. Finally, the validity of the proposed method is shown in several case studies, which indicate that natural frequencies by the proposed method are very accurate compared to other exact, analytical, or numerical methods.

The Microwave-Assisted Extraction of Fats from Irradiated Meat Products for the Detection of Radiation-Induced Hydrocarbons

  • Kwon, Joong-Ho;Kausar, Tusneem;Lee, Jeong-Eun;Kim, Hyun-Ku;Ahn, Dong-U
    • Food Science and Biotechnology
    • /
    • 제16권1호
    • /
    • pp.150-153
    • /
    • 2007
  • Hydrocarbons have been successfully used as a chemical marker in order to identify irradiated from non-irradiated foods. The method for determining hydrocarbons consists of extraction of fats, followed by separation of hydrocarbons by florisil column chromatography, and then identification of hydrocarbons by GC/MS. Currently, solvent extraction method for fats has certain limitations with regard to extraction time and solvent consumption. Commercial hams and sausage were irradiated at 0 and 5 kGy, and the efficiency of microwave-assisted extraction (MAE) and conventional solvent extraction (CSE) methods on the extraction of radiation-induced hydrocarbons from the meat products was compared. Significant levels of hydrocarbons, mainly composed of 1,7-hexadecadien, 1,7,10-hexadecatriene, and 6,9-heptadecadiene, were detected in the extracts from irradiated hams and sausages by both CSE and MAE methods. Both methods were acceptable in extracting hydrocarbons from samples, but MAE method required apparently reduced amounts of solvent from 150 (CSE) to 50 mL and reduced extraction time from 23 (CSE) to 5 min.