• 제목/요약/키워드: Data Set Records

검색결과 197건 처리시간 0.024초

Diagnosing Vocal Disorders using Cobweb Clustering of the Jitter, Shimmer, and Harmonics-to-Noise Ratio

  • Lee, Keonsoo;Moon, Chanki;Nam, Yunyoung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제12권11호
    • /
    • pp.5541-5554
    • /
    • 2018
  • A voice is one of the most significant non-verbal elements for communication. Disorders in vocal organs, or habitual muscular setting for articulatory cause vocal disorders. Therefore, by analyzing the vocal disorders, it is possible to predicate vocal diseases. In this paper, a method of predicting vocal disorders using the jitter, shimmer, and harmonics-to-noise ratio (HNR) extracted from vocal records is proposed. In order to extract jitter, shimmer, and HNR, one-second's voice signals are recorded in 44.1khz. In an experiment, 151 voice records are collected. The collected data set is clustered using cobweb clustering method. 21 classes with 12 leaves are resulted from the data set. According to the semantics of jitter, shimmer, and HNR, the class whose centroid has lowest jitter and shimmer, and highest HNR becomes the normal vocal group. The risk of vocal disorders can be predicted by measuring the distance and direction between the centroids.

EDMS와 기록물의 라이프사이클 (EDMS and Life-cycle of Records)

  • 김익한
    • 기록학연구
    • /
    • 제5호
    • /
    • pp.3-37
    • /
    • 2002
  • Today the market of EDMS is esteemed more than 100 billions won. It signifies a comming of age of electronic records. The traditional archival theories which are based on the paper records are confronted with a new challenge. In some leading countries of archival studies reorientation of archives management has been tried by a number of distinguished specialists such as Bearman and Hedstrom since 10 years. As a consequence new paradigm of archival theories has been developed. Also in Korea this new paradigm has been introduced by some expert such as Lee, Sang-Min, Sul, Moon-won, Lee, Seung-Eok. However their arguments are too general to offer a concrete clue for new paradigm. Faced by new age of electronic records, it's important to start a discussion for the reasonable methods of electronic records management at once. The most drastically changed part of record management by the electronic technique is the life-cycle of records. The commonly practiced three-stage life-cycle is to be reduced to the two-stage life-cycle, and the concept of the spatial movement of records is to be changed. It can be also pointed that the public emerges as user from the early creating stage of records beyond time and space. Thus is can be said that the method of the management features dynamic and cohesive. The method of appraisal must be also changed and reproduced, so that it can reflect the various levels considering dynamics of the electronic records. Supposedly it will be a core factor that causes the change of methodology in records management with the change of life-cycle theory. It must be noted that various subjects would be involved in the work of classification and description over time and space and that feedback between them is of important. Description also tends to be made at the crating stage of records and structured dynamically. It results from the change of life-cycle and the introduction of the concept of continuum. Such trend allows us to start discussions on the assumption that description of both creator and archival professionals act together an important role. Of course, it is linked with the methodology in which most descriptions are made automatically at the early drafting stage of the structure. The meat date is formed on the assumption that there should be feedback between areas of automatic description, description of creators and archival professionals. The most important thing in description is to develop a suitable way how it is structured. An alternative must be offered for managing data set. As iweb that is being operated by Myongji university shows, records created in daily business are managed not as electronic records but as date base. This is because they exist outside the repository in the EDMS system. Since data set often has various sources, an alternative for classification needs to be developed. It is now likely that database is filed according to the created year to be transferred automatically to the repository. Over a long-term the total management of database, electronic records and electronic information will be a topic. A right direction of new paradigm will be found for both iweb and E-government, when practice and studies of theories are combined and interacted.

Different penalty methods for assessing interval from first to successful insemination in Japanese Black heifers

  • Setiaji, Asep;Oikawa, Takuro
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제32권9호
    • /
    • pp.1349-1354
    • /
    • 2019
  • Objective: The objective of this study was to determine the best approach for handling missing records of first to successful insemination (FS) in Japanese Black heifers. Methods: Of a total of 2,367 records of heifers born between 2003 and 2015 used, 206 (8.7%) of open heifers were missing. Four penalty methods based on the number of inseminations were set as follows: C1, FS average according to the number of inseminations; C2, constant number of days, 359; C3, maximum number of FS days to each insemination; and C4, average of FS at the last insemination and FS of C2. C5 was generated by adding a constant number (21 d) to the highest number of FS days in each contemporary group. The bootstrap method was used to compare among the 5 methods in terms of bias, mean squared error (MSE) and coefficient of correlation between estimated breeding value (EBV) of non-censored data and censored data. Three percentages (5%, 10%, and 15%) were investigated using the random censoring scheme. The univariate animal model was used to conduct genetic analysis. Results: Heritability of FS in non-censored data was $0.012{\pm}0.016$, slightly lower than the average estimate from the five penalty methods. C1, C2, and C3 showed lower standard errors of estimated heritability but demonstrated inconsistent results for different percentages of missing records. C4 showed moderate standard errors but more stable ones for all percentages of the missing records, whereas C5 showed the highest standard errors compared with noncensored data. The MSE in C4 heritability was $0.633{\times}10^{-4}$, $0.879{\times}10^{-4}$, $0.876{\times}10^{-4}$ and $0.866{\times}10^{-4}$ for 5%, 8.7%, 10%, and 15%, respectively, of the missing records. Thus, C4 showed the lowest and the most stable MSE of heritability; the coefficient of correlation for EBV was 0.88; 0.93 and 0.90 for heifer, sire and dam, respectively. Conclusion: C4 demonstrated the highest positive correlation with the non-censored data set and was consistent within different percentages of the missing records. We concluded that C4 was the best penalty method for missing records due to the stable value of estimated parameters and the highest coefficient of correlation.

행정정보시스템 기록 이관 절차와 방법 연구 - 원자력안전위원회 MIDAS RASIS RI/RG 업무기록 사례를 중심으로 - (A Study on the Transfer Process and Method for Administrative Information System Records)

  • 황진현;박종연;이태훈;임진희
    • 한국기록관리학회지
    • /
    • 제14권3호
    • /
    • pp.7-32
    • /
    • 2014
  • 본 연구의 목적은 데이터세트 형태의 디지털 정보에 대한 보존 관리 방법을 찾는 것에 있다. 이를 위해 원자력안전위원회의 MIDAS RASIS에서 생산된 RI/RG 업무기록 사례를 분석하였다. MIDAS RASIS 분석을 위해 기록관리 기능요건 체크리스트를 작성하였으며, 체크리스트 결과를 바탕으로 MIDAS RASIS 기록관리 기능을 검토하여 이관 프로세스를 제시하였다. 이와 더불어 MIDAS RASIS의 기록관리를 위한 기록관리 모듈 DB를 설계하였고 표준기록관리시스템으로 이관하는 방안을 제시하였다.

An Efficient Multi-Layer Encryption Framework with Authentication for EHR in Mobile Crowd Computing

  • kumar, Rethina;Ganapathy, Gopinath;Kang, GeonUk
    • International journal of advanced smart convergence
    • /
    • 제8권2호
    • /
    • pp.204-210
    • /
    • 2019
  • Mobile Crowd Computing is one of the most efficient and effective way to collect the Electronic health records and they are very intelligent in processing them. Mobile Crowd Computing can handle, analyze and process the huge volumes of Electronic Health Records (EHR) from the high-performance Cloud Environment. Electronic Health Records are very sensitive, so they need to be secured, authenticated and processed efficiently. However, security, privacy and authentication of Electronic health records(EHR) and Patient health records(PHR) in the Mobile Crowd Computing Environment have become a critical issue that restricts many healthcare services from using Crowd Computing services .Our proposed Efficient Multi-layer Encryption Framework(MLEF) applies a set of multiple security Algorithms to provide access control over integrity, confidentiality, privacy and authentication with cost efficient to the Electronic health records(HER)and Patient health records(PHR). Our system provides the efficient way to create an environment that is capable of capturing, storing, searching, sharing, analyzing and authenticating electronic healthcare records efficiently to provide right intervention to the right patient at the right time in the Mobile Crowd Computing Environment.

기록의 속성과 메타데이터 표준을 통해 본 한국의 기록·기록기술 (Evaluating Records and Their Descriptive Elements in the Records Management of Korea on the Basis of the Characteristics of a Record and Recordkeeping Metadata Standards)

  • 김익한
    • 기록학연구
    • /
    • 제10호
    • /
    • pp.3-26
    • /
    • 2004
  • ISO 15489:2001 addresses the principles and requirements with which organizations, both public and private, should comply on the management of their records to ensure that adequate records are created, captured and managed. The standard defines the characteristics that a record should have through records management system as follows: authenticity, reliability, integrity, and usability. Authenticity means that records can be proven to be what it purports to be, to have been created or sent by the person purported to have created or sent it, and to have been created or sent at the time purported. Reliability means that the contents of the records can be trusted as a full and accurate representation of the transactions, activities or facts to which they attest and can be depended upon in the course of subsequent transactions or activities. Integrity refers to ensuring that a record is complete and unaltered. Usability means that records can be located, retrieved, presented and interpreted. In order to have these characteristics, a record should be persistently linked to the metadata necessary to document a transaction. Metadata is "data describing context, content and structure of records and their management through time." Metadata ensure the creation and maintenance of authentic, reliable and usable records and the protection of the integrity of those records. It could be implemented by creating and capturing records management metadata in systems that create and manage records. There have been some projects and standard initiatives to identify a core set of records management metadata. Included are the Australian Recordkeeping Metadata Standard and the British Metadata Standard which is part of the Requirements for Electronic Records Management System. Recently ISO/TS 23081-1 is published to implement metadata requirements within the framework of ISO 15489. Public records management system in Korea is ruled by the Act on the Management of Archives by Public Agencies and Administrative Records Management Regulation. This article evaluates records and their descriptive elements captured and maintained by the records management system in Korea on the basis of the international metadata standards.

국립박물관의 기록물 관리 현황과 개선방안 (A Study on the Present State and Improvement of National Museum Records Management System)

  • 장현종
    • 한국기록관리학회지
    • /
    • 제8권2호
    • /
    • pp.153-179
    • /
    • 2008
  • 이 연구는 박물관 기록물이 생산단계에서부터 체계적인 관리와 보존이 필요하다는 문제의식에서 출발하여 국립박물관을 대상으로 박물관 기록물의 관리 현황을 살펴보고 문제점을 파악하여 바람직한 기록물 관리 방안을 제안해보고자 하였다. 우리나라의 국립박물관 중에서 국립중앙박물관과 국립중앙박물관의 관리 감독을 받고 있는 11개 지방국립박물관을 대상으로 사례연구를 실시하였고, 문헌연구와 담당자와의 인터뷰 및 면담, 설문지조사를 실시하여 연구에 필요한 자료를 수집하였다. 그리고 이를 바탕으로 기록물 관리 현황을 살펴보고 문제점을 파악하여 바람직한 기록물 관리방안을 제안하고자 하였다. 기록물 관리 현황은 생산과 등록, 분류와 정리, 평가와 폐기, 이관, 활용의 5단계로 나누어 살펴보았으며, 기록물 관리 인력과 기록물 관리 업무에 대한 인식, 기록물 관리 시설과 장비 현황 등을 세부적으로 알아보았다. 분석결과 바람직한 박물관 기록물의 관리를 위해 전문 인력의 배치와 인식변화, 시설 및 장비 확충, 박물관 업무에 적합한 분류기준 마련, 미정리 기록물의 정리 및 상태점검, 보존방안 마련 등이 필요함을 확인할 수 있었다.

Families of Distributions Arising from Distributions of Ordered Data

  • Ahmadi, Mosayeb;Razmkhah, M.;Mohtashami Borzadaran, G.R.
    • Communications for Statistical Applications and Methods
    • /
    • 제22권2호
    • /
    • pp.105-120
    • /
    • 2015
  • A large family of distributions arising from distributions of ordered data is proposed which contains other models studied in the literature. This extension subsume many cases of weighted random variables such as order statistics, records, k-records and many others in variety. Such a distribution can be used for modeling data which are not identical in distribution. Some properties of the theoretical model such as moment, mean deviation, entropy criteria, symmetry and unimodality are derived. The proposed model also studies the problem of parameter estimation and derives maximum likelihood estimators in a weighted gamma distribution. Finally, it will be shown that the proposed model is the best among the previously introduced distributions for modeling a real data set.

불균형 데이터 집합의 분류를 위한 하이브리드 SVM 모델 (A Hybrid SVM Classifier for Imbalanced Data Sets)

  • 이재식;권종구
    • 지능정보연구
    • /
    • 제19권2호
    • /
    • pp.125-140
    • /
    • 2013
  • 어떤 클래스에 속한 레코드의 개수가 다른 클래스들에 속한 레코드의 개수보다 매우 많은 경우에, 이 데이터 집합을 '불균형 데이터 집합'이라고 한다. 데이터 분류에 사용되는 많은 기법들은 이러한 불균형 데이터에 대해서 저조한 성능을 보인다. 어떤 기법의 성능을 평가할 때에 적중률뿐만 아니라, 민감도와 특이도도 함께 측정하여야 한다. 고객의 이탈을 예측하는 문제에서 '유지' 레코드가 다수 클래스를 차지하고, '이탈' 레코드는 소수 클래스를 차지한다. 민감도는 실제로 '유지'인 레코드를 '유지'로 예측하는 비율이고, 특이도는 실제로 '이탈'인 레코드를 '이탈'로 예측하는 비율이다. 많은 데이터 마이닝 기법들이 불균형 데이터에 대해서 저조한 성능을 보이는 것은 바로 소수 클래스의 적중률인 특이도가 낮기 때문이다. 불균형 데이터 집합에 대처하는 과거 연구 중에는 소수 클래스를 Oversampling하여 균형 데이터 집합을 생성한 후에 데이터 마이닝 기법을 적용한 연구들이 있다. 이렇게 균형 데이터 집합을 생성하여 예측을 수행하면, 특이도는 다소 향상시킬 수 있으나 그 대신 민감도가 하락하게 된다. 본 연구에서는 민감도는 유지하면서 특이도를 향상시키는 모델을 개발하였다. 개발된 모델은 Support Vector Machine (SVM), 인공신경망(ANN) 그리고 의사결정나무 기법 등으로 구성된 하이브리드 모델로서, Hybrid SVM Model이라고 명명하였다. 구축과정 및 예측과정은 다음과 같다. 원래의 불균형 데이터 집합으로 SVM_I Model과 ANN_I Model을 구축한다. 불균형 데이터 집합으로부터 Oversampling을 하여 균형 데이터 집합을 생성하고, 이것으로 SVM_B Model을 구축한다. SVM_I Model은 민감도에서 우수하고, SVM_B Model은 특이도에서 우수하다. 입력 레코드에 대해서 SVM_I와 SVM_B가 동일한 예측치를 도출하면 그것을 최종 해로 결정한다. SVM_I와 SVM_B가 상이한 예측치를 도출한 레코드에 대해서는 ANN과 의사결정나무의 도움으로 판별 과정을 거쳐서 최종 해를 결정한다. 상이한 예측치를 도출한 레코드에 대해서는, ANN_I의 출력값을 입력속성으로, 실제 이탈 여부를 목표 속성으로 설정하여 의사결정나무 모델을 구축한다. 그 결과 다음과 같은 2개의 판별규칙을 얻었다. 'IF ANN_I output value < 0.285, THEN Final Solution = Retention' 그리고 'IF ANN_I output value ${\geq}0.285$, THEN Final Solution = Churn'이다. 제시되어 있는 규칙의 Threshold 값인 0.285는 본 연구에서 사용한 데이터에 최적화되어 도출된 값이다. 본 연구에서 제시하는 것은 Hybrid SVM Model의 구조이지 특정한 Threshold 값이 아니기 때문에 이 Threshold 값은 대상 데이터에 따라서 얼마든지 변할 수 있다. Hybrid SVM Model의 성능을 UCI Machine Learning Repository에서 제공하는 Churn 데이터 집합을 사용하여 평가하였다. Hybrid SVM Model의 적중률은 91.08%로서 SVM_I Model이나 SVM_B Model의 적중률보다 높았다. Hybrid SVM Model의 민감도는 95.02%이었고, 특이도는 69.24%이었다. SVM_I Model의 민감도는 94.65%이었고, SVM_B Model의 특이도는 67.00%이었다. 그러므로 본 연구에서 개발한 Hybrid SVM Model이 SVM_I Model의 민감도 수준은 유지하면서 SVM_B Model의 특이도보다는 향상된 성능을 보였다.

KORMARC 서지레코드의 FRBR 알고리즘 개발에 관한 연구 (A Study on the Development of FRBR Algorithm for KORMARC Bibliographic Record)

  • 김정현;이성숙;이유정
    • 한국도서관정보학회지
    • /
    • 제46권1호
    • /
    • pp.1-23
    • /
    • 2015
  • 이 연구의 목적은 KORMARC 서지레코드를 유형별로 분석하여 FRBR을 적용한 검색 알고리즘을 개발하는 데 있다. 이를 위해 OCLC와 LC를 비롯한 국내외 FRBR 구현 알고리즘 개발 사례의 장단점을 비교 분석하여 이 연구의 기반으로 활용하였으며, 국립중앙도서관의 국가서지레코드에서 추출한 KORMARC 실험데이터를 분석하여 FRBR의 4가지 서지적 개체로 식별요소를 추출하였다. 저작세트별로 관련 저작을 군집화하기 '저자+표제'와 같은 저작의 전거형 접근점을 작성할 수 있는 알고리즘을 설계하였다. 한편 국가서지레코드를 분석한 결과, KORMARC 서지레코드를 대상으로 FRBR 알고리즘을 적용하기 위해서는 기존의 서지레코드를 정비하고 레코드의 입력 수준을 전반적으로 강화할 것을 제안하였다.