• 제목/요약/키워드: Data Cleaning

검색결과 422건 처리시간 0.022초

비정형작업 근골격계질환 위험요인의 노출평가: 일부 병원근로자에 대한 PATH-KOSHA 관찰도구 적용사례 (Exposure assessment of musculoskeletal disorder risk factors in non routinized work: An application of PATH-KOSHA observational tool to hospital workers)

  • 박정근;한영선
    • 한국산업보건학회지
    • /
    • 제19권4호
    • /
    • pp.412-422
    • /
    • 2009
  • This study was conducted to assess exposure to musculoskeletal disorder(MSD) risk factors in hospital personnel who performed non-routinized work tasks. A tool ("PATH-KOSHA" version) was newly revised from PATH(Posture, Activity, Tools and Handling) method and uploaded into a personal digital assistant(PDA). The version was used, on a basis of direct-observation, to collect PATH data at the 2 hospital settings in different regions. Job analysis was performed to get various information (e.g., work and rest time, task type) as well. The data collected were visually checked for data cleaning and stored for future data analysis. A total of 1,992 PATH observations were made for 37 hospital workers. Exposure levels varied across 18 items of the MSD risk factors. The highest percent time spent on non-neutral postures was 53% for wrist deviation, followed by 47%(pinch grip), 35%(trunk posture), 23%(neck posture), and 20%(shoulder/arm posture). The highest percent time spent among hand activity level(HAL) variables was 55% for HAL-cat2 (HAL: 3.3 - <6.7). The percent time of items with respect to both loads with more than 5kg and contact stress was less than 4%. Vibration was not exposed in the study workers. Different aspects were discussed for findings. The study results showed that wrist deviation was highest in percent time spent on awkward posture while HAL-cat2 was highest in hand repetition. The study suggests that distal upper extremity posture and HAL should be primarily addressed and controlled in non-routinized work including the hospital settings.

Estimation of Completeness of Cancer Registration for Patients Referred to Shiraz Selected Centers through a Two Source Capture Re-capture Method, 2009 Data

  • Sharifian, Roxana;SedaghatNia, Mohammad Hossein;Nematolahi, Mohtram;Zare, Najaf;Barzegari, Saeed
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제16권13호
    • /
    • pp.5549-5556
    • /
    • 2015
  • Background: Cancer has important social consequences with cancer registration as the basis of moving towards prevention. The present study aimed to estimate the completeness of registration of the ten most common cancers in patients referred to selected hospitals in Shiraz, Iran by using capture-recapture method. Materials and Methods: This cross-sectional analytical study was performed in 2014 based on the data of 2009, on a total of 4,388 registered cancer patients. After cleaning data from two sources, using capture-recapture common findings were identified. Then, the percentage of the completeness of cancer registration was estimated using Chapman and Chao methods. Finally, the effects of demographic and treatment variables on the completeness of cancer registration were investigated. Results: The results showed that the percentages of completeness of cancer registration in the selected hospitals of Shiraz were 58.6% and 58.4%, and influenced by different variables. The age group between 40-49 years old was the highest represented and for the age group under 20 years old was the lowest for cancer registration. Breast cancer had the highest registration level and after that, thyroid and lung cancers, while colorectal cancer had the lowest registration level. Conclusions: According to the results, the number of cancers registered was very few and it seems that factors like inadequate knowledge of some doctors, imprecise diagnosis about the types of cancer, incorrectly filled out medical documents, and lack of sufficient accuracy in recording data on the computer cause errors and defects in cancer registration. This suggests a necessity to educate and teach doctors and other medical workers about the methods of documenting information related to cancer and also conduct additional measures to improve the cancer registration system.

Pre-Processing of Query Logs in Web Usage Mining

  • Abdullah, Norhaiza Ya;Husin, Husna Sarirah;Ramadhani, Herny;Nadarajan, Shanmuga Vivekanada
    • Industrial Engineering and Management Systems
    • /
    • 제11권1호
    • /
    • pp.82-86
    • /
    • 2012
  • In For the past few years, query log data has been collected to find user's behavior in using the site. Many researches have studied on the usage of query logs to extract user's preference, recommend personalization, improve caching and pre-fetching of Web objects, build better adaptive user interfaces, and also to improve Web search for a search engine application. A query log contain data such as the client's IP address, time and date of request, the resources or page requested, status of request HTTP method used and the type of browser and operating system. A query log can offer valuable insight into web site usage. A proper compilation and interpretation of query log can provide a baseline of statistics that indicate the usage levels of website and can be used as tool to assist decision making in management activities. In this paper we want to discuss on the tasks performed of query logs in pre-processing of web usage mining. We will use query logs from an online newspaper company. The query logs will undergo pre-processing stage, in which the clickstream data is cleaned and partitioned into a set of user interactions which will represent the activities of each user during their visits to the site. The query logs will undergo essential task in pre-processing which are data cleaning and user identification.

Estimation of Water Quality of Fish Farms using Multivariate Statistical Analysis

  • Ceong, Hee-Taek;Kim, Hae-Ran
    • Journal of information and communication convergence engineering
    • /
    • 제9권4호
    • /
    • pp.475-482
    • /
    • 2011
  • In this research, we have attempted to estimate the water quality of fish farms in terms of parameters such as water temperature, dissolved oxygen, pH, and salinity by employing observational data obtained from a coastal ocean observatory of a national institution located close to the fish farm. We requested and received marine data comprising nine factors including water temperature from Korea Hydrographic and Oceanographic Administration. For verifying our results, we also established an experimental fish farm in which we directly placed the sensor module of an optical mode, YSI-6920V2, used for self-cleaning inside fish tanks and used the data measured and recorded by a environment monitoring system that was communicating serially with the sensor module. We investigated the differences in water temperature and salinity among three areas - Goheung Balpo, Yeosu Odongdo, and the experimental fish farm, Keumho. Water temperature did not exhibit significant differences but there was a difference in salinity (significance <5%). Further, multiple regression analysis was performed to estimate the water quality of the fish farm at Keumho based on the data of Goheung Balpo. The water temperature and dissolved-oxygen estimations had multiple regression linear relationships with coefficients of determination of 98% and 89%, respectively. However, in the case of the pH and salinity estimated using the oceanic environment with nine factors, the adjusted coefficient of determination was very low at less than 10%, and it was therefore difficult to predict the values. We plotted the predicted and measured values by employing the estimated regression equation and found them to fit very well; the values were close to the regression line. We have demonstrated that if statistical model equations that fit well are used, the expense of fish-farm sensor and system installations, maintenances, and repairs, which is a major issue with existing environmental information monitoring systems of marine farming areas, can be reduced, thereby making it easier for fish farmers to monitor aquaculture and mariculture environments.

Screening of Workers with Presumed Occupational Methanol Poisoning: The Applicablility of a National Active Occupational Disease Surveillance System

  • Eom, Huisu;Lee, Jihye;Kim, Eun-A
    • Safety and Health at Work
    • /
    • 제10권3호
    • /
    • pp.265-274
    • /
    • 2019
  • Background: Methyl alcohol poisoning in mobile phone-manufacturing factories during 2015-2016 was caused by methyl alcohol use for cleaning in computerized numerical control (CNC) processes. To determine whether there were health complications in other workers involved in similar processes, the Occupational Safety and Health Research Institute conducted a survey. Methods: We established a national active surveillance system by collaborating with the Ministry of Employment and Labor and National Health Insurance Service. Employment and national health insurance data were used. Overall, 12,048 employees of major domestic mobile phone companies and CNC process dispatch workers were surveyed from 2016 to 2017. We investigated methyl alcohol poisoning by using the national health insurance data. Questionnaires were used to investigate diseases due to methyl alcohol poisoning. Results: Overall, 24.9% of dispatched workers were employed in at least five companies, and 23.9% of dispatched workers had missing employment insurance history data. The prevalence of blindness including visual impairment, optic neuritis, visual disturbances, and alcohol toxicity in the study participants was higher than that reported in the national health insurance database (0.02%, 0.07%, 0.23%, and 0.03% versus 0.01%, 0.07%, 0.13%, and 0.01%, respectively, in 2015). Moreover, 430 suspicious workers were identified; 415 of these provided an address and phone number, of whom 48 responded (response rate, 11.6%). Among the 48 workers, 10 had diseases at the time of the survey, of whom 3 workers were believed to have diseases related to methyl alcohol exposure. Conclusion: This study revealed that active surveillance data can be used to assess health problems related to methyl alcohol poisoning in CNC processes and dispatch workers.

공공하수처리시설에서 에너지 사용현황 및 절감방안 연구 (A Study on Energy Usage Monitoring and Saving Method in the Sewage Treatment Plant)

  • 김종락;이가희;유광태;김동윤;이호식
    • 한국물환경학회지
    • /
    • 제36권6호
    • /
    • pp.535-545
    • /
    • 2020
  • This study aims to conserve and monitor energy use in public sewage treatment plants by utilizing data from the SCADA system and by controlling the aeration rate required for maintaining effluent water quality. Power consumption in the sewage treatment process was predicted using the equipment's uptime, efficiency, and inherent power consumption. The predicted energy consumption was calibrated by measured data. Additionally, energy efficiency indicators were proposed based on statistical data for energy use, capacity, and effluent quality. In one case study, a sewage treatment plant operated via the SBR process used ~30% of energy consumed in maintaining the bioreactors and treated water tanks (included decanting pump and cleaning systems). Energy consumption analysis with the K-ECO Tool-kit was conducted for unit processing. The results showed that about 58.7% of total energy consumed was used in the preliminary and biological treatment rotating equipment such as the blower and pump. In addition, the energy consumption rate was higher to the order of 19.2% in the phosphorus removal process, 16.0% during sludge treatment, and 6.1% during disinfection and discharge. In terms of equipment energy usage, feeding and decanting pumps accounted for 40% of total energy consumed following 27% for blowers. By controlling the aeration rate based on the proposed feedback control system, the DO concentration was reduced by 56% compared pre-controls and the aeration amount decreased by 28%. The overall power consumption of the plant was reduced by 6% via aeration control.

세계생물다양성정보기구(GBIF)에 출판된 동아시아 관속식물 생물다양성 정보 현황과 자료품질 분석 (Status and Quality Analysis on the Biodiversity Data of East Asian Vascular Plants Mobilized through the Global Biodiversity Information Facility (GBIF))

  • 장진성;권신영;김휘
    • 한국산림과학회지
    • /
    • 제110권2호
    • /
    • pp.179-188
    • /
    • 2021
  • 생물다양성정보학(Biodiversity Informatics)은 정보과학을 생물다양성정보에 접목한 분야로 정이명으로 구성된 학명을 비롯한 종정보를 기초로 일차종발생자료를 구축하고 이를 활용한다. 본 연구에서는 생물다양성 정보의 이용적합도를 기준으로 세계생물다양성정보기구(GBIF)에 출판된 동북아시아 자료의 품질을 BRAHMS 프로그램을 이용하여 평가하고 이를 통해 생물다양성자료 정제의 필요성을 확인하였다. 국립생물자원관, 국립생태원, 국립수목원 등의 국내 생물다양성 관련기관과 더불어 일본, 중국, 대만의 출판 자료는 자료정제과정의 문제로 학명, 지리정보, 채집자, 날짜 등에 대한 오류가 확인된다. 기본적인 속성자료에서 오류가 발생하는 원인은 동아시아의 생물다양성관리기관들이 구조화되지 않은 데이터베이스를 사용하고 평면적인 스프레드시트형 정보를 사용하기 때문이다. 생물다양성 정보 특성상 다양한 정보가 구조화가되지 않을 경우 학명, 인명, 지명, 문헌, 생태정보에 대한 데이터 무결성을 해결하지 못한다. 동아시아 생물다양성정보 관리문제를 극복하기 위해서는 자료의 구조화와 함께 자료정제에 대한 이해도를 높이고, 오류 수정을 위한 지속적인 자료 관리자인 전문 분류학자 양성이 필요하다. 생물다양성 정보관리자는 오류 원인분석을 통해 문서화된 관리 지침을 수정, 추가하는 등 향후 오류 예방을 위한 대책이 필요하며 시스템에 적용시켜야 한다. 이런 모든 과정은 데이터베이스를 기반으로 진행되고 기록되어야 한다. 동아시아의 생물다양성 출판자들은 현재 수준의 단순한 자료구조보다는 생물다양성 정보 관리를 위해 전문적인 선진 프로그램의 사용 혹은 이에 준하는 수준의 고도화된 데이터베이스의 개발이 필요하다.

귀농자들의 농촌정착지원을 위한 프로그램 개발 방향 (Direction of Program Development for Supporting U-turn Farmers' Rural Settlement)

  • 김성수;정지웅;임형백;고운미;김정태;이성
    • 농촌지도와개발
    • /
    • 제11권1호
    • /
    • pp.53-65
    • /
    • 2004
  • The purposes for this study was to provide information for developing educational programs for U-turn farmers' based on their needs on rural settlement. Special objectives of the study were; 1) to survey the general characteristics of U-turn farmers including motives. preparation, and education, 2) to investigate problems and difficulties of the U-turn farmers in rural settlement 3) to identify the reasons for success and failure in U-turn filming, and 4) to provide information in developing programs for U-turn farmers. Data for the study were collected from 526 U-turn farmers throughout the country, and after data cleaning, 494 questionnaires were used for data analysis. Based on the results of this study, the following were recommended for further development of U-turn farming programs; 1) to facilitate and expand continuous surveys on the motives. preparation, education and information for U-turn farmers will be neceassary to update the important and current information on U-turn farming. 2) Further examination of the problems and difficulties of U-turn farmers would be necessary to develop appropriate policies and educational programs for U-turn farming. 3) continuous investigations on the reasons for success and failure of U-turn farming would be necessary to develop appropriate apicultural policies. 4) for more effective educational programs for U-turn farmers. selection of educators, institution, curricular and timing etc. Should be carefully designed to meet the practical needs of the U-turn farmers. 5) more research activities should be encouraged to improve program development and implementation of U-turn farming.

  • PDF

키워드 네트워크 분석 방법을 활용한 블록체인 트렌드 분석에 관한 연구 (A Study on Analysis of the Trend of Blockchain by Key Words Network Analysis)

  • 조성환
    • 한국정보전자통신기술학회논문지
    • /
    • 제11권5호
    • /
    • pp.550-555
    • /
    • 2018
  • 본 연구는 키워드 네트워크 분석에 사용되는 텍스트마이닝과 의미연결망 분석 방법을 활용하여 블록체인의 산업 활용 분야로 언론 및 정부 발표에서 언급되고 있는 '금융', '에너지', '물류'를 언급한 기사들을 비교 분석하였다. 블록체인 적용이 언급된 산업 분야별로 기사의 내용 및 키워드의 차이를 파악하고 비교 분석하는 것을 목적으로 하였다. 2017년 1월부터 2018년 7월까지 언론에서 보도한 총 43,093건의 기사를 Python BeautifulSoup을 이용하여 네이버 뉴스에서 수집하였고, 세 용어의 상호 중복을 제거하기 위한 정제 작업을 수행하였다. 이후 키워드 간 네트워크 분석을 위해 텍스톰(Textom)과 UCINET을 이용하여 세 용어에 대한 텍스트마이닝과 의미연결망 분석을 진행하였다. 분석 결과, 세 용어는 모두 '기술' 측면에서는 유사한 단어들이 있었으나, '정부 정책'이나 '산업'측면의 이슈 등에서 내용적 차이가 있었다. 또한 빈도 및 중심성에 있어서도 차이가 있음을 확인할 수 있었다.

상업 운항 중인 대형 컨테이너선의 항차 중 속력 시운전을 통한 선속-동력 특성 연구 Part I (Study on the Speed-Power Characteristics Through a Speed Trial of a Large Container Vessel During a Commercial Voyage Part I)

  • 김호;이준형;장진호;안해성;강대열;변상수
    • 대한조선학회논문집
    • /
    • 제58권6호
    • /
    • pp.366-374
    • /
    • 2021
  • This paper presents the analysis of the speed-power performance in the real sea using a large container vessel data provided as a test bed from a shipping company. To perform a speed trial of the vessel during a commercial voyage, the on-board measuring device and various operation data acquisition systems were mounted on the vessel for long-term performance monitoring and the voyage operated under the container loading condition close to the design draft was adopted. The content of this paper consists of Part I and Part II. Part I, such as this paper, contains the speed trial method and analysis results of the operating vessel. Part II contains the analysis of the speed-power characteristics change over time and before and after hull cleaning using operation data measured from the voyage operated under a condition similar to the speed trial.