Search | Korea Science

Hwang, Seung-Yeon;Park, Ji-Hun;Youn, Ha-Young;Kwak, Kwang-Jin;Park, Jeong-Min;Kim, Jeong-Joon
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.19 no.1
- /
- pp.187-195
- /
- 2019
Recently, it has become possible to collect, store, process, and analyze data generated in various fields by the development of the technology related to the big data. These big data technologies are used for clinical results analysis and the optimization of clinical trial design will reduce the costs associated with health care. Therefore, in this paper, we are going to analyze clinical results and present guidelines that can reduce the period and cost of clinical trials. First, we use Sqoop to collect clinical results data from relational databases and store in HDFS, and use Hive, a processing tool based on Hadoop, to process data. Finally we use R, a big data analysis tool that is widely used in various fields such as public sector or business, to analyze associations.
https://doi.org/10.7236/JIIBC.2019.19.1.187 인용 PDF KSCI HTML

Choi, Byung-Kwan;Choi, Eun-A;Nam, Moon-Hee
- Journal of Digital Convergence
- /
- v.20 no.5
- /
- pp.681-693
- /
- 2022
The purpose of this study is to suggest a plan to utilize atypical data in the health care field by inferring standard medical terms related to the musculoskeletal system through keyword network analysis of medical records of patients hospitalized for musculoskeletal disorders. The analysis target was 145 summaries of discharge with musculoskeletal disorders from 2015 to 2019, and was analyzed using TEXTOM, a big data analysis solution developed by The IMC. The 177 musculoskeletal related terms derived through the primary and secondary refining processes were finally analyzed. As a result of the study, the frequent term was 'Metastasis', the clinical findings were 'Metastasis', the symptoms were 'Weakness', the diagnosis was 'Hepatitis', the treatment was 'Remove', and the body structure was 'Spine' in the analysis results for each medical terminology system. 'Oxycodone' was used the most. Based on these results, we would like to suggest implications for the analysis, utilization, and management of unstructured medical data.
https://doi.org/10.14400/JDC.2022.20.5.681 인용 PDF KSCI

Kim, Jin Soo;Choi, Bang Ho;Cho, Gi Hwan
- Smart Media Journal
- /
- v.8 no.1
- /
- pp.9-18
- /
- 2019
In the era of the 4th industrial revolution, the big data industry is gathering attention for new business models in the public and private sectors by utilizing various information collected through the internet and mobile. However, although the big data integration and analysis are performed with de-identification techniques, there is still a risk that personal privacy can be exposed. Recently, there are many studies to invent effective methods to maintain the value of data without disclosing personal information. In this paper, a personal information protection system is investigated to boost big data utilization in industrial sectors, such as healthcare and agriculture. The criteria for evaluating the de-identification adequacy of personal information and the protection scope of personal information should be differently applied for each industry. In the field of personal sensitive information-oriented healthcare sector, the minimum value of k-anonymity should be set to 5 or more, which is the average value of other industrial sectors. In agricultural sector, it suggests the inclusion of companion dogs or farmland information as sensitive information. Also, it is desirable to apply the demonstration steps to each region-specific industry.
https://doi.org/10.30693/SMJ.2019.8.1.09 인용 PDF KSCI

Lee, Seong-Hoon;Lee, Dong-Woo
- Journal of Digital Convergence
- /
- v.11 no.4
- /
- pp.267-271
- /
- 2013
Our society has two prospective properties because of IT technology. Firstly, it is accelerated a degree of convergence. And convergence regions are expanded. For example, smart healthcare region was created by IT technology and medical industry. The efforts to convergence will be continued. Because of these properties, A number of data are made in our life. Through many devices such as smart phone, camera, game machine, tablet pc, various data types are produced. In this paper, we described utilization of Big Data. And we analysed Big Data processing process.
https://doi.org/10.14400/JDPM.2013.11.4.267 인용 PDF

Kong, Seongwon;Hwang, Deokyoul
- The Journal of Bigdata
- /
- v.3 no.2
- /
- pp.11-18
- /
- 2018
This study is a study on domain automatic classification for domain - based quality diagnosis which is a key element of big data quality diagnosis. With the increase of the value and utilization of Big Data and the rise of the Fourth Industrial Revolution, the world is making efforts to create new value by utilizing big data in various fields converged with IT such as law, medical, and finance. However, analysis based on low-reliability data results in critical problems in both the process and the result, and it is also difficult to believe that judgments based on the analysis results. Although the need of highly reliable data has also increased, research on the quality of data and its results have been insufficient. The purpose of this study is to shorten the work time to automizing the domain classification work which was performed from manually to using machine learning in the domain - based quality diagnosis, which is a key element of diagnostic evaluation for improving data quality. Extracts information about the characteristics of the data that is stored in the database and identifies the domain, and then featurize it, and automizes the domain classification using machine learning. We will use it for big data quality diagnosis and contribute to quality improvement.
PDF KSCI