• Title/Summary/Keyword: Large-scale Analysis Data

Search Result 1,170, Processing Time 0.028 seconds

A Rolling Sampling Design for the Korea National Health and Nutrition Examination Survey (제4기 국민건강.영양조사를 위한 순환표본 설계연구)

  • Lee, Kay-O;Park, Jin-Woo
    • Survey Research
    • /
    • v.8 no.2
    • /
    • pp.67-89
    • /
    • 2007
  • The Korea National Health and Nutrition Examination Survey(KNHANES) consists of Health Interview Survey, Health Behaviour Survey, Nutrition Survey, and Health Examination, and is designed to produce a broad range of descriptive health and nutritional statistics for sex and age subdomains of the population. These data can be used to measure and monitor the health and nutritional status of the population of Korea. The survey has been conducted three times from 1998. The Korea Centers for Disease Control and Prevention(KCDC) is preparing for the 4th survey which is to be conducted from 2007 through 2009. This study is to design a sample for the 4th survey. The main new feature of the sampling design is using a rolling sampling design method. Since KCDC has imposed some operational requirements, e,g., the needs of producing the annual national statistics and of year-round data collection by some regular staffs, a rolling sampling design method is introduced. This is the first time in history of applying a rolling sampling design for a national-wide large scale survey in Korea. Bringing in the rolling sampling, measurement variation due to different data collectors may be minimized.

  • PDF

Generating Sponsored Blog Texts through Fine-Tuning of Korean LLMs (한국어 언어모델 파인튜닝을 통한 협찬 블로그 텍스트 생성)

  • Bo Kyeong Kim;Jae Yeon Byun;Kyung-Ae Cha
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.29 no.3
    • /
    • pp.1-12
    • /
    • 2024
  • In this paper, we fine-tuned KoAlpaca, a large-scale Korean language model, and implemented a blog text generation system utilizing it. Blogs on social media platforms are widely used as a marketing tool for businesses. We constructed training data of positive reviews through emotion analysis and refinement of collected sponsored blog texts and applied QLoRA for the lightweight training of KoAlpaca. QLoRA is a fine-tuning approach that significantly reduces the memory usage required for training, with experiments in an environment with a parameter size of 12.8B showing up to a 58.8% decrease in memory usage compared to LoRA. To evaluate the generative performance of the fine-tuned model, texts generated from 100 inputs not included in the training data produced on average more than twice the number of words compared to the pre-trained model, with texts of positive sentiment also appearing more than twice as often. In a survey conducted for qualitative evaluation of generative performance, responses indicated that the fine-tuned model's generated outputs were more relevant to the given topics on average 77.5% of the time. This demonstrates that the positive review generation language model for sponsored content in this paper can enhance the efficiency of time management for content creation and ensure consistent marketing effects. However, to reduce the generation of content that deviates from the category of positive reviews due to elements of the pre-trained model, we plan to proceed with fine-tuning using the augmentation of training data.

Factors that Explain the Lag in Building High-growth Firms in Women (여성의 고성장기업 창업이 저조한 원인)

  • Chun, Hesuk
    • The Journal of the Korea Contents Association
    • /
    • v.16 no.7
    • /
    • pp.300-308
    • /
    • 2016
  • Research on OECD and Korea have shown that high-growth startups are the keys to job creation and that these companies are very important for economic growth. Given that the large-scale entry of women into the labor force accelerates economic growth and women have far lower levels of participation in growth-oriented entrepreneurship than men do, accelerating female entrepreneurship could have positive effects on the Korean economy. This paper uses data from several databases to do a comparison analysis between women's and men's start-ups to explore the factors that explain the lag in building high-growth firms among women. Women startups make up nearly 34% of startups(defined as less than 7 years of establishment), but only 6% of high-growth startups. Women rarely own large businesses, reflecting their low levels of initial capital and outside financing. Regardless of gender, entrepreneurs face many of the same challenges in starting businesses, but this study shows three primary factors for female entrepreneurs that lead to a less high-growth startup: a greater financing gap than for men(this gap is more apparent for high-growth firms), a lack of ideas, knowledge, and experience(related to the lack of mentorship), and lastly the difficulty maintaining a work-life balance. The findings are very similar with those found in studie's in the US(financing gap, work-life balance, and lack of mentorship). Further studies are required to identify more specific factors behind the gender gap in ideas, knowledge, and experience.

Characterization of Legionella Isolated from the Water System at Public Facilities in Chungcheongnam-do Province (충남지역 다중이용시설의 환경수계에서 분리한 레지오넬라균의 특성 분석)

  • Cheon, Younghee;Lee, Hyunah;Nam, Hae-Sung;Choi, Jihye;Lee, Dayeon;Ko, Young-Eun;Park, Jongjin;Lee, Miyoung;Park, Junhyuk
    • Journal of Environmental Health Sciences
    • /
    • v.47 no.5
    • /
    • pp.472-478
    • /
    • 2021
  • Background: The Legionella case detection and notification rate have increased in public artificial water environments where people visit, including large buildings, public baths, and hospitals. Objectives: In this study, the distribution of Legionella and its epidemiologic characteristics were analyzed in the water systems of public facilities in Chungcheongnam-do Province in South Korea. Methods: Culture and PCR analysis were performed on 2,991 environmental water system samples collected from 2017 to 2019, and associations with year, facilities, seasons, and temperature of water system were statistically analyzed by using R-Studio for Windows. Descriptive data was compared using chi-square tests and independent t-tests. Results: The detection rate of Legionella increased from 3.1% in 2017 to 10.3% in 2019, appearing most frequently in the order of public baths, large-scale buildings, hospitals, and apartments. It was detected mainly in summer from June to August, over 1.0×103 CFU/L on average in 133 cases (66.5%). Lots of germs were detected in bathtub water, cooling tower water, and warm water (p<0.001), and it was detected at higher rates in the cities where multipurpose facilities were concentrated than in rural areas (p=0.018). Conclusions: This study suggests that continuous monitoring and control are required for Legionella in the water system environment of high risk facilities. Moreover, these results will be helpful to prepare efficient management plans to prevent the Legionellosis that occurs in Chungcheongnam-do Province.

Development of HLA-A, -B and -DR Typing Method Using Next-Generation Sequencing (차세대염기서열분석법을 이용한 HLA-A, -B 그리고 -DR 형별 분석법 개발)

  • Seo, Dong Hee;Lee, Jeong Min;Park, Mi Ok;Lee, Hyun Ju;Moon, Seo Yoon;Oh, Mijin;Kim, So Young;Lee, Sang-Heon;Hyeong, Ki-Eun;Hu, Hae-Jin;Cho, Dae-Yeon
    • The Korean Journal of Blood Transfusion
    • /
    • v.29 no.3
    • /
    • pp.310-319
    • /
    • 2018
  • Background: Research on next-generation sequencing (NGS)-based HLA typing is active. To resolve the phase ambiguity and long turn-around-time of conventional high resolution HLA typing, this study developed a NGS-based high resolution HLA typing method that can handle large-scale samples within an efficient testing time. Methods: For HLA NGS, the condition of nucleic acid extraction, library construction, PCR mechanism, and HLA typing with bioinformatics were developed. To confirm the accuracy of the NGS-based HLA typing method, the results of 192 samples HLA typed by SSOP and 28 samples typed by SBT compared to NGS-based HLA-A, -B and -DR typing. Results: DNA library construction through two-step PCR, NGS sequencing with MiSeq (Illumina Inc., San Diego, USA), and the data analysis platform were established. NGS-based HLA typing results were compatible with known HLA types from 220 blood samples. Conclusion: The NSG-based HLA typing method could handle large volume samples with high-throughput. Therefore, it would be useful for HLA typing of bone marrow donation volunteers.

Impacts of Argo temperature in East Sea Regional Ocean Model with a 3D-Var Data Assimilation (동해 해양자료동화시스템에 대한 Argo 자료동화 민감도 분석)

  • KIM, SOYEON;JO, YOUNGSOON;KIM, YOUNG-HO;LIM, BYUNGHWAN;CHANG, PIL-HUN
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.20 no.3
    • /
    • pp.119-130
    • /
    • 2015
  • Impacts of Argo temperature assimilation on the analysis fields in the East Sea is investigated by using DAESROM, the East Sea Regional Ocean Model with a 3-dimensional variational assimilation module (Kim et al., 2009). Namely, we produced analysis fields in 2009, in which temperature profiles, sea surface temperature (SST) and sea surface height (SSH) anomaly were assimilated (Exp. AllDa) and carried out additional experiment by withdrawing Argo temperature data (Exp. NoArgo). When comparing both experimental results using assimilated temperature profiles, Root Mean Square Error (RMSE) of the Exp. AllDa is generally lower than the Exp. NoArgo. In particular, the Argo impacts are large in the subsurface layer, showing the RMSE difference of about $0.5^{\circ}C$. Based on the observations of 14 surface drifters, Argo impacts on the current and temperature fields in the surface layer are investigated. In general, surface currents along the drifter positions are improved in the Exp. AllDa, and large RMSE differences (about 2.0~6.0 cm/s) between both experiments are found in drifters which observed longer period in the southern region where Argo density was high. On the other hand, Argo impacts on the SST fields are negligible, and it is considered that SST assimilation with 1-day interval has dominant effects. Similar to the difference of surface current fields between both experiments, SSH fields also reveal significant difference in the southern East Sea, for example the southwestern Yamato Basin where anticyclonic circulation develops. The comparison of SSH fields implies that SSH assimilation does not correct the SSH difference caused by withdrawing Argo data. Thus Argo assimilation has an important role to reproduce meso-scale circulation features in the East Sea.

A Study on Space Creation and Management Plan according to Characteristics by Type in Each Small-Scale Biotope in Seoul - Base on the Amphibian Habitats - (서울시 소규모 생물서식공간 유형별 특성에 따른 조성 및 관리방안 연구 - 양서류 서식지를 중심으로 -)

  • Park, Ha-Ju;Han, Bong-Ho;Kim, Jong-Yup
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.52 no.2
    • /
    • pp.110-126
    • /
    • 2024
  • This study conducted a classification of small-scale biological habitats created in Seoul to analyze and synthesize location characteristics, habitat structure, biological habitat functions, and threat factors of representative sites, as well as derive creation and management problems according to the ecological characteristics. The aim was to suggest improvement measures and management items. Data collected through a field survey was used to categorize 39 locations, and 8 representative sites were selected by dividing them into location, water system, and size as classification criteria for typification. Due to the characteristics of each type, the site was created in an area where amphibian movement was disadvantageous due to low or disconnected connectivity with the hinterland forest, and the water supply was unstable in securing a constant flow and maintaining a constant water depth. The habitat structure has a small area, an artificial habitat structure that is unfavorable for amphibians, having the possibility of sediment inflow, and damage to the revetment area. The biological habitat function is a lack of wetland plants and the distribution of naturalized grasses, and threats include the establishment of hiking trails and decks in the surrounding area. Artificial disturbances occur adjacent to facilities. When creating habitats according to the characteristics of each type, it was necessary to review the possibility of an artificial water supply and introduce a water system with a continuous flow in order to connect the hinterland forest for amphibian movement and locate it in a place where water supply is possible. The habitat structure should be as large as possible, or several small-scale habitats should be connected to create a natural waterfront structure. In addition, additional wetland plants should be introduced to provide shelter for amphibians, and facilities such as walking paths should be installed in areas other than migration routes to prevent artificial disturbances. After construction, the management plan is to maintain various water depths for amphibians to inhabit and spawn, stabilize slopes due to sediment inflow, repair damage to revetments, and remove organic matter deposits to secure natural grasses and open water. Artificial management should be minimized. This study proposed improvement measures to improve the function of biological habitats through the analysis of problems with previously applied techniques, and based on this, in the future, small-scale biological habitat spaces suitable for the urban environment can be created for local governments that want to create small-scale biological habitat spaces, including Seoul City. It is significant in that it can provide management plans.

Perception and Appraisal of Urban Park Users Using Text Mining of Google Maps Review - Cases of Seoul Forest, Boramae Park, Olympic Park - (구글맵리뷰 텍스트마이닝을 활용한 공원 이용자의 인식 및 평가 - 서울숲, 보라매공원, 올림픽공원을 대상으로 -)

  • Lee, Ju-Kyung;Son, Yong-Hoon
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.49 no.4
    • /
    • pp.15-29
    • /
    • 2021
  • The study aims to grasp the perception and appraisal of urban park users through text analysis. This study used Google review data provided by Google Maps. Google Maps Review is an online review platform that provides information evaluating locations through social media and provides an understanding of locations from the perspective of general reviewers and regional guides who are registered as members of Google Maps. The study determined if the Google Maps Reviews were useful for extracting meaningful information about the user perceptions and appraisals for parks management plans. The study chose three urban parks in Seoul, South Korea; Seoul Forest, Boramae Park, and Olympic Park. Review data for each of these three parks were collected via web crawling using Python. Through text analysis, the keywords and network structure characteristics for each park were analyzed. The text was analyzed, as were park ratings, and the analysis compared the reviews of residents and foreign tourists. The common keywords found in the review comments for the three parks were "walking", "bicycle", "rest" and "picnic" for activities, "family", "child" and "dogs" for accompanying types, and "playground" and "walking trail" for park facilities. Looking at the characteristics of each park, Seoul Forest shows many outdoor activities based on nature, while the lack of parking spaces and congestion on weekends negatively impacted users. Boramae Park has the appearance of a city park, with various facilities providing numerous activities, but reviewers often cited the park's complexity and the negative aspects in terms of dog walking groups. At Olympic Park, large-scale complex facilities and cultural events were frequently mentioned, emphasizing its entertainment functions. Google Maps Review can function as useful data to identify parks' overall users' experiences and general feelings. Compared to data from other social media sites, Google Maps Review's data provides ratings and understanding factors, including user satisfaction and dissatisfaction.

Architecture and Depositional Style of Gravelly, Deep-Sea Channels: Lago Sofia Conglomerate, Southeyn Chile (칠레 남부 라고 소피아 (Lago Sofla) 심해저 하도 역암의 층구조와 퇴적 스타일)

  • Choe Moon Young;Jo Hyung Rae;Sohn Young Kwan;Kim Yeadong
    • The Korean Journal of Petroleum Geology
    • /
    • v.10 no.1_2 s.11
    • /
    • pp.23-33
    • /
    • 2004
  • The Lago Sofia conglomerate in southern Chile is a lenticular unit encased within mudstone-dominated, deep-sea successions (Cerro Toro Formation, upper Cretaceous), extending from north to south for more than $120{\cal}km$. The Lago Sofia conglomerate is a unique example of long, gravelly deep-sea channels, which are rare in the modern environments. In the northern part (areas of Lago Pehoe and Laguna Goic), the conglomerate unit consists of 3-5 conglomerate bodies intervened by mudstone sequences. Paleocurrent data from these bodies indicate sediment transport to the east, south, and southeart. The conglomerate bodies in the northern Part are interpreted as the tributary channels that drained down the Paleoslope and converged to form N-S-trending trunk channels. In the southern part (Lago Sofia section), the conglomerate unit comprises a thick (> 300 m) conglomerate body, which probably formed in axial trunk channels of the N-5-trending foredeep trough. The well-exposed Lago Sofia section allowed for detailed investigation of sedimentary facies and large-scale architecture of the deepsea channel conglomerate. The conglomerate in Lago Sofia section comprises stratified conglomerate, massive-to-graded conglomerate, and diamictite, which represent bedload deposition under turbidity currents, deposition by high-density turbidity currents, and muddy debris flows, respectively. Paleocurrent data suggest that the debris flows originated from the failure of nearby channel banks or slopes flanking the channel system, whereas the turbidity currents flowed parallel to the orientation of the overall channel system. Architectural elements produced by turbidity currents represent vertical stacking of gravel sheets, lateral accretion of gravel bars, migration of gravel dunes, and filling of channel thalwegs and scoured hollows, similar to those in terrestrial gravel-bed braided rivers. Observations of large-scale stratal pattern reveal that the channel bodies are offset stacked toward the east, suggestive of an eastward migration of the axial trunk channel. The eastward channel migration is probably due to tectonic tilting related to the uplift of the Andean protocordillera just west of the Lago Sofia deep-sea channel system.

  • PDF

Spatial Analysis of Typhoon Genesis Distribution based on IPCC AR5 RCP 8.5 Scenario (IPCC AR5 RCP 8.5 시나리오 기반 태풍발생 공간분석)

  • Lee, Sungsu;Kim, Ga Young
    • Spatial Information Research
    • /
    • v.22 no.4
    • /
    • pp.49-58
    • /
    • 2014
  • Natural disasters of large scale such as typhoon, heat waves and snow storm have recently been increased because of climate change according to global warming which is most likely caused by greenhouse gas in the atmosphere. Increase of greenhouse gases concentration has caused the augmentation of earth's surface temperature, which raised the frequency of incidences of extreme weather in northern hemisphere. In this paper, we present spatial analysis of future typhoon genesis based on IPCC AR5 RCP 8.5 scenario, which applied latest carbon dioxide concentration trend. For this analysis, we firstly calculated GPI using RCP 8.5 monthly data during 1982~2100. By spatially comparing the monthly averaged GPIs and typhoon genesis locations of 1982~2010, a probability density distribution(PDF) of the typhoon genesis was estimated. Then, we defined 0.05GPI, 0.1GPI and 0.15GPI based on the GPI ranges which are corresponding to probability densities of 0.05, 0.1 and 0.15, respectively. Based on the PDF-related GPIs, spatial distributions of probability on the typhoon genesis were estimated for the periods of 1982~2010, 2011~2040, 2041~2070 and 2071~2100. Also, we analyzed area density using historical genesis points and spatial distributions. As the results, Philippines' east area corresponding to region of latitude $10^{\circ}{\sim}20^{\circ}$ shows high typhoon genesis probability in future. Using this result, we expect to estimate the potential region of typhoon genesis in the future and to develop the genesis model.