• Title/Summary/Keyword: Open data mining

Search Result 118, Processing Time 0.026 seconds

New economic policy uncertainty indexes for South Korea (새로운 우리나라 불확실성 지수의 작성)

  • Lee, Geung-Hee;Cho, Joo-Hee;Jo, Jin-Gyeong
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.5
    • /
    • pp.639-653
    • /
    • 2020
  • Baker et al. (Quarterly Journal of Economics, 134, 1593-1636, 2016) developed an Economic Policy Uncertainty (EPU) index for South Korea in the same way as the U.S. EPU Index. However, the South Korean EPU index of Baker et al. (2016) has limitations as it did not fully reflect South Korean situation in terms of keyword selection and the selection of newspapers. We develop monthly South Korean economic policy uncertainty indexes with different keywords and news media. Various analyses have been conducted in order to examine the usefulness of the newly compiled indexes.

Comparative chloroplast genomics and phylogenetic analysis of the Viburnum dilatatum complex (Adoxaceae) in Korea

  • PARK, Jongsun;XI, Hong;OH, Sang-Hun
    • Korean Journal of Plant Taxonomy
    • /
    • v.50 no.1
    • /
    • pp.8-16
    • /
    • 2020
  • Complete chloroplast genome sequences provide detailed information about any structural changes of the genome, instances of phylogenetic reconstruction, and molecular markers for fine-scale analyses. Recent developments of next-generation sequencing (NGS) tools have led to the rapid accumulation of genomic data, especially data pertaining to chloroplasts. Short reads deposited in public databases such as the Sequence Read Archive of the NCBI are open resources, and the corresponding chloroplast genomes are yet to be completed. The V. dilatatum complex in Korea consists of four morphologically similar species: V. dilatatum, V. erosum, V. japonicum, and V. wrightii. Previous molecular phylogenetic analyses based on several DNA regions did not resolve the relationship at the species level. In order to examine the level of variation of the chloroplast genome in the V. dilatatum complex, raw reads of V. dilatatum deposited in the NCBI database were used to reconstruct the whole chloroplast genome, with these results compared to the genomes of V. erosum, V. japonicum, and three other species in Viburnum. These comparative genomics results found no significant structural changes in Viburnum. The degree of interspecific variation among the species in the V. dilatatum complex is very low, suggesting that the species of the complex may have been differentiated recently. The species of the V. dilatatum complex share large unique deletions, providing evidence of close relationships among the species. A phylogenetic analysis of the entire genome of the Viburnum showed that V. dilatatum is a sister to one of two accessions of V. erosum, making V. erosum paraphyletic. Given that the overall degree of variation among the species in the V. dilatatum complex is low, the chloroplast genome may not provide a phylogenetic signal pertaining to relationships among the species.

Tectonic Structure Modeling around the Ulleung Basin and Dokdo Using Potential Data (포텐셜 자료를 이용한 울릉분지와 독도 주변 지체구조 연구)

  • Park, Gye-Soon;Park, Jun-Suk;Kwon, Byung-Doo;Kim, Chang-Hwan;Park, Chan-Hong
    • Journal of the Korean earth science society
    • /
    • v.30 no.2
    • /
    • pp.165-175
    • /
    • 2009
  • The East Sea including the area of this study is identified as a typical back-arc sea located in the backside of the Circum-Pacific volcanic and earthquake belt. Previous studies reported that the East Sea has begun to open by tensile force and formed its current shape. In this study, we investigate the regional tectonic structure of the East Sea using ship-borne gravity, magnetic, and satellite gravity data. The result of three-dimensional depth inversion shows that Moho depth of the study area is approximately 13-25km and inversely proportional to the thickness of the crust. In addition, as approaching to the center of the Ulleung Basin (UB), the thickness of the crust of the UB becomes thinner due to the extension caused by tensile force which had opened the East Sea.

Analyzing Research Trends on Research Support Services Using Topic Modeling (토픽모델링을 활용한 국내외 연구지원서비스 연구동향 분석)

  • Ji Soo Kim;Yoo Kyung Jeong
    • Journal of the Korean Society for information Management
    • /
    • v.41 no.3
    • /
    • pp.309-330
    • /
    • 2024
  • This study aims to identify and compare the primary research topics in domestic and international research support services through topic modeling. The analysis revealed 12 major topics in domestic studies and 15 in international ones. The findings highlight the need for in-depth research on digital technology in open access, data management, research data management in university libraries, and digital research support services. Furthermore, the need for further research has been identified to analyze specific types of digital research support services and to explore the evolving role of information professionals in research data management. This study is significant in that it comprehensively analyzes existing research and provides guidance for future research directions.

An Efficient Estimation of Place Brand Image Power Based on Text Mining Technology (텍스트마이닝 기반의 효율적인 장소 브랜드 이미지 강도 측정 방법)

  • Choi, Sukjae;Jeon, Jongshik;Subrata, Biswas;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.113-129
    • /
    • 2015
  • Location branding is a very important income making activity, by giving special meanings to a specific location while producing identity and communal value which are based around the understanding of a place's location branding concept methodology. Many other areas, such as marketing, architecture, and city construction, exert an influence creating an impressive brand image. A place brand which shows great recognition to both native people of S. Korea and foreigners creates significant economic effects. There has been research on creating a strategically and detailed place brand image, and the representative research has been carried out by Anholt who surveyed two million people from 50 different countries. However, the investigation, including survey research, required a great deal of effort from the workforce and required significant expense. As a result, there is a need to make more affordable, objective and effective research methods. The purpose of this paper is to find a way to measure the intensity of the image of the brand objective and at a low cost through text mining purposes. The proposed method extracts the keyword and the factors constructing the location brand image from the related web documents. In this way, we can measure the brand image intensity of the specific location. The performance of the proposed methodology was verified through comparison with Anholt's 50 city image consistency index ranking around the world. Four methods are applied to the test. First, RNADOM method artificially ranks the cities included in the experiment. HUMAN method firstly makes a questionnaire and selects 9 volunteers who are well acquainted with brand management and at the same time cities to evaluate. Then they are requested to rank the cities and compared with the Anholt's evaluation results. TM method applies the proposed method to evaluate the cities with all evaluation criteria. TM-LEARN, which is the extended method of TM, selects significant evaluation items from the items in every criterion. Then the method evaluates the cities with all selected evaluation criteria. RMSE is used to as a metric to compare the evaluation results. Experimental results suggested by this paper's methodology are as follows: Firstly, compared to the evaluation method that targets ordinary people, this method appeared to be more accurate. Secondly, compared to the traditional survey method, the time and the cost are much less because in this research we used automated means. Thirdly, this proposed methodology is very timely because it can be evaluated from time to time. Fourthly, compared to Anholt's method which evaluated only for an already specified city, this proposed methodology is applicable to any location. Finally, this proposed methodology has a relatively high objectivity because our research was conducted based on open source data. As a result, our city image evaluation text mining approach has found validity in terms of accuracy, cost-effectiveness, timeliness, scalability, and reliability. The proposed method provides managers with clear guidelines regarding brand management in public and private sectors. As public sectors such as local officers, the proposed method could be used to formulate strategies and enhance the image of their places in an efficient manner. Rather than conducting heavy questionnaires, the local officers could monitor the current place image very shortly a priori, than may make decisions to go over the formal place image test only if the evaluation results from the proposed method are not ordinary no matter what the results indicate opportunity or threat to the place. Moreover, with co-using the morphological analysis, extracting meaningful facets of place brand from text, sentiment analysis and more with the proposed method, marketing strategy planners or civil engineering professionals may obtain deeper and more abundant insights for better place rand images. In the future, a prototype system will be implemented to show the feasibility of the idea proposed in this paper.

Twitter Issue Tracking System by Topic Modeling Techniques (토픽 모델링을 이용한 트위터 이슈 트래킹 시스템)

  • Bae, Jung-Hwan;Han, Nam-Gi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.109-122
    • /
    • 2014
  • People are nowadays creating a tremendous amount of data on Social Network Service (SNS). In particular, the incorporation of SNS into mobile devices has resulted in massive amounts of data generation, thereby greatly influencing society. This is an unmatched phenomenon in history, and now we live in the Age of Big Data. SNS Data is defined as a condition of Big Data where the amount of data (volume), data input and output speeds (velocity), and the variety of data types (variety) are satisfied. If someone intends to discover the trend of an issue in SNS Big Data, this information can be used as a new important source for the creation of new values because this information covers the whole of society. In this study, a Twitter Issue Tracking System (TITS) is designed and established to meet the needs of analyzing SNS Big Data. TITS extracts issues from Twitter texts and visualizes them on the web. The proposed system provides the following four functions: (1) Provide the topic keyword set that corresponds to daily ranking; (2) Visualize the daily time series graph of a topic for the duration of a month; (3) Provide the importance of a topic through a treemap based on the score system and frequency; (4) Visualize the daily time-series graph of keywords by searching the keyword; The present study analyzes the Big Data generated by SNS in real time. SNS Big Data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. In addition, such analysis requires the latest big data technology to process rapidly a large amount of real-time data, such as the Hadoop distributed system or NoSQL, which is an alternative to relational database. We built TITS based on Hadoop to optimize the processing of big data because Hadoop is designed to scale up from single node computing to thousands of machines. Furthermore, we use MongoDB, which is classified as a NoSQL database. In addition, MongoDB is an open source platform, document-oriented database that provides high performance, high availability, and automatic scaling. Unlike existing relational database, there are no schema or tables with MongoDB, and its most important goal is that of data accessibility and data processing performance. In the Age of Big Data, the visualization of Big Data is more attractive to the Big Data community because it helps analysts to examine such data easily and clearly. Therefore, TITS uses the d3.js library as a visualization tool. This library is designed for the purpose of creating Data Driven Documents that bind document object model (DOM) and any data; the interaction between data is easy and useful for managing real-time data stream with smooth animation. In addition, TITS uses a bootstrap made of pre-configured plug-in style sheets and JavaScript libraries to build a web system. The TITS Graphical User Interface (GUI) is designed using these libraries, and it is capable of detecting issues on Twitter in an easy and intuitive manner. The proposed work demonstrates the superiority of our issue detection techniques by matching detected issues with corresponding online news articles. The contributions of the present study are threefold. First, we suggest an alternative approach to real-time big data analysis, which has become an extremely important issue. Second, we apply a topic modeling technique that is used in various research areas, including Library and Information Science (LIS). Based on this, we can confirm the utility of storytelling and time series analysis. Third, we develop a web-based system, and make the system available for the real-time discovery of topics. The present study conducted experiments with nearly 150 million tweets in Korea during March 2013.

3D Modeling Approaches in Estimation of Resource and Production of Musan Iron Mine, North Korea (3차원 모델링을 활용한 북한 무산광산일대의 자원량 및 생산량 추정)

  • Bae, Sungji;Yu, Jaehyung;Koh, Sang-Mo;Heo, Chul-Ho
    • Economic and Environmental Geology
    • /
    • v.48 no.5
    • /
    • pp.391-400
    • /
    • 2015
  • Korea is a global steel producer and a major consumer while iron ore producing is very low compared to the demand. On the other hand, North Korea holds tremendous amount of iron reserves and, however, its producing rate is limited. Moreover, the data regarding mineral resources of North Korea is very limited and uncertain because of political isolation. This study estimated the amount of iron ore resource and production amount for the Musan Iron mine, the world-known open-pit mine of North Korea, using satellite imagery(Landsat MSS, ASTER) and digital maps between 1976 to 2007. As a result, the mining area of Musan mine was increased by $6.1km^2$ during the 30 years and the mining sector was estimated as $4.9km^2$. We estimated the iron resources and production amount of 0.7 and 0.2 billion metric tons, respectively based on 3D modeling and average iron ore density of Anshan formation in China. This amount indicates 8.1 million tons of annual average production and it coincides well with previous reports. We expect this study would be utilized significantly on inter-Korean exchange programs by providing trustable preliminary data.

A Delphi Study on Competencies of Future Green Architectural Engineer (근미래 친환경 건축분야 엔지니어에게 필요한 역량에 대한 델파이 연구)

  • Kang, So Yeon;Kim, Taeyeon;Lee, Jungwoo
    • Journal of Engineering Education Research
    • /
    • v.21 no.3
    • /
    • pp.56-65
    • /
    • 2018
  • With rapid advance of technologies including information and communication technologies, jobs are evolving faster than ever. Architectural engineering is no exception in this regard, and the green architectural engineering is emerging fast as a promising new field. In this study, a Delphi study of expert architectural engineers are conducted to find out (1) near future prospects of the field, (2) near future emerging jobs, (3) competencies needed for these jobs, and (4) educational content necessary to build these competencies with regards to the green architectural engineering. Initial Delphi survey consisting of open-ended questions in the above four areas were conducted and came out with 65 items after duplicate removal and semantic refinements. Further refinements via second and third wave of Delphi results into 40 items that the 13 architectural engineering experts may largely agree upon as future prospects with regards to the green architectural engineering. Findings indicate that it is expected that the demand for green architectural engineering and needs for automatic energy control system increase. Also, collaborations with other fields is becoming more and more important in green architectural engineering. The professional work management skills such as knowledge convergence, problem solving, collaboration skills, and creativity linking components from various related areas seem to also be on the increasing need. Near future ready critical skills are found to be the building environment control techniques (thermal, light, sound, and air), the data processing techniques like data mining, energy monitoring, and the control and utilization of environmental analysis software. Experts also agree on new curriculum for green building architecture to be developed with more of converging subjects across disciplines for future ready professional skills and experiences. Major topics to be covered in the near future includes building environment studies, building energy management, energy reduction systems, indoor air quality, global environment and natural phenomena, and machinery and electrical facility. Architectural engineering community should be concerned with building up the competencies identified in this Delphi preparing for fast advancing future.

Analysis of Behavior of Seoullo 7017 Visitors - With a Focus on Text Mining and Social Network Analysis - (서울로 7017 방문자들의 이용행태 분석 -텍스트 마이닝과 소셜 네트워크 분석을 중심으로-)

  • Woo, Kyung-Sook;Suh, Joo-Hwan
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.48 no.6
    • /
    • pp.16-24
    • /
    • 2020
  • The purpose of this study is to analyze the usage behavior of Seoullo 7017, the first public garden in Korea, to understand the usage status by analyzing blogs, and to present usage behavior and improvement plans for Seoullo 7017. From June 2017 to May 2020, after Seoullo 7017 was open to citizens, character data containing 'Seoullo 7017' in the title and contents of NAVER and·DAUM blogs were converted to text mining and socialization, a Big Data technique. The analysis was conducted using social network analysis. The summary of the research results is as follows. First of all, the ratio of men and women searching for Seoullo 7017 online is similar, and the regions that searched most are in the order of Seoul and Gyeonggi, and those in their 40s and 50s were the most interested. In other words, it can be seen that there is a lack of interest in regions other than Seoul and Gyeonggi and among those in their 10s, 20s, and 30s. The main behaviors of Seoullo 7017 are' night view' and 'walking', and the factors that affect culture and art are elements related to culture and art. If various programs and festivals are opened and actively promoted, the main behavior will be more varied. On the other hand, the main behavior that the users of Seoullo 7017 want is 'sit', which is a static behavior, but the physical conditions are not sufficient for the behavior to occur. Therefore, facilities that can cause sitting behavior, such as shades and benches must be improved to meet the needs of visitors. The peculiarity of the change in the behavior of Seoullo 7017 is that it is recognized as a good place to travel alone and a good place to walk alone as a public multi-use facility and group activities are restricted due to COVID-19. Accordingly, in a situation like the COVD-19 pandemic, more diverse behaviors can be derived in facilities where people can take a walk, etc., and the increase of various attractions and the satisfaction of users can be increased. Seoullo 7017, as Korea's first public pedestrian area, was created for urban regeneration and the efficient use of urban resources in areas beyond the meaning of public spaces and is a place with various values such as history, nature, welfare, culture, and tourism. However, as a result of the use behavior analysis, various behaviors did not occur in Seoullo 7017 as expected, and elements that hinder those major behaviors were derived. Based on these research results, it is necessary to understand the usage behavior of Seoullo 7017 and to establish a plan for spatial system and facility improvement, so that Seoullo 7017 can be an important place for urban residents and a driving force to revitalize the city.

Estimates of the Number of Workers Exposed to Diesel Engine Exhaust in South Korea from 1993 to 2013

  • Choi, Sangjun;Park, Donguk;Kim, Seung Won;Ha, Kwonchul;Jung, Hyejung;Yi, Gwangyong;Koh, Dong-Hee;Park, Deokmook;Sun, Oknam;Uuksulainen, Sanni
    • Safety and Health at Work
    • /
    • v.7 no.4
    • /
    • pp.372-380
    • /
    • 2016
  • Background: The aim of this study was to estimate the number of workers exposed to diesel engine exhaust (DEE) by industry and year in the Republic of Korea. Method: The estimates of workers potentially exposed to DEE in the Republic of Korea were calculated by industry on the basis of the carcinogen exposure (CAREX) surveillance system. The data on the labor force employed in DEE exposure industries were obtained from the Census on Establishments conducted by the Korea National Statistical Office from 1993 to 2013. The mean values of prevalence rates adopted by EU15 countries were used as the primary exposure prevalence rates. We also investigated the exposure prevalence rates and exposure characteristics of DEE in 359 workplaces representing 11 industries. Results: The total number of workers exposed to DEE were estimated as 270,014 in 1993 and 417,034 in 2013 (2.2% of the total labor force). As of 2013, the industry categorized as "Land transport" showed the highest number of workers exposed to DEE with 174,359, followed by "Personal and household services" with 70,298, "Construction" with 45,555, "Wholesale and retail trade and restaurants and hotels" with 44,005, and "Sanitation and similar services" with 12,584. These five industries, with more than 10,000 workers exposed to DEE, accounted for 83% of the total DEE-exposed workers. Comparing primary prevalence rates used for preliminary estimation among 49 industries, "Metal ore mining" had the highest rate at 52.6%, followed by "Other mining" with 50.0%, and "Land transport" with 23.6%. Conclusion: The DEE prevalence rates we surveyed (1.3-19.8%) were higher than the primary prevalence rates. The most common emission sources of DEE were diesel engine vehicles such as forklifts, trucks, and vans. Our estimated numbers of workers exposed to DEE can be used to identify industries with workers requiring protection from potential exposure to DEE in the Republic of Korea.