• Title/Summary/Keyword: Text data

Search Result 2,956, Processing Time 0.029 seconds

Development of a gridded crop growth simulation system for the DSSAT model using script languages (스크립트 언어를 사용한 DSSAT 모델 기반 격자형 작물 생육 모의 시스템 개발)

  • Yoo, Byoung Hyun;Kim, Kwang Soo;Ban, Ho-Young
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.20 no.3
    • /
    • pp.243-251
    • /
    • 2018
  • The gridded simulation of crop growth, which would be useful for shareholders and policy makers, often requires specialized computation tasks for preparation of weather input data and operation of a given crop model. Here we developed an automated system to allow for crop growth simulation over a region using the DSSAT (Decision Support System for Agrotechnology Transfer) model. The system consists of modules implemented using R and shell script languages. One of the modules has a functionality to create weather input files in a plain text format for each cell. Another module written in R script was developed for GIS data processing and parallel computing. The other module that launches the crop model automatically was implemented using the shell script language. As a case study, the automated system was used to determine the maximum soybean yield for a given set of management options in Illinois state in the US. The AgMERRA dataset, which is reanalysis data for agricultural models, was used to prepare weather input files during 1981 - 2005. It took 7.38 hours to create 1,859 weather input files for one year of soybean growth simulation in Illinois using a single CPU core. In contrast, the processing time decreased considerably, e.g., 35 minutes, when 16 CPU cores were used. The automated system created a map of the maturity group and the planting date that resulted in the maximum yield in a raster data format. Our results indicated that the automated system for the DSSAT model would help spatial assessments of crop yield at a regional scale.

Pattern Analysis for Civil Complaints of Local Governments Using a Text Mining (텍스트마이닝에 의한 지자체 민원청구 패턴 분석)

  • Won, Tae Hong;Yoo, Hwan Hee
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.34 no.3
    • /
    • pp.319-327
    • /
    • 2016
  • Korea faces a wide range of problems in areas such as safety, environment, and traffic due to the rapid economic development and urbanization process. Despite the local governments’ efforts to deal with electronic civil complaints and solve urban problems, civil complaints have been on the increase year by year. In this study, we collected civil complaint data over the last six years from a small and medium-sized city, Jinju-si. In order to conduct a spatial distribution pattern analysis, we indicated the location data on the area through Geocoding after classifying the reasons for civil complaints and then extracted the location data of the civil complaint occurrence spots in order to analyze the correlation between electronic civil complaints and land use. Results demonstrated that electronic civil complaints in Jinju-si were clustered in residential, central commercial, and residential-industrial mixed-use areas—areas where land development had been completed within the city center. After analyzing the civil complaints according to the land use, results revealed that complaints about illegal parking were the highest. Regarding the analysis results of facility distribution within a 50m radius from the civil complaint areas, civil complaints occurred a lot in detached housing areas located within the commercial and residential-industrial mixed-use areas. In the case of residential areas(old downtown), civil complaints were condensed in the areas with many ordinary restaurants. This research explored civil complaints in terms of the urban space and can be expected to be effectively utilized in finding solutions to the civil complaints

Analysis of Voice Color Similarity for the development of HMM Based Emotional Text to Speech Synthesis (HMM 기반 감정 음성 합성기 개발을 위한 감정 음성 데이터의 음색 유사도 분석)

  • Min, So-Yeon;Na, Deok-Su
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.15 no.9
    • /
    • pp.5763-5768
    • /
    • 2014
  • Maintaining a voice color is important when compounding both the normal voice because an emotion is not expressed with various emotional voices in a single synthesizer. When a synthesizer is developed using the recording data of too many expressed emotions, a voice color cannot be maintained and each synthetic speech is can be heard like the voice of different speakers. In this paper, the speech data was recorded and the change in the voice color was analyzed to develop an emotional HMM-based speech synthesizer. To realize a speech synthesizer, a voice was recorded, and a database was built. On the other hand, a recording process is very important, particularly when realizing an emotional speech synthesizer. Monitoring is needed because it is quite difficult to define emotion and maintain a particular level. In the realized synthesizer, a normal voice and three emotional voice (Happiness, Sadness, Anger) were used, and each emotional voice consists of two levels, High/Low. To analyze the voice color of the normal voice and emotional voice, the average spectrum, which was the measured accumulated spectrum of vowels, was used and the F1(first formant) calculated by the average spectrum was compared. The voice similarity of Low-level emotional data was higher than High-level emotional data, and the proposed method can be monitored by the change in voice similarity.

'Elderly image' Analysis Using Big Data and Social Networking Techniques (빅데이터와 사회연결망 기법을 이용한 '노인 이미지' 분석)

  • Han, Sun-Bo;Lee, Hyun-Sim
    • The Journal of the Korea Contents Association
    • /
    • v.16 no.11
    • /
    • pp.253-263
    • /
    • 2016
  • We analyzed the social issue 'image of the elderly' using Big Data and Social Network Analysis. First, we analyzed the words extracted by the text mining technique by inputting the keyword 'elderly'. As a result of analysis, the image of the elderly viewed through media such as cafes, blogs, etc. Representing the trend of the public was using the word 'Senior' the most. The image of the elderly is expressed using the word having the highest frequency in the top 10, "The elderly are 'Senior' people who are respected by society, they are organized to earn money, to earn their qualifications, to health, and to 'Seniors' who desire to work healthy up to 100 years old". The purpose of this study is to differentiate from the existing analysis method by analyzing the macro-level image of the elderly including the social discourse by collecting vast amount of data and analyzing it with the social networking technique. When the image of the elderly that the public perceives is positively expressed as 'Senior', it can be said that the direction of the current elderly policy is evaluated as a desirable direction. On the other hand, it was able to feel the 'desire' of the public who wanted to be evaluated. Therefore, the policy direction of the elderly to be applied in the future should be the policy that enables the elderly to be perceived as 'Necessary existence' in society by taking on social roles. In addition, we proposed to implement the policy of the elderly that reflects priorities such as job creation, welfare, and alienation that can activity and maintain health.

Design of Heterogeneous Content Linkage Method by Analyzing Genbank (Genbank 분석을 통한 이종의 콘텐츠 연계 방안 설계)

  • Ahn, Bu-Young;Lee, Myung-Sun;Kim, Ji-Young;Oh, Chung-Shick
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.6
    • /
    • pp.49-54
    • /
    • 2010
  • As information on gene sequences is not only diverse but also extremely huge in volume, high-performance computer and information technology techniques are required to build and analyze gene sequence databases. This has given rise to the discipline of bioinformatics, a field of research where computers are utilized to collect, to manage, to save, to evaluate, and to analyze biological data. In line with such continued development in bioinformatics, the Korea Institute of Science and Technology Information (KISTI) has built an infrastructure for the biological information, based on the information technology, and provided the information for researchers of bioscience. This paper analyzes the reference fields of Genbank, the most frequently used gene database by the global researchers among the life information databases, and proposes the interface method to NDSL which is the science and technology information integrated service provided by KISTI. For these, after collecting Genbank data from NCBI FTP site, we rebuilt the database by separating Genbank text files into the basic gene data and the reference data. So new tables are generated through extracting the paper and patent information from Genbank reference fields. Then we suggest the method of connection with the paper DB and the patent DB operated by KISTI.

A Study on the Contemporary Definition of 'GARDEN' - Keyword Analysis used Literature Research and Big Data - ('정원'의 시대적 정의에 관한 연구 - 문헌연구와 빅데이터를 활용한 키워드 분석을 중심으로-)

  • Woo, Kyungsook;Suh, Joo Hwan
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.44 no.5
    • /
    • pp.1-11
    • /
    • 2016
  • There has been an increasingly high interest in gardens and garden design in Korea recently. However, the usage of the term 'garden' is extremely varied and complex, and there has been very little academic research made on the meaning of garden. Therefore, this research attempts to investigate the ideas of current gardens and to elucidate their changing patterns by means of extensive literature research and big data analysis. The notion of garden in the past was broad including not only private space such as Madang(마당) and Teul(뜰), but also even field and grass land as public outdoor space. Yet, the meaning has become smaller to merely private space due to the change of dwelling systems due to high industrial development of the 20th century. Furthermore, the introduction of urban parks as an interactive space between nature and humans, the similar spatial function of gardens, has blurred the boundary between garden and park, which created confusion in understanding the concept of a garden. After all, garden is a subject for humans. The meanings of garden need to be recognized from various points of view since garden itself is a creation by the sum of diverse fields such as natural and social sciences as well as culturology. This discussion on the meaning of garden in the present day will give a conceptual foundation for future research on gardens and garden design. Also, the big data analysis employed here as a research method can help other similar research topics, particularly semantics in landscape architecture.

A Multicast Middleware for the Remote Educational Systems (원격 교육 시스템을 위한 멀티캐스트 미들웨어)

  • Byun, Sang-Seon;Jin, Hyun-Wook;Yoo, Hyuck
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.9 no.1
    • /
    • pp.100-108
    • /
    • 2003
  • By choosing Multicast for transmission of educational contents in the Remote Educational System, we can reduce the server load and increase network bandwidth utilization. We design and implement Multicast Middleware for the Remote Educational System in this paper. There are three characteristics in this Multicast Middleware: 1) Through Centralized Multicast Group Management for passive members, it allows a host to make multicast group, which is composed of receivers, called Group Member and who are chosen by the host, called group Maker. Because, all groups are created by the Group Maker in Centralized Group Management, Group Member's join action will be passive 2) Maintenance and recovery of multicast group information in order to restore from exception and crash; the maintenance and recovery mechanism of Group Maker is distinct from that of Group Member. 3) The mechanism which enables to transmit large size multimedia data through multicasting and remove additional copy operation through shared buffer. Fragmentation/de-fragmentation for large data delivery results in additional copy operation in user level. But by using user level shared buffer, it can be done without user Bevel copy operation. By applying to Remote Educational environment which consists of 30 PCs and Fast Ethernet, we can examine the efficiency of this middleware, which can transmit 18frames/sec movie which resolution 320 $\times$ 120 pixels, 128Kbps encoded sound data and some text data.

A Study on the Development of GIS based Integrated Information System for Water Quality Management of Yeongsan River Estuary (영산강 하구역 수질환경 관리를 위한 GIS기반 통합정보시스템 개발에 관한 연구)

  • Lee, Sung Joo;Kim, Kye Hyun;Park, Young Gil;Lee, Geon Hwi;Yoo, Jea Hyun
    • Journal of Wetlands Research
    • /
    • v.16 no.1
    • /
    • pp.73-83
    • /
    • 2014
  • The government has recently carried out monitoring to attain a better understanding of the current situation and model for prediction of future events pertaining to water quality in the estuarine area of Yeongsan River. But many users have noted difficulties to understand and utilize the results because most monitoring and model data consist of figures and text. The aim of this study is to develop a GIS-based integrated information system to support the understanding of the current situation and prediction of future events about water quality in the estuarine area of Yeongsan River. To achieve this, a monitoring DB is assembled, a linkages model is defined, a GUI is composed, and the system development environment and system composition are defined. The monitoring data consisted of observation data from 2010 ~ 2012 in the estuarine area of Yeongsan River. The models used in the study are HSPF (Hydrological Simulation Program-Fortran) for simulation of the basin and EFDC (Environmental Fluid Dynamics Code) for simulation of the estuary and river. Ultimately, a GIS based system was presented for utilization and expression using monitoring and model data. The system supports prediction of the estuarine area ecological environment quantitatively and displays document type model simulation results in a map-based environment to enhance the user's spatial understanding. In future study, the system will be updated to include a decision making support system that is capable of handling estuary environment issues and support environmental assessment and development of related policies.

A Study on the Value Factors of Culture Consumers for Corporate Culture Marketing through Big Data Techniques (빅데이터 기법을 통한 기업 문화마케팅을 위한 문화소비자의 가치 요소 연구)

  • Oh, Se Jong
    • The Journal of the Convergence on Culture Technology
    • /
    • v.6 no.1
    • /
    • pp.31-36
    • /
    • 2020
  • Corporate Culture Marketing is a marketing tool that enhances a company's cultural image or conveys its image through culture. Culture Consumer value analysis is important predictive data in identifying the value and pursuit of life in individual consumption behavior, explaining the choice behavior of culture consumers, and serves as the basis for decision making. The research method was linked to the text mining and opinion mining techniques of big data, and extracted positive, negative and neutral words. The analysis targets culture consumers participating in concerts at Hyundai Card's 'Super Concert', which is subject to domestic consumers, and CJ ENM's 'KCON', which is subject to foreign consumers. The culture consumer value elements of corporate culture marketing are the basic conditions, and they were derived as 'Consensus Communication (Expression of Sensibility)', 'Participation Sharing(VIP Belonging)', 'Social Change Issue', 'Differentiating Services', 'Price Discount Benefit' and 'Location Quality'. In the future, we will need to foster 'Culture Technology Marketers' and apply them in areas such as arts management planning, cultural investment, cultural distribution, cultural space, Corporate Culture, CSR and K-pop marketing to enhance corporate interests and brand value and enhance brand value.

Establishment of Strategy for Management of Technology Using Data Mining Technique (데이터 마이닝을 통한 기술경영 전략 수립에 관한 연구)

  • Lee, Junseok;Lee, Joonhyuck;Kim, Gabjo;Park, Sangsung;Jang, Dongsik
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.2
    • /
    • pp.126-132
    • /
    • 2015
  • Technology forecasting is about understanding a status of a specific technology in the future, based on the current data of the technology. It is useful when planning technology management strategies. These days, it is common for countries, companies, and researchers to establish R&D directions and strategies by utilizing experts' opinions. However, this qualitative method of technology forecasting is costly and time consuming since it requires to collect a variety of opinions and analysis from many experts. In order to deal with these limitations, quantitative method of technology forecasting is being studied to secure objective forecast result and help R&D decision making process. This paper suggests a methodology of technology forecasting based on quantitative analysis. The methodology consists of data collection, principal component analysis, and technology forecasting by logistic regression, which is one of the data mining techniques. In this research, patent documents related to autonomous vehicle are collected. Then, the texts from patent documents are extracted by text mining technique to construct an appropriate form for analysis. After principal component analysis, logistic regression is performed by using principal component score. On the basis of this result, it is possible to analyze R&D development situation and technology forecasting.