• Title/Summary/Keyword: Modeling-based learning

Search Result 746, Processing Time 0.026 seconds

Subject-Balanced Intelligent Text Summarization Scheme (주제 균형 지능형 텍스트 요약 기법)

  • Yun, Yeoil;Ko, Eunjung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.141-166
    • /
    • 2019
  • Recently, channels like social media and SNS create enormous amount of data. In all kinds of data, portions of unstructured data which represented as text data has increased geometrically. But there are some difficulties to check all text data, so it is important to access those data rapidly and grasp key points of text. Due to needs of efficient understanding, many studies about text summarization for handling and using tremendous amounts of text data have been proposed. Especially, a lot of summarization methods using machine learning and artificial intelligence algorithms have been proposed lately to generate summary objectively and effectively which called "automatic summarization". However almost text summarization methods proposed up to date construct summary focused on frequency of contents in original documents. Those summaries have a limitation for contain small-weight subjects that mentioned less in original text. If summaries include contents with only major subject, bias occurs and it causes loss of information so that it is hard to ascertain every subject documents have. To avoid those bias, it is possible to summarize in point of balance between topics document have so all subject in document can be ascertained, but still unbalance of distribution between those subjects remains. To retain balance of subjects in summary, it is necessary to consider proportion of every subject documents originally have and also allocate the portion of subjects equally so that even sentences of minor subjects can be included in summary sufficiently. In this study, we propose "subject-balanced" text summarization method that procure balance between all subjects and minimize omission of low-frequency subjects. For subject-balanced summary, we use two concept of summary evaluation metrics "completeness" and "succinctness". Completeness is the feature that summary should include contents of original documents fully and succinctness means summary has minimum duplication with contents in itself. Proposed method has 3-phases for summarization. First phase is constructing subject term dictionaries. Topic modeling is used for calculating topic-term weight which indicates degrees that each terms are related to each topic. From derived weight, it is possible to figure out highly related terms for every topic and subjects of documents can be found from various topic composed similar meaning terms. And then, few terms are selected which represent subject well. In this method, it is called "seed terms". However, those terms are too small to explain each subject enough, so sufficient similar terms with seed terms are needed for well-constructed subject dictionary. Word2Vec is used for word expansion, finds similar terms with seed terms. Word vectors are created after Word2Vec modeling, and from those vectors, similarity between all terms can be derived by using cosine-similarity. Higher cosine similarity between two terms calculated, higher relationship between two terms defined. So terms that have high similarity values with seed terms for each subjects are selected and filtering those expanded terms subject dictionary is finally constructed. Next phase is allocating subjects to every sentences which original documents have. To grasp contents of all sentences first, frequency analysis is conducted with specific terms that subject dictionaries compose. TF-IDF weight of each subjects are calculated after frequency analysis, and it is possible to figure out how much sentences are explaining about each subjects. However, TF-IDF weight has limitation that the weight can be increased infinitely, so by normalizing TF-IDF weights for every subject sentences have, all values are changed to 0 to 1 values. Then allocating subject for every sentences with maximum TF-IDF weight between all subjects, sentence group are constructed for each subjects finally. Last phase is summary generation parts. Sen2Vec is used to figure out similarity between subject-sentences, and similarity matrix can be formed. By repetitive sentences selecting, it is possible to generate summary that include contents of original documents fully and minimize duplication in summary itself. For evaluation of proposed method, 50,000 reviews of TripAdvisor are used for constructing subject dictionaries and 23,087 reviews are used for generating summary. Also comparison between proposed method summary and frequency-based summary is performed and as a result, it is verified that summary from proposed method can retain balance of all subject more which documents originally have.

Vegetation classification based on remote sensing data for river management (하천 관리를 위한 원격탐사 자료 기반 식생 분류 기법)

  • Lee, Chanjoo;Rogers, Christine;Geerling, Gertjan;Pennin, Ellis
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.6-7
    • /
    • 2021
  • Vegetation development in rivers is one of the important issues not only in academic fields such as geomorphology, ecology, hydraulics, etc., but also in river management practices. The problem of river vegetation is directly connected to the harmony of conflicting values of flood management and ecosystem conservation. In Korea, since the 2000s, the issue of river vegetation and land formation has been continuously raised under various conditions, such as the regulating rivers downstream of the dams, the small eutrophicated tributary rivers, and the floodplain sites for the four major river projects. In this background, this study proposes a method for classifying the distribution of vegetation in rivers based on remote sensing data, and presents the results of applying this to the Naeseong Stream. The Naeseong Stream is a representative example of the river landscape that has changed due to vegetation development from 2014 to the latest. The remote sensing data used in the study are images of Sentinel 1 and 2 satellites, which is operated by the European Aerospace Administration (ESA), and provided by Google Earth Engine. For the ground truth, manually classified dataset on the surface of the Naeseong Stream in 2016 were used, where the area is divided into eight types including water, sand and herbaceous and woody vegetation. The classification method used a random forest classification technique, one of the machine learning algorithms. 1,000 samples were extracted from 10 pre-selected polygon regions, each half of them were used as training and verification data. The accuracy based on the verification data was found to be 82~85%. The model established through training was also applied to images from 2016 to 2020, and the process of changes in vegetation zones according to the year was presented. The technical limitations and improvement measures of this paper were considered. By providing quantitative information of the vegetation distribution, this technique is expected to be useful in practical management of vegetation such as thinning and rejuvenation of river vegetation as well as technical fields such as flood level calculation and flow-vegetation coupled modeling in rivers.

  • PDF

Modeling Study of Development of Dying Well Education Program for the Medical Personnel in Korea (의료진 대상 웰 다잉 교육프로그램 개발을 위한 모델링에 관한 연구)

  • Kim, Kwang-Hwan;Kim, Yong-Ha;Ahn, Sang-Yoon;Lee, Chong Hyung;Hwang, Hye-Jeong;Lee, Moo-Sik;Kim, Moon-Joon;Park, Arma;Shim, Moon-Sook;Song, Hyeon-Dong
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.15 no.10
    • /
    • pp.6234-6241
    • /
    • 2014
  • The purpose of this study was to examine the status of medical staff stress and accommodating manners on the death of patients in a hospital setting for serving the basic information to develop a death education program of medical personnel from April 1 to April 30, 2014. A survey was performed on 353 medical personnel at K university hospital, located in Daejeon metropolitan city. Frequency analysis, chi-square test, and independent t-test were used to analyze the data. The results showed that 'to understand the value of the time and preparedness of a meaningful future' were the most important perspectives on the contents of death education (p<0.05), 'in order to change perceptions and attitudes toward death positively' was the most important reason why they required death education'(p<0.05), 'case-based teaching and problem-based learning' was the most effective way of death education (p<0.05), 'negative or hostile response of a patient's guardian to medical personnel' was the largest stress that medical personnel confront upon witnessing a death'(p<0.05). An understanding of the death of patients by medical personnel and an awareness of the need for death education will help improve the understanding of the patient, their guardian, and medical personnel themselves. The main findings will contribute to the development of a specific death education program on the medical personnel in a hospital setting.

The Dynamics of CO2 Budget in Gwangneung Deciduous Old-growth Forest: Lessons from the 15 years of Monitoring (광릉 낙엽활엽수 노령림의 CO2 수지 역학: 15년 관측으로부터의 교훈)

  • Yang, Hyunyoung;Kang, Minseok;Kim, Joon;Ryu, Daun;Kim, Su-Jin;Chun, Jung-Hwa;Lim, Jong-Hwan;Park, Chan Woo;Yun, Soon Jin
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.23 no.4
    • /
    • pp.198-221
    • /
    • 2021
  • After large-scale reforestation in the 1960s and 1970s, forests in Korea have gradually been aging. Net ecosystem CO2 exchange of old-growth forests is theoretically near zero; however, it can be a CO2 sink or source depending on the intervention of disturbance or management. In this study, we report the CO2 budget dynamics of the Gwangneung deciduous old-growth forest (GDK) in Korea and examined the following two questions: (1) is the preserved GDK indeed CO2 neutral as theoretically known? and (2) can we explain the dynamics of CO2 budget by the common mechanisms reported in the literature? To answer, we analyzed the 15-year long CO2 flux data measured by eddy covariance technique along with other biometeorological data at the KoFlux GDK site from 2006 to 2020. The results showed that (1) GDK switched back-and-forth between sink and source of CO2 but averaged to be a week CO2 source (and turning to a moderate CO2 source for the recent five years) and (2) the interannual variability of solar radiation, growing season length, and leaf area index showed a positive correlation with that of gross primary production (GPP) (R2=0.32~0.45); whereas the interannual variability of both air and surface temperature was not significantly correlated with that of ecosystem respiration (RE). Furthermore, the machine learning-based model trained using the dataset of early monitoring period (first 10 years) failed to reproduce the observed interannual variations of GPP and RE for the recent five years. Biomass data analysis suggests that carbon emissions from coarse woody debris may have contributed partly to the conversion to a moderate CO2 source. To properly understand and interpret the long-term CO2 budget dynamics of GDK, new framework of analysis and modeling based on complex systems science is needed. Also, it is important to maintain the flux monitoring and data quality along with the monitoring of coarse woody debris and disturbances.

Importance of End User's Feedback Seeking Behavior for Faithful Appropriation of Information Systems in Small and Medium Enterprises (중소기업 환경에서의 합목적적 정보시스템 활용을 위한 최종사용자 피드백 탐색행위의 중요성)

  • Shin, Young-Mee;Lee, Joo-Ryang;Lee, Ho-Geun
    • Asia pacific journal of information systems
    • /
    • v.17 no.4
    • /
    • pp.61-95
    • /
    • 2007
  • Small-and-medium sized enterprises(SMEs) represent quite a large proportion of the industry as a whole in terms of the number of enterprises or employees. However researches on information system so far have focused on large companies, probably because SMEs were not so active in introducing information systems as larger enterprises. SMEs are now increasingly bringing in information systems such as ERP(Enterprise Resource Planning Systems) and some of the companies already entered the stage of ongoing use. Accordingly, researches should deal with the use of information systems by SME s operating under different conditions from large companies. This study examined factors and mechanism inducing faithful appropriation of information systems, in particular integrative systems such as ERP, in view of individuals` active feedback-seeking behavior. There are three factors expected to affect end users` feedback-seeking behavior for faithful appropriation of information systems. They are management support, peer IT champ support, and IT staff support. The main focus of the study is on how these factors affect feedback-seeking behavior and whether the feedback-seeking behavior plays the role of mediator for realizing faithful appropriation of information systems by end users. To examine the research model and the hypotheses, this study employed an empirical method based on a field survey. The survey used measurements mostly employed and verified by previous researches, while some of the measurements had gone through minor modifications for the purpose of the study. The survey respondents are individual employees of SMEs that have been using ERP for one year or longer. To prevent common method bias, Task-Technology Fit items used as the control variable were made to be answered by different respondents. In total, 127 pairs of valid questionnaires were collected and used for the analysis. The PLS(Partial Least Squares) approach to structural equation modeling(PLS-Graph v.3.0) was used as our data analysis strategy because of its ability to model both formative and reflective latent constructs under small-and medium-size samples. The analysis shows Reliability, Construct Validity and Discriminant Validity are appropriate. The path analysis results are as follows; first, the more there is peer IT champ support, the more the end user is likely to show feedback-seeking behavior(path-coefficient=0.230, t=2.28, p<0.05). In other words, if colleagues proficient in information system use recognize the importance of their help, pass on what they have found to be an effective way of using the system or correct others' misuse, ordinary end users will be able to seek feedback on the faithfulness of their appropriation of information system without hesitation, because they know the convenience of getting help. Second, management support encourages ordinary end users to seek more feedback(path-coefficient=0.271, t=3.06, p<0.01) by affecting the end users' perceived value of feedback(path-coefficient=0.401, t=6.01, p<0.01). Management support is far more influential than other factors that when the management of an SME well understands the benefit of ERP, promotes its faithful appropriation and pays attention to employees' satisfaction with the system, employees will make deliberate efforts for faithful appropriation of the system. However, the third factor, IT staff support was found not to be conducive to feedback-seeking behavior from end users(path-coefficient=0.174, t=1.83). This is partly attributable to the fundamental reason that there is little support for end users from IT staff in SMEs. Even when IT staff provides support, end users may find it less important than that from coworkers more familiar with the end users' job. Meanwhile, the more end users seek feedback and attempt to find ways of faithful appropriation of information systems, the more likely the users will be able to deploy the system according to the purpose the system was originally meant for(path-coefficient=0.35, t=2.88, p<0.01). Finally, the mediation effect analysis confirmed the mediation effect of feedback-seeking behavior. By confirming the mediation effect of feedback-seeking behavior, this study draws attention to the importance of feedback-seeking behavior that has long been overlooked in research about information system use. This study also explores the factors that promote feedback-seeking behavior which in result could affect end user`s faithful appropriation of information systems. In addition, this study provides insight about which inducements or resources SMEs should offer to promote individual users' feedback-seeking behavior when formal and sufficient support from IT staff or an outside information system provider is hardly expected. As the study results show, under the business environment of SMEs, help from skilled colleagues and the management plays a critical role. Therefore, SMEs should seriously consider how to utilize skilled peer information system users, while the management should pay keen attention to end users and support them to make the most of information systems.

The Intelligent Determination Model of Audience Emotion for Implementing Personalized Exhibition (개인화 전시 서비스 구현을 위한 지능형 관객 감정 판단 모형)

  • Jung, Min-Kyu;Kim, Jae-Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.1
    • /
    • pp.39-57
    • /
    • 2012
  • Recently, due to the introduction of high-tech equipment in interactive exhibits, many people's attention has been concentrated on Interactive exhibits that can double the exhibition effect through the interaction with the audience. In addition, it is also possible to measure a variety of audience reaction in the interactive exhibition. Among various audience reactions, this research uses the change of the facial features that can be collected in an interactive exhibition space. This research develops an artificial neural network-based prediction model to predict the response of the audience by measuring the change of the facial features when the audience is given stimulation from the non-excited state. To present the emotion state of the audience, this research uses a Valence-Arousal model. So, this research suggests an overall framework composed of the following six steps. The first step is a step of collecting data for modeling. The data was collected from people participated in the 2012 Seoul DMC Culture Open, and the collected data was used for the experiments. The second step extracts 64 facial features from the collected data and compensates the facial feature values. The third step generates independent and dependent variables of an artificial neural network model. The fourth step extracts the independent variable that affects the dependent variable using the statistical technique. The fifth step builds an artificial neural network model and performs a learning process using train set and test set. Finally the last sixth step is to validate the prediction performance of artificial neural network model using the validation data set. The proposed model is compared with statistical predictive model to see whether it had better performance or not. As a result, although the data set in this experiment had much noise, the proposed model showed better results when the model was compared with multiple regression analysis model. If the prediction model of audience reaction was used in the real exhibition, it will be able to provide countermeasures and services appropriate to the audience's reaction viewing the exhibits. Specifically, if the arousal of audience about Exhibits is low, Action to increase arousal of the audience will be taken. For instance, we recommend the audience another preferred contents or using a light or sound to focus on these exhibits. In other words, when planning future exhibitions, planning the exhibition to satisfy various audience preferences would be possible. And it is expected to foster a personalized environment to concentrate on the exhibits. But, the proposed model in this research still shows the low prediction accuracy. The cause is in some parts as follows : First, the data covers diverse visitors of real exhibitions, so it was difficult to control the optimized experimental environment. So, the collected data has much noise, and it would results a lower accuracy. In further research, the data collection will be conducted in a more optimized experimental environment. The further research to increase the accuracy of the predictions of the model will be conducted. Second, using changes of facial expression only is thought to be not enough to extract audience emotions. If facial expression is combined with other responses, such as the sound, audience behavior, it would result a better result.