• Title/Summary/Keyword: Topic map

Search Result 134, Processing Time 0.021 seconds

Multi-Vector Document Embedding Using Semantic Decomposition of Complex Documents (복합 문서의 의미적 분해를 통한 다중 벡터 문서 임베딩 방법론)

  • Park, Jongin;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.19-41
    • /
    • 2019
  • According to the rapidly increasing demand for text data analysis, research and investment in text mining are being actively conducted not only in academia but also in various industries. Text mining is generally conducted in two steps. In the first step, the text of the collected document is tokenized and structured to convert the original document into a computer-readable form. In the second step, tasks such as document classification, clustering, and topic modeling are conducted according to the purpose of analysis. Until recently, text mining-related studies have been focused on the application of the second steps, such as document classification, clustering, and topic modeling. However, with the discovery that the text structuring process substantially influences the quality of the analysis results, various embedding methods have actively been studied to improve the quality of analysis results by preserving the meaning of words and documents in the process of representing text data as vectors. Unlike structured data, which can be directly applied to a variety of operations and traditional analysis techniques, Unstructured text should be preceded by a structuring task that transforms the original document into a form that the computer can understand before analysis. It is called "Embedding" that arbitrary objects are mapped to a specific dimension space while maintaining algebraic properties for structuring the text data. Recently, attempts have been made to embed not only words but also sentences, paragraphs, and entire documents in various aspects. Particularly, with the demand for analysis of document embedding increases rapidly, many algorithms have been developed to support it. Among them, doc2Vec which extends word2Vec and embeds each document into one vector is most widely used. However, the traditional document embedding method represented by doc2Vec generates a vector for each document using the whole corpus included in the document. This causes a limit that the document vector is affected by not only core words but also miscellaneous words. Additionally, the traditional document embedding schemes usually map each document into a single corresponding vector. Therefore, it is difficult to represent a complex document with multiple subjects into a single vector accurately using the traditional approach. In this paper, we propose a new multi-vector document embedding method to overcome these limitations of the traditional document embedding methods. This study targets documents that explicitly separate body content and keywords. In the case of a document without keywords, this method can be applied after extract keywords through various analysis methods. However, since this is not the core subject of the proposed method, we introduce the process of applying the proposed method to documents that predefine keywords in the text. The proposed method consists of (1) Parsing, (2) Word Embedding, (3) Keyword Vector Extraction, (4) Keyword Clustering, and (5) Multiple-Vector Generation. The specific process is as follows. all text in a document is tokenized and each token is represented as a vector having N-dimensional real value through word embedding. After that, to overcome the limitations of the traditional document embedding method that is affected by not only the core word but also the miscellaneous words, vectors corresponding to the keywords of each document are extracted and make up sets of keyword vector for each document. Next, clustering is conducted on a set of keywords for each document to identify multiple subjects included in the document. Finally, a Multi-vector is generated from vectors of keywords constituting each cluster. The experiments for 3.147 academic papers revealed that the single vector-based traditional approach cannot properly map complex documents because of interference among subjects in each vector. With the proposed multi-vector based method, we ascertained that complex documents can be vectorized more accurately by eliminating the interference among subjects.

Visualizing the Results of Opinion Mining from Social Media Contents: Case Study of a Noodle Company (소셜미디어 콘텐츠의 오피니언 마이닝결과 시각화: N라면 사례 분석 연구)

  • Kim, Yoosin;Kwon, Do Young;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.89-105
    • /
    • 2014
  • After emergence of Internet, social media with highly interactive Web 2.0 applications has provided very user friendly means for consumers and companies to communicate with each other. Users have routinely published contents involving their opinions and interests in social media such as blogs, forums, chatting rooms, and discussion boards, and the contents are released real-time in the Internet. For that reason, many researchers and marketers regard social media contents as the source of information for business analytics to develop business insights, and many studies have reported results on mining business intelligence from Social media content. In particular, opinion mining and sentiment analysis, as a technique to extract, classify, understand, and assess the opinions implicit in text contents, are frequently applied into social media content analysis because it emphasizes determining sentiment polarity and extracting authors' opinions. A number of frameworks, methods, techniques and tools have been presented by these researchers. However, we have found some weaknesses from their methods which are often technically complicated and are not sufficiently user-friendly for helping business decisions and planning. In this study, we attempted to formulate a more comprehensive and practical approach to conduct opinion mining with visual deliverables. First, we described the entire cycle of practical opinion mining using Social media content from the initial data gathering stage to the final presentation session. Our proposed approach to opinion mining consists of four phases: collecting, qualifying, analyzing, and visualizing. In the first phase, analysts have to choose target social media. Each target media requires different ways for analysts to gain access. There are open-API, searching tools, DB2DB interface, purchasing contents, and so son. Second phase is pre-processing to generate useful materials for meaningful analysis. If we do not remove garbage data, results of social media analysis will not provide meaningful and useful business insights. To clean social media data, natural language processing techniques should be applied. The next step is the opinion mining phase where the cleansed social media content set is to be analyzed. The qualified data set includes not only user-generated contents but also content identification information such as creation date, author name, user id, content id, hit counts, review or reply, favorite, etc. Depending on the purpose of the analysis, researchers or data analysts can select a suitable mining tool. Topic extraction and buzz analysis are usually related to market trends analysis, while sentiment analysis is utilized to conduct reputation analysis. There are also various applications, such as stock prediction, product recommendation, sales forecasting, and so on. The last phase is visualization and presentation of analysis results. The major focus and purpose of this phase are to explain results of analysis and help users to comprehend its meaning. Therefore, to the extent possible, deliverables from this phase should be made simple, clear and easy to understand, rather than complex and flashy. To illustrate our approach, we conducted a case study on a leading Korean instant noodle company. We targeted the leading company, NS Food, with 66.5% of market share; the firm has kept No. 1 position in the Korean "Ramen" business for several decades. We collected a total of 11,869 pieces of contents including blogs, forum contents and news articles. After collecting social media content data, we generated instant noodle business specific language resources for data manipulation and analysis using natural language processing. In addition, we tried to classify contents in more detail categories such as marketing features, environment, reputation, etc. In those phase, we used free ware software programs such as TM, KoNLP, ggplot2 and plyr packages in R project. As the result, we presented several useful visualization outputs like domain specific lexicons, volume and sentiment graphs, topic word cloud, heat maps, valence tree map, and other visualized images to provide vivid, full-colored examples using open library software packages of the R project. Business actors can quickly detect areas by a swift glance that are weak, strong, positive, negative, quiet or loud. Heat map is able to explain movement of sentiment or volume in categories and time matrix which shows density of color on time periods. Valence tree map, one of the most comprehensive and holistic visualization models, should be very helpful for analysts and decision makers to quickly understand the "big picture" business situation with a hierarchical structure since tree-map can present buzz volume and sentiment with a visualized result in a certain period. This case study offers real-world business insights from market sensing which would demonstrate to practical-minded business users how they can use these types of results for timely decision making in response to on-going changes in the market. We believe our approach can provide practical and reliable guide to opinion mining with visualized results that are immediately useful, not just in food industry but in other industries as well.

The Effect of the Use of Concept Mapping on Science Achievement and the Scientific Attitude in Ocean Units of Earth Science (해양단원 개념도 활용 수업이 과학성취도 및 태도에 미치는 효과)

  • Han, Jung-Hwa;Kim, Kwang-Hui;Park, Soo-Kyong
    • Journal of the Korean earth science society
    • /
    • v.23 no.6
    • /
    • pp.461-473
    • /
    • 2002
  • Concept mapping is a device for representing the conceptual structure of a subject discipline in a two dimensional form which is analogous to a road map. In the teaching and learning of earth science, each concept depends on its relationships to many others for meaning. Using concept mapping in teaching helps teachers and students to be more aware of the key concepts and relationships among them. The purpose of this study is to investigate the effect of the use of concept mapping on science achievement and the scientific attitude in ocean units of earth science. The results of this study are as follows; first, the science achievement of a group of concept mapping teaching is significantly higher than that of the group of traditional teaching. Also, when the achievement levels are compared among different cognitive ability groups, the effect is more significant in mid or lower level student groups than in high level groups. The use of concept mapping is more effective when the concepts have a distinct concept hierarchy. Second, the scores of the test of ‘attitude toward scientific inquiry’ and ‘application of scientific attitude’ of the group of concept mapping teaching are significantly higher than those of the group of traditional teaching, whereas the scores of the test of ‘interest in science learning’ of concept mapping teaching is not different from those of group of traditional teaching. Third, the survey on the use of concept mapping shows a positive response across the tested groups. The use of concept mapping is more beneficial in fostering the comprehension of the topic. A concept map of student's own construction facilitates the assessment of learning, thus promising the usefulness of concept mapping as a means of evaluation. In regard to retention aspect, concept mapping is considered to be more effective in confirming and remembering the topic, while less effective in the aspects of activity and interest. In conclusion, the use of concept maps makes learning an active meaningful process and improves student's academic achievement and scientific attitude. If the concept mapping is more effectively as an active teaching strategy, more meaningful learning will be attained.

Development and Application of the Educational Program to Increase High School Students' Systems Thinking Skills - Focus on Global Warming - (고등학생들의 시스템 사고 향상을 위한 교육프로그램 개발 및 적용 - 지구온난화를 중심으로 -)

  • Lee, Hyo-Nyong;Kwon, Yong-Ju;Oh, Hee-Jin;Lee, Hyun-Dong
    • Journal of the Korean earth science society
    • /
    • v.32 no.7
    • /
    • pp.784-797
    • /
    • 2011
  • The purposes of this study are: (1) to develop educational program designed to improve high school students' knowledge integration and their system thinking skills about global warming and (2) to identify the change of students' system thinking level. The developed program was implemented to twenty seven high school students, and six students grouped into three highs and three lows in their performance were selected to analyze their level of system thinking. The word association, casual map and drawing were used to measure and identify any significant change. As a result, the low level system thinking group improved their system thinking skills for global warming and the earth and sub-systems after the intervention. However, participants' misconception remained the same. And the high level systems thinking group showed more organize system thinking skills about a global warming topic. It is suggested that more educational programs be developed on various topics in order for high school students to improve their systems thinking skills as well as knowledge integration of earth systems and earth environment in school curriculum.

A Depth-based Disocclusion Filling Method for Virtual Viewpoint Image Synthesis (가상 시점 영상 합성을 위한 깊이 기반 가려짐 영역 메움법)

  • Ahn, Il-Koo;Kim, Chang-Ick
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.6
    • /
    • pp.48-60
    • /
    • 2011
  • Nowadays, the 3D community is actively researching on 3D imaging and free-viewpoint video (FVV). The free-viewpoint rendering in multi-view video, virtually move through the scenes in order to create different viewpoints, has become a popular topic in 3D research that can lead to various applications. However, there are restrictions of cost-effectiveness and occupying large bandwidth in video transmission. An alternative to solve this problem is to generate virtual views using a single texture image and a corresponding depth image. A critical issue on generating virtual views is that the regions occluded by the foreground (FG) objects in the original views may become visible in the synthesized views. Filling this disocclusions (holes) in a visually plausible manner determines the quality of synthesis results. In this paper, a new approach for handling disocclusions using depth based inpainting algorithm in synthesized views is presented. Patch based non-parametric texture synthesis which shows excellent performance has two critical elements: determining where to fill first and determining what patch to be copied. In this work, a noise-robust filling priority using the structure tensor of Hessian matrix is proposed. Moreover, a patch matching algorithm excluding foreground region using depth map and considering epipolar line is proposed. Superiority of the proposed method over the existing methods is proved by comparing the experimental results.

Present States, Methodological Features, and an Exemplar Study of the Research on Learning Progressions (학습 발달과정 연구의 현황, 방법론적 특징 및 연구 사례)

  • Maeng, Seungho;Seong, Yeonseon;Jang, Shinho
    • Journal of The Korean Association For Science Education
    • /
    • v.33 no.1
    • /
    • pp.161-180
    • /
    • 2013
  • The purpose of this paper is to introduce the current studies and research methods about Learning Progressions disseminated to several countries including the U.S. since 2006. It also provides a methodological base to investigate learning progressions in science by introducing a case study of learning progression conducted in Korea. For this study, we described several features of current studies on learning progressions in the U.S., and reported the common ways and sequences employed in examining learning progressions especially with respect to assessment for learning. Learning progressions are descriptions of developmental pathways of learning a topic, in which science knowledge is used in students' engaging in science practices. Each learning progression consists of upper anchor, lower anchor, and intermediate steps that connect both anchors. In investigating a learning progression, researchers usually utilize Wilson's four building blocks of assessment system based on the assessment triangle. This kind of method was also applied in investigating the learning progression for water cycle in this study. We discussed implication and consideration for the future research on learning progressions in science in Korea.

Mining Intellectual History Using Unstructured Data Analytics to Classify Thoughts for Digital Humanities (디지털 인문학에서 비정형 데이터 분석을 이용한 사조 분류 방법)

  • Seo, Hansol;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.141-166
    • /
    • 2018
  • Information technology improves the efficiency of humanities research. In humanities research, information technology can be used to analyze a given topic or document automatically, facilitate connections to other ideas, and increase our understanding of intellectual history. We suggest a method to identify and automatically analyze the relationships between arguments contained in unstructured data collected from humanities writings such as books, papers, and articles. Our method, which is called history mining, reveals influential relationships between arguments and the philosophers who present them. We utilize several classification algorithms, including a deep learning method. To verify the performance of the methodology proposed in this paper, empiricists and rationalism - related philosophers were collected from among the philosophical specimens and collected related writings or articles accessible on the internet. The performance of the classification algorithm was measured by Recall, Precision, F-Score and Elapsed Time. DNN, Random Forest, and Ensemble showed better performance than other algorithms. Using the selected classification algorithm, we classified rationalism or empiricism into the writings of specific philosophers, and generated the history map considering the philosopher's year of activity.

Graph Interpretation Ability and Perception of High School Students and Preservice Secondary Teachers in Earth Science (고등학생들과 예비교사들의 지구과학 그래프 해석 능력 및 인식)

  • Lee, Jin-Bong;Lee, Ki-Young;Park, Young-Shin
    • Journal of the Korean earth science society
    • /
    • v.31 no.4
    • /
    • pp.378-391
    • /
    • 2010
  • The purpose of this study was to investigate the graph interpretation ability and perception of high school students and preservice secondary teachers in Earth science. We developed two different instruments; one was a graph interpretation ability inventory that consists of 9 graph types with 18 items, and the other one is two questionnaires to explore the participants' perception about Earth science-related graph. The results of this study are as follows: High school students and preservice secondary teachers demonstrated their remarkable ability in interpreting a line graph, but showed their limited ability with the graph of overlapped and directional change, which means the graph interpretation ability was affected by a graph type; two groups participated in this study revealed a considerable difference in the graph interpretation ability depending on the grade level; preservice teachers were superior to high school students in discriminating two graphs, the representation method, which are different with the same topic; and many participants in both groups considered that the property of Earth science graph was considerably different from that of other science subjects, especially in directional change graph, scatter graph, contour map, and domain graph. The results suggest that the effective graph instruction strategies be developed in Earth science learning.

Factors Influencing Entrepreneurs' Well-Being : Positive Psychological Capital and Antecedents (창업가의 웰빙에 미치는 영향요인 : 긍정심리자본과 선행요인)

  • Kim, Hyeong Min;Kim, Jin Soo
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.15 no.5
    • /
    • pp.203-220
    • /
    • 2020
  • The purpose of this study is to examine the importance of entrepreneurs' well-being, and to propose several ways to improve it. First, the factors influencing entrepreneurs' well-being were explored via previous literature review, and positive psychological capital based on positive psychology was selected as the main factor. In addition, the hypotheses were formulated and the research model was constructed by selecting authentic leadership, mastery goal orientation, and social support as the antecedents predictably associated with positive psychological capital. To empirically analyze the proposed model, a questionnaire survey was conducted for the entrepreneurs who fulfilled the training/coaching/consulting programs offered by the Youth Startup Academy. By using the collected 133 responses, PLS-Structural Equation Analysis, PLS-Multi Group Analysis(MGA), and PLS-Importance-Performance Map Analysis(IPMA) were performed. As a result, it was found that positive psychological capital had a high causal relationship with the well-being of entrepreneurs; and authentic leadership and mastery goal orientation had a positive effect on positive psychological capital. However, the statistical significance of the impact of social support has not been verified. While differences in causality between early and growing stages of start-ups only acted as sub-factors of positive psychological capital, the aforementioned three antecedents showed a significant difference for hope and optimism, respectively. This study expanded the topic of entrepreneurship-related research, which mostly focuses on identifying the performance factors of start-ups. It is meaningful in that it empirically verified the causality model constructed by exploring the factors with respect to the mental well-being of entrepreneurs, which is a main theme in the entrepreneurship-related research. These results found in the current study will render practical insights to entrepreneurs, researchers, and educators.

A Study on the relationship in spatial structure of senior Center in Seoul (서울시 노인종합복지관의 공간 구조적 연결 관계에 관한 연구)

  • Kim, Jin-A;Byun, Dae-Joong
    • Korean Institute of Interior Design Journal
    • /
    • v.21 no.3
    • /
    • pp.182-193
    • /
    • 2012
  • The percentage of senior citizens is increasing in Korea and it is expected to become an "aging society". Problems with the elderly are becoming a big concern, such as physical and mental illness, losing their jobs and having difficulties at home. But, the silver generation, as they are being known, has changed a lot these days. With the aid of medical developments, the elderly's lifespan has become longer, making them more independent and active. Senior Welfare Center's are places where the elderly can spend their golden years in comfort, meaningfully. Senior Welfare Center's these days provide many different programs, which naturally lead to an increase in elderly users. With the rise in welfare centers and users, research on the subject also grew. As this topic has only recently become an issue, there were not many spatial structure studies considering elderly movement. Therefore, there should be spatial structure research that considers older users space awareness and how it can be managed effectively. The goal of this study is to present basic resources for providing a comfortable senior welfare center for elders. This will be based on quantitative analysis derived from spatial structure research along with special construction characteristics based on the institution's general plan. As a research method, Senior Welfare Center's will be categorized into corridor type, hall type, and hybrid types which then be reproduced into a j-graph. Based on this, special structure characteristics and connection links will be comprehended. Then the connection link will be analyzed based on the space syntax result calculated from each type's integration, connectivity, control value, and intelligibility. The analysis result shows that Senior Welfare Center j-graph's average arrangement is hybrid>corridor>hall types. Those elders lacking awareness need easily perceivable spatial structure's and hall type's would be the best choice to increase their awareness as it has high articulation. However, hall type's would be difficult to construct with the size increase, so hybrid type would be the next logical solution. Space with relatively high articulation will need to be planned in hybrid type's where rest areas can be created within the halls in the Welfare Center in connection to its corridors.

  • PDF