• Title/Summary/Keyword: Text Construction

Search Result 386, Processing Time 0.028 seconds

Construction of Event Networks from Large News Data Using Text Mining Techniques (텍스트 마이닝 기법을 적용한 뉴스 데이터에서의 사건 네트워크 구축)

  • Lee, Minchul;Kim, Hea-Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.183-203
    • /
    • 2018
  • News articles are the most suitable medium for examining the events occurring at home and abroad. Especially, as the development of information and communication technology has brought various kinds of online news media, the news about the events occurring in society has increased greatly. So automatically summarizing key events from massive amounts of news data will help users to look at many of the events at a glance. In addition, if we build and provide an event network based on the relevance of events, it will be able to greatly help the reader in understanding the current events. In this study, we propose a method for extracting event networks from large news text data. To this end, we first collected Korean political and social articles from March 2016 to March 2017, and integrated the synonyms by leaving only meaningful words through preprocessing using NPMI and Word2Vec. Latent Dirichlet allocation (LDA) topic modeling was used to calculate the subject distribution by date and to find the peak of the subject distribution and to detect the event. A total of 32 topics were extracted from the topic modeling, and the point of occurrence of the event was deduced by looking at the point at which each subject distribution surged. As a result, a total of 85 events were detected, but the final 16 events were filtered and presented using the Gaussian smoothing technique. We also calculated the relevance score between events detected to construct the event network. Using the cosine coefficient between the co-occurred events, we calculated the relevance between the events and connected the events to construct the event network. Finally, we set up the event network by setting each event to each vertex and the relevance score between events to the vertices connecting the vertices. The event network constructed in our methods helped us to sort out major events in the political and social fields in Korea that occurred in the last one year in chronological order and at the same time identify which events are related to certain events. Our approach differs from existing event detection methods in that LDA topic modeling makes it possible to easily analyze large amounts of data and to identify the relevance of events that were difficult to detect in existing event detection. We applied various text mining techniques and Word2vec technique in the text preprocessing to improve the accuracy of the extraction of proper nouns and synthetic nouns, which have been difficult in analyzing existing Korean texts, can be found. In this study, the detection and network configuration techniques of the event have the following advantages in practical application. First, LDA topic modeling, which is unsupervised learning, can easily analyze subject and topic words and distribution from huge amount of data. Also, by using the date information of the collected news articles, it is possible to express the distribution by topic in a time series. Second, we can find out the connection of events in the form of present and summarized form by calculating relevance score and constructing event network by using simultaneous occurrence of topics that are difficult to grasp in existing event detection. It can be seen from the fact that the inter-event relevance-based event network proposed in this study was actually constructed in order of occurrence time. It is also possible to identify what happened as a starting point for a series of events through the event network. The limitation of this study is that the characteristics of LDA topic modeling have different results according to the initial parameters and the number of subjects, and the subject and event name of the analysis result should be given by the subjective judgment of the researcher. Also, since each topic is assumed to be exclusive and independent, it does not take into account the relevance between themes. Subsequent studies need to calculate the relevance between events that are not covered in this study or those that belong to the same subject.

A Study on the Yousang-Dae Goksuro(Curve-Waterway) in Gangneung, Yungok-Myun, Yoodung Ri (강릉 연곡면 유등리 '유상대(流觴臺)' 곡수로(曲水路)의 조명(照明))

  • Rho, Jae-Hyun;Shin, Sang-Sup;Lee, Jung-Han;Huh, Jun;Park, Joo-Sung
    • Journal of the Korean Institute of Traditional Landscape Architecture
    • /
    • v.30 no.1
    • /
    • pp.14-21
    • /
    • 2012
  • The object of the study, Yousang-Dae(流觴臺) and engraved Go broad text on the flat rock in Gangneung-si Yungok-myun Yoodung-ri Baemgol, reveals that the place was for appreciating arts like Yusang Goksu and Taoist hermit's games. three times of detail reconnaissance survey brought about the results as follows. There is a the text, Manwolsan(滿月山) Baegundongcheon(白雲洞天), engraved on the rock in Baegunsa(白雲寺) that had been built by Doun at the first year of King Hungang(in 875) of the United Shilla, became in ruins in the middle of Joseon, and then was rebuilt in 1954. The text is an invaluable evidence indicating that the tradition of Taoist hermit and Sunbee(classical scholars) culture has been generated in Baemgol Valley. According to the 2nd vol. of Donghoseungram(東湖勝覽), the chronicle of Gangneung published by Choi Baeksoon in 1934, there is a record saying that 'Baegunsa in Namjeonhyeon is the classroom where famous teachers like Yulgok Lee Yi or Seongje Choi Ok were teaching' that verifies the historic property of the place. In addition, the management of Nujeong(樓亭) and Dongcheon can be traced through Baegunjeong(白雲亭) constructed by Kim Yoonkyung(金潤卿) in Muo year, the 9th year of Cheoljong(1858) according to Donghoseungram and the completed version of Jeungboyimyoungji(增補臨瀛誌). Also, Baegundongdongcheon(白雲亭洞天), the text engraved on the standing stone across the stream from Yousang-Dae stone, was created 3 years after the Baegunjeong construction in the 12th year of Cheoljong(1861), which refers a symbolic sign closely related with Yousang-Dae. Based on this premise and circumstance, with careful studying the remains of 'Yusang-dae' Goksuro, we discovered that the Sebun-seok(細分石) controling the amount and the speed of moving water and the remains of furrows of Keumbae-soek(擒盃石) and Yubae-gong(留盃孔) containing water stream with cups through the mountain stream and rocks around Yusang-Dae. In addition, as 21 people's names engraved under the statement of 'Oh-Seong(午星)' were discovered on the bottom of the rock, this clearly confirms that the place was one of the main cultural footholds of tasting the arts which have characteristics of Yu-Sang-Gok-Su-Yeon(流觴曲水宴) until the middle of the 20th century. It implies that the arts tasting culture of Sunbees had been inherited centering on Yusang-dae in this particular place until the middle of the 20th century. It is necessary to be studied in depth because the place is a historic and unique cultural place where 'Confucianism, Buddhism, and Zen'were combined together. Based on the result of the study, the identification of 23 people as well as the writer of Yusang-Dae text should be carefully studied in depth in terms of the characteristics of the place through gathering data about appreciation of arts like Yusanggoksu. Likewise, we should make efforts to discover the chess board engraved on the rock described on the documents, thus we should consider to establish plans to recover the original shape of the place, for example, breaking the cement pavement of the road, additional excavation, changing the existing route, and so fourth.

Design of Large-set Off-line Handwritten Hangul Database Construction (대용량 오프라인 한글 글씨 데이타베이스의 설계)

  • Lee, S.W.;Song, H.H.;Kim, J.S.;Lee, E.J.;Park, H.S.
    • Annual Conference on Human and Language Technology
    • /
    • 1995.10a
    • /
    • pp.131-136
    • /
    • 1995
  • 최근들어 자연스럽게 필기된 한글을 인식함으로써 정보 입력 과정을 자동화하기 위한 오프라인 한글 글씨 인식에 관한 연구가 활발히 진행되고 있다. 오프라인 한글 글씨 인식에 관한 연구에 있어서 반드시 확보되어야 하는 연구 환경으로 대용량 오프라인 한글 글씨 데이타베이스의 구축을 들 수 있는데, 본 논문에서는 시스템공학연구소 국어공학센터의 국어 정보 베이스 개발사업의 일환으로 추진중인 오프라인 한글 글씨 데이타베이스의 구축현황에 대해 간략히 소개하고자 한다. 오프라인 한글 글씨 데이타베이스의 구축은 크게 글씨 데이타베이스 설계, 글씨 데이타 수집, 용지 스캔 및 문자 단위 분할, 데이타베이스 검증의 4 단계로 구성된다. 본 연구에서는 다양한 변형을 갖는 글씨체의 수집을 데이타베이스 구축시 가장 고려해야 할 요소로 삼았으며, 고품질의 일관성 있는 글씨 데이타베이스 구축을 위해 데이타베이스 설계 단계와 검증 단계에 많은 시간을 할애했다. 마지막으로 본 연구에서는 WWW(World Wide Web)의 HTML(Hyper Text Markup Language)을 이용하여 편리 한 사용자 인터페이스를 구현함으로써 사용자들이 쉽게 한글 글씨 영상을 검색 할 수 있음은 물론 인식 알고리즘의 개발에 사용 가능한 형태의 화일을 제공받을 수 있도록 구성하고 있다. 현재는 KS C 완성형 한글 2,350자 중에서 사용 빈도순 상위 520자에 대한 한글 글씨 1,000벌을 수집하여 명도영상 데이타베이스를 구축 중에 있으며, 향후 2년간 나머지 1,830자에 대한 한글 글씨 데이타를 수집하여 데이타베이스를 완성하고자 한다. 구축된 글씨 데이타베이스는 조만간 국내의 오프라인 한글 글씨 인식 연구자들에게 제공되어 우수한 인식 알고리즘의 개발을 위한 중요한 실험 데이타로서 사용될 예정이며, 개발된 인식 시스템에 대한 객관적인 성능 평가에 있어서도 크게 기여하여 국내의 오프라인 한글 글씨 인식에 관한 연구를 활성화시켜주는 계기가 될 것으로 기대된다.

  • PDF

Educational Web Design Taking Usability into Consideration (Focused on VRML Educational Web Page) (유저빌러티를 고려한 교육용 웹 디자인 (VRML교육용 웹 페이지 중심으로))

  • Kim, Nam-Hee;Kim, Tae-Wan
    • Journal of Korea Game Society
    • /
    • v.2 no.1
    • /
    • pp.16-22
    • /
    • 2002
  • After that Internet was introduced to Korea, Web page has developed from text centered to graphic-centered at its fist stage. At the present, it is improving to a design for users. Furthermore, with the acceleration of Information super-highway construction and generalization of basic technology for multi-media, the educational environment has transformed to demander-focused and internet basis service which transcends time and space. Consequently, the educational structure is converting from instructors unilateral lead to student-centered. Additionally, the common usage and digitalization of information have an effect on progress of education quality and cost saving. Although there are plenty of educational web pages on internet, we can notice that many of them are inconvenient for users to put into practice. The reason is that many experts overlook the fundamental which is the basic skill for design and understanding of Web must be accompanied with Web design. Therefore, this thesis will find out the points users should consider for use of Web page and realize the educational web page, reflected for VRML training.

  • PDF

Analysis on Triangle Determination and Congruence (삼각형의 결정과 합동의 분석)

  • Kim, Su-Hyun;Choi, Yoon-Sang
    • Journal of the Korean School Mathematics Society
    • /
    • v.10 no.3
    • /
    • pp.341-351
    • /
    • 2007
  • The primary purpose of this treatise is to suggest the solutions as follows for the errors concerning the triangle determination and congruence in every Korean mathematics textbook for 7th graders: showing that SsA, along with SSS, SAS, ASA, should also be included as the condition for triangle determination, congruence and similarity; proving that contrary to what has been believed, minimality applies only to congruence and similarity but not to determination; examining related Euclidean propositions; discussing the confusion about the characteristics of determination and congruence; and considering the negative effects of giving definite figures in construction education. The secondary purpose is to analyze the significance of triangle determinant that is not dealt with in either Euclid's Elements or the text books in the U.S. or Japan, and suggest a way to effectively deal with triangle determination and congruence in education.

  • PDF

A Study on the Green Design for a Drink Vending Machine (음료자동판매기의 그린디자인에 관한 연구)

  • 문금희
    • Archives of design research
    • /
    • no.18
    • /
    • pp.177-186
    • /
    • 1996
  • With the change of patterns and the environment of the national standard of living the prohlem of environmental pollution became increasingly serious. Because of the enormous increase of various kinds of used and (after utilization) useless articles, efforts to save resources as well as the environment and the promotion of reprated utilization and recycling are inavoidable. The recognition of an environmental an health problem, and the desire for nonpollution created a desire for environment-friendly products in order to avoid an environmental consumptionism. Drink vending machines making use of vessels only once are closely related to the environmental problem. It is therefore necessary to develop an ecologically designed vending machine. In this study the backgrounds and concepts of green design, classification, construction and the environment of a drink vending machine arc analyzed. From this st1.rting-point a concept for the design of a drink vending machine is developed by two concepts : Type A (seperated-gathering type) and Type B (recycling type). Then three defferent types of vending-machines arc introduced a wall -adherable type, a center est1.blishable type and a desk top type. The conclusion of the text is threefold. There are needs for an ecological design of vending machines, ergonomIc considerations and a harmonization of the styldapperarance) of the machine and its circumferences.

  • PDF

Compilation of the Yonsei English Learner Corpus (YELC) 2011 and Its Use for Understanding Current Usage of English by Korean Pre-university Students (한국 예비 대학생의 영어 사용 특성 파악을 위한 대규모 공개 영어 학습자 코퍼스 구축 및 분석)

  • Rhee, Seok-Chae;Jung, Chae Kwan
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.11
    • /
    • pp.1019-1029
    • /
    • 2014
  • In recent years, researchers have become increasingly interested in the creation and pedagogical use of English learner corpora. Many studies have shown that learner corpora can not only make a significant contribution to second language acquisition research but also contribute to the construction and evaluation of language tests by advancing our understanding of English learners. So far, however, little attention has been paid to the Korean EFL (English as a foreign language) learners' corpus. The Yonsei English Learner Corpus (YELC 2011) is a specialized, monolingual, and synchronic Korean EFL learner corpus that was developed by Yonsei University from 2011 to 2012. Over 3,000 Korean high school graduates (or equivalents) who were accepted by Yonsei University for their further studies participated in this project. It consists of 6,572 written texts (1,085,828 words) at nine different English proficiency levels. In this paper, we describe its compilation, and more specifically, how we have corpusized from a text archive to a corpus. After introducing the process of corpusization, we report arresting insights into the specific linguistic features that different proficiency levels of Korean learners of English have. This study also discusses the potential use of the YELC 2011 which is now freely available for research purposes.

A Fast stream cipher Canon (고속 스트림 암호 Canon)

  • Kim, Gil-Ho
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.17 no.7
    • /
    • pp.71-79
    • /
    • 2012
  • Propose stream cipher Canon that need in Wireless sensor network construction that can secure confidentiality and integrity. Create Canon 128 bits streams key by 128 bits secret key and 128 bits IV, and makes 128 bits cipher text through whitening processing with produced streams key and 128 bits plaintext together. Canon for easy hardware implementation and software running fast algorithm consists only of simple logic operations. In particular, because it does not use S-boxes for non-linear operations, hardware implementation is very easy. Proposed stream cipher Canon shows fast speed test results performed better than AES, Salsa20, and gate number is small than Trivium. Canon purpose of the physical environment is very limited applications, mobile phones, wireless Internet environment, DRM (Digital Right Management), wireless sensor networks, RFID, and use software and hardware implementation easy 128 bits stream ciphers.

Construction of Multimedia Information System to Guide Urban Information - at the city of Chin-ju - (도시정보안내를 위한 멀티미디어 정보시스템구축 - 진주시를 중심으로 -)

  • 유환희;조해용;김성우
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.15 no.1
    • /
    • pp.63-73
    • /
    • 1997
  • The objective for the plan of informatization which the government earring out is the modernization of the in-formation service system to be diverse and speedy. With the increase in variety and volume of the available in-formation at the city now, it has become necessary to develop more efficient system of offering the various displays by using computer graphics and multimedia functions as well as storing and managing the information. The multimedia urban information system, which we developed, was designed to furnish various informations of the city to the citizens more efficiently by using Visual BASIC in the personal computer with inexpensive prior. The datas of text, voice, and dynamic images were integrated in this, system by multimedia tools. Also, the database was established to get the expert datas-traffic volume in peak hour, traffic accidents, and road information. as well as general urban informations.

  • PDF

Documentation of the History of Ok-Cheon Catholic Church by standardized 2D CAD and 3D Digital Modeling (표준화된 2D CAD와 3D Digital Modeling을 이용한 옥천천주교회의 연혁 기록)

  • Kim, Myung-Sun;Choi, Soon-Yong
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.1
    • /
    • pp.523-528
    • /
    • 2011
  • Ok-Cheon catholic church has been changed 4 times since it's first construction in 1955. Prior three changes were small ones of windows, doors, roof finish etc. but the last alteration was the extension of it's plan from 一 shape to long cross shape and along with it the size, structure and form of it changed. This history of the church has not been recorded in drawing but only in text with indistinct features not documented. This study makes a new 2D CAD files using layers matched the changes and 3D digital models, these have not only present information but also change informations of the church. They are useful data for effective management, conservation restoration or possible reuse of it.