• Title/Summary/Keyword: Text Index

Search Result 268, Processing Time 0.03 seconds

A Primary Study on Building the Secondary Legal Information Full-Text Databases (2차 법률정보 전문데이터베이스 구축을 위한 기초 연구)

  • Kweon Kie-Won;Roh Jeong-Ran
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.32 no.3
    • /
    • pp.281-296
    • /
    • 1998
  • This study indicates that it is necessary to have characteristic information the information experts recognize-that is to say, experimental and inherent knowledge only human being can have built-in into the system rather than to approach the information system by the linguistic, statistic or structuralistic way, and it can be more essential and intelligent information system. As this study proves that the cited primary legal information within the secondary legal information functions as the index which represents the contents of the text because of the characteristics of legal information, the automatic indexing in the secondary legal full-text databases can be possible without the assitance of the experts. In case of the establishment, amendment or repealing of law, change of index terms can be possible through revising the legal text cited in the secondary legal information full-text databases. Even when we don't input the full-text about retrospective documents, automatic indexing is also possible, and the establishment and the practice of expert knowledge and integrated databases are possible in case of the retrospective documents.

  • PDF

About the Post-Cinematic Characteristics and Desire Shown in a Film (영화 <파란만장>에 나타난 욕망과 포스트시네마적인 특성에 대하여)

  • Son, Seong-Woo
    • Journal of Korea Entertainment Industry Association
    • /
    • v.13 no.3
    • /
    • pp.121-129
    • /
    • 2019
  • This study aims to focus on the text analysis, production methods of text, and reproduction of production methods, based on a film (2010) taken by mobile devices. As a digital film in which the objects and images have no characteristics of index, this work has the post-cinematic attributes in the aspect of consumers' recipience. This thesis paid attention to the interactions between essential change and production/consumption throughout the whole film culture in the receptive aspect. Just as the main character is a mediator-shaman in the film, this film works as a mediating position of cinematic possibility. In this film, there are different kinds of mediation such as mediation of shaman inside the text, mediation of film in the relationship between text and consumers, and consumers' instrumental desire for others'tool outside the text. Outside the text, this relevant film stimulates the imitation desire of consumer subjects as others. In other words, this is connected to the desire of consumers who aim to create a digital film through mobile devices as an author. This is connected to Simondon's thinking in which such technical objects not only generate new relationships, but also become a revolutionary seed that newly collectivizes human society.

A Basic Study on the Application of Text-Maining Method for Qualitative Evaluation through Barrier Free Certification in School Facilities (학교시설의 장애물 없는 생활환경(Barrier Free) 인증 사례를 통한 정성평가 텍스트마이닝 기법 적용에 관한 기초연구)

  • Yun, Pyeong-Se;Lee, Jong-Kuk
    • The Journal of Sustainable Design and Educational Environment Research
    • /
    • v.19 no.1
    • /
    • pp.25-35
    • /
    • 2020
  • Since the introduction and operation of BF certification, a total of 6,432 certificates has been issued until February 2020, of which educational research facilities make 1,091 cases (754 preliminary certification, 337 main certification) out of 6,237 buildings, acquiring BF certification of about 20%. Qualitative evaluation is conducted with focus on the three items of BF-certified building evaluation index, which are medium facilities, internal facilities, and sanitary facilities, and major keywords are the deducted through the Text Mining analysis of the derived results. As a result, problems with access paths occurred in the case of the facilities, and assessment indicators for users were found to be necessary among the assessment of the steps of the internal facilities. Finally, we could see that sanitation facilities needed to improve toilets installed in residential development facilities. Based on the results obtained, the study seeks to suggest directions for improving the evaluation index required for BF-certified school facilities.

Inverted Index based Modified Version of K-Means Algorithm for Text Clustering

  • Jo, Tae-Ho
    • Journal of Information Processing Systems
    • /
    • v.4 no.2
    • /
    • pp.67-76
    • /
    • 2008
  • This research proposes a new strategy where documents are encoded into string vectors and modified version of k means algorithm to be adaptable to string vectors for text clustering. Traditionally, when k means algorithm is used for pattern classification, raw data should be encoded into numerical vectors. This encoding may be difficult, depending on a given application area of pattern classification. For example, in text clustering, encoding full texts given as raw data into numerical vectors leads to two main problems: huge dimensionality and sparse distribution. In this research, we encode full texts into string vectors, and modify the k means algorithm adaptable to string vectors for text clustering.

Inverted Index based Modified Version of KNN for Text Categorization

  • Jo, Tae-Ho
    • Journal of Information Processing Systems
    • /
    • v.4 no.1
    • /
    • pp.17-26
    • /
    • 2008
  • This research proposes a new strategy where documents are encoded into string vectors and modified version of KNN to be adaptable to string vectors for text categorization. Traditionally, when KNN are used for pattern classification, raw data should be encoded into numerical vectors. This encoding may be difficult, depending on a given application area of pattern classification. For example, in text categorization, encoding full texts given as raw data into numerical vectors leads to two main problems: huge dimensionality and sparse distribution. In this research, we encode full texts into string vectors, and modify the supervised learning algorithms adaptable to string vectors for text categorization.

Analysis of Business Performance of Local SMEs Based on Various Alternative Information and Corporate SCORE Index

  • HWANG, Sun Hee;KIM, Hee Jae;KWAK, Dong Chul
    • The Journal of Economics, Marketing and Management
    • /
    • v.10 no.3
    • /
    • pp.21-36
    • /
    • 2022
  • Purpose: The purpose of this study is to compare and analyze the enterprise's score index calculated from atypical data and corrected data. Research design, data, and methodology: In this study, news articles which are non-financial information but qualitative data were collected from 2,432 SMEs that has been extracted "square proportional stratification" out of 18,910 enterprises with fixed data and compared/analyzed each enterprise's score index through text mining analysis methodology. Result: The analysis showed that qualitative data can be quantitatively evaluated by region, industry and period by collecting news from SMEs, and that there are concerns that it could be an element of alternative credit evaluation. Conclusion: News data cannot be collected even if one of the small businesses is self-employed or small businesses has little or no news coverage. Data normalization or standardization should be considered to overcome the difference in scores due to the amount of reference. Furthermore, since keyword sentiment analysis may have different results depending on the researcher's point of view, it is also necessary to consider deep learning sentiment analysis, which is conducted by sentence.

Multi-Dimensional Keyword Search and Analysis of Hotel Review Data Using Multi-Dimensional Text Cubes (다차원 텍스트 큐브를 이용한 호텔 리뷰 데이터의 다차원 키워드 검색 및 분석)

  • Kim, Namsoo;Lee, Suan;Jo, Sunhwa;Kim, Jinho
    • Journal of Information Technology and Architecture
    • /
    • v.11 no.1
    • /
    • pp.63-73
    • /
    • 2014
  • As the advance of WWW, unstructured data including texts are taking users' interests more and more. These unstructured data created by WWW users represent users' subjective opinions thus we can get very useful information such as users' personal tastes or perspectives from them if we analyze appropriately. In this paper, we provide various analysis efficiently for unstructured text documents by taking advantage of OLAP (On-Line Analytical Processing) multidimensional cube technology. OLAP cubes have been widely used for the multidimensional analysis for structured data such as simple alphabetic and numberic data but they didn't have used for unstructured data consisting of long texts. In order to provide multidimensional analysis for unstructured text data, however, Text Cube model has been proposed precently. It incorporates term frequency and inverted index as measurements to search and analyze text databases which play key roles in information retrieval. The primary goal of this paper is to apply this text cube model to a real data set from in an Internet site sharing hotel information and to provide multidimensional analysis for users' reviews on hotels written in texts. To achieve this goal, we first build text cubes for the hotel review data. By using the text cubes, we design and implement the system which provides multidimensional keyword search features to search and to analyze review texts on various dimensions. This system will be able to help users to get valuable guest-subjective summary information easily. Furthermore, this paper evaluats the proposed systems through various experiments and it reveals the effectiveness of the system.

A Study on Developing a Metadata Search System Based on the Text Structure of Korean Studies Research Articles (한국학 연구 논문의 텍스트 구조 기반 메타데이터 검색 시스템 개발 연구)

  • Song, Min-Sun;Ko, Young Man;Lee, Seung-Jun
    • Journal of the Korean Society for information Management
    • /
    • v.33 no.3
    • /
    • pp.155-176
    • /
    • 2016
  • This study aims to develope a scholarly metadata information system based on conceptual elements of text structure of Korean studies research articles and to identify the applicability of text structure based metadata as compared with the existing similar system. For the study, we constructed a database(Korean Studies Metadata Database, KMD) with text structure based on metadata of Korean Studies journal articles selected from the Korea Citation Index(KCI). Then we verified differences between KCI system and KMD system through search results using same keywords. As a result, KMD system shows the search results which meet the users' intention of searching more efficiently in comparison with the KCI system. In other words, even if keyword combinations and conditional expressions of searching execution are same, KMD system can directly present the content of research purposes, research data, and spatial-temporal contexts of research et cetera as search results through the search procedure.

Graph based KNN for Optimizing Index of News Articles

  • Jo, Taeho
    • Journal of Multimedia Information System
    • /
    • v.3 no.3
    • /
    • pp.53-61
    • /
    • 2016
  • This research proposes the index optimization as a classification task and application of the graph based KNN. We need the index optimization as an important task for maximizing the information retrieval performance. And we try to solve the problems in encoding words into numerical vectors, such as huge dimensionality and sparse distribution, by encoding them into graphs as the alternative representations to numerical vectors. In this research, the index optimization is viewed as a classification task, the similarity measure between graphs is defined, and the KNN is modified into the graph based version based on the similarity measure, and it is applied to the index optimization task. As the benefits from this research, by modifying the KNN so, we expect the improvement of classification performance, more graphical representations of words which is inherent in graphs, the ability to trace more easily results from classifying words. In this research, we will validate empirically the proposed version in optimizing index on the two text collections: NewsPage.com and 20NewsGroups.