• Title/Summary/Keyword: Entity Search

Search Result 63, Processing Time 0.025 seconds

Recognition of Roads and Districts from Maps (지도에서 도로와 블록 인식)

  • Jang, Kyung-Shik;Kim, Jai-Hie
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.9
    • /
    • pp.2289-2298
    • /
    • 1997
  • This paper proposes a new method to recognize map. In order to minimize the ripple effect of one recognition result affecting another, the structural information is represented with a hierarchical model. and the model is used in both the recognition and verification process. Furthermore, lines related to an entity are searched in a used in both the recognition and verification process. Furthermore, lines related to an entity are searched in a reduced search space by defining some relations between lines. When there is a mis-recognition after verificaiton, recognition process will be retired. In the process, the accurate result can obtained through the change of the parameter values used in the algorithm. As a result, the search space is reduced effectively, and even objects that embodies the broken lines and the crossed lines are recognized.

  • PDF

Design of Unification Meta-data and Entity-Relationship Model for Educational Digital Content (교수.학습 디지털 컨텐트 통합 메타데이터 및 개체-관계 모델 설계)

  • Koo, Duk-Hoi
    • Journal of The Korean Association of Information Education
    • /
    • v.6 no.3
    • /
    • pp.317-327
    • /
    • 2002
  • The need to support the ICT-using teaching and learning at elementary and secondary schools has led to various digital content service systems. The systems are designed to target the teachers and the students as the major users. The problems involved in them is that they do not provide such services as the integrated search and the systematic use of interface in terms of actual users' use of teaching and learning digital content. It's because they have been created at demands at each time. In an attempt to solve this problem, this study set out to suggest the integrated meta-data items of a teaching and learning digital content, which reflects the Dublin Core Education, the international meta-data standard. It also aimed to design an entity-relationship model to realize the digital content. The results of the integrated meta-data and the entity-relationship model will be utilized as a basic research to help the users to search for various teaching and learning digital contents on an integrated basis and to realize a consistent user interface. Furthermore, they are expected to contribute to the development a service system the teachers and the students can make better use of.

  • PDF

Performance Enhancement of Fast-Moving Object by Location Scheme in FMIP (FMIP에서 위치 관리 기법을 사용한 고속 이동체의 이동 성능 개선 방법)

  • Kim, Mi-Young;Mun, Young-Song
    • Journal of Internet Computing and Services
    • /
    • v.9 no.5
    • /
    • pp.175-183
    • /
    • 2008
  • Wi-Fi defines the procedure to search an AP, authenticate both station and AP, and associate the new BSS or ESS which enables the link layer handoff. One of the problems for hotspot service of Wi-Fi is "passing-object", Wi-Fi describes the messages exchanges between two neighboring APs in BSS or ESS. If the station passes through the neighboring APs before completing link-layer handoff, the path where the tunneled packets should be sent is lost. In this paper, we propose to integrate the positioning entity in a service domain to keep track the high-speed movement.

  • PDF

Development of Tourism Information Named Entity Recognition Datasets for the Fine-tune KoBERT-CRF Model

  • Jwa, Myeong-Cheol;Jwa, Jeong-Woo
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.14 no.2
    • /
    • pp.55-62
    • /
    • 2022
  • A smart tourism chatbot is needed as a user interface to efficiently provide smart tourism services such as recommended travel products, tourist information, my travel itinerary, and tour guide service to tourists. We have been developed a smart tourism app and a smart tourism information system that provide smart tourism services to tourists. We also developed a smart tourism chatbot service consisting of khaiii morpheme analyzer, rule-based intention classification, and tourism information knowledge base using Neo4j graph database. In this paper, we develop the Korean and English smart tourism Name Entity (NE) datasets required for the development of the NER model using the pre-trained language models (PLMs) for the smart tourism chatbot system. We create the tourism information NER datasets by collecting source data through smart tourism app, visitJeju web of Jeju Tourism Organization (JTO), and web search, and preprocessing it using Korean and English tourism information Name Entity dictionaries. We perform training on the KoBERT-CRF NER model using the developed Korean and English tourism information NER datasets. The weight-averaged precision, recall, and f1 scores are 0.94, 0.92 and 0.94 on Korean and English tourism information NER datasets.

PubMine: An Ontology-Based Text Mining System for Deducing Relationships among Biological Entities

  • Kim, Tae-Kyung;Oh, Jeong-Su;Ko, Gun-Hwan;Cho, Wan-Sup;Hou, Bo-Kyeng;Lee, Sang-Hyuk
    • Interdisciplinary Bio Central
    • /
    • v.3 no.2
    • /
    • pp.7.1-7.6
    • /
    • 2011
  • Background: Published manuscripts are the main source of biological knowledge. Since the manual examination is almost impossible due to the huge volume of literature data (approximately 19 million abstracts in PubMed), intelligent text mining systems are of great utility for knowledge discovery. However, most of current text mining tools have limited applicability because of i) providing abstract-based search rather than sentence-based search, ii) improper use or lack of ontology terms, iii) the design to be used for specific subjects, or iv) slow response time that hampers web services and real time applications. Results: We introduce an advanced text mining system called PubMine that supports intelligent knowledge discovery based on diverse bio-ontologies. PubMine improves query accuracy and flexibility with advanced search capabilities of fuzzy search, wildcard search, proximity search, range search, and the Boolean combinations. Furthermore, PubMine allows users to extract multi-dimensional relationships between genes, diseases, and chemical compounds by using OLAP (On-Line Analytical Processing) techniques. The HUGO gene symbols and the MeSH ontology for diseases, chemical compounds, and anatomy have been included in the current version of PubMine, which is freely available at http://pubmine.kobic.re.kr. Conclusions: PubMine is a unique bio-text mining system that provides flexible searches and analysis of biological entity relationships. We believe that PubMine would serve as a key bioinformatics utility due to its rapid response to enable web services for community and to the flexibility to accommodate general ontology.

A Study on Knowledge Entity Extraction Method for Individual Stocks Based on Neural Tensor Network (뉴럴 텐서 네트워크 기반 주식 개별종목 지식개체명 추출 방법에 관한 연구)

  • Yang, Yunseok;Lee, Hyun Jun;Oh, Kyong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.25-38
    • /
    • 2019
  • Selecting high-quality information that meets the interests and needs of users among the overflowing contents is becoming more important as the generation continues. In the flood of information, efforts to reflect the intention of the user in the search result better are being tried, rather than recognizing the information request as a simple string. Also, large IT companies such as Google and Microsoft focus on developing knowledge-based technologies including search engines which provide users with satisfaction and convenience. Especially, the finance is one of the fields expected to have the usefulness and potential of text data analysis because it's constantly generating new information, and the earlier the information is, the more valuable it is. Automatic knowledge extraction can be effective in areas where information flow is vast, such as financial sector, and new information continues to emerge. However, there are several practical difficulties faced by automatic knowledge extraction. First, there are difficulties in making corpus from different fields with same algorithm, and it is difficult to extract good quality triple. Second, it becomes more difficult to produce labeled text data by people if the extent and scope of knowledge increases and patterns are constantly updated. Third, performance evaluation is difficult due to the characteristics of unsupervised learning. Finally, problem definition for automatic knowledge extraction is not easy because of ambiguous conceptual characteristics of knowledge. So, in order to overcome limits described above and improve the semantic performance of stock-related information searching, this study attempts to extract the knowledge entity by using neural tensor network and evaluate the performance of them. Different from other references, the purpose of this study is to extract knowledge entity which is related to individual stock items. Various but relatively simple data processing methods are applied in the presented model to solve the problems of previous researches and to enhance the effectiveness of the model. From these processes, this study has the following three significances. First, A practical and simple automatic knowledge extraction method that can be applied. Second, the possibility of performance evaluation is presented through simple problem definition. Finally, the expressiveness of the knowledge increased by generating input data on a sentence basis without complex morphological analysis. The results of the empirical analysis and objective performance evaluation method are also presented. The empirical study to confirm the usefulness of the presented model, experts' reports about individual 30 stocks which are top 30 items based on frequency of publication from May 30, 2017 to May 21, 2018 are used. the total number of reports are 5,600, and 3,074 reports, which accounts about 55% of the total, is designated as a training set, and other 45% of reports are designated as a testing set. Before constructing the model, all reports of a training set are classified by stocks, and their entities are extracted using named entity recognition tool which is the KKMA. for each stocks, top 100 entities based on appearance frequency are selected, and become vectorized using one-hot encoding. After that, by using neural tensor network, the same number of score functions as stocks are trained. Thus, if a new entity from a testing set appears, we can try to calculate the score by putting it into every single score function, and the stock of the function with the highest score is predicted as the related item with the entity. To evaluate presented models, we confirm prediction power and determining whether the score functions are well constructed by calculating hit ratio for all reports of testing set. As a result of the empirical study, the presented model shows 69.3% hit accuracy for testing set which consists of 2,526 reports. this hit ratio is meaningfully high despite of some constraints for conducting research. Looking at the prediction performance of the model for each stocks, only 3 stocks, which are LG ELECTRONICS, KiaMtr, and Mando, show extremely low performance than average. this result maybe due to the interference effect with other similar items and generation of new knowledge. In this paper, we propose a methodology to find out key entities or their combinations which are necessary to search related information in accordance with the user's investment intention. Graph data is generated by using only the named entity recognition tool and applied to the neural tensor network without learning corpus or word vectors for the field. From the empirical test, we confirm the effectiveness of the presented model as described above. However, there also exist some limits and things to complement. Representatively, the phenomenon that the model performance is especially bad for only some stocks shows the need for further researches. Finally, through the empirical study, we confirmed that the learning method presented in this study can be used for the purpose of matching the new text information semantically with the related stocks.

Korean-Chinese Person Name Translation for Cross Language Information Retrieval

  • Wang, Yu-Chun;Lee, Yi-Hsun;Lin, Chu-Cheng;Tsai, Richard Tzong-Han;Hsu, Wen-Lian
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2007.11a
    • /
    • pp.489-497
    • /
    • 2007
  • Named entity translation plays an important role in many applications, such as information retrieval and machine translation. In this paper, we focus on translating person names, the most common type of name entity in Korean-Chinese cross language information retrieval (KCIR). Unlike other languages, Chinese uses characters (ideographs), which makes person name translation difficult because one syllable may map to several Chinese characters. We propose an effective hybrid person name translation method to improve the performance of KCIR. First, we use Wikipedia as a translation tool based on the inter-language links between the Korean edition and the Chinese or English editions. Second, we adopt the Naver people search engine to find the query name's Chinese or English translation. Third, we extract Korean-English transliteration pairs from Google snippets, and then search for the English-Chinese transliteration in the database of Taiwan's Central News Agency or in Google. The performance of KCIR using our method is over five times better than that of a dictionary-based system. The mean average precision is 0.3490 and the average recall is 0.7534. The method can deal with Chinese, Japanese, Korean, as well as non-CJK person name translation from Korean to Chinese. Hence, it substantially improves the performance of KCIR.

  • PDF

WebGen: a Template-based Web Script Generator (WebGen: 템플릿 기반 웹 스크립트 생성기)

  • Eum, Doo-Hun
    • The KIPS Transactions:PartD
    • /
    • v.14D no.5
    • /
    • pp.509-516
    • /
    • 2007
  • The demand for Web applications that un on databases has been rapidly increased in every area including business. Comparing to this growing demand, it still takes much time to write and maintain Web applications. In this paper, we introduce the Web script generator, WebGen that generates the Web forms as application interface and the Web scripts that process the queries by the forms operating with a database. WebGen generates five Web scripts(Search, Select, Edit, Information, and Action) from built-in templates that are the frames for those scripts by applying the declarative contents of a user-written configuration file. Each script except the Action script generates a corresponding form as user interface. Therefore, WebGen enhances Web application productivity by reducing the development time and effort for Web applications. Unlike the commercial Web script generators, WebGen supports easy version management because it is based on independent templates. Moreover, a WebGen-generated form includes not only the interested entity but also the entities that are related directly and indirectly with the interested entity.

Temporal Data Migration Strategies by Time Granularity and LST-GET (시간단위와 LST-GET에 의한 시간지원 데이터의 이동 기법)

  • 윤홍원;김경석
    • Journal of Korea Multimedia Society
    • /
    • v.2 no.1
    • /
    • pp.9-21
    • /
    • 1999
  • This paper presents the time-segmented storage structure in order to increment search performance and the two data migration strategies: migration by Time Granularity and migration by LST-GET. In the migration strategy by Time Granularity, we describe how to assign entity version to the past, current segment, and future segments. We also describe searching and moving processes for data validity at a granularity level. In the migration strategy by LST-GET, we describe how to computer the value of dividing criterion. We simulate the search performance of the proposed segmented storage structure in comparison with the conventional storage structure in comparison with the conventional storage structure in relational database system. Finally, extensive simulation studies are performed in order to compare the search performance of the migration strategies with the time-segmented storage structure.

  • PDF

Ontology Knowledge based Information Retrieval for User Query Interpretation (사용자 질의 의미 해석을 위한 온톨로지 지식 기반 검색)

  • Kim, Nanju;Pyo, Hyejin;Jeong, Hoon;Choi, Euiin
    • Journal of Digital Convergence
    • /
    • v.12 no.6
    • /
    • pp.245-252
    • /
    • 2014
  • Semantic search promises to provide more accurate result than present-day keyword matching-based search by using the knowledge base represented logically. But, the ordinary users don't know well the complex formal query language and schema of the knowledge base. So, the system should interpret the meaning of user's keywords. In this paper, we describe a user query interpretation system for the semantic retrieval of multimedia contents. Our system is ontological knowledge base-driven in the sense that the interpretation process is integrated into a unified structure around a knowledge base, which is built on domain ontologies.