• Title/Summary/Keyword: Data dictionary

Search Result 346, Processing Time 0.03 seconds

A Repository for Workflow Management on Distributed Object Environment (분산객체 환경에서의 워크플로우 관리를 위한 정보저장소)

  • Yeom, Tae-Jin;Park, Jae-Hyung;Ri, Ja;Kim, Ki-Bong;Jin, Sung-Il
    • The Journal of Society for e-Business Studies
    • /
    • v.4 no.1
    • /
    • pp.1-19
    • /
    • 1999
  • Workflow management system provides automation of job processes by maintaining shareability on information about various job process schedules and persons related to those schedules. Existing workflow management systems use file or database to store the information generated in those systems. However, file or database system could manage only non-complicated information for the workflow but not the information resources of an enterprise which is complicated and of various formats. Therefore, we need a data management system that could control those information resources. This system should manage the data which are distributed at several places geographically. Information Repository could meet those requirements. Information Repository may integrate, store and manage information resources requested by application systems. We have an international standard for the information repository, Information Resources Dictionary System(IRDS). The IRDS, however, does not support distributed environment. In this paper, we design and implement an information repository based on IRDS that may be operated in distributed environment. We verify that this information repository is more effective and is more effective than any other file or database system.

  • PDF

Reconstitution of CB Trie for the Efficient Hangul Retrieval (효율적인 한글 탐색을 위한 CB 트라이의 재구성)

  • Jung, Kyu-Cheol
    • Convergence Security Journal
    • /
    • v.7 no.4
    • /
    • pp.29-34
    • /
    • 2007
  • This paper proposes RCB(Reduced Compact Binary) trie to correct faults of CB(Compact Binary) trie. First, in the case of CB trie, a compact structure was tried for the first time, but as the amount of data was increasing, that of inputted data gained and much difficulty was experienced in insertion due to the dummy nods used in balancing trees. On the other hand, if the HCB trie realized hierarchically, given certain depth to prevent the map from increasing on the right, reached the depth, the method for making new trees and connecting to them was used. Eventually, fast progress could be made in the inputting and searching speed, but this had a disadvantage of the storage space becoming bigger because of the use of dummy nods like CB trie and of many tree links. In the case of RCB trie in this thesis, a capacity is increased by about 60% by completely cutting down dummy nods.

  • PDF

Edutech in the Era of the 4th Industrial Revolution (4차 산업혁명 시대의 에듀테크)

  • Park, Ji Su;Gil, Joon-Min
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.11
    • /
    • pp.329-331
    • /
    • 2020
  • Edutech is a compound word of education and technology, and is an educational paradigm in the era of the 4th industrial revolution. This refers to next-generation education using information and communication technology (ICT) such as big data, artificial intelligence (AI), robots, and virtual reality (VR) of the 4th industrial revolution. e-Learning is being used as an online lecture for education in ICT, but edutech is attracting attention along with e-learning as the feeding of non-face-to-face education has rapidly increased due to COVID-19. Therefore, this paper summarizes the reviewed papers on the blockchain-based badge service platform, simulation-based collaborative e-Learning system, video English dictionary, and blockchain-based access control audit system.

Operational Experience in DB "TERMIN"

  • Shaburova, Natalya N.
    • Journal of Information Science Theory and Practice
    • /
    • v.7 no.3
    • /
    • pp.21-30
    • /
    • 2019
  • Information about the formation and filling (in 2014 to 2016) of a terminological dictionary on electronics and radioengineering and collective work (in 2017 to 2018) with a data bank "TERMIN" is presented in this article. In purpose of creating an instrument of navigating the modern scientific-technical space a net of terms with set semantic links is described. This set is based on the analysis of terms' definitions (each term is checked for inclusion in the definitions of all other terms; the definitions were borrowed from reputable reference editions: encyclopedias, dictionaries, reference books). The created model of a system that consists of different information sources, in which it (information) is indexed by the terminology of Russian State Rubricator of Scientific and Technical Information rubrics and/or keywords, is described. There is an access for the search in all these sources in the system. Searching inquiries are referred to in the language of these rubrics or formulated by arbitrary terms. The system is to refer to information sources and give out relevant information. In accordance with this model, semantic links of various types, which allow expanding a search at different modalities of query, should be set among data bank terms. Obtained links will have to increase semantic matching, i.e., they can provide actual understanding of the meaning of the information that is being sought.

Study on Effective Extraction of New Coined Vocabulary from Political Domain Article and News Comment (정치 도메인에서 신조어휘의 효과적인 추출 및 의미 분석에 대한 연구)

  • Lee, Jihyun;Kim, Jaehong;Cho, Yesung;Lee, Mingu;Choi, Hyebong
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.2
    • /
    • pp.149-156
    • /
    • 2021
  • Text mining is one of the useful tools to discover public opinion and perception regarding political issues from big data. It is very common that users of social media express their opinion with newly-coined words such as slang and emoji. However, those new words are not effectively captured by traditional text mining methods that process text data using a language dictionary. In this study, we propose effective methods to extract newly-coined words that connote the political stance and opinion of users. With various text mining techniques, I attempt to discover the context and the political meaning of the new words.

Statistical Approach to Sentiment Classification using MapReduce (맵리듀스를 이용한 통계적 접근의 감성 분류)

  • Kang, Mun-Su;Baek, Seung-Hee;Choi, Young-Sik
    • Science of Emotion and Sensibility
    • /
    • v.15 no.4
    • /
    • pp.425-440
    • /
    • 2012
  • As the scale of the internet grows, the amount of subjective data increases. Thus, A need to classify automatically subjective data arises. Sentiment classification is a classification of subjective data by various types of sentiments. The sentiment classification researches have been studied focused on NLP(Natural Language Processing) and sentiment word dictionary. The former sentiment classification researches have two critical problems. First, the performance of morpheme analysis in NLP have fallen short of expectations. Second, it is not easy to choose sentiment words and determine how much a word has a sentiment. To solve these problems, this paper suggests a combination of using web-scale data and a statistical approach to sentiment classification. The proposed method of this paper is using statistics of words from web-scale data, rather than finding a meaning of a word. This approach differs from the former researches depended on NLP algorithms, it focuses on data. Hadoop and MapReduce will be used to handle web-scale data.

  • PDF

The Big Data Analysis and Medical Quality Management for Wellness (웰니스를 위한 빅데이터 분석과 의료 질 관리)

  • Cho, Young-Bok;Woo, Sung-Hee;Lee, Sang-Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.12
    • /
    • pp.101-109
    • /
    • 2014
  • Medical technology development and increase the income level of a "Long and healthy Life=Wellness," with the growing interest in actively promoting and maintaining health and wellness has become enlarged. In addition, the demand for personalized health care services is growing and extensive medical moves of big data, disease prevention, too. In this paper, the main interest in the market, highlighting wellness in order to support big data-driven healthcare quality through patient-centered medical services purposes. Patients with drug dependence treatment is not to diet but to improve disease prevention and treatment based on analysis of big data. Analysing your Tweets-daily information and wellness disease prevention and treatment, based on the purpose of the dictionary. Efficient big data analysis for node while increasing processing time experiment. Test result case of total access time efficient 26% of one node to three nodes and case of data storage is 63%, case of data aggregate is 18% efficient of one node to three nodes.

Identification of sentiment keywords association-based hotel network of hotel review using mapper method in topological data analysis (Topological Data Analysis 기법을 활용한 호텔 리뷰데이터의 감성 키워드 기반 호텔 관계망 구축)

  • Jeon, Ye-Seul;Kim, Jeong-Jae
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.1
    • /
    • pp.75-86
    • /
    • 2020
  • Hotel review data can extract various information that includes purchasing factors that lead to consumption, advantages, and disadvantages for hotels. In particular, the sentiment keyword of the review data helps consumers understand the pros and cons of hotels. However, it is not efficient for consumers to read a large number of reviews. Therefore, it is necessary to offer a summary review to customers. In this study, we suggest providing summary information on sentiment keywords association as well as a network of hotels based on sentiment keywords. Based on a sentiment keyword dictionary, the extracted sentiment keywords associations construct the hotel network through topological data analysis based mapper. This hotel network allows a consumer to find some hotels associated with specific sentiment keywords as well as recommends the same related hotels. This summary information provides users with a summarized emotional assessment of hotels and helps hotel marketing teams understand consumers' perceptions of their hotel.

Design of DatawareHouse Real-Time Cleansing System using XMDR (XMDR을 이용한 데이터웨어하우스 실시간 데이터 정제 시스템 설계)

  • Song, Hong-Youl;Jung, Kye-Dong;Choi, Young-Keum
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.8
    • /
    • pp.1861-1867
    • /
    • 2010
  • A datawarehouse is generally used in organizations for decision and policy making. And In a distribute environment when a new system is added, there needs considerable amount of time and cost due to the difference between the systems. Therefore, to solve this matter. Firstly, heterogeneous data structures can be handled by creating abstract queries according to the standard schema and by separating the queries using XMDR. Secondly, metadata dictionary which defines synonyms of metadata and methods for data expression is used to overcome difference of definition and expression of data. Especially, work presented in this thesis provides standardized information for data integration and minimizing the effects of integration on local systems in discrete environments using XMDR to create information of data warehouse in realtime.

An Algorithm for Finding a Relationship Between Entities: Semi-Automated Schema Integration Approach (엔티티 간의 관계명을 생성하는 알고리즘: 반자동화된 스키마 통합)

  • Kim, Yongchan;Park, Jinsoo;Suh, Jihae
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.243-262
    • /
    • 2018
  • Database schema integration is a significant issue in information systems. Because schema integration is a time-consuming and labor-intensive task, many studies have attempted to automate it. Researchers typically use XML as the source schema and leave much of the work to be done through DBA intervention, e.g., there are various naming conflicts related to relationship names in schema integration. In the past, the DBA had to intervene to resolve the naming-conflict name. In this paper, we introduce an algorithm that automatically generates relationship names to resolve relationship name conflicts that occur during schema integration. This algorithm is based on an Internet collocation and English sentence example dictionary. The relationship between the two entities is generated by analyzing examples extracted based on dictionary data through natural language processing. By building a semi-automated schema integration system and testing this algorithm, we found that it showed about 90% accuracy. Using this algorithm, we can resolve the problems related to naming conflicts that occur at schema integration automatically without DBA intervention.