• Title/Summary/Keyword: 워크?

Search Result 22,217, Processing Time 0.051 seconds

Improved Social Network Analysis Method in SNS (SNS에서의 개선된 소셜 네트워크 분석 방법)

  • Sohn, Jong-Soo;Cho, Soo-Whan;Kwon, Kyung-Lag;Chung, In-Jeong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.4
    • /
    • pp.117-127
    • /
    • 2012
  • Due to the recent expansion of the Web 2.0 -based services, along with the widespread of smartphones, online social network services are being popularized among users. Online social network services are the online community services which enable users to communicate each other, share information and expand human relationships. In the social network services, each relation between users is represented by a graph consisting of nodes and links. As the users of online social network services are increasing rapidly, the SNS are actively utilized in enterprise marketing, analysis of social phenomenon and so on. Social Network Analysis (SNA) is the systematic way to analyze social relationships among the members of the social network using the network theory. In general social network theory consists of nodes and arcs, and it is often depicted in a social network diagram. In a social network diagram, nodes represent individual actors within the network and arcs represent relationships between the nodes. With SNA, we can measure relationships among the people such as degree of intimacy, intensity of connection and classification of the groups. Ever since Social Networking Services (SNS) have drawn increasing attention from millions of users, numerous researches have made to analyze their user relationships and messages. There are typical representative SNA methods: degree centrality, betweenness centrality and closeness centrality. In the degree of centrality analysis, the shortest path between nodes is not considered. However, it is used as a crucial factor in betweenness centrality, closeness centrality and other SNA methods. In previous researches in SNA, the computation time was not too expensive since the size of social network was small. Unfortunately, most SNA methods require significant time to process relevant data, and it makes difficult to apply the ever increasing SNS data in social network studies. For instance, if the number of nodes in online social network is n, the maximum number of link in social network is n(n-1)/2. It means that it is too expensive to analyze the social network, for example, if the number of nodes is 10,000 the number of links is 49,995,000. Therefore, we propose a heuristic-based method for finding the shortest path among users in the SNS user graph. Through the shortest path finding method, we will show how efficient our proposed approach may be by conducting betweenness centrality analysis and closeness centrality analysis, both of which are widely used in social network studies. Moreover, we devised an enhanced method with addition of best-first-search method and preprocessing step for the reduction of computation time and rapid search of the shortest paths in a huge size of online social network. Best-first-search method finds the shortest path heuristically, which generalizes human experiences. As large number of links is shared by only a few nodes in online social networks, most nods have relatively few connections. As a result, a node with multiple connections functions as a hub node. When searching for a particular node, looking for users with numerous links instead of searching all users indiscriminately has a better chance of finding the desired node more quickly. In this paper, we employ the degree of user node vn as heuristic evaluation function in a graph G = (N, E), where N is a set of vertices, and E is a set of links between two different nodes. As the heuristic evaluation function is used, the worst case could happen when the target node is situated in the bottom of skewed tree. In order to remove such a target node, the preprocessing step is conducted. Next, we find the shortest path between two nodes in social network efficiently and then analyze the social network. For the verification of the proposed method, we crawled 160,000 people from online and then constructed social network. Then we compared with previous methods, which are best-first-search and breath-first-search, in time for searching and analyzing. The suggested method takes 240 seconds to search nodes where breath-first-search based method takes 1,781 seconds (7.4 times faster). Moreover, for social network analysis, the suggested method is 6.8 times and 1.8 times faster than betweenness centrality analysis and closeness centrality analysis, respectively. The proposed method in this paper shows the possibility to analyze a large size of social network with the better performance in time. As a result, our method would improve the efficiency of social network analysis, making it particularly useful in studying social trends or phenomena.

A Folksonomy Ranking Framework: A Semantic Graph-based Approach (폭소노미 사이트를 위한 랭킹 프레임워크 설계: 시맨틱 그래프기반 접근)

  • Park, Hyun-Jung;Rho, Sang-Kyu
    • Asia pacific journal of information systems
    • /
    • v.21 no.2
    • /
    • pp.89-116
    • /
    • 2011
  • In collaborative tagging systems such as Delicious.com and Flickr.com, users assign keywords or tags to their uploaded resources, such as bookmarks and pictures, for their future use or sharing purposes. The collection of resources and tags generated by a user is called a personomy, and the collection of all personomies constitutes the folksonomy. The most significant need of the folksonomy users Is to efficiently find useful resources or experts on specific topics. An excellent ranking algorithm would assign higher ranking to more useful resources or experts. What resources are considered useful In a folksonomic system? Does a standard superior to frequency or freshness exist? The resource recommended by more users with mere expertise should be worthy of attention. This ranking paradigm can be implemented through a graph-based ranking algorithm. Two well-known representatives of such a paradigm are Page Rank by Google and HITS(Hypertext Induced Topic Selection) by Kleinberg. Both Page Rank and HITS assign a higher evaluation score to pages linked to more higher-scored pages. HITS differs from PageRank in that it utilizes two kinds of scores: authority and hub scores. The ranking objects of these pages are limited to Web pages, whereas the ranking objects of a folksonomic system are somewhat heterogeneous(i.e., users, resources, and tags). Therefore, uniform application of the voting notion of PageRank and HITS based on the links to a folksonomy would be unreasonable, In a folksonomic system, each link corresponding to a property can have an opposite direction, depending on whether the property is an active or a passive voice. The current research stems from the Idea that a graph-based ranking algorithm could be applied to the folksonomic system using the concept of mutual Interactions between entitles, rather than the voting notion of PageRank or HITS. The concept of mutual interactions, proposed for ranking the Semantic Web resources, enables the calculation of importance scores of various resources unaffected by link directions. The weights of a property representing the mutual interaction between classes are assigned depending on the relative significance of the property to the resource importance of each class. This class-oriented approach is based on the fact that, in the Semantic Web, there are many heterogeneous classes; thus, applying a different appraisal standard for each class is more reasonable. This is similar to the evaluation method of humans, where different items are assigned specific weights, which are then summed up to determine the weighted average. We can check for missing properties more easily with this approach than with other predicate-oriented approaches. A user of a tagging system usually assigns more than one tags to the same resource, and there can be more than one tags with the same subjectivity and objectivity. In the case that many users assign similar tags to the same resource, grading the users differently depending on the assignment order becomes necessary. This idea comes from the studies in psychology wherein expertise involves the ability to select the most relevant information for achieving a goal. An expert should be someone who not only has a large collection of documents annotated with a particular tag, but also tends to add documents of high quality to his/her collections. Such documents are identified by the number, as well as the expertise, of users who have the same documents in their collections. In other words, there is a relationship of mutual reinforcement between the expertise of a user and the quality of a document. In addition, there is a need to rank entities related more closely to a certain entity. Considering the property of social media that ensures the popularity of a topic is temporary, recent data should have more weight than old data. We propose a comprehensive folksonomy ranking framework in which all these considerations are dealt with and that can be easily customized to each folksonomy site for ranking purposes. To examine the validity of our ranking algorithm and show the mechanism of adjusting property, time, and expertise weights, we first use a dataset designed for analyzing the effect of each ranking factor independently. We then show the ranking results of a real folksonomy site, with the ranking factors combined. Because the ground truth of a given dataset is not known when it comes to ranking, we inject simulated data whose ranking results can be predicted into the real dataset and compare the ranking results of our algorithm with that of a previous HITS-based algorithm. Our semantic ranking algorithm based on the concept of mutual interaction seems to be preferable to the HITS-based algorithm as a flexible folksonomy ranking framework. Some concrete points of difference are as follows. First, with the time concept applied to the property weights, our algorithm shows superior performance in lowering the scores of older data and raising the scores of newer data. Second, applying the time concept to the expertise weights, as well as to the property weights, our algorithm controls the conflicting influence of expertise weights and enhances overall consistency of time-valued ranking. The expertise weights of the previous study can act as an obstacle to the time-valued ranking because the number of followers increases as time goes on. Third, many new properties and classes can be included in our framework. The previous HITS-based algorithm, based on the voting notion, loses ground in the situation where the domain consists of more than two classes, or where other important properties, such as "sent through twitter" or "registered as a friend," are added to the domain. Forth, there is a big difference in the calculation time and memory use between the two kinds of algorithms. While the matrix multiplication of two matrices, has to be executed twice for the previous HITS-based algorithm, this is unnecessary with our algorithm. In our ranking framework, various folksonomy ranking policies can be expressed with the ranking factors combined and our approach can work, even if the folksonomy site is not implemented with Semantic Web languages. Above all, the time weight proposed in this paper will be applicable to various domains, including social media, where time value is considered important.

Medical Information Dynamic Access System in Smart Mobile Environments (스마트 모바일 환경에서 의료정보 동적접근 시스템)

  • Jeong, Chang Won;Kim, Woo Hong;Yoon, Kwon Ha;Joo, Su Chong
    • Journal of Internet Computing and Services
    • /
    • v.16 no.1
    • /
    • pp.47-55
    • /
    • 2015
  • Recently, the environment of a hospital information system is a trend to combine various SMART technologies. Accordingly, various smart devices, such as a smart phone, Tablet PC is utilized in the medical information system. Also, these environments consist of various applications executing on heterogeneous sensors, devices, systems and networks. In these hospital information system environment, applying a security service by traditional access control method cause a problems. Most of the existing security system uses the access control list structure. It is only permitted access defined by an access control matrix such as client name, service object method name. The major problem with the static approach cannot quickly adapt to changed situations. Hence, we needs to new security mechanisms which provides more flexible and can be easily adapted to various environments with very different security requirements. In addition, for addressing the changing of service medical treatment of the patient, the researching is needed. In this paper, we suggest a dynamic approach to medical information systems in smart mobile environments. We focus on how to access medical information systems according to dynamic access control methods based on the existence of the hospital's information system environments. The physical environments consist of a mobile x-ray imaging devices, dedicated mobile/general smart devices, PACS, EMR server and authorization server. The software environment was developed based on the .Net Framework for synchronization and monitoring services based on mobile X-ray imaging equipment Windows7 OS. And dedicated a smart device application, we implemented a dynamic access services through JSP and Java SDK is based on the Android OS. PACS and mobile X-ray image devices in hospital, medical information between the dedicated smart devices are based on the DICOM medical image standard information. In addition, EMR information is based on H7. In order to providing dynamic access control service, we classify the context of the patients according to conditions of bio-information such as oxygen saturation, heart rate, BP and body temperature etc. It shows event trace diagrams which divided into two parts like general situation, emergency situation. And, we designed the dynamic approach of the medical care information by authentication method. The authentication Information are contained ID/PWD, the roles, position and working hours, emergency certification codes for emergency patients. General situations of dynamic access control method may have access to medical information by the value of the authentication information. In the case of an emergency, was to have access to medical information by an emergency code, without the authentication information. And, we constructed the medical information integration database scheme that is consist medical information, patient, medical staff and medical image information according to medical information standards.y Finally, we show the usefulness of the dynamic access application service based on the smart devices for execution results of the proposed system according to patient contexts such as general and emergency situation. Especially, the proposed systems are providing effective medical information services with smart devices in emergency situation by dynamic access control methods. As results, we expect the proposed systems to be useful for u-hospital information systems and services.

Image Watermarking for Copyright Protection of Images on Shopping Mall (쇼핑몰 이미지 저작권보호를 위한 영상 워터마킹)

  • Bae, Kyoung-Yul
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.4
    • /
    • pp.147-157
    • /
    • 2013
  • With the advent of the digital environment that can be accessed anytime, anywhere with the introduction of high-speed network, the free distribution and use of digital content were made possible. Ironically this environment is raising a variety of copyright infringement, and product images used in the online shopping mall are pirated frequently. There are many controversial issues whether shopping mall images are creative works or not. According to Supreme Court's decision in 2001, to ad pictures taken with ham products is simply a clone of the appearance of objects to deliver nothing but the decision was not only creative expression. But for the photographer's losses recognized in the advertising photo shoot takes the typical cost was estimated damages. According to Seoul District Court precedents in 2003, if there are the photographer's personality and creativity in the selection of the subject, the composition of the set, the direction and amount of light control, set the angle of the camera, shutter speed, shutter chance, other shooting methods for capturing, developing and printing process, the works should be protected by copyright law by the Court's sentence. In order to receive copyright protection of the shopping mall images by the law, it is simply not to convey the status of the product, the photographer's personality and creativity can be recognized that it requires effort. Accordingly, the cost of making the mall image increases, and the necessity for copyright protection becomes higher. The product images of the online shopping mall have a very unique configuration unlike the general pictures such as portraits and landscape photos and, therefore, the general image watermarking technique can not satisfy the requirements of the image watermarking. Because background of product images commonly used in shopping malls is white or black, or gray scale (gradient) color, it is difficult to utilize the space to embed a watermark and the area is very sensitive even a slight change. In this paper, the characteristics of images used in shopping malls are analyzed and a watermarking technology which is suitable to the shopping mall images is proposed. The proposed image watermarking technology divide a product image into smaller blocks, and the corresponding blocks are transformed by DCT (Discrete Cosine Transform), and then the watermark information was inserted into images using quantization of DCT coefficients. Because uniform treatment of the DCT coefficients for quantization cause visual blocking artifacts, the proposed algorithm used weighted mask which quantizes finely the coefficients located block boundaries and coarsely the coefficients located center area of the block. This mask improves subjective visual quality as well as the objective quality of the images. In addition, in order to improve the safety of the algorithm, the blocks which is embedded the watermark are randomly selected and the turbo code is used to reduce the BER when extracting the watermark. The PSNR(Peak Signal to Noise Ratio) of the shopping mall image watermarked by the proposed algorithm is 40.7~48.5[dB] and BER(Bit Error Rate) after JPEG with QF = 70 is 0. This means the watermarked image is high quality and the algorithm is robust to JPEG compression that is used generally at the online shopping malls. Also, for 40% change in size and 40 degrees of rotation, the BER is 0. In general, the shopping malls are used compressed images with QF which is higher than 90. Because the pirated image is used to replicate from original image, the proposed algorithm can identify the copyright infringement in the most cases. As shown the experimental results, the proposed algorithm is suitable to the shopping mall images with simple background. However, the future study should be carried out to enhance the robustness of the proposed algorithm because the robustness loss is occurred after mask process.

Evaluating Reverse Logistics Networks with Centralized Centers : Hybrid Genetic Algorithm Approach (집중형센터를 가진 역물류네트워크 평가 : 혼합형 유전알고리즘 접근법)

  • Yun, YoungSu
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.4
    • /
    • pp.55-79
    • /
    • 2013
  • In this paper, we propose a hybrid genetic algorithm (HGA) approach to effectively solve the reverse logistics network with centralized centers (RLNCC). For the proposed HGA approach, genetic algorithm (GA) is used as a main algorithm. For implementing GA, a new bit-string representation scheme using 0 and 1 values is suggested, which can easily make initial population of GA. As genetic operators, the elitist strategy in enlarged sampling space developed by Gen and Chang (1997), a new two-point crossover operator, and a new random mutation operator are used for selection, crossover and mutation, respectively. For hybrid concept of GA, an iterative hill climbing method (IHCM) developed by Michalewicz (1994) is inserted into HGA search loop. The IHCM is one of local search techniques and precisely explores the space converged by GA search. The RLNCC is composed of collection centers, remanufacturing centers, redistribution centers, and secondary markets in reverse logistics networks. Of the centers and secondary markets, only one collection center, remanufacturing center, redistribution center, and secondary market should be opened in reverse logistics networks. Some assumptions are considered for effectively implementing the RLNCC The RLNCC is represented by a mixed integer programming (MIP) model using indexes, parameters and decision variables. The objective function of the MIP model is to minimize the total cost which is consisted of transportation cost, fixed cost, and handling cost. The transportation cost is obtained by transporting the returned products between each centers and secondary markets. The fixed cost is calculated by opening or closing decision at each center and secondary markets. That is, if there are three collection centers (the opening costs of collection center 1 2, and 3 are 10.5, 12.1, 8.9, respectively), and the collection center 1 is opened and the remainders are all closed, then the fixed cost is 10.5. The handling cost means the cost of treating the products returned from customers at each center and secondary markets which are opened at each RLNCC stage. The RLNCC is solved by the proposed HGA approach. In numerical experiment, the proposed HGA and a conventional competing approach is compared with each other using various measures of performance. For the conventional competing approach, the GA approach by Yun (2013) is used. The GA approach has not any local search technique such as the IHCM proposed the HGA approach. As measures of performance, CPU time, optimal solution, and optimal setting are used. Two types of the RLNCC with different numbers of customers, collection centers, remanufacturing centers, redistribution centers and secondary markets are presented for comparing the performances of the HGA and GA approaches. The MIP models using the two types of the RLNCC are programmed by Visual Basic Version 6.0, and the computer implementing environment is the IBM compatible PC with 3.06Ghz CPU speed and 1GB RAM on Windows XP. The parameters used in the HGA and GA approaches are that the total number of generations is 10,000, population size 20, crossover rate 0.5, mutation rate 0.1, and the search range for the IHCM is 2.0. Total 20 iterations are made for eliminating the randomness of the searches of the HGA and GA approaches. With performance comparisons, network representations by opening/closing decision, and convergence processes using two types of the RLNCCs, the experimental result shows that the HGA has significantly better performance in terms of the optimal solution than the GA, though the GA is slightly quicker than the HGA in terms of the CPU time. Finally, it has been proved that the proposed HGA approach is more efficient than conventional GA approach in two types of the RLNCC since the former has a GA search process as well as a local search process for additional search scheme, while the latter has a GA search process alone. For a future study, much more large-sized RLNCCs will be tested for robustness of our approach.

Intelligent VOC Analyzing System Using Opinion Mining (오피니언 마이닝을 이용한 지능형 VOC 분석시스템)

  • Kim, Yoosin;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.113-125
    • /
    • 2013
  • Every company wants to know customer's requirement and makes an effort to meet them. Cause that, communication between customer and company became core competition of business and that important is increasing continuously. There are several strategies to find customer's needs, but VOC (Voice of customer) is one of most powerful communication tools and VOC gathering by several channels as telephone, post, e-mail, website and so on is so meaningful. So, almost company is gathering VOC and operating VOC system. VOC is important not only to business organization but also public organization such as government, education institute, and medical center that should drive up public service quality and customer satisfaction. Accordingly, they make a VOC gathering and analyzing System and then use for making a new product and service, and upgrade. In recent years, innovations in internet and ICT have made diverse channels such as SNS, mobile, website and call-center to collect VOC data. Although a lot of VOC data is collected through diverse channel, the proper utilization is still difficult. It is because the VOC data is made of very emotional contents by voice or text of informal style and the volume of the VOC data are so big. These unstructured big data make a difficult to store and analyze for use by human. So that, the organization need to automatic collecting, storing, classifying and analyzing system for unstructured big VOC data. This study propose an intelligent VOC analyzing system based on opinion mining to classify the unstructured VOC data automatically and determine the polarity as well as the type of VOC. And then, the basis of the VOC opinion analyzing system, called domain-oriented sentiment dictionary is created and corresponding stages are presented in detail. The experiment is conducted with 4,300 VOC data collected from a medical website to measure the effectiveness of the proposed system and utilized them to develop the sensitive data dictionary by determining the special sentiment vocabulary and their polarity value in a medical domain. Through the experiment, it comes out that positive terms such as "칭찬, 친절함, 감사, 무사히, 잘해, 감동, 미소" have high positive opinion value, and negative terms such as "퉁명, 뭡니까, 말하더군요, 무시하는" have strong negative opinion. These terms are in general use and the experiment result seems to be a high probability of opinion polarity. Furthermore, the accuracy of proposed VOC classification model has been compared and the highest classification accuracy of 77.8% is conformed at threshold with -0.50 of opinion classification of VOC. Through the proposed intelligent VOC analyzing system, the real time opinion classification and response priority of VOC can be predicted. Ultimately the positive effectiveness is expected to catch the customer complains at early stage and deal with it quickly with the lower number of staff to operate the VOC system. It can be made available human resource and time of customer service part. Above all, this study is new try to automatic analyzing the unstructured VOC data using opinion mining, and shows that the system could be used as variable to classify the positive or negative polarity of VOC opinion. It is expected to suggest practical framework of the VOC analysis to diverse use and the model can be used as real VOC analyzing system if it is implemented as system. Despite experiment results and expectation, this study has several limits. First of all, the sample data is only collected from a hospital web-site. It means that the sentimental dictionary made by sample data can be lean too much towards on that hospital and web-site. Therefore, next research has to take several channels such as call-center and SNS, and other domain like government, financial company, and education institute.

Research Trend Analysis Using Bibliographic Information and Citations of Cloud Computing Articles: Application of Social Network Analysis (클라우드 컴퓨팅 관련 논문의 서지정보 및 인용정보를 활용한 연구 동향 분석: 사회 네트워크 분석의 활용)

  • Kim, Dongsung;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.195-211
    • /
    • 2014
  • Cloud computing services provide IT resources as services on demand. This is considered a key concept, which will lead a shift from an ownership-based paradigm to a new pay-for-use paradigm, which can reduce the fixed cost for IT resources, and improve flexibility and scalability. As IT services, cloud services have evolved from early similar computing concepts such as network computing, utility computing, server-based computing, and grid computing. So research into cloud computing is highly related to and combined with various relevant computing research areas. To seek promising research issues and topics in cloud computing, it is necessary to understand the research trends in cloud computing more comprehensively. In this study, we collect bibliographic information and citation information for cloud computing related research papers published in major international journals from 1994 to 2012, and analyzes macroscopic trends and network changes to citation relationships among papers and the co-occurrence relationships of key words by utilizing social network analysis measures. Through the analysis, we can identify the relationships and connections among research topics in cloud computing related areas, and highlight new potential research topics. In addition, we visualize dynamic changes of research topics relating to cloud computing using a proposed cloud computing "research trend map." A research trend map visualizes positions of research topics in two-dimensional space. Frequencies of key words (X-axis) and the rates of increase in the degree centrality of key words (Y-axis) are used as the two dimensions of the research trend map. Based on the values of the two dimensions, the two dimensional space of a research map is divided into four areas: maturation, growth, promising, and decline. An area with high keyword frequency, but low rates of increase of degree centrality is defined as a mature technology area; the area where both keyword frequency and the increase rate of degree centrality are high is defined as a growth technology area; the area where the keyword frequency is low, but the rate of increase in the degree centrality is high is defined as a promising technology area; and the area where both keyword frequency and the rate of degree centrality are low is defined as a declining technology area. Based on this method, cloud computing research trend maps make it possible to easily grasp the main research trends in cloud computing, and to explain the evolution of research topics. According to the results of an analysis of citation relationships, research papers on security, distributed processing, and optical networking for cloud computing are on the top based on the page-rank measure. From the analysis of key words in research papers, cloud computing and grid computing showed high centrality in 2009, and key words dealing with main elemental technologies such as data outsourcing, error detection methods, and infrastructure construction showed high centrality in 2010~2011. In 2012, security, virtualization, and resource management showed high centrality. Moreover, it was found that the interest in the technical issues of cloud computing increases gradually. From annual cloud computing research trend maps, it was verified that security is located in the promising area, virtualization has moved from the promising area to the growth area, and grid computing and distributed system has moved to the declining area. The study results indicate that distributed systems and grid computing received a lot of attention as similar computing paradigms in the early stage of cloud computing research. The early stage of cloud computing was a period focused on understanding and investigating cloud computing as an emergent technology, linking to relevant established computing concepts. After the early stage, security and virtualization technologies became main issues in cloud computing, which is reflected in the movement of security and virtualization technologies from the promising area to the growth area in the cloud computing research trend maps. Moreover, this study revealed that current research in cloud computing has rapidly transferred from a focus on technical issues to for a focus on application issues, such as SLAs (Service Level Agreements).

Directions of Implementing Documentation Strategies for Local Regions (지역 기록화를 위한 도큐멘테이션 전략의 적용)

  • Seol, Moon-Won
    • The Korean Journal of Archival Studies
    • /
    • no.26
    • /
    • pp.103-149
    • /
    • 2010
  • Documentation strategy has been experimented in various subject areas and local regions since late 1980's when it was proposed as archival appraisal and selection methods by archival communities in the United States. Though it was criticized to be too ideal, it needs to shed new light on the potentialities of the strategy for documenting local regions in digital environment. The purpose of this study is to analyse the implementation issues of documentation strategy and to suggest the directions for documenting local regions of Korea through the application of the strategy. The documentation strategy which was developed more than twenty years ago in mostly western countries gives us some implications for documenting local regions even in current digital environments. They are as follows; Firstly, documentation strategy can enhance the value of archivists as well as archives in local regions because archivist should be active shaper of history rather than passive receiver of archives according to the strategy. It can also be a solution for overcoming poor conditions of local archives management in Korea. Secondly, the strategy can encourage cooperation between collecting institutions including museums, libraries, archives, cultural centers, history institutions, etc. in each local region. In the networked environment the cooperation can be achieved more effectively than in traditional environment where the heavy workload of cooperative institutions is needed. Thirdly, the strategy can facilitate solidarity of various groups in local region. According to the analysis of the strategy projects, it is essential to collect their knowledge, passion, and enthusiasm of related groups to effectively implement the strategy. It can also provide a methodology for minor groups of society to document their memories. This study suggests the directions of documenting local regions in consideration of current archival infrastructure of Korean as follows; Firstly, very selective and intensive documentation should be pursued rather than comprehensive one for documenting local regions. Though it is a very political problem to decide what subject has priority for documentation, interests of local community members as well as professional groups should be considered in the decision-making process seriously. Secondly, it is effective to plan integrated representation of local history in the distributed custody of local archives. It would be desirable to implement archival gateway for integrated search and representation of local archives regardless of the location of archives. Thirdly, it is necessary to try digital documentation using Web 2.0 technologies. Documentation strategy as the methodology of selecting and acquiring archives can not avoid subjectivity and prejudices of appraiser completely. To mitigate the problems, open documentation system should be prepared for reflecting different interests of different groups. Fourth, it is desirable to apply a conspectus model used in cooperative collection management of libraries to document local regions digitally. Conspectus can show existing documentation strength and future documentation intensity for each participating institution. Using this, documentation level of each subject area can be set up cooperatively and effectively in the local regions.

A Study on Knowledge Entity Extraction Method for Individual Stocks Based on Neural Tensor Network (뉴럴 텐서 네트워크 기반 주식 개별종목 지식개체명 추출 방법에 관한 연구)

  • Yang, Yunseok;Lee, Hyun Jun;Oh, Kyong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.25-38
    • /
    • 2019
  • Selecting high-quality information that meets the interests and needs of users among the overflowing contents is becoming more important as the generation continues. In the flood of information, efforts to reflect the intention of the user in the search result better are being tried, rather than recognizing the information request as a simple string. Also, large IT companies such as Google and Microsoft focus on developing knowledge-based technologies including search engines which provide users with satisfaction and convenience. Especially, the finance is one of the fields expected to have the usefulness and potential of text data analysis because it's constantly generating new information, and the earlier the information is, the more valuable it is. Automatic knowledge extraction can be effective in areas where information flow is vast, such as financial sector, and new information continues to emerge. However, there are several practical difficulties faced by automatic knowledge extraction. First, there are difficulties in making corpus from different fields with same algorithm, and it is difficult to extract good quality triple. Second, it becomes more difficult to produce labeled text data by people if the extent and scope of knowledge increases and patterns are constantly updated. Third, performance evaluation is difficult due to the characteristics of unsupervised learning. Finally, problem definition for automatic knowledge extraction is not easy because of ambiguous conceptual characteristics of knowledge. So, in order to overcome limits described above and improve the semantic performance of stock-related information searching, this study attempts to extract the knowledge entity by using neural tensor network and evaluate the performance of them. Different from other references, the purpose of this study is to extract knowledge entity which is related to individual stock items. Various but relatively simple data processing methods are applied in the presented model to solve the problems of previous researches and to enhance the effectiveness of the model. From these processes, this study has the following three significances. First, A practical and simple automatic knowledge extraction method that can be applied. Second, the possibility of performance evaluation is presented through simple problem definition. Finally, the expressiveness of the knowledge increased by generating input data on a sentence basis without complex morphological analysis. The results of the empirical analysis and objective performance evaluation method are also presented. The empirical study to confirm the usefulness of the presented model, experts' reports about individual 30 stocks which are top 30 items based on frequency of publication from May 30, 2017 to May 21, 2018 are used. the total number of reports are 5,600, and 3,074 reports, which accounts about 55% of the total, is designated as a training set, and other 45% of reports are designated as a testing set. Before constructing the model, all reports of a training set are classified by stocks, and their entities are extracted using named entity recognition tool which is the KKMA. for each stocks, top 100 entities based on appearance frequency are selected, and become vectorized using one-hot encoding. After that, by using neural tensor network, the same number of score functions as stocks are trained. Thus, if a new entity from a testing set appears, we can try to calculate the score by putting it into every single score function, and the stock of the function with the highest score is predicted as the related item with the entity. To evaluate presented models, we confirm prediction power and determining whether the score functions are well constructed by calculating hit ratio for all reports of testing set. As a result of the empirical study, the presented model shows 69.3% hit accuracy for testing set which consists of 2,526 reports. this hit ratio is meaningfully high despite of some constraints for conducting research. Looking at the prediction performance of the model for each stocks, only 3 stocks, which are LG ELECTRONICS, KiaMtr, and Mando, show extremely low performance than average. this result maybe due to the interference effect with other similar items and generation of new knowledge. In this paper, we propose a methodology to find out key entities or their combinations which are necessary to search related information in accordance with the user's investment intention. Graph data is generated by using only the named entity recognition tool and applied to the neural tensor network without learning corpus or word vectors for the field. From the empirical test, we confirm the effectiveness of the presented model as described above. However, there also exist some limits and things to complement. Representatively, the phenomenon that the model performance is especially bad for only some stocks shows the need for further researches. Finally, through the empirical study, we confirmed that the learning method presented in this study can be used for the purpose of matching the new text information semantically with the related stocks.

Development of Beauty Experience Pattern Map Based on Consumer Emotions: Focusing on Cosmetics (소비자 감성 기반 뷰티 경험 패턴 맵 개발: 화장품을 중심으로)

  • Seo, Bong-Goon;Kim, Keon-Woo;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.179-196
    • /
    • 2019
  • Recently, the "Smart Consumer" has been emerging. He or she is increasingly inclined to search for and purchase products by taking into account personal judgment or expert reviews rather than by relying on information delivered through manufacturers' advertising. This is especially true when purchasing cosmetics. Because cosmetics act directly on the skin, consumers respond seriously to dangerous chemical elements they contain or to skin problems they may cause. Above all, cosmetics should fit well with the purchaser's skin type. In addition, changes in global cosmetics consumer trends make it necessary to study this field. The desire to find one's own individualized cosmetics is being revealed to consumers around the world and is known as "Finding the Holy Grail." Many consumers show a deep interest in customized cosmetics with the cultural boom known as "K-Beauty" (an aspect of "Han-Ryu"), the growth of personal grooming, and the emergence of "self-culture" that includes "self-beauty" and "self-interior." These trends have led to the explosive popularity of cosmetics made in Korea in the Chinese and Southeast Asian markets. In order to meet the customized cosmetics needs of consumers, cosmetics manufacturers and related companies are responding by concentrating on delivering premium services through the convergence of ICT(Information, Communication and Technology). Despite the evolution of companies' responses regarding market trends toward customized cosmetics, there is no "Intelligent Data Platform" that deals holistically with consumers' skin condition experience and thus attaches emotions to products and services. To find the Holy Grail of customized cosmetics, it is important to acquire and analyze consumer data on what they want in order to address their experiences and emotions. The emotions consumers are addressing when purchasing cosmetics varies by their age, sex, skin type, and specific skin issues and influences what price is considered reasonable. Therefore, it is necessary to classify emotions regarding cosmetics by individual consumer. Because of its importance, consumer emotion analysis has been used for both services and products. Given the trends identified above, we judge that consumer emotion analysis can be used in our study. Therefore, we collected and indexed data on consumers' emotions regarding their cosmetics experiences focusing on consumers' language. We crawled the cosmetics emotion data from SNS (blog and Twitter) according to sales ranking ($1^{st}$ to $99^{th}$), focusing on the ample/serum category. A total of 357 emotional adjectives were collected, and we combined and abstracted similar or duplicate emotional adjectives. We conducted a "Consumer Sentiment Journey" workshop to build a "Consumer Sentiment Dictionary," and this resulted in a total of 76 emotional adjectives regarding cosmetics consumer experience. Using these 76 emotional adjectives, we performed clustering with the Self-Organizing Map (SOM) method. As a result of the analysis, we derived eight final clusters of cosmetics consumer sentiments. Using the vector values of each node for each cluster, the characteristics of each cluster were derived based on the top ten most frequently appearing consumer sentiments. Different characteristics were found in consumer sentiments in each cluster. We also developed a cosmetics experience pattern map. The study results confirmed that recommendation and classification systems that consider consumer emotions and sentiments are needed because each consumer differs in what he or she pursues and prefers. Furthermore, this study reaffirms that the application of emotion and sentiment analysis can be extended to various fields other than cosmetics, and it implies that consumer insights can be derived using these methods. They can be used not only to build a specialized sentiment dictionary using scientific processes and "Design Thinking Methodology," but we also expect that these methods can help us to understand consumers' psychological reactions and cognitive behaviors. If this study is further developed, we believe that it will be able to provide solutions based on consumer experience, and therefore that it can be developed as an aspect of marketing intelligence.