• Title/Summary/Keyword: 대용량 데이터

Search Result 2,292, Processing Time 0.032 seconds

Stand-alone Real-time Healthcare Monitoring Driven by Integration of Both Triboelectric and Electro-magnetic Effects (실시간 헬스케어 모니터링의 독립 구동을 위한 접촉대전 발전과 전자기 발전 원리의 융합)

  • Cho, Sumin;Joung, Yoonsu;Kim, Hyeonsu;Park, Minseok;Lee, Donghan;Kam, Dongik;Jang, Sunmin;Ra, Yoonsang;Cha, Kyoung Je;Kim, Hyung Woo;Seo, Kyoung Duck;Choi, Dongwhi
    • Korean Chemical Engineering Research
    • /
    • v.60 no.1
    • /
    • pp.86-92
    • /
    • 2022
  • Recently, the bio-healthcare market is enlarging worldwide due to various reasons such as the COVID-19 pandemic. Among them, biometric measurement and analysis technology are expected to bring about future technological innovation and socio-economic ripple effect. Existing systems require a large-capacity battery to drive signal processing, wireless transmission part, and an operating system in the process. However, due to the limitation of the battery capacity, it causes a spatio-temporal limitation on the use of the device. This limitation can act as a cause for the disconnection of data required for the user's health care monitoring, so it is one of the major obstacles of the health care device. In this study, we report the concept of a standalone healthcare monitoring module, which is based on both triboelectric effects and electromagnetic effects, by converting biomechanical energy into suitable electric energy. The proposed system can be operated independently without an external power source. In particular, the wireless foot pressure measurement monitoring system, which is rationally designed triboelectric sensor (TES), can recognize the user's walking habits through foot pressure measurement. By applying the triboelectric effects to the contact-separation behavior that occurs during walking, an effective foot pressure sensor was made, the performance of the sensor was verified through an electrical output signal according to the pressure, and its dynamic behavior is measured through a signal processing circuit using a capacitor. In addition, the biomechanical energy dissipated during walking is harvested as electrical energy by using the electromagnetic induction effect to be used as a power source for wireless transmission and signal processing. Therefore, the proposed system has a great potential to reduce the inconvenience of charging caused by limited battery capacity and to overcome the problem of data disconnection.

The Effect of Domain Specificity on the Performance of Domain-Specific Pre-Trained Language Models (도메인 특수성이 도메인 특화 사전학습 언어모델의 성능에 미치는 영향)

  • Han, Minah;Kim, Younha;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.4
    • /
    • pp.251-273
    • /
    • 2022
  • Recently, research on applying text analysis to deep learning has steadily continued. In particular, researches have been actively conducted to understand the meaning of words and perform tasks such as summarization and sentiment classification through a pre-trained language model that learns large datasets. However, existing pre-trained language models show limitations in that they do not understand specific domains well. Therefore, in recent years, the flow of research has shifted toward creating a language model specialized for a particular domain. Domain-specific pre-trained language models allow the model to understand the knowledge of a particular domain better and reveal performance improvements on various tasks in the field. However, domain-specific further pre-training is expensive to acquire corpus data of the target domain. Furthermore, many cases have reported that performance improvement after further pre-training is insignificant in some domains. As such, it is difficult to decide to develop a domain-specific pre-trained language model, while it is not clear whether the performance will be improved dramatically. In this paper, we present a way to proactively check the expected performance improvement by further pre-training in a domain before actually performing further pre-training. Specifically, after selecting three domains, we measured the increase in classification accuracy through further pre-training in each domain. We also developed and presented new indicators to estimate the specificity of the domain based on the normalized frequency of the keywords used in each domain. Finally, we conducted classification using a pre-trained language model and a domain-specific pre-trained language model of three domains. As a result, we confirmed that the higher the domain specificity index, the higher the performance improvement through further pre-training.

Efficient Topic Modeling by Mapping Global and Local Topics (전역 토픽의 지역 매핑을 통한 효율적 토픽 모델링 방안)

  • Choi, Hochang;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.69-94
    • /
    • 2017
  • Recently, increase of demand for big data analysis has been driving the vigorous development of related technologies and tools. In addition, development of IT and increased penetration rate of smart devices are producing a large amount of data. According to this phenomenon, data analysis technology is rapidly becoming popular. Also, attempts to acquire insights through data analysis have been continuously increasing. It means that the big data analysis will be more important in various industries for the foreseeable future. Big data analysis is generally performed by a small number of experts and delivered to each demander of analysis. However, increase of interest about big data analysis arouses activation of computer programming education and development of many programs for data analysis. Accordingly, the entry barriers of big data analysis are gradually lowering and data analysis technology being spread out. As the result, big data analysis is expected to be performed by demanders of analysis themselves. Along with this, interest about various unstructured data is continually increasing. Especially, a lot of attention is focused on using text data. Emergence of new platforms and techniques using the web bring about mass production of text data and active attempt to analyze text data. Furthermore, result of text analysis has been utilized in various fields. Text mining is a concept that embraces various theories and techniques for text analysis. Many text mining techniques are utilized in this field for various research purposes, topic modeling is one of the most widely used and studied. Topic modeling is a technique that extracts the major issues from a lot of documents, identifies the documents that correspond to each issue and provides identified documents as a cluster. It is evaluated as a very useful technique in that reflect the semantic elements of the document. Traditional topic modeling is based on the distribution of key terms across the entire document. Thus, it is essential to analyze the entire document at once to identify topic of each document. This condition causes a long time in analysis process when topic modeling is applied to a lot of documents. In addition, it has a scalability problem that is an exponential increase in the processing time with the increase of analysis objects. This problem is particularly noticeable when the documents are distributed across multiple systems or regions. To overcome these problems, divide and conquer approach can be applied to topic modeling. It means dividing a large number of documents into sub-units and deriving topics through repetition of topic modeling to each unit. This method can be used for topic modeling on a large number of documents with limited system resources, and can improve processing speed of topic modeling. It also can significantly reduce analysis time and cost through ability to analyze documents in each location or place without combining analysis object documents. However, despite many advantages, this method has two major problems. First, the relationship between local topics derived from each unit and global topics derived from entire document is unclear. It means that in each document, local topics can be identified, but global topics cannot be identified. Second, a method for measuring the accuracy of the proposed methodology should be established. That is to say, assuming that global topic is ideal answer, the difference in a local topic on a global topic needs to be measured. By those difficulties, the study in this method is not performed sufficiently, compare with other studies dealing with topic modeling. In this paper, we propose a topic modeling approach to solve the above two problems. First of all, we divide the entire document cluster(Global set) into sub-clusters(Local set), and generate the reduced entire document cluster(RGS, Reduced global set) that consist of delegated documents extracted from each local set. We try to solve the first problem by mapping RGS topics and local topics. Along with this, we verify the accuracy of the proposed methodology by detecting documents, whether to be discerned as the same topic at result of global and local set. Using 24,000 news articles, we conduct experiments to evaluate practical applicability of the proposed methodology. In addition, through additional experiment, we confirmed that the proposed methodology can provide similar results to the entire topic modeling. We also proposed a reasonable method for comparing the result of both methods.

Sentiment Analysis of Movie Review Using Integrated CNN-LSTM Mode (CNN-LSTM 조합모델을 이용한 영화리뷰 감성분석)

  • Park, Ho-yeon;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.141-154
    • /
    • 2019
  • Rapid growth of internet technology and social media is progressing. Data mining technology has evolved to enable unstructured document representations in a variety of applications. Sentiment analysis is an important technology that can distinguish poor or high-quality content through text data of products, and it has proliferated during text mining. Sentiment analysis mainly analyzes people's opinions in text data by assigning predefined data categories as positive and negative. This has been studied in various directions in terms of accuracy from simple rule-based to dictionary-based approaches using predefined labels. In fact, sentiment analysis is one of the most active researches in natural language processing and is widely studied in text mining. When real online reviews aren't available for others, it's not only easy to openly collect information, but it also affects your business. In marketing, real-world information from customers is gathered on websites, not surveys. Depending on whether the website's posts are positive or negative, the customer response is reflected in the sales and tries to identify the information. However, many reviews on a website are not always good, and difficult to identify. The earlier studies in this research area used the reviews data of the Amazon.com shopping mal, but the research data used in the recent studies uses the data for stock market trends, blogs, news articles, weather forecasts, IMDB, and facebook etc. However, the lack of accuracy is recognized because sentiment calculations are changed according to the subject, paragraph, sentiment lexicon direction, and sentence strength. This study aims to classify the polarity analysis of sentiment analysis into positive and negative categories and increase the prediction accuracy of the polarity analysis using the pretrained IMDB review data set. First, the text classification algorithm related to sentiment analysis adopts the popular machine learning algorithms such as NB (naive bayes), SVM (support vector machines), XGboost, RF (random forests), and Gradient Boost as comparative models. Second, deep learning has demonstrated discriminative features that can extract complex features of data. Representative algorithms are CNN (convolution neural networks), RNN (recurrent neural networks), LSTM (long-short term memory). CNN can be used similarly to BoW when processing a sentence in vector format, but does not consider sequential data attributes. RNN can handle well in order because it takes into account the time information of the data, but there is a long-term dependency on memory. To solve the problem of long-term dependence, LSTM is used. For the comparison, CNN and LSTM were chosen as simple deep learning models. In addition to classical machine learning algorithms, CNN, LSTM, and the integrated models were analyzed. Although there are many parameters for the algorithms, we examined the relationship between numerical value and precision to find the optimal combination. And, we tried to figure out how the models work well for sentiment analysis and how these models work. This study proposes integrated CNN and LSTM algorithms to extract the positive and negative features of text analysis. The reasons for mixing these two algorithms are as follows. CNN can extract features for the classification automatically by applying convolution layer and massively parallel processing. LSTM is not capable of highly parallel processing. Like faucets, the LSTM has input, output, and forget gates that can be moved and controlled at a desired time. These gates have the advantage of placing memory blocks on hidden nodes. The memory block of the LSTM may not store all the data, but it can solve the CNN's long-term dependency problem. Furthermore, when LSTM is used in CNN's pooling layer, it has an end-to-end structure, so that spatial and temporal features can be designed simultaneously. In combination with CNN-LSTM, 90.33% accuracy was measured. This is slower than CNN, but faster than LSTM. The presented model was more accurate than other models. In addition, each word embedding layer can be improved when training the kernel step by step. CNN-LSTM can improve the weakness of each model, and there is an advantage of improving the learning by layer using the end-to-end structure of LSTM. Based on these reasons, this study tries to enhance the classification accuracy of movie reviews using the integrated CNN-LSTM model.

Low price type inspection and monitoring system of lithium ion batteries for hybrid vessels (하이브리드 선박용 리튬 배터리의 저가형 감시시스템 구현)

  • Kwon, Hyuk-joo;Kim, Min-kwon;Lee, Sung-geun
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.40 no.1
    • /
    • pp.28-33
    • /
    • 2016
  • Batteries are used for main power engine in the fields such as mobiles, electric vehicles and unmanned submarines, for starter and lamp driver in general automotive, for emergency electric source in ship. These days, lead-acid and the lithium ion batteries are increasingly used in the fields of the secondary battery, and the lead-acid battery has a low price and safety comparatively, The lithium ion battery has a high energy density, excellent output characteristics and long life, whereas it has the risk of explosion by reacting with moisture in the air. But Recently, due to the development of waterproof, fireproof, dustproof technology, lithium batteries are widely used, particularly, because their usages are getting wider enough to be used as a power source for hybrid ship and electric propulsion ship, it is necessary to manage more strictly. Hybrid ship has power supply units connected to the packets to produce more than 500kWh large power source, and therefore, A number of the communication modules and wires need to implement the wire inspection and monitor system(WIIMS) that allows monitoring server to transmit detecting voltage, current and temperature data, which is required for the management of the batteries. This paper implements a low price type wireless inspection and monitoring system(WILIMS) of the lithium ion battery for hybrid vessels using BLE wireless communication modules and power line modem( PLM), which have the advantages of low price, no electric lines compared to serial communication inspection systems(SCIS). There are state of charge(SOC), state of health(SOH) in inspection parts of batteries, and proposed system will be able to prevent safety accidents because it allows us to predict life time and make a preventive maintenance by checking them at regular intervals.

A Case Study: ICT and the Region-based Sharing Economy of a Start-up Social Enterprise (ICT 기반 지역 공유경제형 사회적 기업 사례 연구)

  • Roh, Taehyup
    • Information Systems Review
    • /
    • v.18 no.1
    • /
    • pp.157-175
    • /
    • 2016
  • Under the market economy of capitalism, several limitations reveal the inequity and redistribution problem of wealth, inefficiency of over-manufacturing and over-consumption, pollution of the natural environment, and the constraint of human liberty and dignity. The new challenge of symbiotic relationships that encourage individual corporations coincides with the need to practice social responsibility and share values to overcome these limitations. Social economy and the social enterprises that simultaneously pursue the making of corporate private profits and the realization of social values have been suggested and disseminated as alternative social value creators. Furthermore, the concept of a sharing economy, which refers to the sharing of things rather than owning them, is growing traction as a new paradigm of capitalism. However, these efforts of social enterprises have fallen short against the conflicts between private profit and social values. This study deals with the case of a start-up social corporation, "Purun Bike Sharing Inc.," which is based on a regional sharing economy business model about bike rental services that use Information and Communication Technology (ICT). This corporation pursues harmonic management to achieve a balance between private profit and social value. Its corporate mission is to achieve sharing, coexistence, and contribution for public welfare. This mission is a possible idea for use in the local community network as a core key for sustainable social enterprises. The model can also be an alternative approach to overcome the structural friction in the social corporation. This study considers the case of Purun Bike Sharing as a sustainable way to practice a sharing economy business model based on a regional cooperation network, which can be combined with social value, and to apply ICT to a sharing economy system. It also examines the definition and current state of social enterprises and the sharing economy, and the cases of the sharing economy business model for the review of prior research.

A Study on Netwotk Effect by using System Dynamics Analysis: A Case of Cyworld (시스템 다이내믹스 기법을 이용한 네트워크 효과 분석: 싸이월드 사례)

  • Kim, Ga-Hye;Yang, Hee-Dong
    • Information Systems Review
    • /
    • v.11 no.1
    • /
    • pp.161-179
    • /
    • 2009
  • Nowadays an increasing number of Internet users are running individual websites as Blog or Cyworld. As this type of personal media has a great influence on communication among people, business comes to care about Network Effect, Network Software, and Social Network. For instance, Cyworld created the web service called 'Minihompy' for individual web-logs, and acquired 2.4milion users in 2007. Although many people assumed that the popularity of Minihompy, or Blog would be a passing fad, Cyworld has improved its service, and expanded its Network with various contents. This kind of expansion reflects survival efforts from infinite competitions among ISPs (Internet Service Provider) with focus on enhancing usability to users. However, Cyworld's Network Effect is gradually diminished in these days. Both of low production cost of service vendors and the low searching/conversing costs of users combine to make ISPs hard to keep their market share sustainable. To overcome this lackluster trend, Cyworld has adopted new strategies and try to lock their users in their service. Various efforts to improve the continuance and expansion of Network effect remain unclear and uncertain. If we understand beforehand how a service would improve Network effect, and which service could bring more effect, ISPs can get substantial help in launching their new business strategy. Regardless many diverse ideas to increase their user's duration online ISPs cannot guarantee 'how the new service strategies will end up in profitability. Therefore, this research studies about Network effect of Cyworld's 'Minihompy' using System-Dynamics method which could analyze dynamic relation between users and ISPs. Furthermore, the research aims to predict changes of Network Effect based on the strategy of new service. 'Page View' and 'Duration Time' can be enhanced for the short tenn because they enhance the service functionality. However, these services cannot increase the Network in the long-run. Limitations of this research include that we predict the future merely based on the limited data. We also limit the independent variables over Network Effect only to the following two issues: Increasing the number of users and increasing the Service Functionality. Despite of some limitations, this study perhaps gives some insights to the policy makers or others facing the stiff competition in the network business.

Design of a Crowd-Sourced Fingerprint Mapping and Localization System (군중-제공 신호지도 작성 및 위치 추적 시스템의 설계)

  • Choi, Eun-Mi;Kim, In-Cheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.9
    • /
    • pp.595-602
    • /
    • 2013
  • WiFi fingerprinting is well known as an effective localization technique used for indoor environments. However, this technique requires a large amount of pre-built fingerprint maps over the entire space. Moreover, due to environmental changes, these maps have to be newly built or updated periodically by experts. As a way to avoid this problem, crowd-sourced fingerprint mapping attracts many interests from researchers. This approach supports many volunteer users to share their WiFi fingerprints collected at a specific environment. Therefore, crowd-sourced fingerprinting can automatically update fingerprint maps up-to-date. In most previous systems, however, individual users were asked to enter their positions manually to build their local fingerprint maps. Moreover, the systems do not have any principled mechanism to keep fingerprint maps clean by detecting and filtering out erroneous fingerprints collected from multiple users. In this paper, we present the design of a crowd-sourced fingerprint mapping and localization(CMAL) system. The proposed system can not only automatically build and/or update WiFi fingerprint maps from fingerprint collections provided by multiple smartphone users, but also simultaneously track their positions using the up-to-date maps. The CMAL system consists of multiple clients to work on individual smartphones to collect fingerprints and a central server to maintain a database of fingerprint maps. Each client contains a particle filter-based WiFi SLAM engine, tracking the smartphone user's position and building each local fingerprint map. The server of our system adopts a Gaussian interpolation-based error filtering algorithm to maintain the integrity of fingerprint maps. Through various experiments, we show the high performance of our system.

A Spatio-Temporal Clustering Technique for the Moving Object Path Search (이동 객체 경로 탐색을 위한 시공간 클러스터링 기법)

  • Lee, Ki-Young;Kang, Hong-Koo;Yun, Jae-Kwan;Han, Ki-Joon
    • Journal of Korea Spatial Information System Society
    • /
    • v.7 no.3 s.15
    • /
    • pp.67-81
    • /
    • 2005
  • Recently, the interest and research on the development of new application services such as the Location Based Service and Telemetics providing the emergency service, neighbor information search, and route search according to the development of the Geographic Information System have been increasing. User's search in the spatio-temporal database which is used in the field of Location Based Service or Telemetics usually fixes the current time on the time axis and queries the spatial and aspatial attributes. Thus, if the range of query on the time axis is extensive, it is difficult to efficiently deal with the search operation. For solving this problem, the snapshot, a method to summarize the location data of moving objects, was introduced. However, if the range to store data is wide, more space for storing data is required. And, the snapshot is created even for unnecessary space that is not frequently used for search. Thus, non storage space and memory are generally used in the snapshot method. Therefore, in this paper, we suggests the Hash-based Spatio-Temporal Clustering Algorithm(H-STCA) that extends the two-dimensional spatial hash algorithm used for the spatial clustering in the past to the three-dimensional spatial hash algorithm for overcoming the disadvantages of the snapshot method. And, this paper also suggests the knowledge extraction algorithm to extract the knowledge for the path search of moving objects from the past location data based on the suggested H-STCA algorithm. Moreover, as the results of the performance evaluation, the snapshot clustering method using H-STCA, in the search time, storage structure construction time, optimal path search time, related to the huge amount of moving object data demonstrated the higher performance than the spatio-temporal index methods and the original snapshot method. Especially, for the snapshot clustering method using H-STCA, the more the number of moving objects was increased, the more the performance was improved, as compared to the existing spatio-temporal index methods and the original snapshot method.

  • PDF

Job Preference Analysis and Job Matching System Development for the Middle Aged Class (중장년층 일자리 요구사항 분석 및 인력 고용 매칭 시스템 개발)

  • Kim, Seongchan;Jang, Jincheul;Kim, Seong Jung;Chin, Hyojin;Yi, Mun Yong
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.247-264
    • /
    • 2016
  • With the rapid acceleration of low-birth rate and population aging, the employment of the neglected groups of people including the middle aged class is a crucial issue in South Korea. In particular, in the 2010s, the number of the middle aged who want to find a new job after retirement age is significantly increasing with the arrival of the retirement time of the baby boom generation (born 1955-1963). Despite the importance of matching jobs to this emerging middle aged class, private job portals as well as the Korean government do not provide any online job service tailored for them. A gigantic amount of job information is available online; however, the current recruiting systems do not meet the demand of the middle aged class as their primary targets are young workers. We are in dire need of a specially designed recruiting system for the middle aged. Meanwhile, when users are searching the desired occupations on the Worknet website, provided by the Korean Ministry of Employment and Labor, users are experiencing discomfort to search for similar jobs because Worknet is providing filtered search results on the basis of exact matches of a preferred job code. Besides, according to our Worknet data analysis, only about 24% of job seekers had landed on a job position consistent with their initial preferred job code while the rest had landed on a position different from their initial preference. To improve the situation, particularly for the middle aged class, we investigate a soft job matching technique by performing the following: 1) we review a user behavior logs of Worknet, which is a public job recruiting system set up by the Korean government and point out key system design implications for the middle aged. Specifically, we analyze the job postings that include preferential tags for the middle aged in order to disclose what types of jobs are in favor of the middle aged; 2) we develope a new occupation classification scheme for the middle aged, Korea Occupation Classification for the Middle-aged (KOCM), based on the similarity between jobs by reorganizing and modifying a general occupation classification scheme. When viewed from the perspective of job placement, an occupation classification scheme is a way to connect the enterprises and job seekers and a basic mechanism for job placement. The key features of KOCM include establishing the Simple Labor category, which is the most requested category by enterprises; and 3) we design MOMA (Middle-aged Occupation Matching Algorithm), which is a hybrid job matching algorithm comprising constraint-based reasoning and case-based reasoning. MOMA incorporates KOCM to expand query to search similar jobs in the database. MOMA utilizes cosine similarity between user requirement and job posting to rank a set of postings in terms of preferred job code, salary, distance, and job type. The developed system using MOMA demonstrates about 20 times of improvement over the hard matching performance. In implementing the algorithm for a web-based application of recruiting system for the middle aged, we also considered the usability issue of making the system easier to use, which is especially important for this particular class of users. That is, we wanted to improve the usability of the system during the job search process for the middle aged users by asking to enter only a few simple and core pieces of information such as preferred job (job code), salary, and (allowable) distance to the working place, enabling the middle aged to find a job suitable to their needs efficiently. The Web site implemented with MOMA should be able to contribute to improving job search of the middle aged class. We also expect the overall approach to be applicable to other groups of people for the improvement of job matching results.