• Title/Summary/Keyword: Large Complex Systems

Search Result 613, Processing Time 0.03 seconds

A Construction of the C_MDR(Component_MetaData Registry) for the Environment of Exchanging the Component (컴포넌트 유통환경을 위한 컴포넌트 메타데이타 레지스트리 구축 : C_MDR)

  • Song, Chee-Yang;Yim, Sung-Bin;Baik, Doo-Kwon;Kim, Chul-Hong
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.7 no.6
    • /
    • pp.614-629
    • /
    • 2001
  • As the information-intensive society in 21c based on the environment of global internet is promoted, the software is getting more large and complex, and the demand for the software is increasing briskly. So, it becomes an important issue in academic and industrial field to activate reuse by developing and exchanging the standardized component. Currently, the information services as a product type of each company are provided in foreign market place for reusing a commercial component, but the components which are serviced in each market place are different, insufficient and unstandardized. That is, construction for Component Data Registry based on ISO 11179, is not accomplished. Hence, the national government has stepped up the plan for sending out public component at 2001. Therefore, the systems as a tool for sharing and exchange of data, have to support the meta-information of standardized component. In this paper, we will propose the C_MDR system: a tool to register and manage the standardized meta-information, based upon ISO 11179, for the commercialized common component. The purpose of this system is to systemically share and exchange the data in chain of acceleration of reusing the component. So, we will show the platform of specification for the component meta-information, then define the meta-information according to this platform, also represent the meta-information using XML for enhancing the interoperability of information with other system. Moreover, we will show that three-layered expression make modeling to be simple and understandable. The implementation of this system is to construct a prototype system of the component meta-information through the internet on www, this system uses ASP as a development language and RDBMS Oracle for PC. Thus, we may expect the standardization of the exchanged component metadata, and be able to apply to the exchanged reuse tool.

  • PDF

Financial Fraud Detection using Text Mining Analysis against Municipal Cybercriminality (지자체 사이버 공간 안전을 위한 금융사기 탐지 텍스트 마이닝 방법)

  • Choi, Sukjae;Lee, Jungwon;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.119-138
    • /
    • 2017
  • Recently, SNS has become an important channel for marketing as well as personal communication. However, cybercrime has also evolved with the development of information and communication technology, and illegal advertising is distributed to SNS in large quantity. As a result, personal information is lost and even monetary damages occur more frequently. In this study, we propose a method to analyze which sentences and documents, which have been sent to the SNS, are related to financial fraud. First of all, as a conceptual framework, we developed a matrix of conceptual characteristics of cybercriminality on SNS and emergency management. We also suggested emergency management process which consists of Pre-Cybercriminality (e.g. risk identification) and Post-Cybercriminality steps. Among those we focused on risk identification in this paper. The main process consists of data collection, preprocessing and analysis. First, we selected two words 'daechul(loan)' and 'sachae(private loan)' as seed words and collected data with this word from SNS such as twitter. The collected data are given to the two researchers to decide whether they are related to the cybercriminality, particularly financial fraud, or not. Then we selected some of them as keywords if the vocabularies are related to the nominals and symbols. With the selected keywords, we searched and collected data from web materials such as twitter, news, blog, and more than 820,000 articles collected. The collected articles were refined through preprocessing and made into learning data. The preprocessing process is divided into performing morphological analysis step, removing stop words step, and selecting valid part-of-speech step. In the morphological analysis step, a complex sentence is transformed into some morpheme units to enable mechanical analysis. In the removing stop words step, non-lexical elements such as numbers, punctuation marks, and double spaces are removed from the text. In the step of selecting valid part-of-speech, only two kinds of nouns and symbols are considered. Since nouns could refer to things, the intent of message is expressed better than the other part-of-speech. Moreover, the more illegal the text is, the more frequently symbols are used. The selected data is given 'legal' or 'illegal'. To make the selected data as learning data through the preprocessing process, it is necessary to classify whether each data is legitimate or not. The processed data is then converted into Corpus type and Document-Term Matrix. Finally, the two types of 'legal' and 'illegal' files were mixed and randomly divided into learning data set and test data set. In this study, we set the learning data as 70% and the test data as 30%. SVM was used as the discrimination algorithm. Since SVM requires gamma and cost values as the main parameters, we set gamma as 0.5 and cost as 10, based on the optimal value function. The cost is set higher than general cases. To show the feasibility of the idea proposed in this paper, we compared the proposed method with MLE (Maximum Likelihood Estimation), Term Frequency, and Collective Intelligence method. Overall accuracy and was used as the metric. As a result, the overall accuracy of the proposed method was 92.41% of illegal loan advertisement and 77.75% of illegal visit sales, which is apparently superior to that of the Term Frequency, MLE, etc. Hence, the result suggests that the proposed method is valid and usable practically. In this paper, we propose a framework for crisis management caused by abnormalities of unstructured data sources such as SNS. We hope this study will contribute to the academia by identifying what to consider when applying the SVM-like discrimination algorithm to text analysis. Moreover, the study will also contribute to the practitioners in the field of brand management and opinion mining.

A Integrated Model of Land/Transportation System

  • 이상용
    • Proceedings of the KOR-KST Conference
    • /
    • 1995.12a
    • /
    • pp.45-73
    • /
    • 1995
  • The current paper presents a system dynamics model which can generate the land use anq transportation system performance simultaneously is proposed. The model system consists of 7 submodels (population, migration of population, household, job growth-employment-land availability, housing development, travel demand, and traffic congestion level), and each of them is designed based on the causality functions and feedback loop structure between a large number of physical, socio-economic, and policy variables. The important advantages of the system dynamics model are as follows. First, the model can address the complex interactions between land use and transportation system performance dynamically. Therefore, it can be an effective tool for evaluating the time-by-time effect of a policy over time horizons. Secondly, the system dynamics model is not relied on the assumption of equilibrium state of urban systems as in conventional models since it determines the state of model components directly through dynamic system simulation. Thirdly, the system dynamics model is very flexible in reflecting new features, such as a policy, a new phenomenon which has not existed in the past, a special event, or a useful concept from other methodology, since it consists of a lots of separated equations. In Chapter I, II, and III, overall approach and structure of the model system are discussed with causal-loop diagrams and major equations. In Chapter V _, the performance of the developed model is applied to the analysis of the impact of highway capacity expansion on land use for the area of Montgomery County, MD. The year-by-year impacts of highway capacity expansion on congestion level and land use are analyzed with some possible scenarios for the highway capacity expansion. This is a first comprehensive attempt to use dynamic system simulation modeling in simultaneous treatment of land use and transportation system interactions. The model structure is not very elaborate mainly due to the problem of the availability of behavioral data, but the model performance results indicate that the proposed approach can be a promising one in dealing comprehensively with complicated urban land use/transportation system.

  • PDF

A Study on the Determinants of Demand for Visiting Department Stores Using Big Data (POS) (빅데이터(POS)를 활용한 백화점 방문수요 결정요인에 관한 연구)

  • Shin, Seong Youn;Park, Jung A
    • Land and Housing Review
    • /
    • v.13 no.4
    • /
    • pp.55-71
    • /
    • 2022
  • Recently, the domestic department store industry is growing into a complex shopping cultural space, which is advanced and differentiated by changes in consumption patterns. In addition, competition is intensifying across 70 places operated by five large companies. This study investigates the determinants of the visits to department stores using the big data concept's automatic vehicle access system (pos) and proposes how to strengthen the competitiveness of the department store industry. We use a negative binomial regression test to predict the frequency of visits to 67 branches, except for three branches whose annual sales were incomplete due to the new opening in 2021. The results show that the demand for visiting department stores is positively associated with airport, terminal, and train stations, land areas, parking lots, VIP lounge numbers, luxury store ratio, F&B store numbers, non-commercial areas, and hotels. We suggest four strategies to enhance the competitiveness of domestic department stores. First, department store consumers have a high preference for luxury brands. Therefore, department stores need to form their own overseas buyer teams to discover and attract new luxury brands and attract customers who have a high demand for luxury brands. In addition, to attract consumers with high purchasing power and loyalty, it is necessary to provide more differentiated products and services for VIP customers than before. Second, it is desirable to focus on transportation hub areas such as train stations, airports, and terminals in Gyeonggi and Incheon. Third, department stores should attract tenants who can satisfy customers, given that key tenants are an important component of advanced shopping centers for department stores. Finally, the department store, a top-end shopping center, should be developed as a space with differentiated shopping, culture, dining out, and leisure services, such as "The Hyundai", which opened in 2021, to ensure future growth potential.

The Dynamics of CO2 Budget in Gwangneung Deciduous Old-growth Forest: Lessons from the 15 years of Monitoring (광릉 낙엽활엽수 노령림의 CO2 수지 역학: 15년 관측으로부터의 교훈)

  • Yang, Hyunyoung;Kang, Minseok;Kim, Joon;Ryu, Daun;Kim, Su-Jin;Chun, Jung-Hwa;Lim, Jong-Hwan;Park, Chan Woo;Yun, Soon Jin
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.23 no.4
    • /
    • pp.198-221
    • /
    • 2021
  • After large-scale reforestation in the 1960s and 1970s, forests in Korea have gradually been aging. Net ecosystem CO2 exchange of old-growth forests is theoretically near zero; however, it can be a CO2 sink or source depending on the intervention of disturbance or management. In this study, we report the CO2 budget dynamics of the Gwangneung deciduous old-growth forest (GDK) in Korea and examined the following two questions: (1) is the preserved GDK indeed CO2 neutral as theoretically known? and (2) can we explain the dynamics of CO2 budget by the common mechanisms reported in the literature? To answer, we analyzed the 15-year long CO2 flux data measured by eddy covariance technique along with other biometeorological data at the KoFlux GDK site from 2006 to 2020. The results showed that (1) GDK switched back-and-forth between sink and source of CO2 but averaged to be a week CO2 source (and turning to a moderate CO2 source for the recent five years) and (2) the interannual variability of solar radiation, growing season length, and leaf area index showed a positive correlation with that of gross primary production (GPP) (R2=0.32~0.45); whereas the interannual variability of both air and surface temperature was not significantly correlated with that of ecosystem respiration (RE). Furthermore, the machine learning-based model trained using the dataset of early monitoring period (first 10 years) failed to reproduce the observed interannual variations of GPP and RE for the recent five years. Biomass data analysis suggests that carbon emissions from coarse woody debris may have contributed partly to the conversion to a moderate CO2 source. To properly understand and interpret the long-term CO2 budget dynamics of GDK, new framework of analysis and modeling based on complex systems science is needed. Also, it is important to maintain the flux monitoring and data quality along with the monitoring of coarse woody debris and disturbances.

Multi-Variate Tabular Data Processing and Visualization Scheme for Machine Learning based Analysis: A Case Study using Titanic Dataset (기계 학습 기반 분석을 위한 다변량 정형 데이터 처리 및 시각화 방법: Titanic 데이터셋 적용 사례 연구)

  • Juhyoung Sung;Kiwon Kwon;Kyoungwon Park;Byoungchul Song
    • Journal of Internet Computing and Services
    • /
    • v.25 no.4
    • /
    • pp.121-130
    • /
    • 2024
  • As internet and communication technology (ICT) is improved exponentially, types and amount of available data also increase. Even though data analysis including statistics is significant to utilize this large amount of data, there are inevitable limits to process various and complex data in general way. Meanwhile, there are many attempts to apply machine learning (ML) in various fields to solve the problems according to the enhancement in computational performance and increase in demands for autonomous systems. Especially, data processing for the model input and designing the model to solve the objective function are critical to achieve the model performance. Data processing methods according to the type and property have been presented through many studies and the performance of ML highly varies depending on the methods. Nevertheless, there are difficulties in deciding which data processing method for data analysis since the types and characteristics of data have become more diverse. Specifically, multi-variate data processing is essential for solving non-linear problem based on ML. In this paper, we present a multi-variate tabular data processing scheme for ML-aided data analysis by using Titanic dataset from Kaggle including various kinds of data. We present the methods like input variable filtering applying statistical analysis and normalization according to the data property. In addition, we analyze the data structure using visualization. Lastly, we design an ML model and train the model by applying the proposed multi-variate data process. After that, we analyze the passenger's survival prediction performance of the trained model. We expect that the proposed multi-variate data processing and visualization can be extended to various environments for ML based analysis.

Different Look, Different Feel: Social Robot Design Evaluation Model Based on ABOT Attributes and Consumer Emotions (각인각색, 각봇각색: ABOT 속성과 소비자 감성 기반 소셜로봇 디자인평가 모형 개발)

  • Ha, Sangjip;Lee, Junsik;Yoo, In-Jin;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.2
    • /
    • pp.55-78
    • /
    • 2021
  • Tosolve complex and diverse social problems and ensure the quality of life of individuals, social robots that can interact with humans are attracting attention. In the past, robots were recognized as beings that provide labor force as they put into industrial sites on behalf of humans. However, the concept of today's robot has been extended to social robots that coexist with humans and enable social interaction with the advent of Smart technology, which is considered an important driver in most industries. Specifically, there are service robots that respond to customers, the robots that have the purpose of edutainment, and the emotionalrobots that can interact with humans intimately. However, popularization of robots is not felt despite the current information environment in the modern ICT service environment and the 4th industrial revolution. Considering social interaction with users which is an important function of social robots, not only the technology of the robots but also other factors should be considered. The design elements of the robot are more important than other factors tomake consumers purchase essentially a social robot. In fact, existing studies on social robots are at the level of proposing "robot development methodology" or testing the effects provided by social robots to users in pieces. On the other hand, consumer emotions felt from the robot's appearance has an important influence in the process of forming user's perception, reasoning, evaluation and expectation. Furthermore, it can affect attitude toward robots and good feeling and performance reasoning, etc. Therefore, this study aims to verify the effect of appearance of social robot and consumer emotions on consumer's attitude toward social robot. At this time, a social robot design evaluation model is constructed by combining heterogeneous data from different sources. Specifically, the three quantitative indicator data for the appearance of social robots from the ABOT Database is included in the model. The consumer emotions of social robot design has been collected through (1) the existing design evaluation literature and (2) online buzzsuch as product reviews and blogs, (3) qualitative interviews for social robot design. Later, we collected the score of consumer emotions and attitudes toward various social robots through a large-scale consumer survey. First, we have derived the six major dimensions of consumer emotions for 23 pieces of detailed emotions through dimension reduction methodology. Then, statistical analysis was performed to verify the effect of derived consumer emotionson attitude toward social robots. Finally, the moderated regression analysis was performed to verify the effect of quantitatively collected indicators of social robot appearance on the relationship between consumer emotions and attitudes toward social robots. Interestingly, several significant moderation effects were identified, these effects are visualized with two-way interaction effect to interpret them from multidisciplinary perspectives. This study has theoretical contributions from the perspective of empirically verifying all stages from technical properties to consumer's emotion and attitudes toward social robots by linking the data from heterogeneous sources. It has practical significance that the result helps to develop the design guidelines based on consumer emotions in the design stage of social robot development.

Ensemble of Nested Dichotomies for Activity Recognition Using Accelerometer Data on Smartphone (Ensemble of Nested Dichotomies 기법을 이용한 스마트폰 가속도 센서 데이터 기반의 동작 인지)

  • Ha, Eu Tteum;Kim, Jeongmin;Ryu, Kwang Ryel
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.4
    • /
    • pp.123-132
    • /
    • 2013
  • As the smartphones are equipped with various sensors such as the accelerometer, GPS, gravity sensor, gyros, ambient light sensor, proximity sensor, and so on, there have been many research works on making use of these sensors to create valuable applications. Human activity recognition is one such application that is motivated by various welfare applications such as the support for the elderly, measurement of calorie consumption, analysis of lifestyles, analysis of exercise patterns, and so on. One of the challenges faced when using the smartphone sensors for activity recognition is that the number of sensors used should be minimized to save the battery power. When the number of sensors used are restricted, it is difficult to realize a highly accurate activity recognizer or a classifier because it is hard to distinguish between subtly different activities relying on only limited information. The difficulty gets especially severe when the number of different activity classes to be distinguished is very large. In this paper, we show that a fairly accurate classifier can be built that can distinguish ten different activities by using only a single sensor data, i.e., the smartphone accelerometer data. The approach that we take to dealing with this ten-class problem is to use the ensemble of nested dichotomy (END) method that transforms a multi-class problem into multiple two-class problems. END builds a committee of binary classifiers in a nested fashion using a binary tree. At the root of the binary tree, the set of all the classes are split into two subsets of classes by using a binary classifier. At a child node of the tree, a subset of classes is again split into two smaller subsets by using another binary classifier. Continuing in this way, we can obtain a binary tree where each leaf node contains a single class. This binary tree can be viewed as a nested dichotomy that can make multi-class predictions. Depending on how a set of classes are split into two subsets at each node, the final tree that we obtain can be different. Since there can be some classes that are correlated, a particular tree may perform better than the others. However, we can hardly identify the best tree without deep domain knowledge. The END method copes with this problem by building multiple dichotomy trees randomly during learning, and then combining the predictions made by each tree during classification. The END method is generally known to perform well even when the base learner is unable to model complex decision boundaries As the base classifier at each node of the dichotomy, we have used another ensemble classifier called the random forest. A random forest is built by repeatedly generating a decision tree each time with a different random subset of features using a bootstrap sample. By combining bagging with random feature subset selection, a random forest enjoys the advantage of having more diverse ensemble members than a simple bagging. As an overall result, our ensemble of nested dichotomy can actually be seen as a committee of committees of decision trees that can deal with a multi-class problem with high accuracy. The ten classes of activities that we distinguish in this paper are 'Sitting', 'Standing', 'Walking', 'Running', 'Walking Uphill', 'Walking Downhill', 'Running Uphill', 'Running Downhill', 'Falling', and 'Hobbling'. The features used for classifying these activities include not only the magnitude of acceleration vector at each time point but also the maximum, the minimum, and the standard deviation of vector magnitude within a time window of the last 2 seconds, etc. For experiments to compare the performance of END with those of other methods, the accelerometer data has been collected at every 0.1 second for 2 minutes for each activity from 5 volunteers. Among these 5,900 ($=5{\times}(60{\times}2-2)/0.1$) data collected for each activity (the data for the first 2 seconds are trashed because they do not have time window data), 4,700 have been used for training and the rest for testing. Although 'Walking Uphill' is often confused with some other similar activities, END has been found to classify all of the ten activities with a fairly high accuracy of 98.4%. On the other hand, the accuracies achieved by a decision tree, a k-nearest neighbor, and a one-versus-rest support vector machine have been observed as 97.6%, 96.5%, and 97.6%, respectively.

Investigation on a Way to Maximize the Productivity in Poultry Industry (양계산업에 있어서 생산성 향상방안에 대한 조사 연구)

  • 오세정
    • Korean Journal of Poultry Science
    • /
    • v.16 no.2
    • /
    • pp.105-127
    • /
    • 1989
  • Although poultry industry in Japan has been much developed in recent years, it still needs to be developed , compared with developed countries. Since the poultry market in Korea is expected to be opened in the near future it is necessary to maximize the Productivity to reduce the production costs and to develop the scientific, technologies and management organization systems for the improvement of the quality in poultry production. Followings ale the summary of poultry industry in Japan. 1. Poultry industry in Japan is almost specized and commercialized and its management system is : integrated, cooperative and developed to industrialized intensive style. Therefore, they have competitive power in the international poultry markets. 2. Average egg weight is 48-50g per day (Max. 54g) and feed requirement is 2. 1-2. 3. 3. The management organization system is specialized and farmers in small scale form complex and farmers in large scale are integrated.

  • PDF

A Study on Knowledge Entity Extraction Method for Individual Stocks Based on Neural Tensor Network (뉴럴 텐서 네트워크 기반 주식 개별종목 지식개체명 추출 방법에 관한 연구)

  • Yang, Yunseok;Lee, Hyun Jun;Oh, Kyong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.25-38
    • /
    • 2019
  • Selecting high-quality information that meets the interests and needs of users among the overflowing contents is becoming more important as the generation continues. In the flood of information, efforts to reflect the intention of the user in the search result better are being tried, rather than recognizing the information request as a simple string. Also, large IT companies such as Google and Microsoft focus on developing knowledge-based technologies including search engines which provide users with satisfaction and convenience. Especially, the finance is one of the fields expected to have the usefulness and potential of text data analysis because it's constantly generating new information, and the earlier the information is, the more valuable it is. Automatic knowledge extraction can be effective in areas where information flow is vast, such as financial sector, and new information continues to emerge. However, there are several practical difficulties faced by automatic knowledge extraction. First, there are difficulties in making corpus from different fields with same algorithm, and it is difficult to extract good quality triple. Second, it becomes more difficult to produce labeled text data by people if the extent and scope of knowledge increases and patterns are constantly updated. Third, performance evaluation is difficult due to the characteristics of unsupervised learning. Finally, problem definition for automatic knowledge extraction is not easy because of ambiguous conceptual characteristics of knowledge. So, in order to overcome limits described above and improve the semantic performance of stock-related information searching, this study attempts to extract the knowledge entity by using neural tensor network and evaluate the performance of them. Different from other references, the purpose of this study is to extract knowledge entity which is related to individual stock items. Various but relatively simple data processing methods are applied in the presented model to solve the problems of previous researches and to enhance the effectiveness of the model. From these processes, this study has the following three significances. First, A practical and simple automatic knowledge extraction method that can be applied. Second, the possibility of performance evaluation is presented through simple problem definition. Finally, the expressiveness of the knowledge increased by generating input data on a sentence basis without complex morphological analysis. The results of the empirical analysis and objective performance evaluation method are also presented. The empirical study to confirm the usefulness of the presented model, experts' reports about individual 30 stocks which are top 30 items based on frequency of publication from May 30, 2017 to May 21, 2018 are used. the total number of reports are 5,600, and 3,074 reports, which accounts about 55% of the total, is designated as a training set, and other 45% of reports are designated as a testing set. Before constructing the model, all reports of a training set are classified by stocks, and their entities are extracted using named entity recognition tool which is the KKMA. for each stocks, top 100 entities based on appearance frequency are selected, and become vectorized using one-hot encoding. After that, by using neural tensor network, the same number of score functions as stocks are trained. Thus, if a new entity from a testing set appears, we can try to calculate the score by putting it into every single score function, and the stock of the function with the highest score is predicted as the related item with the entity. To evaluate presented models, we confirm prediction power and determining whether the score functions are well constructed by calculating hit ratio for all reports of testing set. As a result of the empirical study, the presented model shows 69.3% hit accuracy for testing set which consists of 2,526 reports. this hit ratio is meaningfully high despite of some constraints for conducting research. Looking at the prediction performance of the model for each stocks, only 3 stocks, which are LG ELECTRONICS, KiaMtr, and Mando, show extremely low performance than average. this result maybe due to the interference effect with other similar items and generation of new knowledge. In this paper, we propose a methodology to find out key entities or their combinations which are necessary to search related information in accordance with the user's investment intention. Graph data is generated by using only the named entity recognition tool and applied to the neural tensor network without learning corpus or word vectors for the field. From the empirical test, we confirm the effectiveness of the presented model as described above. However, there also exist some limits and things to complement. Representatively, the phenomenon that the model performance is especially bad for only some stocks shows the need for further researches. Finally, through the empirical study, we confirmed that the learning method presented in this study can be used for the purpose of matching the new text information semantically with the related stocks.