• 제목/요약/키워드: 노력

Search Result 15,767, Processing Time 0.054 seconds

Construction and Application of Intelligent Decision Support System through Defense Ontology - Application example of Air Force Logistics Situation Management System (국방 온톨로지를 통한 지능형 의사결정지원시스템 구축 및 활용 - 공군 군수상황관리체계 적용 사례)

  • Jo, Wongi;Kim, Hak-Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.77-97
    • /
    • 2019
  • The large amount of data that emerges from the initial connection environment of the Fourth Industrial Revolution is a major factor that distinguishes the Fourth Industrial Revolution from the existing production environment. This environment has two-sided features that allow it to produce data while using it. And the data produced so produces another value. Due to the massive scale of data, future information systems need to process more data in terms of quantities than existing information systems. In addition, in terms of quality, only a large amount of data, Ability is required. In a small-scale information system, it is possible for a person to accurately understand the system and obtain the necessary information, but in a variety of complex systems where it is difficult to understand the system accurately, it becomes increasingly difficult to acquire the desired information. In other words, more accurate processing of large amounts of data has become a basic condition for future information systems. This problem related to the efficient performance of the information system can be solved by building a semantic web which enables various information processing by expressing the collected data as an ontology that can be understood by not only people but also computers. For example, as in most other organizations, IT has been introduced in the military, and most of the work has been done through information systems. Currently, most of the work is done through information systems. As existing systems contain increasingly large amounts of data, efforts are needed to make the system easier to use through its data utilization. An ontology-based system has a large data semantic network through connection with other systems, and has a wide range of databases that can be utilized, and has the advantage of searching more precisely and quickly through relationships between predefined concepts. In this paper, we propose a defense ontology as a method for effective data management and decision support. In order to judge the applicability and effectiveness of the actual system, we reconstructed the existing air force munitions situation management system as an ontology based system. It is a system constructed to strengthen management and control of logistics situation of commanders and practitioners by providing real - time information on maintenance and distribution situation as it becomes difficult to use complicated logistics information system with large amount of data. Although it is a method to take pre-specified necessary information from the existing logistics system and display it as a web page, it is also difficult to confirm this system except for a few specified items in advance, and it is also time-consuming to extend the additional function if necessary And it is a system composed of category type without search function. Therefore, it has a disadvantage that it can be easily utilized only when the system is well known as in the existing system. The ontology-based logistics situation management system is designed to provide the intuitive visualization of the complex information of the existing logistics information system through the ontology. In order to construct the logistics situation management system through the ontology, And the useful functions such as performance - based logistics support contract management and component dictionary are further identified and included in the ontology. In order to confirm whether the constructed ontology can be used for decision support, it is necessary to implement a meaningful analysis function such as calculation of the utilization rate of the aircraft, inquiry about performance-based military contract. Especially, in contrast to building ontology database in ontology study in the past, in this study, time series data which change value according to time such as the state of aircraft by date are constructed by ontology, and through the constructed ontology, It is confirmed that it is possible to calculate the utilization rate based on various criteria as well as the computable utilization rate. In addition, the data related to performance-based logistics contracts introduced as a new maintenance method of aircraft and other munitions can be inquired into various contents, and it is easy to calculate performance indexes used in performance-based logistics contract through reasoning and functions. Of course, we propose a new performance index that complements the limitations of the currently applied performance indicators, and calculate it through the ontology, confirming the possibility of using the constructed ontology. Finally, it is possible to calculate the failure rate or reliability of each component, including MTBF data of the selected fault-tolerant item based on the actual part consumption performance. The reliability of the mission and the reliability of the system are calculated. In order to confirm the usability of the constructed ontology-based logistics situation management system, the proposed system through the Technology Acceptance Model (TAM), which is a representative model for measuring the acceptability of the technology, is more useful and convenient than the existing system.

A Study on Knowledge Entity Extraction Method for Individual Stocks Based on Neural Tensor Network (뉴럴 텐서 네트워크 기반 주식 개별종목 지식개체명 추출 방법에 관한 연구)

  • Yang, Yunseok;Lee, Hyun Jun;Oh, Kyong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.25-38
    • /
    • 2019
  • Selecting high-quality information that meets the interests and needs of users among the overflowing contents is becoming more important as the generation continues. In the flood of information, efforts to reflect the intention of the user in the search result better are being tried, rather than recognizing the information request as a simple string. Also, large IT companies such as Google and Microsoft focus on developing knowledge-based technologies including search engines which provide users with satisfaction and convenience. Especially, the finance is one of the fields expected to have the usefulness and potential of text data analysis because it's constantly generating new information, and the earlier the information is, the more valuable it is. Automatic knowledge extraction can be effective in areas where information flow is vast, such as financial sector, and new information continues to emerge. However, there are several practical difficulties faced by automatic knowledge extraction. First, there are difficulties in making corpus from different fields with same algorithm, and it is difficult to extract good quality triple. Second, it becomes more difficult to produce labeled text data by people if the extent and scope of knowledge increases and patterns are constantly updated. Third, performance evaluation is difficult due to the characteristics of unsupervised learning. Finally, problem definition for automatic knowledge extraction is not easy because of ambiguous conceptual characteristics of knowledge. So, in order to overcome limits described above and improve the semantic performance of stock-related information searching, this study attempts to extract the knowledge entity by using neural tensor network and evaluate the performance of them. Different from other references, the purpose of this study is to extract knowledge entity which is related to individual stock items. Various but relatively simple data processing methods are applied in the presented model to solve the problems of previous researches and to enhance the effectiveness of the model. From these processes, this study has the following three significances. First, A practical and simple automatic knowledge extraction method that can be applied. Second, the possibility of performance evaluation is presented through simple problem definition. Finally, the expressiveness of the knowledge increased by generating input data on a sentence basis without complex morphological analysis. The results of the empirical analysis and objective performance evaluation method are also presented. The empirical study to confirm the usefulness of the presented model, experts' reports about individual 30 stocks which are top 30 items based on frequency of publication from May 30, 2017 to May 21, 2018 are used. the total number of reports are 5,600, and 3,074 reports, which accounts about 55% of the total, is designated as a training set, and other 45% of reports are designated as a testing set. Before constructing the model, all reports of a training set are classified by stocks, and their entities are extracted using named entity recognition tool which is the KKMA. for each stocks, top 100 entities based on appearance frequency are selected, and become vectorized using one-hot encoding. After that, by using neural tensor network, the same number of score functions as stocks are trained. Thus, if a new entity from a testing set appears, we can try to calculate the score by putting it into every single score function, and the stock of the function with the highest score is predicted as the related item with the entity. To evaluate presented models, we confirm prediction power and determining whether the score functions are well constructed by calculating hit ratio for all reports of testing set. As a result of the empirical study, the presented model shows 69.3% hit accuracy for testing set which consists of 2,526 reports. this hit ratio is meaningfully high despite of some constraints for conducting research. Looking at the prediction performance of the model for each stocks, only 3 stocks, which are LG ELECTRONICS, KiaMtr, and Mando, show extremely low performance than average. this result maybe due to the interference effect with other similar items and generation of new knowledge. In this paper, we propose a methodology to find out key entities or their combinations which are necessary to search related information in accordance with the user's investment intention. Graph data is generated by using only the named entity recognition tool and applied to the neural tensor network without learning corpus or word vectors for the field. From the empirical test, we confirm the effectiveness of the presented model as described above. However, there also exist some limits and things to complement. Representatively, the phenomenon that the model performance is especially bad for only some stocks shows the need for further researches. Finally, through the empirical study, we confirmed that the learning method presented in this study can be used for the purpose of matching the new text information semantically with the related stocks.

Animal Infectious Diseases Prevention through Big Data and Deep Learning (빅데이터와 딥러닝을 활용한 동물 감염병 확산 차단)

  • Kim, Sung Hyun;Choi, Joon Ki;Kim, Jae Seok;Jang, Ah Reum;Lee, Jae Ho;Cha, Kyung Jin;Lee, Sang Won
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.137-154
    • /
    • 2018
  • Animal infectious diseases, such as avian influenza and foot and mouth disease, occur almost every year and cause huge economic and social damage to the country. In order to prevent this, the anti-quarantine authorities have tried various human and material endeavors, but the infectious diseases have continued to occur. Avian influenza is known to be developed in 1878 and it rose as a national issue due to its high lethality. Food and mouth disease is considered as most critical animal infectious disease internationally. In a nation where this disease has not been spread, food and mouth disease is recognized as economic disease or political disease because it restricts international trade by making it complex to import processed and non-processed live stock, and also quarantine is costly. In a society where whole nation is connected by zone of life, there is no way to prevent the spread of infectious disease fully. Hence, there is a need to be aware of occurrence of the disease and to take action before it is distributed. Epidemiological investigation on definite diagnosis target is implemented and measures are taken to prevent the spread of disease according to the investigation results, simultaneously with the confirmation of both human infectious disease and animal infectious disease. The foundation of epidemiological investigation is figuring out to where one has been, and whom he or she has met. In a data perspective, this can be defined as an action taken to predict the cause of disease outbreak, outbreak location, and future infection, by collecting and analyzing geographic data and relation data. Recently, an attempt has been made to develop a prediction model of infectious disease by using Big Data and deep learning technology, but there is no active research on model building studies and case reports. KT and the Ministry of Science and ICT have been carrying out big data projects since 2014 as part of national R &D projects to analyze and predict the route of livestock related vehicles. To prevent animal infectious diseases, the researchers first developed a prediction model based on a regression analysis using vehicle movement data. After that, more accurate prediction model was constructed using machine learning algorithms such as Logistic Regression, Lasso, Support Vector Machine and Random Forest. In particular, the prediction model for 2017 added the risk of diffusion to the facilities, and the performance of the model was improved by considering the hyper-parameters of the modeling in various ways. Confusion Matrix and ROC Curve show that the model constructed in 2017 is superior to the machine learning model. The difference between the2016 model and the 2017 model is that visiting information on facilities such as feed factory and slaughter house, and information on bird livestock, which was limited to chicken and duck but now expanded to goose and quail, has been used for analysis in the later model. In addition, an explanation of the results was added to help the authorities in making decisions and to establish a basis for persuading stakeholders in 2017. This study reports an animal infectious disease prevention system which is constructed on the basis of hazardous vehicle movement, farm and environment Big Data. The significance of this study is that it describes the evolution process of the prediction model using Big Data which is used in the field and the model is expected to be more complete if the form of viruses is put into consideration. This will contribute to data utilization and analysis model development in related field. In addition, we expect that the system constructed in this study will provide more preventive and effective prevention.

An Essay on the Change of Jinju Sword Dance after being designated as an Important Intangible Cultural Asset (<진주검무> 중요무형문화재 지정 이후의 변화에 관한 소고)

  • Lee, Jong Sook
    • Korean Journal of Heritage: History & Science
    • /
    • v.49 no.1
    • /
    • pp.4-21
    • /
    • 2016
  • The purpose of this study is to investigate changes of Jinju Sword Dance, characteristics of the changes, and the current condition of its preservation and succession after the designation as the important intangible cultural property no. 12 in January 16th, 1967. In other words, this study understands the situation which has established the present state of after changes over generations. As of now. the year of 2015, the 3 generation holders have been approved since 1967. In 1967, 8 members of $1^{st}$ generation holders were selected from gisaengs of Gwonbeon. However, the succession training was incomplete due to conflicts among the holders, the deaths of some holders, and economic activities of the individuals. As the need of a pivot for succession training and activities was rising, Seong, Gye-Ok was additionally approved as the $2^{nd}$ generation holder on June $21^{st}$, 1978. Seong, Gye-Ok who had never been a gisaeng had dramatically changed with a lot of new attempts. After the death of Seong, Gye-Ok in 2009, Kim, Tae-Yeon and Yu, Yeong-Hee were approved as the $3^{rd}$ generation holders in February, 2010. Based on the resources including the "Cultural Research Reports of Important Intangible Cultural Properties" in 1966 and videos up to 2014, the changes of the dance and surroundings are as follow. 1. The formation of musical accompaniment has been changed during the 3 generations. In the video of the $1^{st}$ generation(in 1970), the performance lasted about 15 minutes, whereas the performance lasted 25 minutes in the video of the $2^{nd}$ generation. Yumbuldoduri rhythm was considered as Ginyumbul(Sangryeongsan) and played more slowly. The original dance requiring only 15 rhythms was extended to 39 rhythms to provide longer performance time. In the $3^{rd}$ generation, the dance recovered 15 rhythms using the term Ginyumbul. The facts that Yumbul was played for 3 minutes in the $1^{st}$ generation but for 5 minutes in the 3rd generation shows that there was tendency pursuing the slowness from the $2^{nd}$ generation. 2. For the composition of the Dance, the performance included additional 20 rhythms of Ginyumbul and Ah(亞)-shaped formation from the $2^{nd}$ generation. From the $3^{rd}$ generation, the performance excluded the formation which had no traditional base. For the movement of the Dance, the bridge poses of Ggakjittegi and Bangsukdoli have been visibly inflexible. Also, the extention of time value in 1 beat led the Dance less vibrant. 3. At the designation as an important intangible cultural property (in 1967), the swords with rotatable necks were used, whereas the dancers had been using the swords with non-rotatable necks since late 1970s when the $2^{nd}$ generation holder began to used them. The swords in the "Research Reports" (in 1966) was pointy and semilunar, whereas the straight swords are being used currently. The use of the straight swords can be confirmed from the videos after 1970. 4. There is no change in wearing Jeonlib, Jeonbok, and Hansam, whereas the arrangement of Saekdong of Hansam was different from the arrangement shown in the "Research Reports". Also, dancers were considered to begin wearing the navy skirts when the swords with non-rotatable necks began to be used. Those results showed that has been actively changed for 50 years after the designation. The $2^{nd}$ generation holder, Seong, Gye-Ok, was the pivot of the changes. However, , which was already designated as an important intangible cultural property, is considered to be only a victim of the change experiment from the project to restore Gyobang culture in Jinju, and it is a priority to conduct studies with historical legitimacy. First of all, the slowing beat should be emphasized as the main fact to reduce both the liveliness and dynamic beauty of the Dance.

An Study on Cognition and Investigation of Silla Tumuli in the Japanese Imperialistic Rule (일제강점기의 신라고분조사연구에 대한 검토)

  • Cha, Soon Chul
    • Korean Journal of Heritage: History & Science
    • /
    • v.39
    • /
    • pp.95-130
    • /
    • 2006
  • Japanese government college researchers, including Sekino Tadashi(關野貞), have conducted research studies and collected data, on overall Korean cultural relics as well as Silla tumuli(新羅古墳) in the early modern times under the Japanese imperialistic rule. They were supported by the Meichi government in the early stage of research, by the Chosun government-general, and by their related organizations after Korea was coIonialized to carry out investigations on Korean antiquities, fine arts, architecture, anthropology, folklore, and so on. The objective for which they prosecuted inquiries into Korean cultural relics, including Silla tumuli, may be attributed to the purport to find out such data as needed for the theoretical foundation to justify their colonialization of Korea. Such a reason often showed locally biased or distorted views. Investigations and surveys had been incessantly carried out by those Japanese scholars who took a keen interest in Korean tumuli and excavated relics since 1886. 'Korea Architecture Survey Reports' conducted in 1904 by Sekino in Korea gives a brief introduction of the contents of Korean tumuli, including the Five Royal Mausoleums(五陵). And in 1906 Imanishi Ryu(今西龍) launched for the first time an excavation survey on Buksan Tumulus(北山古墳) in Sogeumgangsan(小金剛山) and on 'Namchong(南塚)' in Hwangnam-dong, which greatly contributed to the foundation of a basic understanding of Wooden chamber tombs with stone mound(積石木槨墳) and stone chambers with tunnel entrance(橫穴式石室墳). The ground plan and cross section of stone chambers made in 1909 at his excavation survey of seokchimchong(石枕塚) by Yazui Seiyichi(谷井第一) who majored in architecture made a drawing in excavation surveys for the first time in Korea, in which numerical expressions are sharply distinguished from the previous sketched ones. And even in the following excavation surveys this kind of drawing continued. Imanishi and Yazui elucidated that wooden chambers with stone mound chronologically differs from the stone chambers with tunnel entrance on the basis of the results of surveys of the locational characteristics of Silla tumuli, the forms and size of tomb entrance, excavated relics, and so forth. The government-general put in force 'the Historic Spots and Relics Preservation Rules' and 'the Historic Spots Survey Council Regulations' in 1916, establishing 'Historic Spots Survey Council and Museum Conference. When museums initiated their activities, they exhibited those relics excavated from tumuli and conducted surveys of relics with the permission of the Chosun government-general. A gold crown tomb(金冠塚) was excavated and surveyed in 1921 and a seobong tomb(瑞鳳塚) in 1927. Concomitantly with this large size wooden chamber tombs with stone mound attracted strong public attention. Furthermore, a variety of surveys of spots throughout the country were carried out but publication of tumuli had not yet been realized. Recently some researchers's endeavors led to publish unpublished reports. However, the reason why reports of such significant tumuli as seobong tomb had not yet been published may be ascribed to the critical point in those days. The Gyeongju Tumuli Distribution Chart made by Nomori Ken(野守健) on the basis of the land register in the late 1920s seems of much significance in that it specifies the size and locations of 155 tumuli and shows the overall shape of tumuli groups within the city, as used in today's distribution chart. In the 1930s Arimitsu Kyoichi(有光敎一) and Saito Tadashi(齋藤忠) identified through excavation surveys of many wooden chamber tombs with stone mound and stone chambers with tunnel entrance, that there were several forms of tombs in a tomb system. In particular, his excavation survey experience of those wooden chamber tombs with stone mound which were exposed in complicated and overlapped forms show features more developed than that of preceding excavation surveys and reports publication, and so on. The result of having reviewed the contents of many historic spots surveyed at that time. Therefore this reexamination is considered to be a significant project in arranging the history of archaeology in Korea.

A Study on the Present Condition and Improvement of Cultural Heritage Management in Seoul - Based on the Results of Regular Surveys (2016~2018) - (서울특별시 지정문화재 관리 현황 진단 및 개선방안 연구 - 정기조사(2016~2018) 결과를 중심으로 -)

  • Cho, Hong-seok;Suh, Hyun-jung;Kim, Ye-rin;Kim, Dong-cheon
    • Korean Journal of Heritage: History & Science
    • /
    • v.52 no.2
    • /
    • pp.80-105
    • /
    • 2019
  • With the increasing complexity and irregularity of disaster types, the need for cultural asset preservation and management from a proactive perspective has increased as a number of cultural properties have been destroyed and damaged by various natural and humanistic factors. In consideration of these circumstances, the Cultural Heritage Administration enacted an Act in December 2005 to enforce the regular commission of surveys for the systematic preservation and management of cultural assets, and through a recent revision of this Act, the investigation cycle has been reduced from five to three years, and the object of regular inspections has been expanded to cover registered cultural properties. According to the ordinance, a periodic survey of city- or province-designated heritage is to be carried out mainly by metropolitan and provincial governments. The Seoul Metropolitan Government prepared a legal basis for commissioning regular surveys under the Seoul Special City Cultural Properties Protection Ordinance 2008 and, in recognition of the importance of preventive management due to the large number of cultural assets located in the city center and the high demand for visits, conducted regular surveys of the entire city-designated cultural assets from 2016 to 2018. Upon the first survey being completed, it was considered necessary to review the policy effectiveness of the system and to conduct a comprehensive review of the results of the regular surveys that had been carried out to enhance the management of cultural assets. Therefore, the present study examined the comprehensive management status of the cultural assets designated by the Seoul Metropolitan Government for three years (2016-2018), assessing the performance and identifying limitations. Additionally, ways to improve it were sought, and a DB establishment plan for the establishment of an integrated management system under the auspices of the Seoul Metropolitan Government was proposed. Specifically, survey forms were administered under the Guidelines for the Operation of Periodic Surveys of National Designated Cultural Assets; however, the types of survey forms were reclassified and further subdivided in consideration of the characteristics of the designated cultural assets, and manuals were developed for consistent and specific information technologies in respect of the scope and manner of the survey. Based on this analysis, it was confirmed that 401 cases (77.0%) out of 521 cases were generally well preserved; however, 102 cases (19.6%) were found to require special measures such as attention, precision diagnosis, and repair. Meanwhile, there were 18 cases (3.4%) of unsurveyed cultural assets. These were inaccessible to the investigation at this time due to reasons such as unknown location or closure to the public. Regarding the specific types of cultural assets, among a total of 171 cultural real estate properties, 63 cases (36.8%) of structural damage were caused by the failure and elimination of members, and 73 cases (42.7%) of surface area damage were the result of biological damage. Almost all plants and geological earth and scenic spots were well preserved. In the case of movable cultural assets, 25 cases (7.1%) among 350 cases were found to have changed location, and structural damage and surface area damage was found according to specific material properties, excluding ceramics. In particular, papers, textiles, and leather goods, with material properties that are vulnerable to damage, were found to have greater damage than those of other materials because they were owned and managed by individuals and temples. Thus, it has been confirmed that more proactive management is needed. Accordingly, an action plan for the comprehensive preservation and management status check shall be developed according to management status and urgency, and the project promotion plan and the focus management target should be selected and managed first. In particular, concerning movable cultural assets, there have been some cases in which new locations have gone unreported after changes in ownership (management); therefore, a new system is required to strengthen the obligation to report changes in ownership (management) or location. Based on the current status diagnosis and improvement measures, it is expected that the foundation of a proactive and efficient cultural asset management system can be realized through the establishment of an effective mid- to long-term database of the integrated management system pursued by the Seoul Metropolitan Government.

An Analytical Approach Using Topic Mining for Improving the Service Quality of Hotels (호텔 산업의 서비스 품질 향상을 위한 토픽 마이닝 기반 분석 방법)

  • Moon, Hyun Sil;Sung, David;Kim, Jae Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.21-41
    • /
    • 2019
  • Thanks to the rapid development of information technologies, the data available on Internet have grown rapidly. In this era of big data, many studies have attempted to offer insights and express the effects of data analysis. In the tourism and hospitality industry, many firms and studies in the era of big data have paid attention to online reviews on social media because of their large influence over customers. As tourism is an information-intensive industry, the effect of these information networks on social media platforms is more remarkable compared to any other types of media. However, there are some limitations to the improvements in service quality that can be made based on opinions on social media platforms. Users on social media platforms represent their opinions as text, images, and so on. Raw data sets from these reviews are unstructured. Moreover, these data sets are too big to extract new information and hidden knowledge by human competences. To use them for business intelligence and analytics applications, proper big data techniques like Natural Language Processing and data mining techniques are needed. This study suggests an analytical approach to directly yield insights from these reviews to improve the service quality of hotels. Our proposed approach consists of topic mining to extract topics contained in the reviews and the decision tree modeling to explain the relationship between topics and ratings. Topic mining refers to a method for finding a group of words from a collection of documents that represents a document. Among several topic mining methods, we adopted the Latent Dirichlet Allocation algorithm, which is considered as the most universal algorithm. However, LDA is not enough to find insights that can improve service quality because it cannot find the relationship between topics and ratings. To overcome this limitation, we also use the Classification and Regression Tree method, which is a kind of decision tree technique. Through the CART method, we can find what topics are related to positive or negative ratings of a hotel and visualize the results. Therefore, this study aims to investigate the representation of an analytical approach for the improvement of hotel service quality from unstructured review data sets. Through experiments for four hotels in Hong Kong, we can find the strengths and weaknesses of services for each hotel and suggest improvements to aid in customer satisfaction. Especially from positive reviews, we find what these hotels should maintain for service quality. For example, compared with the other hotels, a hotel has a good location and room condition which are extracted from positive reviews for it. In contrast, we also find what they should modify in their services from negative reviews. For example, a hotel should improve room condition related to soundproof. These results mean that our approach is useful in finding some insights for the service quality of hotels. That is, from the enormous size of review data, our approach can provide practical suggestions for hotel managers to improve their service quality. In the past, studies for improving service quality relied on surveys or interviews of customers. However, these methods are often costly and time consuming and the results may be biased by biased sampling or untrustworthy answers. The proposed approach directly obtains honest feedback from customers' online reviews and draws some insights through a type of big data analysis. So it will be a more useful tool to overcome the limitations of surveys or interviews. Moreover, our approach easily obtains the service quality information of other hotels or services in the tourism industry because it needs only open online reviews and ratings as input data. Furthermore, the performance of our approach will be better if other structured and unstructured data sources are added.

A Study on the Development Trend of Artificial Intelligence Using Text Mining Technique: Focused on Open Source Software Projects on Github (텍스트 마이닝 기법을 활용한 인공지능 기술개발 동향 분석 연구: 깃허브 상의 오픈 소스 소프트웨어 프로젝트를 대상으로)

  • Chong, JiSeon;Kim, Dongsung;Lee, Hong Joo;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.1-19
    • /
    • 2019
  • Artificial intelligence (AI) is one of the main driving forces leading the Fourth Industrial Revolution. The technologies associated with AI have already shown superior abilities that are equal to or better than people in many fields including image and speech recognition. Particularly, many efforts have been actively given to identify the current technology trends and analyze development directions of it, because AI technologies can be utilized in a wide range of fields including medical, financial, manufacturing, service, and education fields. Major platforms that can develop complex AI algorithms for learning, reasoning, and recognition have been open to the public as open source projects. As a result, technologies and services that utilize them have increased rapidly. It has been confirmed as one of the major reasons for the fast development of AI technologies. Additionally, the spread of the technology is greatly in debt to open source software, developed by major global companies, supporting natural language recognition, speech recognition, and image recognition. Therefore, this study aimed to identify the practical trend of AI technology development by analyzing OSS projects associated with AI, which have been developed by the online collaboration of many parties. This study searched and collected a list of major projects related to AI, which were generated from 2000 to July 2018 on Github. This study confirmed the development trends of major technologies in detail by applying text mining technique targeting topic information, which indicates the characteristics of the collected projects and technical fields. The results of the analysis showed that the number of software development projects by year was less than 100 projects per year until 2013. However, it increased to 229 projects in 2014 and 597 projects in 2015. Particularly, the number of open source projects related to AI increased rapidly in 2016 (2,559 OSS projects). It was confirmed that the number of projects initiated in 2017 was 14,213, which is almost four-folds of the number of total projects generated from 2009 to 2016 (3,555 projects). The number of projects initiated from Jan to Jul 2018 was 8,737. The development trend of AI-related technologies was evaluated by dividing the study period into three phases. The appearance frequency of topics indicate the technology trends of AI-related OSS projects. The results showed that the natural language processing technology has continued to be at the top in all years. It implied that OSS had been developed continuously. Until 2015, Python, C ++, and Java, programming languages, were listed as the top ten frequently appeared topics. However, after 2016, programming languages other than Python disappeared from the top ten topics. Instead of them, platforms supporting the development of AI algorithms, such as TensorFlow and Keras, are showing high appearance frequency. Additionally, reinforcement learning algorithms and convolutional neural networks, which have been used in various fields, were frequently appeared topics. The results of topic network analysis showed that the most important topics of degree centrality were similar to those of appearance frequency. The main difference was that visualization and medical imaging topics were found at the top of the list, although they were not in the top of the list from 2009 to 2012. The results indicated that OSS was developed in the medical field in order to utilize the AI technology. Moreover, although the computer vision was in the top 10 of the appearance frequency list from 2013 to 2015, they were not in the top 10 of the degree centrality. The topics at the top of the degree centrality list were similar to those at the top of the appearance frequency list. It was found that the ranks of the composite neural network and reinforcement learning were changed slightly. The trend of technology development was examined using the appearance frequency of topics and degree centrality. The results showed that machine learning revealed the highest frequency and the highest degree centrality in all years. Moreover, it is noteworthy that, although the deep learning topic showed a low frequency and a low degree centrality between 2009 and 2012, their ranks abruptly increased between 2013 and 2015. It was confirmed that in recent years both technologies had high appearance frequency and degree centrality. TensorFlow first appeared during the phase of 2013-2015, and the appearance frequency and degree centrality of it soared between 2016 and 2018 to be at the top of the lists after deep learning, python. Computer vision and reinforcement learning did not show an abrupt increase or decrease, and they had relatively low appearance frequency and degree centrality compared with the above-mentioned topics. Based on these analysis results, it is possible to identify the fields in which AI technologies are actively developed. The results of this study can be used as a baseline dataset for more empirical analysis on future technology trends that can be converged.

An Essay in a Research on Gwonwu Hong Chan-yu's Poetic Literature - Focussing on Classical Chinese Poems in Gwonwujip (권우(卷宇) 홍찬유(洪贊裕) 시문학(詩文學) 연구(硏究) 시론(試論) - 『권우집(卷宇集)』 소재(所載) 한시(漢詩)를 중심(中心)으로 -)

  • Yoon, Jaehwan
    • (The)Study of the Eastern Classic
    • /
    • no.50
    • /
    • pp.55-88
    • /
    • 2013
  • Gwonwu Hong Chan-yu is one of the modern and contemporary Korean scholars of Sino-Korean literature and one of the literati of his era, so is respected as a guiding light by academic descendants. Gwonwu was a teacher of his era, who experienced all the turbulence of Korean society, such as the Japanese occupation by force, the Korean War, the military dictatorship, and the struggle for democracy, and who educated and led young scholars of his time. However, academia has not payed attention to his life and achievements since his death. This paper is to examine the poetry of Gwonwu Hong Chan-yu, one of the representative modern and contemporary scholar of Sini-Korean literature, which has not yet been discussed by academia. The minimal meaning of this paper is that it is a first work based on his anthology, which has not been discussed by academia, and a first full-scale study on Gwonwu Hongchan-yu. For the reason, this paper aims at the detailed inspection of his poetic pieces recorded in his anthology. Nonetheless, despite such intentions, some limits cannot be avoided here and there in this paper for the insufficient knowledge and academic capability of this paper's writer and for the lack of academic sources. Gwonwu's poetry examined through his anthology shows the characteristic which is that his poems focus on exposing his own internal emotions. Such a characteristic says that his idea of poetic literature payed attention more to individuality, that is exposition of private emotions, than to social utility of poems. Gwonwu's such an idea of poetic literature can be generally affirmed throughout his poetry. Accordingly, Gwonwu preferred classical Chinese poems to archaistic poems, and single poems to serial poems; and avoided writing poems within social relations such as farewell-poems, bestowal-poems, and mourning-poems. When the characteristics of Gwonwu's poetic literature get summarized as such, however, some questions remain. The preferential question is whether the poems in his anthology are the whole poetry of him. Although Gwonwu's poetic pieces that the writer of this paper have checked out till now are all in his anthology, it is very much questionable whether Gwonwu's poetry can be summed up only with these poems. The next question is what is the writing method for taking joy(spice), sentiment, and full-heart into his poems if Gwonwu's poems focus on exposing his internal emotions, and if poems exposing joy and poems exposing sentiment and full-heart appear coherently in various different spaces and circumstances of writing. The final question is what are the meanings of Gwonwu's poems if his poetry checked out through his anthology directly shows either the reality carried in his poems or the reality of a time in his life. The questions listed above are thought to be resolved by the synchronizing process of stereoscopic searches both for Gwonwu as an individual and for the era of his life. Especially, spurring deeper researches toward a new direction regarding Gwonwu's poetry has an important meaning for construction of a complete modern and contemporary history of Sino-Korean literature and for procurement of continuous research on Sino-Korean literature and its history. For the reason, it is thought that more efforts of researchers are required.

A Study on the Effect of Network Centralities on Recommendation Performance (네트워크 중심성 척도가 추천 성능에 미치는 영향에 대한 연구)

  • Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.23-46
    • /
    • 2021
  • Collaborative filtering, which is often used in personalization recommendations, is recognized as a very useful technique to find similar customers and recommend products to them based on their purchase history. However, the traditional collaborative filtering technique has raised the question of having difficulty calculating the similarity for new customers or products due to the method of calculating similaritiesbased on direct connections and common features among customers. For this reason, a hybrid technique was designed to use content-based filtering techniques together. On the one hand, efforts have been made to solve these problems by applying the structural characteristics of social networks. This applies a method of indirectly calculating similarities through their similar customers placed between them. This means creating a customer's network based on purchasing data and calculating the similarity between the two based on the features of the network that indirectly connects the two customers within this network. Such similarity can be used as a measure to predict whether the target customer accepts recommendations. The centrality metrics of networks can be utilized for the calculation of these similarities. Different centrality metrics have important implications in that they may have different effects on recommended performance. In this study, furthermore, the effect of these centrality metrics on the performance of recommendation may vary depending on recommender algorithms. In addition, recommendation techniques using network analysis can be expected to contribute to increasing recommendation performance even if they apply not only to new customers or products but also to entire customers or products. By considering a customer's purchase of an item as a link generated between the customer and the item on the network, the prediction of user acceptance of recommendation is solved as a prediction of whether a new link will be created between them. As the classification models fit the purpose of solving the binary problem of whether the link is engaged or not, decision tree, k-nearest neighbors (KNN), logistic regression, artificial neural network, and support vector machine (SVM) are selected in the research. The data for performance evaluation used order data collected from an online shopping mall over four years and two months. Among them, the previous three years and eight months constitute social networks composed of and the experiment was conducted by organizing the data collected into the social network. The next four months' records were used to train and evaluate recommender models. Experiments with the centrality metrics applied to each model show that the recommendation acceptance rates of the centrality metrics are different for each algorithm at a meaningful level. In this work, we analyzed only four commonly used centrality metrics: degree centrality, betweenness centrality, closeness centrality, and eigenvector centrality. Eigenvector centrality records the lowest performance in all models except support vector machines. Closeness centrality and betweenness centrality show similar performance across all models. Degree centrality ranking moderate across overall models while betweenness centrality always ranking higher than degree centrality. Finally, closeness centrality is characterized by distinct differences in performance according to the model. It ranks first in logistic regression, artificial neural network, and decision tree withnumerically high performance. However, it only records very low rankings in support vector machine and K-neighborhood with low-performance levels. As the experiment results reveal, in a classification model, network centrality metrics over a subnetwork that connects the two nodes can effectively predict the connectivity between two nodes in a social network. Furthermore, each metric has a different performance depending on the classification model type. This result implies that choosing appropriate metrics for each algorithm can lead to achieving higher recommendation performance. In general, betweenness centrality can guarantee a high level of performance in any model. It would be possible to consider the introduction of proximity centrality to obtain higher performance for certain models.