• Title/Summary/Keyword: Language network analysis

Search Result 372, Processing Time 0.026 seconds

A Korean Community-based Question Answering System Using Multiple Machine Learning Methods (다중 기계학습 방법을 이용한 한국어 커뮤니티 기반 질의-응답 시스템)

  • Kwon, Sunjae;Kim, Juae;Kang, Sangwoo;Seo, Jungyun
    • Journal of KIISE
    • /
    • v.43 no.10
    • /
    • pp.1085-1093
    • /
    • 2016
  • Community-based Question Answering system is a system which provides answers for each question from the documents uploaded on web communities. In order to enhance the capacity of question analysis, former methods have developed specific rules suitable for a target region or have applied machine learning to partial processes. However, these methods incur an excessive cost for expanding fields or lead to cases in which system is overfitted for a specific field. This paper proposes a multiple machine learning method which automates the overall process by adapting appropriate machine learning in each procedure for efficient processing of community-based Question Answering system. This system can be divided into question analysis part and answer selection part. The question analysis part consists of the question focus extractor, which analyzes the focused phrases in questions and uses conditional random fields, and the question type classifier, which classifies topics of questions and uses support vector machine. In the answer selection part, the we trains weights that are used by the similarity estimation models through an artificial neural network. Also these are a number of cases in which the results of morphological analysis are not reliable for the data uploaded on web communities. Therefore, we suggest a method that minimizes the impact of morphological analysis by using character features in the stage of question analysis. The proposed system outperforms the former system by showing a Mean Average Precision criteria of 0.765 and R-Precision criteria of 0.872.

A Suggestion for Spatiotemporal Analysis Model of Complaints on Officially Assessed Land Price by Big Data Mining (빅데이터 마이닝에 의한 공시지가 민원의 시공간적 분석모델 제시)

  • Cho, Tae In;Choi, Byoung Gil;Na, Young Woo;Moon, Young Seob;Kim, Se Hun
    • Journal of Cadastre & Land InformatiX
    • /
    • v.48 no.2
    • /
    • pp.79-98
    • /
    • 2018
  • The purpose of this study is to suggest a model analysing spatio-temporal characteristics of the civil complaints for the officially assessed land price based on big data mining. Specifically, in this study, the underlying reasons for the civil complaints were found from the spatio-temporal perspectives, rather than the institutional factors, and a model was suggested monitoring a trend of the occurrence of such complaints. The official documents of 6,481 civil complaints for the officially assessed land price in the district of Jung-gu of Incheon Metropolitan City over the period from 2006 to 2015 along with their temporal and spatial poperties were collected and used for the analysis. Frequencies of major key words were examined by using a text mining method. Correlations among mafor key words were studied through the social network analysis. By calculating term frequency(TF) and term frequency-inverse document frequency(TF-IDF), which correspond to the weighted value of key words, I identified the major key words for the occurrence of the civil complaint for the officially assessed land price. Then the spatio-temporal characteristics of the civil complaints were examined by analysing hot spot based on the statistics of Getis-Ord $Gi^*$. It was found that the characteristic of civil complaints for the officially assessed land price were changing, forming a cluster that is linked spatio-temporally. Using text mining and social network analysis method, we could find out that the occurrence reason of civil complaints for the officially assessed land price could be identified quantitatively based on natural language. TF and TF-IDF, the weighted averages of key words, can be used as main explanatory variables to analyze spatio-temporal characteristics of civil complaints for the officially assessed land price since these statistics are different over time across different regions.

Smart Browser based on Semantic Web using RFID Technology (RFID 기술을 이용한 시맨틱 웹 기반 스마트 브라우저)

  • Song, Chang-Woo;Lee, Jung-Hyun
    • The Journal of the Korea Contents Association
    • /
    • v.8 no.12
    • /
    • pp.37-44
    • /
    • 2008
  • Data entered into RFID tags are used for saving costs and enhancing competitiveness in the development of applications in various industrial areas. RFID readers perform the identification and search of hundreds of objects, which are tags. RFID technology that identifies objects on request of dynamic linking and tracking is composed of application components supporting information infrastructure. Despite their many advantages, existing applications, which do not consider elements related to real.time data communication among remote RFID devices, cannot support connections among heterogeneous devices effectively. As different network devices are installed in applications separately and go through different query analysis processes, there happen the delays of monitoring or errors in data conversion. The present study implements a RFID database handling system in semantic Web environment for integrated management of information extracted from RFID tags regardless of application. Users’ RFID tags are identified by a RFID reader mounted on an application, and the data are sent to the RFID database processing system, and then the process converts the information into a semantic Web language. Data transmitted on the standardized semantic Web base are translated by a smart browser and displayed on the screen. The use of a semantic Web language enables reasoning on meaningful relations and this, in turn, makes it easy to expand the functions by adding modules.

Design and Implementation of Lesson Plan System for teacher-student based on XML (XML 기반 교수-학생 학습지도 시스템의 설계 및 구현)

  • Choi, Mun-Kyoung;Kim, Haeng-Kon
    • The KIPS Transactions:PartD
    • /
    • v.9D no.6
    • /
    • pp.1055-1062
    • /
    • 2002
  • Recently, the lesson plan document that is imported in the educational area is not provided to the educational information systematically, and the teachers are not easy to compose the lessen plan documentation. So, it needs additional time and effort to develope the lesson plan documents. Because of increasing the distributing network. web-based lesson plan system is required to all of the education area. Therefore, we need to compose the lesson plan that is possible to obtain the various teacher's requirement by providing creation, retrival, and reusability of document using the standard XML on web. In this paper, we developed the system for creating the common DTD (Document Type Definition), providing the standard XML document through the common DTD over the lesson plan analysis. In this system, it provides the editor to compose the lesson plan and supports the searching function to improvement of reusability on the existing lesson plan. We design the searching functions such as the structure base, facet and keyword. The composed lesson plans are interoperated with Database. Consequently, we can share the information on web by composing the lesson plan using the XML and save the time and cost by directly writing the lesson plan on web. We can also provide the improved learning environment.

Development of a Remotely Sensed Image Processing/Analysis System : GeoPixel Ver. 1.0 (JAVA를 이용한 위성영상처리/분석 시스템 개발 : GeoPixel Ver. 1.0)

  • 안충현;신대혁
    • Korean Journal of Remote Sensing
    • /
    • v.13 no.1
    • /
    • pp.13-30
    • /
    • 1997
  • Recent improvements of satellite remote sensing sensors which are represented by hyperspectral imaging sensors and high spatial resolution sensors provide a large amount of data, typically several hundred megabytes per one scene. Moreover, increasing information exchange via internet and information super-highway requires the developments of more active service systems for processing and analysing of remote sensing data in order to provide value-added products. In this sense, an advanced satellite data processing system is being developed to achive high performance in computing speed and efficieney in processing a huge volume of data, and to make possible network computing and easy improving, upgrading and managing of systems. JAVA internet programming language provides several advantages for developing software such as object-oriented programming, multi-threading and robust memory managent. Using these features, a satellite data processing system named as GeoPixel has been developing using JAVA language. The GeoPixel adopted newly developed techniques including object-pipe connect method between each process and multi-threading structure. In other words, this system has characteristics such as independent operating platform and efficient data processing by handling a huge volume of remote sensing data with robustness. In the evaluation of data processing capability, the satisfactory results were shown in utilizing computer resources(CPU and Memory) and processing speeds.

A Study on Artificial Intelligence Ethics Perceptions of University Students by Text Mining (텍스트 마이닝으로 살펴본 대학생들의 인공지능 윤리 인식 연구)

  • Yoo, Sujin;Jang, YunJae
    • Journal of The Korean Association of Information Education
    • /
    • v.25 no.6
    • /
    • pp.947-960
    • /
    • 2021
  • In this study, we examine the AI ethics perception of university students to explore the direction of AI ethics education. For this, 83 students wrote their thoughts about 5 discussion topics on online bulletin board. We analyzed it using language networks, one of the text mining techniques. As a result, 62.5% of students spoke the future of the AI society positively. Second, if there is a self-driving car accident, 39.2% of students thought it is the vehicle owner's responsibility at the current level of autonomous driving. Third, invasion of privacy, abuse of technology, and unbalanced information acquisition were cited as dysfunctions of the development of AI. It was mentioned that ethical education for both AI users and developers is required as a way to minimize malfunctions, and institutional preparations should be carried out in parallel. Fourth, only 19.2% of students showed a positive opinion about a society where face recognition technology is universal. Finally, there was a common opinion that when collecting data including personal information, only the part with the consent should be used. Regarding the use of AI without moral standards, they emphasized the ethical literacy of both users and developers. This study is meaningful in that it provides information necessary to design the contents of artificial intelligence ethics education in liberal arts education.

Global Citizenship Education in the Primary Geography Curriculum of the Republic of Korea: Content Analysis Focusing on the Semantic Structure of 2009 Revised School Curriculum (초등지리 교육과정에 반영된 세계시민교육 관련 요소의 구조적 특성에 관한 연구: 2009 개정 교육과정 성취기준에 대한 내용분석을 중심으로)

  • Lee, Dong-Min
    • Journal of the Korean Geographical Society
    • /
    • v.49 no.6
    • /
    • pp.949-969
    • /
    • 2014
  • The purpose of this study is to analyze the share of global citizenship education in the 2009 Revised Social Studies (geography area) School Curriculum of the Republic of Korea. I selected the achievement standards of the geography domain in the fifth and sixth grades as the subjects of analysis. The chosen subjects were examined using content analysis: I used KrKwic, a Korean language content analysis tool, to analyze the content and drew a semantic network of the analysis results using UciNet/NetDraw. I found that the geography domain of the 2009 Revised Primary School Curriculum included the concepts of and factors of global citizenship education. However, global citizenship education did not account for a major portion of the curriculum, and the curriculum achievement standards were noticeably nation-state centered. Global citizenship education factors were not closely associated with to other related factors in fact, they even revealed a isolated pattern. These findings suggest that the inclusion of global citizenship education in primary geography education is limited, because the connections between global citizenship education and related contents, such as the environment, sustainable development, conflict, and cooperation, are probably impeded. Globalization accompanies the transformation of territories, identities, and the relations between nation-states and the world, although nation-states continue to play a significant role in the globalized worlds. Therefore global citizenship education, a educational trend focusing on the global community, is particularly important and is required in the geography curriculum of the global era. I expect that the examination undertaken in this study to contribute to future curriculum revisions regarding globalizatin and global citizenship.

  • PDF

Analysis of ICT Education Trends using Keyword Occurrence Frequency Analysis and CONCOR Technique (키워드 출현 빈도 분석과 CONCOR 기법을 이용한 ICT 교육 동향 분석)

  • Youngseok Lee
    • Journal of Industrial Convergence
    • /
    • v.21 no.1
    • /
    • pp.187-192
    • /
    • 2023
  • In this study, trends in ICT education were investigated by analyzing the frequency of appearance of keywords related to machine learning and using conversion of iteration correction(CONCOR) techniques. A total of 304 papers from 2018 to the present published in registered sites were searched on Google Scalar using "ICT education" as the keyword, and 60 papers pertaining to ICT education were selected based on a systematic literature review. Subsequently, keywords were extracted based on the title and summary of the paper. For word frequency and indicator data, 49 keywords with high appearance frequency were extracted by analyzing frequency, via the term frequency-inverse document frequency technique in natural language processing, and words with simultaneous appearance frequency. The relationship degree was verified by analyzing the connection structure and centrality of the connection degree between words, and a cluster composed of words with similarity was derived via CONCOR analysis. First, "education," "research," "result," "utilization," and "analysis" were analyzed as main keywords. Second, by analyzing an N-GRAM network graph with "education" as the keyword, "curriculum" and "utilization" were shown to exhibit the highest correlation level. Third, by conducting a cluster analysis with "education" as the keyword, five groups were formed: "curriculum," "programming," "student," "improvement," and "information." These results indicate that practical research necessary for ICT education can be conducted by analyzing ICT education trends and identifying trends.

Feasibility of Deep Learning Algorithms for Binary Classification Problems (이진 분류문제에서의 딥러닝 알고리즘의 활용 가능성 평가)

  • Kim, Kitae;Lee, Bomi;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.1
    • /
    • pp.95-108
    • /
    • 2017
  • Recently, AlphaGo which is Bakuk (Go) artificial intelligence program by Google DeepMind, had a huge victory against Lee Sedol. Many people thought that machines would not be able to win a man in Go games because the number of paths to make a one move is more than the number of atoms in the universe unlike chess, but the result was the opposite to what people predicted. After the match, artificial intelligence technology was focused as a core technology of the fourth industrial revolution and attracted attentions from various application domains. Especially, deep learning technique have been attracted as a core artificial intelligence technology used in the AlphaGo algorithm. The deep learning technique is already being applied to many problems. Especially, it shows good performance in image recognition field. In addition, it shows good performance in high dimensional data area such as voice, image and natural language, which was difficult to get good performance using existing machine learning techniques. However, in contrast, it is difficult to find deep leaning researches on traditional business data and structured data analysis. In this study, we tried to find out whether the deep learning techniques have been studied so far can be used not only for the recognition of high dimensional data but also for the binary classification problem of traditional business data analysis such as customer churn analysis, marketing response prediction, and default prediction. And we compare the performance of the deep learning techniques with that of traditional artificial neural network models. The experimental data in the paper is the telemarketing response data of a bank in Portugal. It has input variables such as age, occupation, loan status, and the number of previous telemarketing and has a binary target variable that records whether the customer intends to open an account or not. In this study, to evaluate the possibility of utilization of deep learning algorithms and techniques in binary classification problem, we compared the performance of various models using CNN, LSTM algorithm and dropout, which are widely used algorithms and techniques in deep learning, with that of MLP models which is a traditional artificial neural network model. However, since all the network design alternatives can not be tested due to the nature of the artificial neural network, the experiment was conducted based on restricted settings on the number of hidden layers, the number of neurons in the hidden layer, the number of output data (filters), and the application conditions of the dropout technique. The F1 Score was used to evaluate the performance of models to show how well the models work to classify the interesting class instead of the overall accuracy. The detail methods for applying each deep learning technique in the experiment is as follows. The CNN algorithm is a method that reads adjacent values from a specific value and recognizes the features, but it does not matter how close the distance of each business data field is because each field is usually independent. In this experiment, we set the filter size of the CNN algorithm as the number of fields to learn the whole characteristics of the data at once, and added a hidden layer to make decision based on the additional features. For the model having two LSTM layers, the input direction of the second layer is put in reversed position with first layer in order to reduce the influence from the position of each field. In the case of the dropout technique, we set the neurons to disappear with a probability of 0.5 for each hidden layer. The experimental results show that the predicted model with the highest F1 score was the CNN model using the dropout technique, and the next best model was the MLP model with two hidden layers using the dropout technique. In this study, we were able to get some findings as the experiment had proceeded. First, models using dropout techniques have a slightly more conservative prediction than those without dropout techniques, and it generally shows better performance in classification. Second, CNN models show better classification performance than MLP models. This is interesting because it has shown good performance in binary classification problems which it rarely have been applied to, as well as in the fields where it's effectiveness has been proven. Third, the LSTM algorithm seems to be unsuitable for binary classification problems because the training time is too long compared to the performance improvement. From these results, we can confirm that some of the deep learning algorithms can be applied to solve business binary classification problems.

A Study on the Performance of Priority Mechanisms in ATM Multiplexer (ATM 멀티플렉서에서의 우선순위 메카니즘에 관한 연구)

  • 윤성호;박광채;이재호
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.6
    • /
    • pp.779-792
    • /
    • 1993
  • In a switching node or an ATM multiplexer of the ATM network, a good bandwidth utilization can be achieved by the priority control using the 1-bit(Cell Loss priority) in ATM cell header. In this paper, the mixed mechanism is proposed to make up for shortcomings of existing space priority control mechanisms and to decrease the loss probability of high priority cell and its performance is analyzed about the cell loss probability. To estimate the performance of proposed mixed mechanism, its cell loss probability is compared with those of non-priority mechanism, push-out mechanism and partial buffer sharing mechanism. The cell loss probability is analyzed using a M/D/1/N modeling and a 2-state MMPP/D/1/N modeling and also comparison between two modelings is made. To verify this result of numerical analysis, the computer simulation is performed for each mechanism using the simulation language, SIMSRIPT II.5.

  • PDF