• Title/Summary/Keyword: Research paper

Search Result 55,911, Processing Time 0.092 seconds

An Analysis of the Landscape Cognitive Characteristics of 'Gugok Streams' in the First Half of the 18th Century Based on the Comparison of China's 『Wuyi-Gugok Painting』 (중국 『무이구곡도』 3폭(幅)의 비교 분석을 통해 본 18세기 무이산 구곡계(九曲溪)의 경물 인지특성)

  • Cheng, Zhao-Xia;Rho, Jae-Hyun;Jiang, Cheng
    • Journal of the Korean Institute of Traditional Landscape Architecture
    • /
    • v.37 no.3
    • /
    • pp.62-82
    • /
    • 2019
  • Taking the three Wuyi-Gugok Drawings, 『A Picture Showing the Boundary Between Mountains and Rivers: A』, 『Landscape of the Jiuqu River in the Wuyi Mountain: B』 and 『Eighteen Sceneries of Wuyi Mountain: C』, which were produced in the mid-Qing Dynasty as the research objects and after investigating the names recorded in the paintings, this paper tries to analyze the scenic spots, scene types and images in the literature survey. Also, based on the number of Scenic type and the number of Scenic name in each Gok, landscape richness(LR) and landscape similarity(LS) of the Gugok scenic spots, the cognitive characteristics of the landscape in the 18th century were carefully observed. The results are as follows. Firstly, according to the description statistics of scenic spot types in Wuyi Mountain Chronicle, there were 41 descriptions of scenery names in the three paintings, among which rock, peak and stone accounted for the majority. According to the data, the number of rocks, peaks and stones in Wuyi-Gugok landscape accounted for more than half, which reflected the characteristics of geological landscape such as Danxia landform in Wuyi-Gugok landscape. Secondly, the landscape of Gugok Stream(九曲溪) was diverse and full of images. The 1st Gok Daewangbong(大王峰) and Manjeongbong(幔亭峰), the 2nd Gok Oknyeobong(玉女峰), the 3rd Gok Sojangbong(小藏峰), the 4th Gok Daejangbong(大藏峰), the 5th Gok Daeeunbyeong(大隱屛) and Muijeongsa(武夷精舍), the 6th Gok Seonjangbong(仙掌峰) and Cheonyubong(天游峰) all had outstanding landscape in each Gok. However, the landscape features of the 7th~9th Gok were relatively low. Thirdly, according to the landscape image survey of each Gok, the image formation of Gugok cultural landscape originates from the specificity of the myths and legends related to Wuyi Mountain, and the landscape is highly well-known. Due to the specificity, the landscape recognition was very high. In particular, the 1st Gok and the 5th Gok closely related to the Taoist culture based on Muigun, the Stone Carving culture and the Boat Tour culture related to neo-confucianism culture of Zhu Xi. Fourthly, according to the analysis results of landscape similarity of 41 landscape types shown in the figure, the similarity of A and C was very high. The morphological description and the relationship of distant and near performance was very similar. Therefore, it could be judged that this was obviously influenced by one painting. As a whole, the names of the scenes depicted in the three paintings were formed at least in the first half of 18th century through a long history of inheritance, accumulated myths and legends, and the names of the scenes. The order of the scenery names in three Drawings had some differences. But among the scenery names appearing in all three Drawings, there were 21 stones, 20 rocks and 17 peaks. Stones, rocks and peaks guided the landscape of Gugok Streams in Wuyi Mountain. Fifthly, Seonjodae(仙釣臺) in A and C was described in the 4th Gok, but what deserved attention was that it was known as the scenery name of the 3rd Gok in Korean. In addition, Seungjindong(升眞洞) in the 1st Gok and Seokdangsa(石堂寺) in the 7th Gok were not described in Drawings A, B and C. This is a special point that needs to be studied in the future.

The Requirement and Effect of the Document of Carriage in Respect of the International Carriage of Cargo by Air (국제항공화물운송에 관한 운송증서의 요건 및 효력)

  • Lee, Kang-Bin
    • The Korean Journal of Air & Space Law and Policy
    • /
    • v.23 no.2
    • /
    • pp.67-92
    • /
    • 2008
  • The purpose of this paper is to research the requirements and effect of the document of carriage in respect of the carriage of cargo by air under the Montreal Convention of 1999, IATA Conditions of Carriage for Cargo, and the judicial precedents of Korea and foreign countries. Under the Article 4 of Montreal Convention, in respect of the carriage of cargo, an air waybill shall be delivered. If any other means which preserves a record of the carriage are used, the carrier shall, if so requested by the consignor, deliver to the consignor a cargo receipt. Under the Article 7 of Montreal convention, the air waybill shall be made out by the consignor. If, at the request of the consignor, the carrier makes it out, the carrier shall be deemed to have done so on behalf of the consignor. The air waybill shall be made out in three original parts. The first part shall be marked "for the carrier", and shall be signed by the consignor. The second part shall be marked "for the consignee", and shall be signed by the consignor and by the carrier. The third part shall be signed by the carrier who shall hand it to the consignor after the goods have been accepted. Under the Article 5 of Montreal Convention, the air waybill or the cargo receipt shall include (a) an indication of the places of departure and destination, (b) an indication of at least one agreed stopping place, (c) an indication of the weight of the consignment. Under the Article 10 of Montreal Convention, the consignor shall indemnify the carrier against all damages suffered by the carrier or any other person to whom the carrier is liable, by reason of the irregularity, incorrectness or incompleteness of the particulars and statement furnished by the consignor or on its behalf. Under the Article 9 of Montreal Convention, non-compliance with the Article 4 to 8 of Montreal Convention shall not affect the existence of the validity of the contract, which shall be subject to the rules of Montreal Convention including those relating to limitation of liability. The air waybill is not a document of title or negotiable instrument. Under the Article 11 of Montreal Convention, the air waybill or cargo receipt is prima facie evidence of the conclusion of the contract, of the acceptance of the cargo and of the conditions of carriage. Under the Article 12 of Montreal Convention, if the carrier carries out the instructions of the consignor for the disposition of the cargo without requiring the production of the part of the air waybill or the cargo receipt, the carrier will be liable, for any damage which may be accused thereby to any person who is lawfully in possession of that part of the air waybill or the cargo receipt. According to the precedent of Korea Supreme Court sentenced on 22 July 2004, the freight forwarder as carrier was not liable for the illegal delivery of cargo to the notify party (actual importer) on the air waybill by the operator of the bonded warehouse because the freighter did not designate the boned warehouse and did not hold the position of employer to the operator of the bonded warehouse. In conclusion, as the Korea Customs Authorities will drive the e-Freight project for the carriage of cargo by air, the carrier and freight forwarder should pay attention to the requirements and legal effect of the electronic documentation of the carriage of cargo by air.

  • PDF

The micro-tensile bond strength of two-step self-etch adhesive to ground enamel with and without prior acid-etching (산부식 전처리에 따른 2단계 자가부식 접착제의 연마 법랑질에 대한 미세인장결합강도)

  • Kim, You-Lee;Kim, Jee-Hwan;Shim, June-Sung;Kim, Kwang-Mahn;Lee, Keun-Woo
    • The Journal of Korean Academy of Prosthodontics
    • /
    • v.46 no.2
    • /
    • pp.148-156
    • /
    • 2008
  • Statement of problems: Self-etch adhesives exhibit some clinical benefits such as ease of manipulation and reduced technique-sensitivity. Nevertheless, some concern remains regarding the bonding effectiveness of self-etch adhesives to enamel, in particular when so-called 'mild' self-etch adhesives are employed. This study compared the microtensile bond strengths to ground enamel of the two-step self-etch adhesive Clearfil SE Bond (Kuraray) to the three-step etch-and- rinse adhesive Scotchbond Multi-Purpose (3M ESPE) and the one-step self-etch adhesive iBond (Heraeus Kulzer). Purpose: The purpose of this study was to determine the effect of a preceding phosphoric acid conditioning step on the bonding effectiveness of a two-step self-etch adhesive to ground enamel. Material and methods: The two-step self-etch adhesive Clearfil SE Bond non-etch group, Clearfil SE Bond etch group with prior 35% phosphoric acid etching, and the one-step self-etch adhesive iBond group were used as experimental groups. The three-step etch-and-rinse adhesive Scotchbond Multi-Purpose was used as a control group. The facial surfaces of bovine incisors were divided in four equal parts cruciformly, and randomly distributed into each group. The facial surface of each incisor was ground with 800-grit silicon carbide paper. Each adhesive group was applied according to the manufacturer's instructions to ground enamel, after which the surface was built up using Light-Core (Bisco). After storage in distilled water at $37^{\circ}C$ for 1 week, the restored teeth were sectioned into enamel beams approximately 0.8*0.8mm in cross section using a low speed precision diamond saw (TOPMET Metsaw-LS). After storage in distilled water at $37^{\circ}C$ for 1 month, 3 months, microtensile bond strength evaluations were performed using microspecimens. The microtensile bond strength (MPa) was derived by dividing the imposed force (N) at time of fracture by the bond area ($mm^2$). The mode of failure at the interface was determined with a microscope (Microscope-B nocular, Nikon). The data of microtensile bond strength were statistically analyzed using a one-way ANOVA, followed by Least Significant Difference Post Hoc Test at a significance level of 5%. Results: The mean microtensile bond strength after 1 month of storage showed no statistically significant difference between all adhesive groups (P>0.05). After 3 months of storage, adhesion to ground enamel of iBond was not significantly different from Clearfil SE Bond etch (P>>0.05), while Clearfil SE Bond non-etch and Scotchbond Multi-Purpose demonstrated significantly lower bond strengths (P<0.05), with no significant differences between the two adhesives. Conclusion: In this study the microtensile bond strength to ground enamel of two-step self-etch adhesive Clearfil SE Bond was not significantly different from three-step etch-and-rinse adhesive Scotchbond Multi-Purpose, and prior etching with 35% phosphoric acid significantly increased the bonding effectiveness of Clearfil SE Bond to enamel at 3 months.

Pareto Ratio and Inequality Level of Knowledge Sharing in Virtual Knowledge Collaboration: Analysis of Behaviors on Wikipedia (지식 공유의 파레토 비율 및 불평등 정도와 가상 지식 협업: 위키피디아 행위 데이터 분석)

  • Park, Hyun-Jung;Shin, Kyung-Shik
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.3
    • /
    • pp.19-43
    • /
    • 2014
  • The Pareto principle, also known as the 80-20 rule, states that roughly 80% of the effects come from 20% of the causes for many events including natural phenomena. It has been recognized as a golden rule in business with a wide application of such discovery like 20 percent of customers resulting in 80 percent of total sales. On the other hand, the Long Tail theory, pointing out that "the trivial many" produces more value than "the vital few," has gained popularity in recent times with a tremendous reduction of distribution and inventory costs through the development of ICT(Information and Communication Technology). This study started with a view to illuminating how these two primary business paradigms-Pareto principle and Long Tail theory-relates to the success of virtual knowledge collaboration. The importance of virtual knowledge collaboration is soaring in this era of globalization and virtualization transcending geographical and temporal constraints. Many previous studies on knowledge sharing have focused on the factors to affect knowledge sharing, seeking to boost individual knowledge sharing and resolve the social dilemma caused from the fact that rational individuals are likely to rather consume than contribute knowledge. Knowledge collaboration can be defined as the creation of knowledge by not only sharing knowledge, but also by transforming and integrating such knowledge. In this perspective of knowledge collaboration, the relative distribution of knowledge sharing among participants can count as much as the absolute amounts of individual knowledge sharing. In particular, whether the more contribution of the upper 20 percent of participants in knowledge sharing will enhance the efficiency of overall knowledge collaboration is an issue of interest. This study deals with the effect of this sort of knowledge sharing distribution on the efficiency of knowledge collaboration and is extended to reflect the work characteristics. All analyses were conducted based on actual data instead of self-reported questionnaire surveys. More specifically, we analyzed the collaborative behaviors of editors of 2,978 English Wikipedia featured articles, which are the best quality grade of articles in English Wikipedia. We adopted Pareto ratio, the ratio of the number of knowledge contribution of the upper 20 percent of participants to the total number of knowledge contribution made by the total participants of an article group, to examine the effect of Pareto principle. In addition, Gini coefficient, which represents the inequality of income among a group of people, was applied to reveal the effect of inequality of knowledge contribution. Hypotheses were set up based on the assumption that the higher ratio of knowledge contribution by more highly motivated participants will lead to the higher collaboration efficiency, but if the ratio gets too high, the collaboration efficiency will be exacerbated because overall informational diversity is threatened and knowledge contribution of less motivated participants is intimidated. Cox regression models were formulated for each of the focal variables-Pareto ratio and Gini coefficient-with seven control variables such as the number of editors involved in an article, the average time length between successive edits of an article, the number of sections a featured article has, etc. The dependent variable of the Cox models is the time spent from article initiation to promotion to the featured article level, indicating the efficiency of knowledge collaboration. To examine whether the effects of the focal variables vary depending on the characteristics of a group task, we classified 2,978 featured articles into two categories: Academic and Non-academic. Academic articles refer to at least one paper published at an SCI, SSCI, A&HCI, or SCIE journal. We assumed that academic articles are more complex, entail more information processing and problem solving, and thus require more skill variety and expertise. The analysis results indicate the followings; First, Pareto ratio and inequality of knowledge sharing relates in a curvilinear fashion to the collaboration efficiency in an online community, promoting it to an optimal point and undermining it thereafter. Second, the curvilinear effect of Pareto ratio and inequality of knowledge sharing on the collaboration efficiency is more sensitive with a more academic task in an online community.

Efficient Topic Modeling by Mapping Global and Local Topics (전역 토픽의 지역 매핑을 통한 효율적 토픽 모델링 방안)

  • Choi, Hochang;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.69-94
    • /
    • 2017
  • Recently, increase of demand for big data analysis has been driving the vigorous development of related technologies and tools. In addition, development of IT and increased penetration rate of smart devices are producing a large amount of data. According to this phenomenon, data analysis technology is rapidly becoming popular. Also, attempts to acquire insights through data analysis have been continuously increasing. It means that the big data analysis will be more important in various industries for the foreseeable future. Big data analysis is generally performed by a small number of experts and delivered to each demander of analysis. However, increase of interest about big data analysis arouses activation of computer programming education and development of many programs for data analysis. Accordingly, the entry barriers of big data analysis are gradually lowering and data analysis technology being spread out. As the result, big data analysis is expected to be performed by demanders of analysis themselves. Along with this, interest about various unstructured data is continually increasing. Especially, a lot of attention is focused on using text data. Emergence of new platforms and techniques using the web bring about mass production of text data and active attempt to analyze text data. Furthermore, result of text analysis has been utilized in various fields. Text mining is a concept that embraces various theories and techniques for text analysis. Many text mining techniques are utilized in this field for various research purposes, topic modeling is one of the most widely used and studied. Topic modeling is a technique that extracts the major issues from a lot of documents, identifies the documents that correspond to each issue and provides identified documents as a cluster. It is evaluated as a very useful technique in that reflect the semantic elements of the document. Traditional topic modeling is based on the distribution of key terms across the entire document. Thus, it is essential to analyze the entire document at once to identify topic of each document. This condition causes a long time in analysis process when topic modeling is applied to a lot of documents. In addition, it has a scalability problem that is an exponential increase in the processing time with the increase of analysis objects. This problem is particularly noticeable when the documents are distributed across multiple systems or regions. To overcome these problems, divide and conquer approach can be applied to topic modeling. It means dividing a large number of documents into sub-units and deriving topics through repetition of topic modeling to each unit. This method can be used for topic modeling on a large number of documents with limited system resources, and can improve processing speed of topic modeling. It also can significantly reduce analysis time and cost through ability to analyze documents in each location or place without combining analysis object documents. However, despite many advantages, this method has two major problems. First, the relationship between local topics derived from each unit and global topics derived from entire document is unclear. It means that in each document, local topics can be identified, but global topics cannot be identified. Second, a method for measuring the accuracy of the proposed methodology should be established. That is to say, assuming that global topic is ideal answer, the difference in a local topic on a global topic needs to be measured. By those difficulties, the study in this method is not performed sufficiently, compare with other studies dealing with topic modeling. In this paper, we propose a topic modeling approach to solve the above two problems. First of all, we divide the entire document cluster(Global set) into sub-clusters(Local set), and generate the reduced entire document cluster(RGS, Reduced global set) that consist of delegated documents extracted from each local set. We try to solve the first problem by mapping RGS topics and local topics. Along with this, we verify the accuracy of the proposed methodology by detecting documents, whether to be discerned as the same topic at result of global and local set. Using 24,000 news articles, we conduct experiments to evaluate practical applicability of the proposed methodology. In addition, through additional experiment, we confirmed that the proposed methodology can provide similar results to the entire topic modeling. We also proposed a reasonable method for comparing the result of both methods.

A Study on the Establishment of Comparison System between the Statement of Military Reports and Related Laws (군(軍) 보고서 등장 문장과 관련 법령 간 비교 시스템 구축 방안 연구)

  • Jung, Jiin;Kim, Mintae;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.3
    • /
    • pp.109-125
    • /
    • 2020
  • The Ministry of National Defense is pushing for the Defense Acquisition Program to build strong defense capabilities, and it spends more than 10 trillion won annually on defense improvement. As the Defense Acquisition Program is directly related to the security of the nation as well as the lives and property of the people, it must be carried out very transparently and efficiently by experts. However, the excessive diversification of laws and regulations related to the Defense Acquisition Program has made it challenging for many working-level officials to carry out the Defense Acquisition Program smoothly. It is even known that many people realize that there are related regulations that they were unaware of until they push ahead with their work. In addition, the statutory statements related to the Defense Acquisition Program have the tendency to cause serious issues even if only a single expression is wrong within the sentence. Despite this, efforts to establish a sentence comparison system to correct this issue in real time have been minimal. Therefore, this paper tries to propose a "Comparison System between the Statement of Military Reports and Related Laws" implementation plan that uses the Siamese Network-based artificial neural network, a model in the field of natural language processing (NLP), to observe the similarity between sentences that are likely to appear in the Defense Acquisition Program related documents and those from related statutory provisions to determine and classify the risk of illegality and to make users aware of the consequences. Various artificial neural network models (Bi-LSTM, Self-Attention, D_Bi-LSTM) were studied using 3,442 pairs of "Original Sentence"(described in actual statutes) and "Edited Sentence"(edited sentences derived from "Original Sentence"). Among many Defense Acquisition Program related statutes, DEFENSE ACQUISITION PROGRAM ACT, ENFORCEMENT RULE OF THE DEFENSE ACQUISITION PROGRAM ACT, and ENFORCEMENT DECREE OF THE DEFENSE ACQUISITION PROGRAM ACT were selected. Furthermore, "Original Sentence" has the 83 provisions that actually appear in the Act. "Original Sentence" has the main 83 clauses most accessible to working-level officials in their work. "Edited Sentence" is comprised of 30 to 50 similar sentences that are likely to appear modified in the county report for each clause("Original Sentence"). During the creation of the edited sentences, the original sentences were modified using 12 certain rules, and these sentences were produced in proportion to the number of such rules, as it was the case for the original sentences. After conducting 1 : 1 sentence similarity performance evaluation experiments, it was possible to classify each "Edited Sentence" as legal or illegal with considerable accuracy. In addition, the "Edited Sentence" dataset used to train the neural network models contains a variety of actual statutory statements("Original Sentence"), which are characterized by the 12 rules. On the other hand, the models are not able to effectively classify other sentences, which appear in actual military reports, when only the "Original Sentence" and "Edited Sentence" dataset have been fed to them. The dataset is not ample enough for the model to recognize other incoming new sentences. Hence, the performance of the model was reassessed by writing an additional 120 new sentences that have better resemblance to those in the actual military report and still have association with the original sentences. Thereafter, we were able to check that the models' performances surpassed a certain level even when they were trained merely with "Original Sentence" and "Edited Sentence" data. If sufficient model learning is achieved through the improvement and expansion of the full set of learning data with the addition of the actual report appearance sentences, the models will be able to better classify other sentences coming from military reports as legal or illegal. Based on the experimental results, this study confirms the possibility and value of building "Real-Time Automated Comparison System Between Military Documents and Related Laws". The research conducted in this experiment can verify which specific clause, of several that appear in related law clause is most similar to the sentence that appears in the Defense Acquisition Program-related military reports. This helps determine whether the contents in the military report sentences are at the risk of illegality when they are compared with those in the law clauses.

Ensemble Learning with Support Vector Machines for Bond Rating (회사채 신용등급 예측을 위한 SVM 앙상블학습)

  • Kim, Myoung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.29-45
    • /
    • 2012
  • Bond rating is regarded as an important event for measuring financial risk of companies and for determining the investment returns of investors. As a result, it has been a popular research topic for researchers to predict companies' credit ratings by applying statistical and machine learning techniques. The statistical techniques, including multiple regression, multiple discriminant analysis (MDA), logistic models (LOGIT), and probit analysis, have been traditionally used in bond rating. However, one major drawback is that it should be based on strict assumptions. Such strict assumptions include linearity, normality, independence among predictor variables and pre-existing functional forms relating the criterion variablesand the predictor variables. Those strict assumptions of traditional statistics have limited their application to the real world. Machine learning techniques also used in bond rating prediction models include decision trees (DT), neural networks (NN), and Support Vector Machine (SVM). Especially, SVM is recognized as a new and promising classification and regression analysis method. SVM learns a separating hyperplane that can maximize the margin between two categories. SVM is simple enough to be analyzed mathematical, and leads to high performance in practical applications. SVM implements the structuralrisk minimization principle and searches to minimize an upper bound of the generalization error. In addition, the solution of SVM may be a global optimum and thus, overfitting is unlikely to occur with SVM. In addition, SVM does not require too many data sample for training since it builds prediction models by only using some representative sample near the boundaries called support vectors. A number of experimental researches have indicated that SVM has been successfully applied in a variety of pattern recognition fields. However, there are three major drawbacks that can be potential causes for degrading SVM's performance. First, SVM is originally proposed for solving binary-class classification problems. Methods for combining SVMs for multi-class classification such as One-Against-One, One-Against-All have been proposed, but they do not improve the performance in multi-class classification problem as much as SVM for binary-class classification. Second, approximation algorithms (e.g. decomposition methods, sequential minimal optimization algorithm) could be used for effective multi-class computation to reduce computation time, but it could deteriorate classification performance. Third, the difficulty in multi-class prediction problems is in data imbalance problem that can occur when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such a classifier. SVM ensemble learning is one of machine learning methods to cope with the above drawbacks. Ensemble learning is a method for improving the performance of classification and prediction algorithms. AdaBoost is one of the widely used ensemble learning techniques. It constructs a composite classifier by sequentially training classifiers while increasing weight on the misclassified observations through iterations. The observations that are incorrectly predicted by previous classifiers are chosen more often than examples that are correctly predicted. Thus Boosting attempts to produce new classifiers that are better able to predict examples for which the current ensemble's performance is poor. In this way, it can reinforce the training of the misclassified observations of the minority class. This paper proposes a multiclass Geometric Mean-based Boosting (MGM-Boost) to resolve multiclass prediction problem. Since MGM-Boost introduces the notion of geometric mean into AdaBoost, it can perform learning process considering the geometric mean-based accuracy and errors of multiclass. This study applies MGM-Boost to the real-world bond rating case for Korean companies to examine the feasibility of MGM-Boost. 10-fold cross validations for threetimes with different random seeds are performed in order to ensure that the comparison among three different classifiers does not happen by chance. For each of 10-fold cross validation, the entire data set is first partitioned into tenequal-sized sets, and then each set is in turn used as the test set while the classifier trains on the other nine sets. That is, cross-validated folds have been tested independently of each algorithm. Through these steps, we have obtained the results for classifiers on each of the 30 experiments. In the comparison of arithmetic mean-based prediction accuracy between individual classifiers, MGM-Boost (52.95%) shows higher prediction accuracy than both AdaBoost (51.69%) and SVM (49.47%). MGM-Boost (28.12%) also shows the higher prediction accuracy than AdaBoost (24.65%) and SVM (15.42%)in terms of geometric mean-based prediction accuracy. T-test is used to examine whether the performance of each classifiers for 30 folds is significantly different. The results indicate that performance of MGM-Boost is significantly different from AdaBoost and SVM classifiers at 1% level. These results mean that MGM-Boost can provide robust and stable solutions to multi-classproblems such as bond rating.

Automatic gasometer reading system using selective optical character recognition (관심 문자열 인식 기술을 이용한 가스계량기 자동 검침 시스템)

  • Lee, Kyohyuk;Kim, Taeyeon;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.1-25
    • /
    • 2020
  • In this paper, we suggest an application system architecture which provides accurate, fast and efficient automatic gasometer reading function. The system captures gasometer image using mobile device camera, transmits the image to a cloud server on top of private LTE network, and analyzes the image to extract character information of device ID and gas usage amount by selective optical character recognition based on deep learning technology. In general, there are many types of character in an image and optical character recognition technology extracts all character information in an image. But some applications need to ignore non-of-interest types of character and only have to focus on some specific types of characters. For an example of the application, automatic gasometer reading system only need to extract device ID and gas usage amount character information from gasometer images to send bill to users. Non-of-interest character strings, such as device type, manufacturer, manufacturing date, specification and etc., are not valuable information to the application. Thus, the application have to analyze point of interest region and specific types of characters to extract valuable information only. We adopted CNN (Convolutional Neural Network) based object detection and CRNN (Convolutional Recurrent Neural Network) technology for selective optical character recognition which only analyze point of interest region for selective character information extraction. We build up 3 neural networks for the application system. The first is a convolutional neural network which detects point of interest region of gas usage amount and device ID information character strings, the second is another convolutional neural network which transforms spatial information of point of interest region to spatial sequential feature vectors, and the third is bi-directional long short term memory network which converts spatial sequential information to character strings using time-series analysis mapping from feature vectors to character strings. In this research, point of interest character strings are device ID and gas usage amount. Device ID consists of 12 arabic character strings and gas usage amount consists of 4 ~ 5 arabic character strings. All system components are implemented in Amazon Web Service Cloud with Intel Zeon E5-2686 v4 CPU and NVidia TESLA V100 GPU. The system architecture adopts master-lave processing structure for efficient and fast parallel processing coping with about 700,000 requests per day. Mobile device captures gasometer image and transmits to master process in AWS cloud. Master process runs on Intel Zeon CPU and pushes reading request from mobile device to an input queue with FIFO (First In First Out) structure. Slave process consists of 3 types of deep neural networks which conduct character recognition process and runs on NVidia GPU module. Slave process is always polling the input queue to get recognition request. If there are some requests from master process in the input queue, slave process converts the image in the input queue to device ID character string, gas usage amount character string and position information of the strings, returns the information to output queue, and switch to idle mode to poll the input queue. Master process gets final information form the output queue and delivers the information to the mobile device. We used total 27,120 gasometer images for training, validation and testing of 3 types of deep neural network. 22,985 images were used for training and validation, 4,135 images were used for testing. We randomly splitted 22,985 images with 8:2 ratio for training and validation respectively for each training epoch. 4,135 test image were categorized into 5 types (Normal, noise, reflex, scale and slant). Normal data is clean image data, noise means image with noise signal, relfex means image with light reflection in gasometer region, scale means images with small object size due to long-distance capturing and slant means images which is not horizontally flat. Final character string recognition accuracies for device ID and gas usage amount of normal data are 0.960 and 0.864 respectively.

Literature Analysis of Radiotherapy in Uterine Cervix Cancer for the Processing of the Patterns of Care Study in Korea (한국에서 자궁경부알 방사선치료의 Patterns of Care Study 진행을 위한 문헌 비교 연구)

  • Choi Doo Ho;Kim Eun Seog;Kim Yong Ho;Kim Jin Hee;Yang Dae Sik;Kang Seung Hee;Wu Hong Gyun;Kim Il Han
    • Radiation Oncology Journal
    • /
    • v.23 no.2
    • /
    • pp.61-70
    • /
    • 2005
  • Purpose: Uterine cervix cancer is one of the most prevalent women cancer in Korea. We analysed published papers in Korea with comparing Patterns of Care Study (PCS) articles of United States and Japan for the purpose of developing and processing Korean PCS. Materials and Methods: We searched PCS related foreign-produced papers in the PCS homepage (212 articles and abstracts) and from the Pub Med to find Structure and Process of the PCS. To compare their study with Korean papers, we used the internet site 'Korean Pub Med' to search 99 articles regarding uterine cervix cancer and radiation therapy. We analysed Korean paper by comparing them with selected PCS papers regarding Structure, Process and Outcome and compared their items between the period of before 1980's and 1990's. Results: Evaluable papers were 28 from United States, 10 from the Japan and 73 from the Korea which treated cervix PCS items. PCS papers for United States and Japan commonly stratified into $3\~4$ categories on the bases of the scales characteristics of the facilities, numbers of the patients, doctors, Researchers restricted eligible patients strictly. For the process of the study, they analysed factors regarding pretreatment staging in chronological order, treatment related factors, factors in addition to FIGO staging and treatment machine. Papers in United States dealt with racial characteristics, socioeconomic characteristics of the patients, tumor size (6), and bilaterality of parametrial or pelvic side wail invasion (5), whereas papers from Japan treated of the tumor markers. The common trend in the process of staging work-up was decreased use of lymphangiogram, barium enema and increased use of CT and MRI over the times. The recent subject from the Korean papers dealt with concurrent chemoradiotherapy (9 papers), treatment duration (4), tumor markers (B) and unconventional fractionation. Conclusion: By comparing papers among 3 nations, we collected items for Korean uterine cervix cancer PCS. By consensus meeting and close communication, survey items for cervix cancer PCS were developed to measure structure, process and outcome of the radiation treatment of the cervix cancer. Subsequent future research will focus on the use of brachytherapy and its impact on outcome including complications. These finding and future PCS studies will direct the development of educational programs aimed at correcting identified deficits in care.

Predicting the Direction of the Stock Index by Using a Domain-Specific Sentiment Dictionary (주가지수 방향성 예측을 위한 주제지향 감성사전 구축 방안)

  • Yu, Eunji;Kim, Yoosin;Kim, Namgyu;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.95-110
    • /
    • 2013
  • Recently, the amount of unstructured data being generated through a variety of social media has been increasing rapidly, resulting in the increasing need to collect, store, search for, analyze, and visualize this data. This kind of data cannot be handled appropriately by using the traditional methodologies usually used for analyzing structured data because of its vast volume and unstructured nature. In this situation, many attempts are being made to analyze unstructured data such as text files and log files through various commercial or noncommercial analytical tools. Among the various contemporary issues dealt with in the literature of unstructured text data analysis, the concepts and techniques of opinion mining have been attracting much attention from pioneer researchers and business practitioners. Opinion mining or sentiment analysis refers to a series of processes that analyze participants' opinions, sentiments, evaluations, attitudes, and emotions about selected products, services, organizations, social issues, and so on. In other words, many attempts based on various opinion mining techniques are being made to resolve complicated issues that could not have otherwise been solved by existing traditional approaches. One of the most representative attempts using the opinion mining technique may be the recent research that proposed an intelligent model for predicting the direction of the stock index. This model works mainly on the basis of opinions extracted from an overwhelming number of economic news repots. News content published on various media is obviously a traditional example of unstructured text data. Every day, a large volume of new content is created, digitalized, and subsequently distributed to us via online or offline channels. Many studies have revealed that we make better decisions on political, economic, and social issues by analyzing news and other related information. In this sense, we expect to predict the fluctuation of stock markets partly by analyzing the relationship between economic news reports and the pattern of stock prices. So far, in the literature on opinion mining, most studies including ours have utilized a sentiment dictionary to elicit sentiment polarity or sentiment value from a large number of documents. A sentiment dictionary consists of pairs of selected words and their sentiment values. Sentiment classifiers refer to the dictionary to formulate the sentiment polarity of words, sentences in a document, and the whole document. However, most traditional approaches have common limitations in that they do not consider the flexibility of sentiment polarity, that is, the sentiment polarity or sentiment value of a word is fixed and cannot be changed in a traditional sentiment dictionary. In the real world, however, the sentiment polarity of a word can vary depending on the time, situation, and purpose of the analysis. It can also be contradictory in nature. The flexibility of sentiment polarity motivated us to conduct this study. In this paper, we have stated that sentiment polarity should be assigned, not merely on the basis of the inherent meaning of a word but on the basis of its ad hoc meaning within a particular context. To implement our idea, we presented an intelligent investment decision-support model based on opinion mining that performs the scrapping and parsing of massive volumes of economic news on the web, tags sentiment words, classifies sentiment polarity of the news, and finally predicts the direction of the next day's stock index. In addition, we applied a domain-specific sentiment dictionary instead of a general purpose one to classify each piece of news as either positive or negative. For the purpose of performance evaluation, we performed intensive experiments and investigated the prediction accuracy of our model. For the experiments to predict the direction of the stock index, we gathered and analyzed 1,072 articles about stock markets published by "M" and "E" media between July 2011 and September 2011.