• Title/Summary/Keyword: separated

Search Result 8,703, Processing Time 0.035 seconds

Resolving the 'Gray sheep' Problem Using Social Network Analysis (SNA) in Collaborative Filtering (CF) Recommender Systems (소셜 네트워크 분석 기법을 활용한 협업필터링의 특이취향 사용자(Gray Sheep) 문제 해결)

  • Kim, Minsung;Im, Il
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.137-148
    • /
    • 2014
  • Recommender system has become one of the most important technologies in e-commerce in these days. The ultimate reason to shop online, for many consumers, is to reduce the efforts for information search and purchase. Recommender system is a key technology to serve these needs. Many of the past studies about recommender systems have been devoted to developing and improving recommendation algorithms and collaborative filtering (CF) is known to be the most successful one. Despite its success, however, CF has several shortcomings such as cold-start, sparsity, gray sheep problems. In order to be able to generate recommendations, ordinary CF algorithms require evaluations or preference information directly from users. For new users who do not have any evaluations or preference information, therefore, CF cannot come up with recommendations (Cold-star problem). As the numbers of products and customers increase, the scale of the data increases exponentially and most of the data cells are empty. This sparse dataset makes computation for recommendation extremely hard (Sparsity problem). Since CF is based on the assumption that there are groups of users sharing common preferences or tastes, CF becomes inaccurate if there are many users with rare and unique tastes (Gray sheep problem). This study proposes a new algorithm that utilizes Social Network Analysis (SNA) techniques to resolve the gray sheep problem. We utilize 'degree centrality' in SNA to identify users with unique preferences (gray sheep). Degree centrality in SNA refers to the number of direct links to and from a node. In a network of users who are connected through common preferences or tastes, those with unique tastes have fewer links to other users (nodes) and they are isolated from other users. Therefore, gray sheep can be identified by calculating degree centrality of each node. We divide the dataset into two, gray sheep and others, based on the degree centrality of the users. Then, different similarity measures and recommendation methods are applied to these two datasets. More detail algorithm is as follows: Step 1: Convert the initial data which is a two-mode network (user to item) into an one-mode network (user to user). Step 2: Calculate degree centrality of each node and separate those nodes having degree centrality values lower than the pre-set threshold. The threshold value is determined by simulations such that the accuracy of CF for the remaining dataset is maximized. Step 3: Ordinary CF algorithm is applied to the remaining dataset. Step 4: Since the separated dataset consist of users with unique tastes, an ordinary CF algorithm cannot generate recommendations for them. A 'popular item' method is used to generate recommendations for these users. The F measures of the two datasets are weighted by the numbers of nodes and summed to be used as the final performance metric. In order to test performance improvement by this new algorithm, an empirical study was conducted using a publically available dataset - the MovieLens data by GroupLens research team. We used 100,000 evaluations by 943 users on 1,682 movies. The proposed algorithm was compared with an ordinary CF algorithm utilizing 'Best-N-neighbors' and 'Cosine' similarity method. The empirical results show that F measure was improved about 11% on average when the proposed algorithm was used

    . Past studies to improve CF performance typically used additional information other than users' evaluations such as demographic data. Some studies applied SNA techniques as a new similarity metric. This study is novel in that it used SNA to separate dataset. This study shows that performance of CF can be improved, without any additional information, when SNA techniques are used as proposed. This study has several theoretical and practical implications. This study empirically shows that the characteristics of dataset can affect the performance of CF recommender systems. This helps researchers understand factors affecting performance of CF. This study also opens a door for future studies in the area of applying SNA to CF to analyze characteristics of dataset. In practice, this study provides guidelines to improve performance of CF recommender systems with a simple modification.

  • The Study about 「The Discourse on the Constitutional Symptoms and Diseases」 of Sasangin on the 『Dongyi Suse Bowon』 (『동의수세보원(東醫壽世保元)』 태소음양인(太少陰陽人)의 「병증론(病證論)」에 관(關)한 연구(硏究))

    • Lee, Su-kyung;Song, Il-byung
      • Journal of Sasang Constitutional Medicine
      • /
      • v.11 no.2
      • /
      • pp.1-26
      • /
      • 1999
    • This paper was written in order to understand each constitutional symptoms and diseases with two aspects. The first was to trace the courses to accomplish constitutional symptoms and diseases from that of oriented medicine through "Dongyi Bogam" and the original writing such as "Shanghanlun". The second was to analyze the constitutional diseases with Lee Je-ma's own recognition on human being and the society which was based on the "Dongyi Suse Bowon". The original concepts of 'The Interior Disease' and 'The Exterior Disease' were based on the Nature and the Emotion, the Environmental Frames and the Human Affairs, the Ears Eyes Nose Mouth and the Lung Spleen Liver Kidney. The exterior disease were caused by the abilities of ears to listen, eyes to see, nose to smell, and mouth to taste on the environmental frames which were related one's recognition to society. The interior diseases were caused by the abilities of lung to study, spleen to ask, liver to think, kidney to judge on human affairs which were related the relationship between me and others. So the titles of constitutional diseases were named by these views on his first writing of "Dongyi Suse Bowon" in 1894. So the titles of Taeyangin diseases, 'The lumbar Vertebae Disease Induced by Exopathogen' and 'The Small Intestine Disease Induced by Endopathogen' were still remained as the first writing. But the titles of constitutional diseases were rewritten such as present titles in 1900. In order to express pathology and mechanism of constitutional diseases exactly, he rewrote titles which contained the manifestation sites of diseases, and the symptoms of febrile and cold, and the different congenital formations of organs. The exterior diseases and interior diseases had three characteristics. The first was that the exterior disease injured by the nature which had a tendency to progress slowly and the interior disease injured by the emotion which had a tendency to progress rapidly. The second was not that the interior disease and the exterior disease were separated, but that one influenced the other and these were revealed as a disease together when the diseases continued for a long time. The third was that even though the disease caught together it was included the beginning disease. The symptoms in ordinary times was the origin and clue to recognize the constitutional symptoms and diseases. It enabled to establish the constitutional medicine which treated by different ways according to constitution. It had two characteristics which were different from the Traditional Chinese Medicine in appearance of diseases. The first was that the disease was progressed to the next step from the symptoms in ordinary times. The second was that each constitution had different symptoms which were due to symptoms in ordinary times under the same disease, The third was the manifestation of disease were different from symptoms in ordinary times in the same constitution. But the most important thing was that Lee Je-ma recognized these symptoms in ordinary times as four categories and he presented constitutional symptoms and constitutional disease. The four categories were the method to recognize the human being and the diseases for him As the symptoms and diseases of Sasang Constitutional Medicine were compared to Traditional Chinese Medicine, the constitutional diseases of "Dongyi Suse Bowon" could be classified into two groups. The first group was the unique diseases and symptoms, which were not in the Traditional Chinese Medicine, and which were established by the Lee Je-ma. These contained the diseases of taeyangin, the exterior disease of taeumin, the exterior disease of soyangin. The second group used the unique methods to treat disease, which were not in Traditional Chinese Medicine, and which were established by Lee Je-ma. This contained the interior disease of taeumin, the delirium diseases from the MangYin of soyangin, the treatment to help the Yang-Qi ascend and to supplement the ql In the exterior disease of soeumin. Especially, the diseases of taeyangin and taeumin which were caused by the metabolism disorders of Qi-Yack(氣液) were the great achievement to establish constitutional symptoms and diseases. The discourse of taeyangin diseases presented his original thought to recognize the symptoms and diseases through the Shin Gi Hyul Jeong(神氣血精) and the Qi-Yack, the discourse of taeumin diseases presented the disperse of Qi-Yack through the forward and backward of sweat, the discourse of soyangin disease presented the sweat of hand and feet which was manifested that yin-qi of spleen descended to yin qi of kidney, and the bowel movement which was manifested that yang qi of large intestine ascend to head, face and four extremities, the discourse of soeumin disease presented the Jueyin syndrome without the abdominal pain and diarrhea as the exterior disease and made importance to the nervous mind And the classification of exterior diseases and interior diseases were not due to the pharmacology but due to the symptoms and diseases according to the constitution.

    • PDF

    Development of Quantification Methods for the Myocardial Blood Flow Using Ensemble Independent Component Analysis for Dynamic $H_2^{15}O$ PET (동적 $H_2^{15}O$ PET에서 앙상블 독립성분분석법을 이용한 심근 혈류 정량화 방법 개발)

    • Lee, Byeong-Il;Lee, Jae-Sung;Lee, Dong-Soo;Kang, Won-Jun;Lee, Jong-Jin;Kim, Soo-Jin;Choi, Seung-Jin;Chung, June-Key;Lee, Myung-Chul
      • The Korean Journal of Nuclear Medicine
      • /
      • v.38 no.6
      • /
      • pp.486-491
      • /
      • 2004
    • Purpose: factor analysis and independent component analysis (ICA) has been used for handling dynamic image sequences. Theoretical advantages of a newly suggested ICA method, ensemble ICA, leaded us to consider applying this method to the analysis of dynamic myocardial $H_2^{15}O$ PET data. In this study, we quantified patients' blood flow using the ensemble ICA method. Materials and Methods: Twenty subjects underwent $H_2^{15}O$ PET scans using ECAT EXACT 47 scanner and myocardial perfusion SPECT using Vertex scanner. After transmission scanning, dynamic emission scans were initiated simultaneously with the injection of $555{\sim}740$ MBq $H_2^{15}O$. Hidden independent components can be extracted from the observed mixed data (PET image) by means of ICA algorithms. Ensemble learning is a variational Bayesian method that provides an analytical approximation to the parameter posterior using a tractable distribution. Variational approximation forms a lower bound on the ensemble likelihood and the maximization of the lower bound is achieved through minimizing the Kullback-Leibler divergence between the true posterior and the variational posterior. In this study, posterior pdf was approximated by a rectified Gaussian distribution to incorporate non-negativity constraint, which is suitable to dynamic images in nuclear medicine. Blood flow was measured in 9 regions - apex, four areas in mid wall, and four areas in base wall. Myocardial perfusion SPECT score and angiography results were compared with the regional blood flow. Results: Major cardiac components were separated successfully by the ensemble ICA method and blood flow could be estimated in 15 among 20 patients. Mean myocardial blood flow was $1.2{\pm}0.40$ ml/min/g in rest, $1.85{\pm}1.12$ ml/min/g in stress state. Blood flow values obtained by an operator in two different occasion were highly correlated (r=0.99). In myocardium component image, the image contrast between left ventricle and myocardium was 1:2.7 in average. Perfusion reserve was significantly different between the regions with and without stenosis detected by the coronary angiography (P<0.01). In 66 segment with stenosis confirmed by angiography, the segments with reversible perfusion decrease in perfusion SPECT showed lower perfusion reserve values in $H_2^{15}O$ PET. Conclusions: Myocardial blood flow could be estimated using an ICA method with ensemble learning. We suggest that the ensemble ICA incorporating non-negative constraint is a feasible method to handle dynamic image sequence obtained by the nuclear medicine techniques.

    Anti-climacterium Effects of Gagamguibiondam-tang in Ovariectomized Rats (난소적출로 유발된 랫트 갱년기 장애에 대한 가감귀비온담탕의 생리활성 효과 평가)

    • Han, Sang-Gyeom;Kim, Dong-Chul
      • The Journal of Korean Obstetrics and Gynecology
      • /
      • v.30 no.4
      • /
      • pp.18-44
      • /
      • 2017
    • Purpose: The object of this study was to observe the anti-climacterium activity of Gagamguibiondam-tang (GGOT) on ovariectomized (OVX) rats, a well-documented rodent models resembles with women postmenopausal climacterium symptoms, as including cardiovascular diseases, obesity, hyperlipidemia, osteoporosis, organ steatosis and mental disorders. Methods: In this study, anti-climacteric effects were evaluated separated into three categories; 1) anti-obese, 2) anti-uterine atrophy and 3) anti-osteoporotic effects. Five groups were used (8 rats in each group); sham control, OVX control, GGOT 500, 250 and 125 mg/kg administered groups. Twenty-eight days after bilateral OVX surgery, GGOT were orally administered, once a day for 84 days, and then the changes on the body weight and gain during experimental periods, serum estradiol levels, abdominal fat pad and uterus weights with histopathology of abdominal fat pads (total thickness and mean adipocyte diameters) and uterus (total, epithelial and mucosal thickness, percentages of uterine gland regions) for anti-obese and estrogenic effects. In addition, femur, tibia and fourth or fifth lumbar vertebrae (L4 or L5) wet, dry and ash weights, mineral density (BMD), bone strength (failure load), serum osteocalcin and bone specific alkaline phosphatase (bALP) contents, histological and histomorphometrical analyses - bone mass and structure with bone resorption, were monitored for anti-osteoporosis activity. Results: As a result of OVX, noticeable increases of body weight and gains, food and water consumption, weights of abdominal fat pad deposited in dorsal abdominal cavity, serum osteocalcin levels were demonstrated in this experiment with decrease of uterus, femur, tibia and L5 weights, serum bALP and estradiol levels. In addition, marked hypertrophic changes of adipocytes located in deposited abdominal fat pads, uterine disused atrophic changes, decreases of bone mass and structures of femur, tibia and L4 were also observed in OVX control rats with dramatic increases of bone resorption markers, the Ocn and OS/BS at histopathological and histomorphometrical analysis in this study as compared with sham-operated control rats, suggesting the estrogen-deficient climacterium symptoms - obese and osteoporosis were induced by OVX, respectively. However, these estrogen-deficient climacterium symptoms induced by bilateral OVX in rats were significantly inhibited by 84 days of continuous oral treatment of GGOT 500, 250 and 125 mg/kg, respectively. Especially, GGOT 500, 250 and 125 mg/kg showed clear dose-dependent inhibitory activities on the OVX-induced climacterium signs. Conclusion: The results suggest that oral administration of GGOT 500, 250 and 125 mg/kg has clear dose-dependent favorable anti-climacterium effects - estrogenic, anti-obese and anti-osteoporotic activities in OVX rats in this experiment.

    Effects of High Glucose and Advanced Glycosylation Endproducts(AGE) on the in vitro Permeability Model (당과 후기당화합물의 생체 외 사구체여과율 모델에 대한 역할)

    • Lee Jun-Ho;Ha Tae-Sun
      • Childhood Kidney Diseases
      • /
      • v.10 no.1
      • /
      • pp.8-17
      • /
      • 2006
    • Purpose : We describe the changes of rat glomerular epithelial cells when exposed to high levels of glucose and advanced glycosylation endproducts(AGE) in the in vitro diabetic condition. We expect morphological alteration of glomerular epithelial cells and permeability changes experimentally and we may correlate the results with a mechanism of proteinuria in DM. Methods : We made 0.2 M glucose-6-phsphate solution mixed with PBS(pH 7.4) containing 50 mg/mL BSA and pretense inhibitor for preparation of AGE. As control, we used BSA. We manufactured and symbolized five culture dishes as follows; B5 - normal glucose(5 mM) + BSA, B30 - high glucose(30 mM) + BSA, A5 - normal glucose(5 mM) + AGE, A30 - high glucose(30 mM) + AGE, A/B 25 - normal glucose(5 mM) + 25 mM of mannitol(osmotic control). After the incubation period of both two days and seven days, we measured the amount of heparan sulfate proteoglycan(HSPG) in each dish by ELISA and compared them with the B5 dish at 2nd and 7th incubation days. We observed the morphological changes of epithelial cells in each culture dish using scanning electron microscopy(SEM). We tried the permeability assay of glomerular epithelial cells using cellulose semi-permeable membrane measuring the amount of filtered BSA through the apical chamber for 2 hours by sandwich ELISA. Results : On the 2nd incubation day, there was no significant difference in the amount of HSPG between the 5 culture dishes. But on the 7th incubation day, the amount of HSPG increased by 10% compared with the B5 dish on the 2nd day except the A30 dish(P<0.05). Compared with the B5 dish on the 7th day the amount of HSPG in A30 and B30 dish decreased to 77.8% and 95.3% of baseline, respectively(P>0.05). In the osmotic control group (A/B 25) no significant correlation was observed. On the SEM, we could see the separated intercellular junction and fused microvilli of glomerular epithelial cells in the culture dishes where AGE was added. The permeability of BSA increased by 19% only in the A30 dish on the 7th day compared with B5 dish on the 7th day in the permeability assay(P<0.05). Conclusion: We observed not only the role of a high level of glucose and AGE in decreasing the production of HSPG of glomerular epithelial cells in vitro, but also their additive effect. However, the role of AGE is greater than that of glucose. These results seems to correlate with the defects in charge selective barrier. Morphological changes of the disruption of intercellular junction and fused microvilli of glomerular epithelial cells seem to correlate with the defects in size-selective barrier. Therefore, we can explain the increased permeability of glomerular epithelial units in the in vitro diabetic condition.

    • PDF

    Temporal and Spatial Distribution of Benthic Polychaetous Communities in Seomjin River Estuary (섬진강 하구역 저서다모류군집의 시·공간 분포)

    • Kang, Sung Hyo;Lee, Jung Ho;Park, Sung Wan;Shin, Hyun Chool
      • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
      • /
      • v.19 no.4
      • /
      • pp.243-255
      • /
      • 2014
    • This study was investigated to estimate the relations between benthic environments and benthic polychaetous community from April 2012 to February 2013. Twenty four stations were selected sequentially with Seomjin River Estuary from the northern part of Gwangyang Bay. The study area could be divided into three characteristic zones based on salinity, water temperature, dissolved oxygen and pH such as Saline Water Zone (SWZ), Brackish Water Zone (BWZ), and Fresh Water Zone (FWZ). Salinity was above 30.0 psu in SWZ, drastically decreased toward inland in BWZ, and nearly zero psu in FWZ. SWZ showed its specific environmental characters like that water temperature fluctuated with little seasonal change and DO showed the lowest values among three zones, and pH maintained as consistent value without seasonal fluctuation. In FWZ, on the other hand, water temperature showed high seasonal fluctuation, DO showed the highest values among three zones, and pH fluctuated greatly. In sedimentary environment, mud, sand and sand/gravel were found as dominant sedimentary deposits in SWZ, BWZ and FWZ, respectively. Organic matter content and AVS in surface sediment were high in SWZ, while Chl-a content high in FWZ. This study area showed a marked environmental difference between FWZ and SWZ as follows: FWZ has coarse sediment and low salinity, low organic matter content, low AVS in FWZ but SWZ has fine sediment and high salinity, high organic matter content and AVS. Species number and mean density of benthic polychaete community was highest in Saline Water Zone (SWZ), drastically decreased in Brackish Water Zone (BWZ), and lowest in Fresh Water Zone (FWZ). Dominant polychates above 5.0% of individual numbers were 6 taxa. Lumbrineris longifolia, Prionospio cirrifera, Tharyx sp. occurred as main dominant species of all study periods, and Hediste sp., Praxillella affinis, Tylorrhynchus sp. dominantly occurred at some seasons. Inhabiting areas of dominant species were separated characteristically. Representative species in SWZ were Lumbrineris longifolia, Tharyx sp., Mediomastus sp.. Wide-appearing species between SWZ and BWZ were Prionospio cirrifera, Heteromastus filiformis, Aricidea sp.. Characteristic species in FWZ were Tylorrhynchus sp. and Hediste sp.. As the results of cluster analysis and nMDS based on the species composition of polychaetous community, unique station groups were established in SWZ and FWZ. Stations in BWZ were sub-divided into several groups with season. Pearson's correlation analysis and PCA between benthic environments and ecological characteristics of polychaetous community showed that salinity, sediment composition, organic content and dissolved oxygen played a role to determine the temporal and spatial distribution of the ecological characteristics as species number, mean density, abundance of main species, and ecological indices.

    Chromaticity and Brown Pigment Patterns of Soy Sauce and UHYUKJANG, Korean Traditional Fermented Soy Sauce (간장과 어육장의 색도 및 갈색색소 패턴)

    • Kim, Ji-Sang;Moon, Gap-Soon;Lee, Young-Soon
      • Korean journal of food and cookery science
      • /
      • v.22 no.5 s.95
      • /
      • pp.642-649
      • /
      • 2006
    • The browning of soy sauce is caused by the reaction of amino-carbonyl between amino-compounds and reducing sugar. Only a few studies have investigated the formation of melanoidins in UHYUKJANG. The objectives of this study were to analyze the brown pigment of UHYUKJANG and to investigate the characteristics of UHYUKJANG in comparison with soy sauce and model melanoidins. The samples were ripened for 0, 60, 120, 180, 240, 300 and 360 days at 4$^{\circ}C$ and 20$^{\circ}C$. The pH, absorbance at 420 nm absorbance ratio of 400 to 500 nm and UV-VIS spectra as an index of color intensity were measured. Additionally, L, a and b values of the samples and the amount of 3-Deoxyglucosone(3DG) in the samples were measured. The pH of both soy sauce (from 6.26 to 5.52) and UHYUKJANG (from 6.13 to 5.11) rapidly decreased during the first 60 days of aging and was also affected by storage temperature. The absorbance of samples at 420 nm increased during the aging process, reaching its maximum after 180 days, regardless of sample and temperature. On the other hand, the intensity of brown color in the samples increased with increasing aging period according to the results of absorbance ratio (soy sauce: 1.37 to 5.29, UHYUKJANG: 1.37 to 5.02). The L value of soy sauce increased during the aging process and was maximized after 240 days at 4$^{\circ}C$ and 180 days at 20$^{\circ}C$, but decreased thereafter. There was no significant difference in L value of UHYUKJANG, regardless of aging period and temperature. On the other hand, the b value did not reveal any significant change during aging, but the a value increased until 120 days of aging in the other samples except for UHYUKJANG at 20$^{\circ}C$. The average amount of 3DG separated from soy sauce was 5.65 mg%, and from UHYUKJANG was 3.74 mg%. These results indicated that the browning of UHYUKJANG was also caused by melanoidins produced by the reaction of amino-carbonyl during the fermentation process.

    Development of Information Extraction System from Multi Source Unstructured Documents for Knowledge Base Expansion (지식베이스 확장을 위한 멀티소스 비정형 문서에서의 정보 추출 시스템의 개발)

    • Choi, Hyunseung;Kim, Mintae;Kim, Wooju;Shin, Dongwook;Lee, Yong Hun
      • Journal of Intelligence and Information Systems
      • /
      • v.24 no.4
      • /
      • pp.111-136
      • /
      • 2018
    • In this paper, we propose a methodology to extract answer information about queries from various types of unstructured documents collected from multi-sources existing on web in order to expand knowledge base. The proposed methodology is divided into the following steps. 1) Collect relevant documents from Wikipedia, Naver encyclopedia, and Naver news sources for "subject-predicate" separated queries and classify the proper documents. 2) Determine whether the sentence is suitable for extracting information and derive the confidence. 3) Based on the predicate feature, extract the information in the proper sentence and derive the overall confidence of the information extraction result. In order to evaluate the performance of the information extraction system, we selected 400 queries from the artificial intelligence speaker of SK-Telecom. Compared with the baseline model, it is confirmed that it shows higher performance index than the existing model. The contribution of this study is that we develop a sequence tagging model based on bi-directional LSTM-CRF using the predicate feature of the query, with this we developed a robust model that can maintain high recall performance even in various types of unstructured documents collected from multiple sources. The problem of information extraction for knowledge base extension should take into account heterogeneous characteristics of source-specific document types. The proposed methodology proved to extract information effectively from various types of unstructured documents compared to the baseline model. There is a limitation in previous research that the performance is poor when extracting information about the document type that is different from the training data. In addition, this study can prevent unnecessary information extraction attempts from the documents that do not include the answer information through the process for predicting the suitability of information extraction of documents and sentences before the information extraction step. It is meaningful that we provided a method that precision performance can be maintained even in actual web environment. The information extraction problem for the knowledge base expansion has the characteristic that it can not guarantee whether the document includes the correct answer because it is aimed at the unstructured document existing in the real web. When the question answering is performed on a real web, previous machine reading comprehension studies has a limitation that it shows a low level of precision because it frequently attempts to extract an answer even in a document in which there is no correct answer. The policy that predicts the suitability of document and sentence information extraction is meaningful in that it contributes to maintaining the performance of information extraction even in real web environment. The limitations of this study and future research directions are as follows. First, it is a problem related to data preprocessing. In this study, the unit of knowledge extraction is classified through the morphological analysis based on the open source Konlpy python package, and the information extraction result can be improperly performed because morphological analysis is not performed properly. To enhance the performance of information extraction results, it is necessary to develop an advanced morpheme analyzer. Second, it is a problem of entity ambiguity. The information extraction system of this study can not distinguish the same name that has different intention. If several people with the same name appear in the news, the system may not extract information about the intended query. In future research, it is necessary to take measures to identify the person with the same name. Third, it is a problem of evaluation query data. In this study, we selected 400 of user queries collected from SK Telecom 's interactive artificial intelligent speaker to evaluate the performance of the information extraction system. n this study, we developed evaluation data set using 800 documents (400 questions * 7 articles per question (1 Wikipedia, 3 Naver encyclopedia, 3 Naver news) by judging whether a correct answer is included or not. To ensure the external validity of the study, it is desirable to use more queries to determine the performance of the system. This is a costly activity that must be done manually. Future research needs to evaluate the system for more queries. It is also necessary to develop a Korean benchmark data set of information extraction system for queries from multi-source web documents to build an environment that can evaluate the results more objectively.

    A Study on the Basic Planning of the Nam-Hae Sin-Sa Architecture (남해신사 기본계획에 따른 신당건축 고찰)

    • Kim, Sang Tae;Jang, Hun Duc
      • Korean Journal of Heritage: History & Science
      • /
      • v.42 no.2
      • /
      • pp.62-85
      • /
      • 2009
    • The Nam-Hae Sin-sa, the South Sea shrine in Yeong-Am, Korea was a national institution for public peace and bliss, was excavated in 2000, and the shrine and the 3-way-gate were reconstructed in 2001. Hae Sin-sa, the Sea shrine is a place for religious service separated into the Nam-Hae Sin-sa, the Dong-Hae Myo, and the Seo-Hae Dan. The Dong-Hae Myo was reconstructed, but restored shrine and 3-way-gate of the Nam-Hae Sin-sa is not perfect in comparison with excavation plan in 2000, therefore new reconstruction was researched through the related literature, the analysis of historical maps and excavation results, the interview with the concerned people and the case study. This research defines the analysis of the Plan of the Nam-Hae Sin-sa Reconstruction as follows. 1. The Nam-Hae Sin-sa was the institution for religious service operated by national direct management, represents the shrine for public peace and bliss on the Mountain, the Sea, and the River. Especially the Nam-Hae Sin-sa had an important position on the pivot of international trade with China and Japan, and had a role of main shrine with another one in the Mt. Ji-ri San. 2. The name of the Sea shrine was called as Nam-Hae Sin-sa(the South Sea shrine), Dong-Hae Myo(the East Sea shrine), Seo-Hae Dan(the West Sea shrine). But the name of the South Sea shrine had changed in the early period of Chosun as Nam-Hae Sin-sa to the later Chosun as Nam-Hae Dang through the research of related literature and historical map. Such as the Seo-Hae Dan, it was constructed for the Dan, the flat raised-floor without buildings, and changed to the type of Sa-Dang with addition of buildings. 3. The historical map of Hae Sin-sa informs the types of the roof, the Mat-bae roof was used in the Dong-Hae Myo, but the Pal-jak roof was showed in the Seo-Hae Dan and the Nam-Hae Sin-sa. 4. According to the analysis of Yong-Ch'uck the unit length, Nam-Hae Sin-sa was reconstructed in the period of Koryo on large scale, but it was restored in the Chosun on middle scale. And the Unit of Yong Ch'uck was changed into Yeong-jo Ch'uck in the period of Chosun. 5. As the results, The Plan of the Nam-Hae Sin-sa Reconstruction designed the new shrine into the 3 Kan front and the 2 Kan side with 3:2 scale. An-ch'o-gong with Yong-du and Yong Mi the ornaments represents head and tail of dragon, the Un-gong and the ornament of Pa-ryun-dae-gong in the building, and the Ch'ung-ryang of the Yong-du show the image of the institution for religious service for the god of the sea who look like dragon. The inner gate building and the main entrance were designed as same plan and scale as Hyang-gyo, the Korean Traditional School and Shrine of Confucianism, on the basis of results of excavation. Raise the 3-tall gate of the main entrance with harmony of the scale and the shape, because the side of gate building has the Mat-bae roof. 6. This research shows that Plan of the Nam-Hae Sin-sa Reconstruction is composed into shrine space and reservation space from the main entrance to inner gate and shrine like Jung-ak Dan in the Mt. Gye-ryong San, and it also informs the well in the west side of Sin-sa is an important factor of the plan of shrine architecture.

    Query-based Answer Extraction using Korean Dependency Parsing (의존 구문 분석을 이용한 질의 기반 정답 추출)

    • Lee, Dokyoung;Kim, Mintae;Kim, Wooju
      • Journal of Intelligence and Information Systems
      • /
      • v.25 no.3
      • /
      • pp.161-177
      • /
      • 2019
    • In this paper, we study the performance improvement of the answer extraction in Question-Answering system by using sentence dependency parsing result. The Question-Answering (QA) system consists of query analysis, which is a method of analyzing the user's query, and answer extraction, which is a method to extract appropriate answers in the document. And various studies have been conducted on two methods. In order to improve the performance of answer extraction, it is necessary to accurately reflect the grammatical information of sentences. In Korean, because word order structure is free and omission of sentence components is frequent, dependency parsing is a good way to analyze Korean syntax. Therefore, in this study, we improved the performance of the answer extraction by adding the features generated by dependency parsing analysis to the inputs of the answer extraction model (Bidirectional LSTM-CRF). The process of generating the dependency graph embedding consists of the steps of generating the dependency graph from the dependency parsing result and learning the embedding of the graph. In this study, we compared the performance of the answer extraction model when inputting basic word features generated without the dependency parsing and the performance of the model when inputting the addition of the Eojeol tag feature and dependency graph embedding feature. Since dependency parsing is performed on a basic unit of an Eojeol, which is a component of sentences separated by a space, the tag information of the Eojeol can be obtained as a result of the dependency parsing. The Eojeol tag feature means the tag information of the Eojeol. The process of generating the dependency graph embedding consists of the steps of generating the dependency graph from the dependency parsing result and learning the embedding of the graph. From the dependency parsing result, a graph is generated from the Eojeol to the node, the dependency between the Eojeol to the edge, and the Eojeol tag to the node label. In this process, an undirected graph is generated or a directed graph is generated according to whether or not the dependency relation direction is considered. To obtain the embedding of the graph, we used Graph2Vec, which is a method of finding the embedding of the graph by the subgraphs constituting a graph. We can specify the maximum path length between nodes in the process of finding subgraphs of a graph. If the maximum path length between nodes is 1, graph embedding is generated only by direct dependency between Eojeol, and graph embedding is generated including indirect dependencies as the maximum path length between nodes becomes larger. In the experiment, the maximum path length between nodes is adjusted differently from 1 to 3 depending on whether direction of dependency is considered or not, and the performance of answer extraction is measured. Experimental results show that both Eojeol tag feature and dependency graph embedding feature improve the performance of answer extraction. In particular, considering the direction of the dependency relation and extracting the dependency graph generated with the maximum path length of 1 in the subgraph extraction process in Graph2Vec as the input of the model, the highest answer extraction performance was shown. As a result of these experiments, we concluded that it is better to take into account the direction of dependence and to consider only the direct connection rather than the indirect dependence between the words. The significance of this study is as follows. First, we improved the performance of answer extraction by adding features using dependency parsing results, taking into account the characteristics of Korean, which is free of word order structure and omission of sentence components. Second, we generated feature of dependency parsing result by learning - based graph embedding method without defining the pattern of dependency between Eojeol. Future research directions are as follows. In this study, the features generated as a result of the dependency parsing are applied only to the answer extraction model in order to grasp the meaning. However, in the future, if the performance is confirmed by applying the features to various natural language processing models such as sentiment analysis or name entity recognition, the validity of the features can be verified more accurately.


    (34141) Korea Institute of Science and Technology Information, 245, Daehak-ro, Yuseong-gu, Daejeon
    Copyright (C) KISTI. All Rights Reserved.