• Title/Summary/Keyword: Index level

Search Result 6,015, Processing Time 0.033 seconds

Development of Information Extraction System from Multi Source Unstructured Documents for Knowledge Base Expansion (지식베이스 확장을 위한 멀티소스 비정형 문서에서의 정보 추출 시스템의 개발)

  • Choi, Hyunseung;Kim, Mintae;Kim, Wooju;Shin, Dongwook;Lee, Yong Hun
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.111-136
    • /
    • 2018
  • In this paper, we propose a methodology to extract answer information about queries from various types of unstructured documents collected from multi-sources existing on web in order to expand knowledge base. The proposed methodology is divided into the following steps. 1) Collect relevant documents from Wikipedia, Naver encyclopedia, and Naver news sources for "subject-predicate" separated queries and classify the proper documents. 2) Determine whether the sentence is suitable for extracting information and derive the confidence. 3) Based on the predicate feature, extract the information in the proper sentence and derive the overall confidence of the information extraction result. In order to evaluate the performance of the information extraction system, we selected 400 queries from the artificial intelligence speaker of SK-Telecom. Compared with the baseline model, it is confirmed that it shows higher performance index than the existing model. The contribution of this study is that we develop a sequence tagging model based on bi-directional LSTM-CRF using the predicate feature of the query, with this we developed a robust model that can maintain high recall performance even in various types of unstructured documents collected from multiple sources. The problem of information extraction for knowledge base extension should take into account heterogeneous characteristics of source-specific document types. The proposed methodology proved to extract information effectively from various types of unstructured documents compared to the baseline model. There is a limitation in previous research that the performance is poor when extracting information about the document type that is different from the training data. In addition, this study can prevent unnecessary information extraction attempts from the documents that do not include the answer information through the process for predicting the suitability of information extraction of documents and sentences before the information extraction step. It is meaningful that we provided a method that precision performance can be maintained even in actual web environment. The information extraction problem for the knowledge base expansion has the characteristic that it can not guarantee whether the document includes the correct answer because it is aimed at the unstructured document existing in the real web. When the question answering is performed on a real web, previous machine reading comprehension studies has a limitation that it shows a low level of precision because it frequently attempts to extract an answer even in a document in which there is no correct answer. The policy that predicts the suitability of document and sentence information extraction is meaningful in that it contributes to maintaining the performance of information extraction even in real web environment. The limitations of this study and future research directions are as follows. First, it is a problem related to data preprocessing. In this study, the unit of knowledge extraction is classified through the morphological analysis based on the open source Konlpy python package, and the information extraction result can be improperly performed because morphological analysis is not performed properly. To enhance the performance of information extraction results, it is necessary to develop an advanced morpheme analyzer. Second, it is a problem of entity ambiguity. The information extraction system of this study can not distinguish the same name that has different intention. If several people with the same name appear in the news, the system may not extract information about the intended query. In future research, it is necessary to take measures to identify the person with the same name. Third, it is a problem of evaluation query data. In this study, we selected 400 of user queries collected from SK Telecom 's interactive artificial intelligent speaker to evaluate the performance of the information extraction system. n this study, we developed evaluation data set using 800 documents (400 questions * 7 articles per question (1 Wikipedia, 3 Naver encyclopedia, 3 Naver news) by judging whether a correct answer is included or not. To ensure the external validity of the study, it is desirable to use more queries to determine the performance of the system. This is a costly activity that must be done manually. Future research needs to evaluate the system for more queries. It is also necessary to develop a Korean benchmark data set of information extraction system for queries from multi-source web documents to build an environment that can evaluate the results more objectively.

Studies on Neck Blast Infection of Rice Plant (벼 이삭목도열병(病)의 감염(感染)에 관(關)한 연구(硏究))

  • Kim, Hong Gi;Park, Jong Seong
    • Korean Journal of Agricultural Science
    • /
    • v.12 no.2
    • /
    • pp.206-241
    • /
    • 1985
  • Attempts to search infection period, infection speed in the tissue of neck blast of rice plant, location of inoculum source and effects of several conditions about the leaf sheath of rice plants for neck blast incidence have been made. 1. The most infectious period for neck blast incidence was the booting stage just before heading date, and most of necks have been infected during the booting stage and on heading date. But $Indica{\times}Japonica$ hybrid varieties had shown always high possibility for infection after booting stage. 2. Incubation period for neck blast of rice plants under natural conditions had rather a long period ranging from 10 to 22 days. Under artificial inoculation condition incubation period in the young panicle was shorter than in the old panicle. Panicles that emerged from the sheath of flag leaf had long incubation period, with a low infection rate and they also shown slow infection speed in the tissue. 3. Considering the incubation period of neck blast of rice plant, we assumed that the most effective application periods of chemicals are 5-10 days for immediate effective chemicals and 10-15 days for slow effective chemicals before heading. 4. Infiltration of conidia into the leaf sheath of rice plant carried out by saturation effect with water through the suture of the upper three leaves. The number of conidia observed in the leaf sheath during the booting stage were higher than those in the leaf sheath during other stages. Ligule had protected to infiltrate of conidia into the leaf sheath. 5. When conidia were infiltrated into the leaf sheath, the highest number of attached conidia was observed on the panicle base and panicle axis with hairs and degenerated panicle, which seemed to promote the infection of neck blast. 6. The lowest spore concentration for neck blast incidence was variable with rice varietal groups. $Indica{\times}Japonica$ hybrid varieties were infected easily compared to the Japonica type varieties, especially. The number of spores for neck blast incidence in $Indica{\times}Japonica$ hybrid varieties was less than 100 and disease index was higher also in $Indica{\times}Japonica$ hybrid than in Japonica type varieties. 7. Nitrogen content and silicate content were related with blast incidence in necks of rice plants in the different growing stage changed during growing period. Nitrogen content increased from booting stage to heading date and then decreased gradually as time passes. Silicate content increased from booting stage after heading with time. Change of these content promoted to increase neck blast infection. 8. Conidia moved to rice plant by ascending and desending dispersal and then attached on the rice plant. Conidia transfered horizontally was found very negligible. So we presumed that infection rate of neck blast was very low after emergence of panicle base from the leaf sheath. Also ascending air current by temperature difference between upper and lower side of rice plant seemed to increase the liberation of spores. 9. Conidial number of the blast fungus collected just before and after heading date was closely related with neck blast incidence. Lesions on three leaves from the top were closely related with neck blast incidence, because they had high potential for conidia formation of rice blast fungus and they were direct inoculum sources for neck blast. 10. The condition inside the leaf sheath was very favorable for the incidence of neck blast and the neck blast incidence in the leaf sheath increased as the level of fertilizer applied increased. Therefore, the infection rate of neck blast on the all panicle parts such as panicle base, panicle branches, spikelets, nodes, and internodes inside the leaf sheath didn't show differences due to varietal resistance or fertilizers applied. 11. Except for others among dominant species of fungi in the leaf sheath, only Gerlachia oryzae appeared to promote incidence of neck blast. It was assumed that days for heading of varieties were related with neck blast incidence.

  • PDF

Studies on the Natural Distribution and Ecology of Ilex cornuta Lindley et Pax. in Korea (호랑가시나무의 천연분포(天然分布)와 군낙생태(群落生態)에 관한 연구(研究))

  • Lee, Jeong Seok
    • Journal of Korean Society of Forest Science
    • /
    • v.62 no.1
    • /
    • pp.24-42
    • /
    • 1983
  • To develop Ilex cornuta which grow naturally in the southwest seaside district as new ornamental tree, the author chose I. cornuta growing in the four natural communities and those cultivated in Kwangju city as a sample, and investigated its ecology, morphology and characteristics. The results obtained was summarized as follows; 1) The natural distribution of I. cornuta marks $35^{\circ}$43'N and $126^{\circ}$44'E in the southwestern part of Korea and $33^{\circ}$20'N and $126^{\circ}$15'E in Jejoo island. This area has the following necessary conditions for Ilex cornuta: the annual average temperature is above $12^{\circ}C$, the coldness index below $-12.7^{\circ}C$, annual average relative humidity 75-80%, and the number of snow-covering days is 20-25 days, situated within 20km of from coastline and within, 100m above sea level and mainly at the foot of the mountain facing the southeast. 2) The vegetation in I. cornuta community can be divided that upper layer is composed of Pinus thunbergii and P. densiflora, middle layer of Eurya japonica var. montana, Ilex cornuta and Vaccinium bracteatum, and the ground vegetation is composed of Carex lanceolata and Arundinella hirta var. ciliare. The community has high species diversity which indicates it is at the stage of development. Although I. cornuta is a species of the southern type of temperate zone where coniferous tree or broad leaved, evergreen trees grow together, it occasionally grows in the subtropical zone. 3) Parent rock is gneiss or rhyolite etc., and soil is acidic (about pH 4.5-5.0) and the content of available phosphorus is low. 4) At maturity, the height growth averaged $10.48{\pm}0.23cm$ a year and the diameter growth 0.43 cm a year, and the annual ring was not clear. Mean leaf-number was 11.34. There are a significant positive correlation between twig-elongation and leaf-number. 5) One-year-old seedling grows up to 10.66 cm (max. 18.2 cm, min. 4.0 cm) in shoot-height, with its leaf number 12.1 (max. 18, min), its basal diameter 2.24 mm (max. 4.0 mm, min. 1.0 mm) and shows rhythmical growth in high temperature period. There were significant positive correlations between stalk-height and leaf-number, between stalk-height and basal-diameter, and between number and basal diameter. 6) The flowering time ranged from the end of April to the beginning of May, and the flower has tetra-merouscorella and corymb of yellowish green. It has a bisexual flower and dioecism with a sexual ratio 1:1. 7) The fruit, after fertilization, grows 0.87 cm long (0.61-1.31 cm) and 0.8 cm wide (0.62-1.05 cm) by the beginning of May. Fruits begin to turn red and continue to ripen until the end of October or the beginning of November and remain unfading until the end of following May. With the partial change in color of dark-brown at the beginning of the June fruits begin to fall, bur some remain even after three years. 8) The seed acquision ratio is 24.7% by weight, and the number of grains per fruit averages 3.9 and the seed weight per liter is 114.2 gram, while the average weight of 1,000 seeds is 24.56 grams. 9) Seeds after complete removal of sarcocarp, were buried under ground in a fixed temperature and humidity and they began to develop root in October, a year later and germinated in the next April. Under sunlight or drought, however, the dormant state may be continued.

  • PDF

Thermal Environments of Children's Parks during Heat Wave Period (폭염 시 어린이공원의 온열환경)

  • Ryu, Nam-Hyong;Lee, Chun-Seok
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.44 no.6
    • /
    • pp.84-97
    • /
    • 2016
  • This study was to investigate the user's thermal environments of the children's parks according to pavements and sunscreen types during periods of heat waves. The measurements were conducted at the sand pits, rubber chip pavement, shelters, and green shade ground of the two children's parks located in Jinju, Korea(Chilam: $N\;35^{\circ}11^{\prime}1.4{^{\prime}^{\prim}}$, $E\;128^{\circ}5^{\prime}31.7{^{\prime}^{\prime}}$, elevation 38m, Gaho: $N\;35^{\circ}09^{\prime}56.8{^{\prime}^{\prime}}$, $E\;128^{\circ}6^{\prime}41.1{^{\prime}^{\prime}}$, elevation 24m) over three days during 11-13, August, 2016. The highest ambient air temperatures at the Jinju Meteorological Office during the three measurement days were $35.9{\sim}36.8^{\circ}C$, which corresponded with the extremely hot weather. A series of experiments measured air temperature, relative humidity, wind velocity, black globe temperature, and long-wave and short-wave radiation of the six directions 0.6 m above ground level. The wet bulb globe temperature(WBGT) and the universal thermal climatic index(UTCI) were used to evaluate thermal stress. Surface temperature images of the play equipment were also taken using infrared thermography. Surface temperatures of the play equipment and grounds were used to evaluate burn risk through contact with playground materials. The results showed the following. The maximum air temperatures averaged over 1-hour period for three days were $36.6{\sim}39.4^{\circ}C$. The sun shades reduced those temperatures by up to $2.8^{\circ}C$(green shade) and $1.0^{\circ}C/2.3^{\circ}C$(shelters). The minimum relative humidity values averaged over 1-hour period for three days were 44~50%. The sun shades increased those humidity values by up to 6%(green shade) and 4%/6%(shelters). The risk of heat related illness at the measurement sites of the children's parks were extreme and high in the daytime hours. The maximum WBGT values averaged over a 30-minute period for three days were $31.2{\sim}33.6^{\circ}C$. The sun shades reduced those WBGT values by up to $2.4^{\circ}C$(green shade) and $0.5^{\circ}C/2.1^{\circ}C$(shelters) compared to sandpits, but would not block the risk of heat related illness in the daytime hours. The category of heat stress at the measurement sites of the children's parks were extreme and very strong in the daytime hours. The maximum UTCI values averaged over a 30-minute period for three days were $39.9{\sim}48.1^{\circ}C$. The sun shades reduced those UTCI values by up to $7.8^{\circ}C$(green shade) and $4.1^{\circ}C/8.2^{\circ}C$(shelters) compared to sandpits, but could not lower heat stress category from extreme and very strong to strong and moderate in the daytime hours. According to the burn threshold criteria when skin was in contact with playground materials, the maximum surface temperature of the stainless steels($70.8^{\circ}C$) surpassed three seconds $60^{\circ}C$ threshold for uncoated steel, that of the rubber chip($76.5^{\circ}C$) surpassed five seconds $74^{\circ}C$ threshold for the plastic, that of the plastic slide($68.5^{\circ}C$) and seats($71.0^{\circ}C$) surpassed the one min $60^{\circ}C$ threshold for plastic, respectively. The surface temperatures of shaded play equipment were lower approximately $20^{\circ}C$ than those of play equipment exposed to the sun. Therefore, sun shades can block the risk of burns in daytime hours. Because of the extreme and high risk of heat related illness and extreme and high heat stress at the children's parks during periods of heat waves, parents and administrators must protect children from the use of playgrounds. The risk of burn when contact with play equipments and grounds at the children's parks during periods of heat waves, was very high. The sun shades are essential to block the risk of burn from play equipments and grounds at the children's parks during heat waves.

The Variation of Natural Population of Pinus densiflora S. et Z. in Korea (III) -Genetic Variation of the Progeny Originated from Mt. Chu-wang, An-Myon Island and Mt. O-Dae Populations- (소나무 천연집단(天然集團)의 변이(變異)에 관(關)한 연구(硏究)(III) -주왕산(周王山), 안면도(安眠島), 오대산(五臺山) 소나무집단(集團)의 차대(次代)의 유전변이(遺傳變異)-)

  • Yim, Kyong Bin;Kwon, Ki Won
    • Journal of Korean Society of Forest Science
    • /
    • v.32 no.1
    • /
    • pp.36-63
    • /
    • 1976
  • The purpose of this study is to elucidate the genetic variation of the natural forest of Pinus densiflora. Three natural populations of the species, which are considered to be superior quality phenotypically, were selected. The locations and conditions of the populations are shown in table 1 and 2. The morphological traits of tree and needle and some other characteristics were presented already in our first report of this series in which population and family differences according to observed characteristics were statistically analyzed. Twenty trees were sampled from each populations, i.e., 60 trees in total. During the autumn of 1974, matured cones were collected from each tree and open-pollinated seeds were extracted in laboratory. Immediately after cone collection, in closed condition, the morphological characteristics were measured. Seed and seed-wing dimensions were also studied. In the spring of 1975, the seeds were sown in the experimental tree nursery located in Suweon. And in the April of 1976, the 1-0 seedlings were transplanted according to the predetermined experimental design, randomized block design with three replications. Because of cone setting condition. the number of family from which progenies were raised by populations were not equal. The numbers of family were 20 in population 1. 18 in population 2 and 15 in population 3. Then, each randomized block contained seedlings of 53 families from 3 populations. The present paper is mainly concerned with the variation of some characteristics of cone, seed, needle, growth performance of seedlings, and chlorophyll and monoterpene compositions of needles. The results obtained are summerized as follows. 1. The meteorological data obtained by averaging the records of 30 year period, observed from the nearest station to each location of populations, are shown in Fig. 3, 4, and 5. The distributional pattern of monthly precipitation are quite similar among locations. However, the precipitation density on population 2, Seosan area, during growing season is lower as compared to the other two populations. Population 1. Cheong-song area, and population 3, Pyong-chang area, are located in inland, but population 2 in the western seacoast. The differences on the average monthly air temperatures and the average monthly lowest temperatures among populations can hardly be found. 2. Available information on the each mother trees (families) studied, such as age, stem height, diameter at breast height, clear-bole-length, crown conditions and others are shown in table 6,7, and 8. 3. The measurements of fresh cone weight, length and the widest diameter of cone are given in Tab]e 9. All these traits arc concerned with the highly significant population differences and family differences within population. And the population difference was also found in the cone-index, that is, length-diameter ratio. 4. Seed-wing length and seed-wing width showed the population differences, and the family differences were also found in both characteristics. Not discussed in this paper, however, seed-wing colours and their shapes indicate the specificity which is inherent to individual trees as shown in photo 3 on page 50. The colour and shape are fully the expression of genetic make up of mother tree. The little variations on these traits are resulted from this reason. The significant differences among populations and among families were found in those characteristics, such as 1000-seed weight, seed length, seed width, and seed thickness as shown in table 11. As to all these dimensions, the values arc always larger in population 1 which is younger in age than that of the other two. The population differences evaluated by cone, seed and seed-wing sizes could partly be attributed to the growth vigorousity. 5. The values of correlation between the characteristics of cone and seed are presented in table 12. As shown, the positive correlations between cone diameter and seed-wing width were calculated in all populations studied. The correlation between seed-wing length and seed length was significantly positive in population 1 and 3 but not in population 2, that is, the r-value is so small as 0.002. in the latter. The correlation between cone length and seed-wing length was highly significant in population 1, but not in population 2. 6. Differences among progenies in growth performances, such as 1-0 and 1-1 seedling height and root collar diameter were highly singificant among populations as well as families within population(Table 13.) 7. The heritability values in narrow sense of population characteristics were estimated on the basis of variance components. The values based on seedling height at each age stage of 1-1 and 1-0 ranged from 0.146 to 0.288 and the values of root collar diameter from 0.060 to 0.130. (Table 14). These heritability values varied according to characteristics and seedling ages. Here what must be stated is that, for calculation of heritability values, the variance values of population was divided by the variance value of environment (error) and family and population. The present authors want to add the heritability values based on family level in the coming report. It might be considered that if the tree age is increased in furture, the heritability value is supposed to be altered or lowered. Examining the heritability values studied previously by many authors, in pine group at age of 7 to 15, the values of height growth ranged from 0.2 to 0.4 in general. The values we obtained are further below than these. 8. The correlation between seedling growth and seed characteristics were examined and the values resulted are shown in table 16. Contrary to our hypothetical premise of positive correlation between 1-0 seedling height and seed weight, non-significance on it was found. However, 1-0 seedling height correlated positively with seed length. And significant correlations between 1-0 and 1-1 seedling height are calculated. 9. The numbers of stomata row calculated separately by abaxial and adaxial side showed highly significant differences among populations, but not in serration density. On serration density, the differences among families within population were highly significant. (Table 17) A fact must be noted is that the correlation between stomata row on abaxial side and adaxial side was highly significant in all populations. Non-significances of correlation coefficient between progenies and parents regarding to stomata row on abaxial side were shown in all populations studied.(Table 18). 10. The contents of chhlorophyll b of the needle were a little more than that of chlorophyll a irrespective of the populations examined. The differences of chlorophyll a, b and a plus b contents were highly significant but not among families within populations as shown in table 20. The contents of chlorophyll a and b are presented by individual trees of each populations in table 21. 11. The occurrence of monoterpene components was examined by gas liquid chromatography (Shimazu, GC-1C type) to evaluate the population difference. There are some papers reporting the chemical geography of pines basing upon monoterpene composition. The number of populations studied here is not enough to state this problem. The kinds of monoterpene observed in needle were ${\alpha}$-pinene, camphene, ${\beta}$-pinene, myrcene, limonene, ${\beta}$-phellandrene and terpinolene plus two unknowns. In analysis of monoterpene composition, the number of sample trees varied with population, I.e., 18 families for population 1, 15 for population 2 and 11 for population3. (Table 22, 23 and 24). The histograms(Fig. 6) of 7 components of monoterpene by population show noticeably higher percentages of ${\alpha}$-pinene irrespective of population and ${\beta}$-phellandrene in the next order. The minor Pinus densiflora monoterpene composition of camphene, myrcene, limonene and terpinolene made up less than 10 percent of the portion in general. The average coefficients of variation of ${\alpha}$-pinene and ${\beta}$-phellandrene were 11 percent. On the contrary to this, the average coefficients of variation of camphene, limonene and terpinolene varied from 20 to 30 percent. And the significant differences between populaiton were observed only in myrcene and ${\beta}$-phellandrene. (Table 25).

  • PDF