• Title/Summary/Keyword: value of information

Search Result 14,126, Processing Time 0.042 seconds

Target Word Selection Disambiguation using Untagged Text Data in English-Korean Machine Translation (영한 기계 번역에서 미가공 텍스트 데이터를 이용한 대역어 선택 중의성 해소)

  • Kim Yu-Seop;Chang Jeong-Ho
    • The KIPS Transactions:PartB
    • /
    • v.11B no.6
    • /
    • pp.749-758
    • /
    • 2004
  • In this paper, we propose a new method utilizing only raw corpus without additional human effort for disambiguation of target word selection in English-Korean machine translation. We use two data-driven techniques; one is the Latent Semantic Analysis(LSA) and the other the Probabilistic Latent Semantic Analysis(PLSA). These two techniques can represent complex semantic structures in given contexts like text passages. We construct linguistic semantic knowledge by using the two techniques and use the knowledge for target word selection in English-Korean machine translation. For target word selection, we utilize a grammatical relationship stored in a dictionary. We use k- nearest neighbor learning algorithm for the resolution of data sparseness Problem in target word selection and estimate the distance between instances based on these models. In experiments, we use TREC data of AP news for construction of latent semantic space and Wail Street Journal corpus for evaluation of target word selection. Through the Latent Semantic Analysis methods, the accuracy of target word selection has improved over 10% and PLSA has showed better accuracy than LSA method. finally we have showed the relatedness between the accuracy and two important factors ; one is dimensionality of latent space and k value of k-NT learning by using correlation calculation.

The Effects on Dose Distribution Characteristics by Changing Beam Tuning Parameters of Digital Linear Accelerator in Medicine (의료용 디지털 선형가속기의 빔조정 인자변화가 선량분포특성에 미치는 영향)

  • 박현주;이동훈;이동한;권수일;류성렬;지영훈
    • Progress in Medical Physics
    • /
    • v.10 no.1
    • /
    • pp.17-22
    • /
    • 1999
  • INJ-I, INJ-E, PFN, BMI, and PRF were selected among the various factors which constitute a digital linear accelerator to find effects on the dose distribution by changing current and voltage within the permitted scale which Mevatron automatically maintained. We measured the absorbed dose using an ion chamber, analyzed the waveform of beam output using an oscilloscope, and measured symmetry and flatness using a dosimetry system. An RFA plus (Scanditronix, Sweden) device was used as a dosimetry system. Then an 0.6cc ion chamber (PR06C, USA), an electrometer (Capintec192, USA), and an oscilloscope (Tektronix, USA) were employed to measure the changes on the dose distribution characteristics by changing the beam-tuning parameters. When the currents and the voltages of INJ-I, INJ-E, PFN, BMI, and PRF were modified, we were able to see the notable change on the dose rate by examining the change of the output pulse using the oscilloscope and by measuring them using the ion chamber. However, the results of energy and flatness graph from RF A plus were almost identical. The factors had fine differences: INJ-I, INJ-E, PFN, BMI, and PRF had 0.01∼0.02% differences in D10/D20, 0.1∼0.2 % differences in symmetry, and 0.1∼0.4% differences in flatness. Since Mevatron controlled itself automatically to keep the reference value of the factor, it was not able to see large differences in the dose distribution. There were fine differences on the dose rate distribution when the voltage and the currents of the digitized factors were modified Nonetheless, a basic operational management information was achieved.

  • PDF

Analysis of Characteristics of Horizontal Response Spectrum of Velocity Ground Motions from 5 Macro Earthquakes (5개 중규모 지진의 속도 관측자료를 이용한 수평 응답스펙트럼 특성 분석)

  • Kim, Jun-Kyoung
    • Tunnel and Underground Space
    • /
    • v.21 no.6
    • /
    • pp.471-479
    • /
    • 2011
  • The velocity horizontal response spectra using the observed ground motions from the recent 5 macro earthquakes, equal to or larger than 4.8 in magnitude, around Korean Peninsula were analysed and then were compared to the acceleration horizontal response spectra, seismic design response spectra (Reg Guide 1.60), applied to the domestic nuclear power plants, and finally the Korean Standard Design Response Spectrum for general structures and buildings. 102 velocity horizontal ground motions, including NS and EW components, were used for velocity horizontal response spectra and then normalized with respect to the peak velocity value of each ground motion. First, the results showed that velocity horizontal response spectra have larger values at the range of medium natural period, but acceleration horizontal response spectra have larger values at the range of short natural periods. Secondly, the results also showed that velocity horizontal response spectra exceed Reg. Guide 1.60 for longer natural periods bands less than 6-7 Hz. Finally, the results were also compared to the Korean Standard Response Spectrum for the 3 different soil types(SC, SD, and SE soil type) and showed that velocity horizontal response spectra revealed much higher values for the frequency bands below 1.5(SC), 2.0(SD), and 3.0(SE) seconds, respectively, than the Korean Standard Response Spectrum. The results suggest that the fact that acceleration, velocity, and displacement horizontal response spectra have larger values at the range of short, medium, and long natural periods, respectively, can be applied consistently to those form domestic ground motion, especially, the velocity ground motion. Information on response spectrum at such medium range periods can be very important since the domestic design of buildings and structures emphasizes recently medium and long natural periods than short one due to increased super high-rise buildings.

X-tree Diff: An Efficient Change Detection Algorithm for Tree-structured Data (X-tree Diff: 트리 기반 데이터를 위한 효율적인 변화 탐지 알고리즘)

  • Lee, Suk-Kyoon;Kim, Dong-Ah
    • The KIPS Transactions:PartC
    • /
    • v.10C no.6
    • /
    • pp.683-694
    • /
    • 2003
  • We present X-tree Diff, a change detection algorithm for tree-structured data. Our work is motivated by need to monitor massive volume of web documents and detect suspicious changes, called defacement attack on web sites. From this context, our algorithm should be very efficient in speed and use of memory space. X-tree Diff uses a special ordered labeled tree, X-tree, to represent XML/HTML documents. X-tree nodes have a special field, tMD, which stores a 128-bit hash value representing the structure and data of subtrees, so match identical subtrees form the old and new versions. During this process, X-tree Diff uses the Rule of Delaying Ambiguous Matchings, implying that it perform exact matching where a node in the old version has one-to one corrspondence with the corresponding node in the new, by delaying all the others. It drastically reduces the possibility of wrong matchings. X-tree Diff propagates such exact matchings upwards in Step 2, and obtain more matchings downwsards from roots in Step 3. In step 4, nodes to ve inserted or deleted are decided, We aldo show thst X-tree Diff runs on O(n), woere n is the number of noses in X-trees, in worst case as well as in average case, This result is even better than that of BULD Diff algorithm, which is O(n log(n)) in worst case, We experimented X-tree Diff on reat data, which are about 11,000 home pages from about 20 wev sites, instead of synthetic documets manipulated for experimented for ex[erimentation. Currently, X-treeDiff algorithm is being used in a commeercial hacking detection system, called the WIDS(Web-Document Intrusion Detection System), which is to find changes occured in registered websites, and report suspicious changes to users.

Efficient Management and use of Records from the Truth Commissions (과거사위원회 기록의 효율적인 관리와 활용방안)

  • Lim, Hee Yeon
    • The Korean Journal of Archival Studies
    • /
    • no.17
    • /
    • pp.247-292
    • /
    • 2008
  • Investigations have been started to set the modern history and national spirit to rights after Commissions were established. Those Commissions are established and operated with time limit to finish its own missions. They creates three kinds of records as acquired materials which acquired or are donated for investigation; investigation records as investigation reports; and administrative records that created while supporting organization's operation. The Commissions use more past records to do special tasks asnation's slate clean and uncovering the truth than other agencies. In other words, the commissions take the most advantages of well-managed records, however, their record management environment and operation systems are relatively loose than other permanent machineries. It has three reasons that; first, there is no record management regulations and criteria for machineries that have time limit. This affected each commissions 'systems and 6 Truth Commissions' record management systems are built separately and on the different level; Second, members lack responsibility from frequent sending, reinstatement, change, and restructuring and that makes troubles to produce and manage records; Third, central archives pay less attention to machineries that operated limited period as the truth commissions. The Commissions rather need more systematic control because its records have historical value. To solve these problems, record management regulations have to be prepared first with features of organizations running limited time and commissions' records as acquired materials or investigation records. Furthermore, building up standard record management system for the Commissions, standardizing transfer data, imposing professional record personnel, and setting limits frequent personnel changes would finish practical problems. Besides, those records created to reveal the truth should use for education and research because Truth Commissions are established to set unfortunate history right and not to repeat it again. The records would serve as steppingstone for establishment of the Truth Record Center that does education, information work, publication, and research with the records. The record center would help using the records efficiently and improving knowledge for its people. And, the center should devote people to recognize importance of the records.

DEVELOPMENT OF STATEWIDE TRUCK TRAFFIC FORECASTING METHOD BY USING LIMITED O-D SURVEY DATA (한정된 O-D조사자료를 이용한 주 전체의 트럭교통예측방법 개발)

  • 박만배
    • Proceedings of the KOR-KST Conference
    • /
    • 1995.02a
    • /
    • pp.101-113
    • /
    • 1995
  • The objective of this research is to test the feasibility of developing a statewide truck traffic forecasting methodology for Wisconsin by using Origin-Destination surveys, traffic counts, classification counts, and other data that are routinely collected by the Wisconsin Department of Transportation (WisDOT). Development of a feasible model will permit estimation of future truck traffic for every major link in the network. This will provide the basis for improved estimation of future pavement deterioration. Pavement damage rises exponentially as axle weight increases, and trucks are responsible for most of the traffic-induced damage to pavement. Consequently, forecasts of truck traffic are critical to pavement management systems. The pavement Management Decision Supporting System (PMDSS) prepared by WisDOT in May 1990 combines pavement inventory and performance data with a knowledge base consisting of rules for evaluation, problem identification and rehabilitation recommendation. Without a r.easonable truck traffic forecasting methodology, PMDSS is not able to project pavement performance trends in order to make assessment and recommendations in the future years. However, none of WisDOT's existing forecasting methodologies has been designed specifically for predicting truck movements on a statewide highway network. For this research, the Origin-Destination survey data avaiiable from WisDOT, including two stateline areas, one county, and five cities, are analyzed and the zone-to'||'&'||'not;zone truck trip tables are developed. The resulting Origin-Destination Trip Length Frequency (00 TLF) distributions by trip type are applied to the Gravity Model (GM) for comparison with comparable TLFs from the GM. The gravity model is calibrated to obtain friction factor curves for the three trip types, Internal-Internal (I-I), Internal-External (I-E), and External-External (E-E). ~oth "macro-scale" calibration and "micro-scale" calibration are performed. The comparison of the statewide GM TLF with the 00 TLF for the macro-scale calibration does not provide suitable results because the available 00 survey data do not represent an unbiased sample of statewide truck trips. For the "micro-scale" calibration, "partial" GM trip tables that correspond to the 00 survey trip tables are extracted from the full statewide GM trip table. These "partial" GM trip tables are then merged and a partial GM TLF is created. The GM friction factor curves are adjusted until the partial GM TLF matches the 00 TLF. Three friction factor curves, one for each trip type, resulting from the micro-scale calibration produce a reasonable GM truck trip model. A key methodological issue for GM. calibration involves the use of multiple friction factor curves versus a single friction factor curve for each trip type in order to estimate truck trips with reasonable accuracy. A single friction factor curve for each of the three trip types was found to reproduce the 00 TLFs from the calibration data base. Given the very limited trip generation data available for this research, additional refinement of the gravity model using multiple mction factor curves for each trip type was not warranted. In the traditional urban transportation planning studies, the zonal trip productions and attractions and region-wide OD TLFs are available. However, for this research, the information available for the development .of the GM model is limited to Ground Counts (GC) and a limited set ofOD TLFs. The GM is calibrated using the limited OD data, but the OD data are not adequate to obtain good estimates of truck trip productions and attractions .. Consequently, zonal productions and attractions are estimated using zonal population as a first approximation. Then, Selected Link based (SELINK) analyses are used to adjust the productions and attractions and possibly recalibrate the GM. The SELINK adjustment process involves identifying the origins and destinations of all truck trips that are assigned to a specified "selected link" as the result of a standard traffic assignment. A link adjustment factor is computed as the ratio of the actual volume for the link (ground count) to the total assigned volume. This link adjustment factor is then applied to all of the origin and destination zones of the trips using that "selected link". Selected link based analyses are conducted by using both 16 selected links and 32 selected links. The result of SELINK analysis by u~ing 32 selected links provides the least %RMSE in the screenline volume analysis. In addition, the stability of the GM truck estimating model is preserved by using 32 selected links with three SELINK adjustments, that is, the GM remains calibrated despite substantial changes in the input productions and attractions. The coverage of zones provided by 32 selected links is satisfactory. Increasing the number of repetitions beyond four is not reasonable because the stability of GM model in reproducing the OD TLF reaches its limits. The total volume of truck traffic captured by 32 selected links is 107% of total trip productions. But more importantly, ~ELINK adjustment factors for all of the zones can be computed. Evaluation of the travel demand model resulting from the SELINK adjustments is conducted by using screenline volume analysis, functional class and route specific volume analysis, area specific volume analysis, production and attraction analysis, and Vehicle Miles of Travel (VMT) analysis. Screenline volume analysis by using four screenlines with 28 check points are used for evaluation of the adequacy of the overall model. The total trucks crossing the screenlines are compared to the ground count totals. L V/GC ratios of 0.958 by using 32 selected links and 1.001 by using 16 selected links are obtained. The %RM:SE for the four screenlines is inversely proportional to the average ground count totals by screenline .. The magnitude of %RM:SE for the four screenlines resulting from the fourth and last GM run by using 32 and 16 selected links is 22% and 31 % respectively. These results are similar to the overall %RMSE achieved for the 32 and 16 selected links themselves of 19% and 33% respectively. This implies that the SELINICanalysis results are reasonable for all sections of the state.Functional class and route specific volume analysis is possible by using the available 154 classification count check points. The truck traffic crossing the Interstate highways (ISH) with 37 check points, the US highways (USH) with 50 check points, and the State highways (STH) with 67 check points is compared to the actual ground count totals. The magnitude of the overall link volume to ground count ratio by route does not provide any specific pattern of over or underestimate. However, the %R11SE for the ISH shows the least value while that for the STH shows the largest value. This pattern is consistent with the screenline analysis and the overall relationship between %RMSE and ground count volume groups. Area specific volume analysis provides another broad statewide measure of the performance of the overall model. The truck traffic in the North area with 26 check points, the West area with 36 check points, the East area with 29 check points, and the South area with 64 check points are compared to the actual ground count totals. The four areas show similar results. No specific patterns in the L V/GC ratio by area are found. In addition, the %RMSE is computed for each of the four areas. The %RMSEs for the North, West, East, and South areas are 92%, 49%, 27%, and 35% respectively, whereas, the average ground counts are 481, 1383, 1532, and 3154 respectively. As for the screenline and volume range analyses, the %RMSE is inversely related to average link volume. 'The SELINK adjustments of productions and attractions resulted in a very substantial reduction in the total in-state zonal productions and attractions. The initial in-state zonal trip generation model can now be revised with a new trip production's trip rate (total adjusted productions/total population) and a new trip attraction's trip rate. Revised zonal production and attraction adjustment factors can then be developed that only reflect the impact of the SELINK adjustments that cause mcreases or , decreases from the revised zonal estimate of productions and attractions. Analysis of the revised production adjustment factors is conducted by plotting the factors on the state map. The east area of the state including the counties of Brown, Outagamie, Shawano, Wmnebago, Fond du Lac, Marathon shows comparatively large values of the revised adjustment factors. Overall, both small and large values of the revised adjustment factors are scattered around Wisconsin. This suggests that more independent variables beyond just 226; population are needed for the development of the heavy truck trip generation model. More independent variables including zonal employment data (office employees and manufacturing employees) by industry type, zonal private trucks 226; owned and zonal income data which are not available currently should be considered. A plot of frequency distribution of the in-state zones as a function of the revised production and attraction adjustment factors shows the overall " adjustment resulting from the SELINK analysis process. Overall, the revised SELINK adjustments show that the productions for many zones are reduced by, a factor of 0.5 to 0.8 while the productions for ~ relatively few zones are increased by factors from 1.1 to 4 with most of the factors in the 3.0 range. No obvious explanation for the frequency distribution could be found. The revised SELINK adjustments overall appear to be reasonable. The heavy truck VMT analysis is conducted by comparing the 1990 heavy truck VMT that is forecasted by the GM truck forecasting model, 2.975 billions, with the WisDOT computed data. This gives an estimate that is 18.3% less than the WisDOT computation of 3.642 billions of VMT. The WisDOT estimates are based on the sampling the link volumes for USH, 8TH, and CTH. This implies potential error in sampling the average link volume. The WisDOT estimate of heavy truck VMT cannot be tabulated by the three trip types, I-I, I-E ('||'&'||'pound;-I), and E-E. In contrast, the GM forecasting model shows that the proportion ofE-E VMT out of total VMT is 21.24%. In addition, tabulation of heavy truck VMT by route functional class shows that the proportion of truck traffic traversing the freeways and expressways is 76.5%. Only 14.1% of total freeway truck traffic is I-I trips, while 80% of total collector truck traffic is I-I trips. This implies that freeways are traversed mainly by I-E and E-E truck traffic while collectors are used mainly by I-I truck traffic. Other tabulations such as average heavy truck speed by trip type, average travel distance by trip type and the VMT distribution by trip type, route functional class and travel speed are useful information for highway planners to understand the characteristics of statewide heavy truck trip patternS. Heavy truck volumes for the target year 2010 are forecasted by using the GM truck forecasting model. Four scenarios are used. Fo~ better forecasting, ground count- based segment adjustment factors are developed and applied. ISH 90 '||'&'||' 94 and USH 41 are used as example routes. The forecasting results by using the ground count-based segment adjustment factors are satisfactory for long range planning purposes, but additional ground counts would be useful for USH 41. Sensitivity analysis provides estimates of the impacts of the alternative growth rates including information about changes in the trip types using key routes. The network'||'&'||'not;based GMcan easily model scenarios with different rates of growth in rural versus . . urban areas, small versus large cities, and in-state zones versus external stations. cities, and in-state zones versus external stations.

  • PDF

A study on the rock mass classification in boreholes for a tunnel design using machine learning algorithms (머신러닝 기법을 활용한 터널 설계 시 시추공 내 암반분류에 관한 연구)

  • Lee, Je-Kyum;Choi, Won-Hyuk;Kim, Yangkyun;Lee, Sean Seungwon
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.23 no.6
    • /
    • pp.469-484
    • /
    • 2021
  • Rock mass classification results have a great influence on construction schedule and budget as well as tunnel stability in tunnel design. A total of 3,526 tunnels have been constructed in Korea and the associated techniques in tunnel design and construction have been continuously developed, however, not many studies have been performed on how to assess rock mass quality and grade more accurately. Thus, numerous cases show big differences in the results according to inspectors' experience and judgement. Hence, this study aims to suggest a more reliable rock mass classification (RMR) model using machine learning algorithms, which is surging in availability, through the analyses based on various rock and rock mass information collected from boring investigations. For this, 11 learning parameters (depth, rock type, RQD, electrical resistivity, UCS, Vp, Vs, Young's modulus, unit weight, Poisson's ratio, RMR) from 13 local tunnel cases were selected, 337 learning data sets as well as 60 test data sets were prepared, and 6 machine learning algorithms (DT, SVM, ANN, PCA & ANN, RF, XGBoost) were tested for various hyperparameters for each algorithm. The results show that the mean absolute errors in RMR value from five algorithms except Decision Tree were less than 8 and a Support Vector Machine model is the best model. The applicability of the model, established through this study, was confirmed and this prediction model can be applied for more reliable rock mass classification when additional various data is continuously cumulated.

Relationship between Fertilizer Application Level and Soil Chemical Properties for Strawberry Cultivation under Greenhouse in Chungnam Province (충남지역 시설 딸기재배지 시비수준과 토양 화학성과의 관계)

  • Choi, Moon-Tae;Lee, Jin-Il;Yun, Yeo-Uk;Lee, Jong-Eun;Lee, Bong-Chun;Yang, Euy-Seog;Lee, Young-Han
    • Korean Journal of Soil Science and Fertilizer
    • /
    • v.43 no.2
    • /
    • pp.153-159
    • /
    • 2010
  • Nowadays, Korean farmers rely more on chemical fertilizers than low input sustainable agriculture drawn from the farm itself. In order to improve soil nutritional imbalance for environment friendly agriculture in greenhouse, we have carried out a relationship between fertilizer application level, and soil chemical properties for strawberry cultivation at 56 sites in Chungnam Province. Average amount of nitrogen as basal fertilization was 92.3 Mg $ha^{-1}$ which higher 2.6 times compared to standard amount of basal fertilizer. In case of compost application more than 30 Mg $ha^{-1}$, excessive ratio compared to optimum level was higher 1.8 times for EC value, 3.0 times for available phosphate, 2.6 times for exchangeable potassium, 1.7 times for exchangeable calcium, and 1.6 times for exchangeable magnesium, respectively. Amounts of compost application significantly correlated with available phosphate (r=0.370, $p{\leq}0.01$), exchangeable potassium(r=0.429, $p{\leq}0.01$), exchangeable calcium(r=0.404, $p{\leq}0.01$), exchangeable magnesium(r=0.453, $p{\leq}0.01$), and exchangeable sodium(r=0.369, $p{\leq}0.01$), respectively. Our results suggest that soil nutrients management for sustainable agriculture was optimum fertilization based on soil testing for strawberry cultivation in greenhouse.

Using Trophic State Index (TSI) Values to Draw Inferences Regarding Phytoplankton Limiting Factors and Seston Composition from Routine Water Quality Monitoring Data (영양상태지수 (trophic state index)를 이용한 수체 내 식물플랑크톤 제한요인 및 seston조성의 유추)

  • Havens, Karl E
    • Korean Journal of Ecology and Environment
    • /
    • v.33 no.3 s.91
    • /
    • pp.187-196
    • /
    • 2000
  • This paper describes a simple method that uses differences among Carlson's (1977) trophic state index (TSI) values based on total phosphorus (TP), chlorophyll a (CHL) and Secchi depth (SD) to draw inferences regarding the factors that are limiting to phytoplankton growth and the composition of lake seston. Examples are provided regarding seasonal and spatial patterns in a large subtropical lake (Lake Okeechobee, Florida, USA) and inter- and intra-lake variations from a multilake data set developed from published studies. Once an investigator has collected routine water quality data and established TSI values based on TP, CHL, and SD, a number of inferences can be made. Additional information can be provided where it also is possible to calculate a TSI based on total nitrogen (TN). Where TSI (CHL)<>TSI (SD), light attenuating particles are large (large filaments or colonies of algae), and the phytoplankton may be limited by zooplankton grazing. Other limiting conditions are inferred by different relationships between the TSI values. Results of this study indicate that the analysis is quite robust, and that it generally gives good agreement with conclusions based on more direct methods (e.g., nutrientaddition bioassays, zooplankton size data, zooplankton removal experiments). The TSI approach, when validated periodically with these more costly and time-intensive methods, provides an effective, low cost method for tracking long-term changes in pelagic structure and function with potential value in monitoring lake ecology and responses to management.

  • PDF

Research about feature selection that use heuristic function (휴리스틱 함수를 이용한 feature selection에 관한 연구)

  • Hong, Seok-Mi;Jung, Kyung-Sook;Chung, Tae-Choong
    • The KIPS Transactions:PartB
    • /
    • v.10B no.3
    • /
    • pp.281-286
    • /
    • 2003
  • A large number of features are collected for problem solving in real life, but to utilize ail the features collected would be difficult. It is not so easy to collect of correct data about all features. In case it takes advantage of all collected data to learn, complicated learning model is created and good performance result can't get. Also exist interrelationships or hierarchical relations among the features. We can reduce feature's number analyzing relation among the features using heuristic knowledge or statistical method. Heuristic technique refers to learning through repetitive trial and errors and experience. Experts can approach to relevant problem domain through opinion collection process by experience. These properties can be utilized to reduce the number of feature used in learning. Experts generate a new feature (highly abstract) using raw data. This paper describes machine learning model that reduce the number of features used in learning using heuristic function and use abstracted feature by neural network's input value. We have applied this model to the win/lose prediction in pro-baseball games. The result shows the model mixing two techniques not only reduces the complexity of the neural network model but also significantly improves the classification accuracy than when neural network and heuristic model are used separately.