• Title/Summary/Keyword: Large Data Set

Search Result 1,063, Processing Time 0.026 seconds

Survey on the Content and Intake Pattern of Sugar from Elementary and Middle School Foodservices in Daejeon and Chungcheong Province (대전.충청지역 초.중학교 급식의 당 함량 및 급식을 통한 당류의 섭취실태 연구)

  • Park, You-Gyoung;Lee, Eun-Mi;Kim, Chang-Soo;Eom, Joon-Ho;Byun, Jung-A;Sun, Nam-Kyu;Lee, Jin-Ha;Heo, Ok-Soon
    • Journal of the Korean Society of Food Science and Nutrition
    • /
    • v.39 no.10
    • /
    • pp.1545-1554
    • /
    • 2010
  • Korean government will set up the nationwide food safety system with strict control of hazardous nutrients like sugar, fatty acids and sodium as well as advanced nutrition education system. In addition, almost one hundred percent of school food service rate forced the government to consider more effective ways to upgrade the nutritional status of school meals. The object of our study was to provide the data on content and consumption of sugar in school meal for the nationwide project. For this purpose, we surveyed the sugar content of 842 school meal menus and their intake level for 154 days in 8 schools in Daejeon and Chungcheong Province. Sugar contents, the sum of the quantity of 5 sugars commonly detected in food, were analysed with HPLC-RID (Refractive Index Detector). Sugar intakes were calculated by multiplying the intake of each menu to the sugar content of that menu. The sugar content was highest in the desserts, which include fruit juices, dairy products and fruits. Sugar content of side dish was high in sauces and braised foods. Sugar intake from one dish is high in beverage and dairy product, and one dish meals contribute greatly to sugar intake because of their large amount of meal intake. The average lunch meal intakes of second grade and fifth grade elementary school students were 244 g/meal and 304 g/meal, respectively. The meal intake of middle school student was 401 g/meal. The average sugar intake from one day school lunch was 4.22 g (4.03 g on elementary and 5.31 g on middle school student), which is less than 10% of daily sugar reference value for Koreans. The result of this study provides exact data of sugar intake pattern based on the content of sugar which is matched directly to the meals consumed by the students.

Methodology for Identifying Issues of User Reviews from the Perspective of Evaluation Criteria: Focus on a Hotel Information Site (사용자 리뷰의 평가기준 별 이슈 식별 방법론: 호텔 리뷰 사이트를 중심으로)

  • Byun, Sungho;Lee, Donghoon;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.23-43
    • /
    • 2016
  • As a result of the growth of Internet data and the rapid development of Internet technology, "big data" analysis has gained prominence as a major approach for evaluating and mining enormous data for various purposes. Especially, in recent years, people tend to share their experiences related to their leisure activities while also reviewing others' inputs concerning their activities. Therefore, by referring to others' leisure activity-related experiences, they are able to gather information that might guarantee them better leisure activities in the future. This phenomenon has appeared throughout many aspects of leisure activities such as movies, traveling, accommodation, and dining. Apart from blogs and social networking sites, many other websites provide a wealth of information related to leisure activities. Most of these websites provide information of each product in various formats depending on different purposes and perspectives. Generally, most of the websites provide the average ratings and detailed reviews of users who actually used products/services, and these ratings and reviews can actually support the decision of potential customers in purchasing the same products/services. However, the existing websites offering information on leisure activities only provide the rating and review based on one stage of a set of evaluation criteria. Therefore, to identify the main issue for each evaluation criterion as well as the characteristics of specific elements comprising each criterion, users have to read a large number of reviews. In particular, as most of the users search for the characteristics of the detailed elements for one or more specific evaluation criteria based on their priorities, they must spend a great deal of time and effort to obtain the desired information by reading more reviews and understanding the contents of such reviews. Although some websites break down the evaluation criteria and direct the user to input their reviews according to different levels of criteria, there exist excessive amounts of input sections that make the whole process inconvenient for the users. Further, problems may arise if a user does not follow the instructions for the input sections or fill in the wrong input sections. Finally, treating the evaluation criteria breakdown as a realistic alternative is difficult, because identifying all the detailed criteria for each evaluation criterion is a challenging task. For example, if a review about a certain hotel has been written, people tend to only write one-stage reviews for various components such as accessibility, rooms, services, or food. These might be the reviews for most frequently asked questions, such as distance between the nearest subway station or condition of the bathroom, but they still lack detailed information for these questions. In addition, in case a breakdown of the evaluation criteria was provided along with various input sections, the user might only fill in the evaluation criterion for accessibility or fill in the wrong information such as information regarding rooms in the evaluation criteria for accessibility. Thus, the reliability of the segmented review will be greatly reduced. In this study, we propose an approach to overcome the limitations of the existing leisure activity information websites, namely, (1) the reliability of reviews for each evaluation criteria and (2) the difficulty of identifying the detailed contents that make up the evaluation criteria. In our proposed methodology, we first identify the review content and construct the lexicon for each evaluation criterion by using the terms that are frequently used for each criterion. Next, the sentences in the review documents containing the terms in the constructed lexicon are decomposed into review units, which are then reconstructed by using the evaluation criteria. Finally, the issues of the constructed review units by evaluation criteria are derived and the summary results are provided. Apart from the derived issues, the review units are also provided. Therefore, this approach aims to help users save on time and effort, because they will only be reading the relevant information they need for each evaluation criterion rather than go through the entire text of review. Our proposed methodology is based on the topic modeling, which is being actively used in text analysis. The review is decomposed into sentence units rather than considering the whole review as a document unit. After being decomposed into individual review units, the review units are reorganized according to each evaluation criterion and then used in the subsequent analysis. This work largely differs from the existing topic modeling-based studies. In this paper, we collected 423 reviews from hotel information websites and decomposed these reviews into 4,860 review units. We then reorganized the review units according to six different evaluation criteria. By applying these review units in our methodology, the analysis results can be introduced, and the utility of proposed methodology can be demonstrated.

An Empirical Study on the Determinants of Supply Chain Management Systems Success from Vendor's Perspective (참여자관점에서 공급사슬관리 시스템의 성공에 영향을 미치는 요인에 관한 실증연구)

  • Kang, Sung-Bae;Moon, Tae-Soo;Chung, Yoon
    • Asia pacific journal of information systems
    • /
    • v.20 no.3
    • /
    • pp.139-166
    • /
    • 2010
  • The supply chain management (SCM) systems have emerged as strong managerial tools for manufacturing firms in enhancing competitive strength. Despite of large investments in the SCM systems, many companies are not fully realizing the promised benefits from the systems. A review of literature on adoption, implementation and success factor of IOS (inter-organization systems), EDI (electronic data interchange) systems, shows that this issue has been examined from multiple theoretic perspectives. And many researchers have attempted to identify the factors which influence the success of system implementation. However, the existing studies have two drawbacks in revealing the determinants of systems implementation success. First, previous researches raise questions as to the appropriateness of research subjects selected. Most SCM systems are operating in the form of private industrial networks, where the participants of the systems consist of two distinct groups: focus companies and vendors. The focus companies are the primary actors in developing and operating the systems, while vendors are passive participants which are connected to the system in order to supply raw materials and parts to the focus companies. Under the circumstance, there are three ways in selecting the research subjects; focus companies only, vendors only, or two parties grouped together. It is hard to find researches that use the focus companies exclusively as the subjects probably due to the insufficient sample size for statistic analysis. Most researches have been conducted using the data collected from both groups. We argue that the SCM success factors cannot be correctly indentified in this case. The focus companies and the vendors are in different positions in many areas regarding the system implementation: firm size, managerial resources, bargaining power, organizational maturity, and etc. There are no obvious reasons to believe that the success factors of the two groups are identical. Grouping the two groups also raises questions on measuring the system success. The benefits from utilizing the systems may not be commonly distributed to the two groups. One group's benefits might be realized at the expenses of the other group considering the situation where vendors participating in SCM systems are under continuous pressures from the focus companies with respect to prices, quality, and delivery time. Therefore, by combining the system outcomes of both groups we cannot measure the system benefits obtained by each group correctly. Second, the measures of system success adopted in the previous researches have shortcoming in measuring the SCM success. User satisfaction, system utilization, and user attitudes toward the systems are most commonly used success measures in the existing studies. These measures have been developed as proxy variables in the studies of decision support systems (DSS) where the contribution of the systems to the organization performance is very difficult to measure. Unlike the DSS, the SCM systems have more specific goals, such as cost saving, inventory reduction, quality improvement, rapid time, and higher customer service. We maintain that more specific measures can be developed instead of proxy variables in order to measure the system benefits correctly. The purpose of this study is to find the determinants of SCM systems success in the perspective of vendor companies. In developing the research model, we have focused on selecting the success factors appropriate for the vendors through reviewing past researches and on developing more accurate success measures. The variables can be classified into following: technological, organizational, and environmental factors on the basis of TOE (Technology-Organization-Environment) framework. The model consists of three independent variables (competition intensity, top management support, and information system maturity), one mediating variable (collaboration), one moderating variable (government support), and a dependent variable (system success). The systems success measures have been developed to reflect the operational benefits of the SCM systems; improvement in planning and analysis capabilities, faster throughput, cost reduction, task integration, and improved product and customer service. The model has been validated using the survey data collected from 122 vendors participating in the SCM systems in Korea. To test for mediation, one should estimate the hierarchical regression analysis on the collaboration. And moderating effect analysis should estimate the moderated multiple regression, examines the effect of the government support. The result shows that information system maturity and top management support are the most important determinants of SCM system success. Supply chain technologies that standardize data formats and enhance information sharing may be adopted by supply chain leader organization because of the influence of focal company in the private industrial networks in order to streamline transactions and improve inter-organization communication. Specially, the need to develop and sustain an information system maturity will provide the focus and purpose to successfully overcome information system obstacles and resistance to innovation diffusion within the supply chain network organization. The support of top management will help focus efforts toward the realization of inter-organizational benefits and lend credibility to functional managers responsible for its implementation. The active involvement, vision, and direction of high level executives provide the impetus needed to sustain the implementation of SCM. The quality of collaboration relationships also is positively related to outcome variable. Collaboration variable is found to have a mediation effect between on influencing factors and implementation success. Higher levels of inter-organizational collaboration behaviors such as shared planning and flexibility in coordinating activities were found to be strongly linked to the vendors trust in the supply chain network. Government support moderates the effect of the IS maturity, competitive intensity, top management support on collaboration and implementation success of SCM. In general, the vendor companies face substantially greater risks in SCM implementation than the larger companies do because of severe constraints on financial and human resources and limited education on SCM systems. Besides resources, Vendors generally lack computer experience and do not have sufficient internal SCM expertise. For these reasons, government supports may establish requirements for firms doing business with the government or provide incentives to adopt, implementation SCM or practices. Government support provides significant improvements in implementation success of SCM when IS maturity, competitive intensity, top management support and collaboration are low. The environmental characteristic of competition intensity has no direct effect on vendor perspective of SCM system success. But, vendors facing above average competition intensity will have a greater need for changing technology. This suggests that companies trying to implement SCM systems should set up compatible supply chain networks and a high-quality collaboration relationship for implementation and performance.

Summative Evaluation of 1993, 1994 Discussion Contest of Scientific Investigation (제 1, 2회 학생 과학 공동탐구 토론대회의 종합적 평가)

  • Kim, Eun-Sook;Yoon, Hye-Gyoung
    • Journal of The Korean Association For Science Education
    • /
    • v.16 no.4
    • /
    • pp.376-388
    • /
    • 1996
  • The first and the second "Discussion Contest of Scientific Investigation" was evaluated in this study. This contest was a part of 'Korean Youth Science Festival' held in 1993 and 1994. The evaluation was based on the data collected from the middle school students of final teams, their teachers, a large number of middle school students and college students who were audience of the final competition. Questionnaires, interviews, reports of final teams, and video tape of final competition were used to collect data. The study focussed on three research questions. The first was about the preparation and the research process of students of final teams. The second was about the format and the proceeding of the Contest. The third was whether participating the Contest was useful experience for the students and the teachers of the final teams. The first area, the preparation and the research process of students, were investigated in three aspects. One was the level of cooperation, participation, support and the role of teachers. The second was the information search and experiment, and the third was the report writing. The students of the final teams from both years, had positive opinion about the cooperation, students' active involvement, and support from family and school. Students considered their teachers to be a guide or a counsellor, showing their level of active participation. On the other hand, the interview of 1993 participants showed that there were times that teachers took strong leading role. Therefore one can conclude that students took active roles most of the time while the room for improvement still exists. To search the information they need during the period of the preparation, student visited various places such as libraries, bookstores, universities, and research institutes. Their search was not limited to reading the books, although the books were primary source of information. Students also learned how to organize the information they found and considered leaning of organizing skill useful and fun. Variety of experiments was an important part of preparation and students had positive opinion about it. Understanding related theory was considered most difficult and important, while designing and building proper equipments was considered difficult but not important. This reflects the students' school experience where the equipments were all set in advance and students were asked to confirm the theories presented in the previous class hours. About the reports recording the research process, students recognize the importance and the necessity of the report but had difficulty in writing it. Their reports showed tendency to list everything they did without clear connection to the problem to be solved. Most of the reports did not record the references and some of them confused report writing with story telling. Therefore most of them need training in writing the reports. It is also desirable to describe the process of student learning when theory or mathematics that are beyond the level of middle school curriculum were used because it is part of their investigation. The second area of evaluation was about the format and the proceeding of the Contest, the problems given to students, and the process of student discussion. The format of the Contests, which consisted of four parts, presentation, refutation, debate and review, received good evaluation from students because it made students think more and gave more difficult time but was meaningful and helped to remember longer time according to students. On the other hand, students said the time given to each part of the contest was too short. The problems given to students were short and open ended to stimulate students' imagination and to offer various possible routes to the solution. This type of problem was very unfamiliar and gave a lot of difficulty to students. Student had positive opinion about the research process they experienced but did not recognize the fact that such a process was possible because of the oneness of the task. The level of the problems was rated as too difficult by teachers and college students but as appropriate by the middle school students in audience and participating students. This suggests that it is possible for student to convert the problems to be challengeable and intellectually satisfactory appropriate for their level of understanding even when the problems were difficult for middle school students. During the process of student discussion, a few problems were observed. Some problems were related to the technics of the discussion, such as inappropriate behavior for the role he/she was taking, mismatching answers to the questions. Some problems were related to thinking. For example, students thinking was off balanced toward deductive reasoning, and reasoning based on experimental data was weak. The last area of evaluation was the effect of the Contest. It was measured through the change of the attitude toward science and science classes, and willingness to attend the next Contest. According to the result of the questionnaire, no meaningful change in attitude was observed. However, through the interview several students were observed to have significant positive change in attitude while no student with negative change was observed. Most of the students participated in Contest said they would participate again or recommend their friend to participate. Most of the teachers agreed that the Contest should continue and they would recommend their colleagues or students to participate. As described above, the "Discussion Contest of Scientific Investigation", which was developed and tried as a new science contest, had positive response from participating students and teachers, and the audience. Two among the list of results especially demonstrated that the goal of the Contest, "active and cooperative science learning experience", was reached. One is the fact that students recognized the experience of cooperation, discussion, information search, variety of experiments to be fun and valuable. The other is the fact that the students recognized the format of the contest consisting of presentation, refutation, discussion and review, required more thinking and was challenging, but was more meaningful. Despite a few problems such as, unfamiliarity with the technics of discussion, weakness in inductive and/or experiment based reasoning, and difficulty in report writing, The Contest demonstrated the possibility of new science learning environment and science contest by offering the chance to challenge open tasks by utilizing student science knowledge and ability to inquire and to discuss rationally and critically with other students.

  • PDF

Delineating Transcription Factor Networks Governing Virulence of a Global Human Meningitis Fungal Pathogen, Cryptococcus neoformans

  • Jung, Kwang-Woo;Yang, Dong-Hoon;Maeng, Shinae;Lee, Kyung-Tae;So, Yee-Seul;Hong, Joohyeon;Choi, Jaeyoung;Byun, Hyo-Jeong;Kim, Hyelim;Bang, Soohyun;Song, Min-Hee;Lee, Jang-Won;Kim, Min Su;Kim, Seo-Young;Ji, Je-Hyun;Park, Goun;Kwon, Hyojeong;Cha, Sooyeon;Meyers, Gena Lee;Wang, Li Li;Jang, Jooyoung;Janbon, Guilhem;Adedoyin, Gloria;Kim, Taeyup;Averette, Anna K.;Heitman, Joseph;Cheong, Eunji;Lee, Yong-Hwan;Lee, Yin-Won;Bahn, Yong-Sun
    • 한국균학회소식:학술대회논문집
    • /
    • 2015.05a
    • /
    • pp.59-59
    • /
    • 2015
  • Cryptococcus neoformans causes life-threatening meningoencephalitis in humans, but the treatment of cryptococcosis remains challenging. To develop novel therapeutic targets and approaches, signaling cascades controlling pathogenicity of C. neoformans have been extensively studied but the underlying biological regulatory circuits remain elusive, particularly due to the presence of an evolutionarily divergent set of transcription factors (TFs) in this basidiomycetous fungus. In this study, we constructed a high-quality of 322 signature-tagged gene deletion strains for 155 putative TF genes, which were previously predicted using the DNA-binding domain TF database (http://www.transcriptionfactor.org/). We tested in vivo and in vitro phenotypic traits under 32 distinct growth conditions using 322 TF gene deletion strains. At least one phenotypic trait was exhibited by 145 out of 155 TF mutants (93%) and approximately 85% of the TFs (132/155) have been functionally characterized for the first time in this study. Through high-coverage phenome analysis, we discovered myriad novel TFs that play critical roles in growth, differentiation, virulence-factor (melanin, capsule, and urease) formation, stress responses, antifungal drug resistance, and virulence. Large-scale virulence and infectivity assays in insect (Galleria mellonella) and mouse host models identified 34 novel TFs that are critical for pathogenicity. The genotypic and phenotypic data for each TF are available in the C. neoformans TF phenome database (http://tf.cryptococcus.org). In conclusion, our phenome-based functional analysis of the C. neoformans TF mutant library provides key insights into transcriptional networks of basidiomycetous fungi and ubiquitous human fungal pathogens.

  • PDF

A Study on Detection Methodology for Influential Areas in Social Network using Spatial Statistical Analysis Methods (공간통계분석기법을 이용한 소셜 네트워크 유력지역 탐색기법 연구)

  • Lee, Young Min;Park, Woo Jin;Yu, Ki Yun
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.22 no.4
    • /
    • pp.21-30
    • /
    • 2014
  • Lately, new influentials have secured a large number of volunteers on social networks due to vitalization of various social media. There has been considerable research on these influential people in social networks but the research has limitations on location information of Location Based Social Network Service(LBSNS). Therefore, the purpose of this study is to propose a spatial detection methodology and application plan for influentials who make comments about diverse social and cultural issues in LBSNS using spatial statistical analysis methods. Twitter was used to collect analysis object data and 168,040 Twitter messages were collected in Seoul over a month-long period. In addition, 'politics,' 'economy,' and 'IT' were set as categories and hot issue keywords as given categories. Therefore, it was possible to come up with an exposure index for searching influentials in respect to hot issue keywords, and exposure index by administrative units of Seoul was calculated through a spatial joint operation. Moreover, an influential index that considers the spatial dependence of the exposure index was drawn to extract information on the influential areas at the top 5% of the influential index and analyze the spatial distribution characteristics and spatial correlation. The experimental results demonstrated that spatial correlation coefficient was relatively high at more than 0.3 in same categories, and correlation coefficient between politics category and economy category was also more than 0.3. On the other hand, correlation coefficient between politics category and IT category was very low at 0.18, and between economy category and IT category was also very weak at 0.15. This study has a significance for materialization of influentials from spatial information perspective, and can be usefully utilized in the field of gCRM in the future.

IMAGING SIMULATIONS FOR THE KOREAN VLBI NETWORK(KVN) (한국우주전파관측망(KVN)의 영상모의실험)

  • Jung, Tae-Hyun;Rhee, Myung-Hyun;Roh, Duk-Gyoo;Kim, Hyun-Goo;Sohn, Bong-Won
    • Journal of Astronomy and Space Sciences
    • /
    • v.22 no.1
    • /
    • pp.1-12
    • /
    • 2005
  • The Korean VLBI Network (KVN) will open a new field of research in astronomy, geodesy and earth science using the newest three Elm radio telescopes. This will expand our ability to look at the Universe in the millimeter regime. Imaging capability of radio interferometry is highly dependent upon the antenna configuration, source size, declination and the shape of target. In this paper, imaging simulations are carried out with the KVN system configuration. Five test images were used which were a point source, multi-point sources, a uniform sphere with two different sizes compared to the synthesis beam of the KVN and a Very Large Array (VLA) image of Cygnus A. The declination for the full time simulation was set as +60 degrees and the observation time range was -6 to +6 hours around transit. Simulations have been done at 22GHz, one of the KVN observation frequency. All these simulations and data reductions have been run with the Astronomical Image Processing System (AIPS) software package. As the KVN array has a resolution of about 6 mas (milli arcsecond) at 220Hz, in case of model source being approximately the beam size or smaller, the ratio of peak intensity over RMS shows about 10000:1 and 5000:1. The other case in which model source is larger than the beam size, this ratio shows very low range of about 115:1 and 34:1. This is due to the lack of short baselines and the small number of antenna. We compare the coordinates of the model images with those of the cleaned images. The result shows mostly perfect correspondence except in the case of the 12mas uniform sphere. Therefore, the main astronomical targets for the KVN will be the compact sources and the KVN will have an excellent performance in the astrometry for these sources.

Theoretical Study on Optimal Conditions for Absorbent Regeneration in CO2 Absorption Process (이산화탄소 흡수 공정에서 흡수액 최적 재생 조건에 대한 이론적 고찰)

  • Park, Sungyoul
    • Korean Chemical Engineering Research
    • /
    • v.50 no.6
    • /
    • pp.1002-1007
    • /
    • 2012
  • The considerable portion of energy demand has been satisfied by the combustion of fossil fuel and the consequent $CO_2$ emission was considered as a main cause of global warming. As a technology option for $CO_2$ emission mitigation, absorption process has been used in $CO_2$ capture from large scale emission sources. To set up optimal operating parameters in $CO_2$ absorption and solvent regeneration units are important for the better performance of the whole $CO_2$ absorption plant. Optimal operating parameters are usually selected through a lot of actual operation data. However theoretical approach are also useful because the arbitrary change of process parameters often limited for the stability of process operation. In this paper, a theoretical approach based on vapor-liquid equilibrium was proposed to estimate optimal operating conditions of $CO_2$ absorption process. Two $CO_2$ absorption processes using 12 wt% aqueous $NH_3$ solution and 20 wt% aqueous MEA solution were investigated in this theoretical estimation of optimal operating conditions. The results showed that $CO_2$ loading of rich absorbent should be kept below 0.4 in case of 12 wt% aqueous $NH_3$ solution for $CO_2$ absorption but there was no limitation of $CO_2$ loading in case of 20 wt% aqueous MEA solution for $CO_2$ absorption. The optimal regeneration temperature was determined by theoretical approach based on $CO_2$ loadings of rich and lean absorbent, which determined to satisfy the amount of absorbed $CO_2$. The amount of heating medium at optimal regeneration temperature is also determined to meet the difference of $CO_2$ loading between rich and lean absorbent. It could be confirmed that the theoretical approach, which accurately estimate the optimal regeneration conditions of lab scale $CO_2$ absorption using 12 wt% aqueous $NH_3$ solution could estimate those of 20 wt% aqueous MEA solution and could be used for the design and operation of $CO_2$ absorption process using chemical absorbent.

The Effect of Freshwater Inflow on the Spatio-temporal Variation of water Qualify of Yeongil Bay (영일만 수질의 시ㆍ공간 변동에 미치는 담수유입의 효과)

  • 김영숙;김영섭
    • Korean Journal of Environmental Biology
    • /
    • v.22 no.1
    • /
    • pp.57-65
    • /
    • 2004
  • In order to determine the effect of fresh water inflow from the Heongsan river on the changes of water quality in the Yeongil Bay (Korea), the seasonal changes of water temperature, salinity, chemical oxygen demand (COD), dissolved inorganic nitrogen(DIN) and phosphate phosphorus ($PO_4$-P) concentrations were examined using the data set obtained five fixed points of Yeongil Bay from 1998 to 2000. The distributions and changes of COD and concentrations of total inorganic phosphorous (TIP) and nitrogen (TIN) at three points Heongsan river, were also compared with those of Yeongil Bay. Based on the correlations of DIN and $PO_4$-P, it was found that the inflow of freshwater affected on the water quality of Yeongil Bay. Such a complicacy was confirmed by the prominent differences in n few water quality measures between Site 1(the innermost area) and Site 5 (the mouth of the bay). The negative correlations in $\Delta N/\Delta P $ at sites 1, 2 and 3 of the inner-part of the bay also indicated a large effect of freshwater inflow on the water quality of the bay. The extremely low atomic ratio of an average of 6.4 in $\Delta N/\Delta P $ compared to the Redfild ratio suggested that the DIN was depleted in the overall bay system. In contrast, it was inferred that the excessive PO$_4$-P concentration was due to the inflow of freshwater from the Heongsan river.

Optimal supervised LSA method using selective feature dimension reduction (선택적 자질 차원 축소를 이용한 최적의 지도적 LSA 방법)

  • Kim, Jung-Ho;Kim, Myung-Kyu;Cha, Myung-Hoon;In, Joo-Ho;Chae, Soo-Hoan
    • Science of Emotion and Sensibility
    • /
    • v.13 no.1
    • /
    • pp.47-60
    • /
    • 2010
  • Most of the researches about classification usually have used kNN(k-Nearest Neighbor), SVM(Support Vector Machine), which are known as learn-based model, and Bayesian classifier, NNA(Neural Network Algorithm), which are known as statistics-based methods. However, there are some limitations of space and time when classifying so many web pages in recent internet. Moreover, most studies of classification are using uni-gram feature representation which is not good to represent real meaning of words. In case of Korean web page classification, there are some problems because of korean words property that the words have multiple meanings(polysemy). For these reasons, LSA(Latent Semantic Analysis) is proposed to classify well in these environment(large data set and words' polysemy). LSA uses SVD(Singular Value Decomposition) which decomposes the original term-document matrix to three different matrices and reduces their dimension. From this SVD's work, it is possible to create new low-level semantic space for representing vectors, which can make classification efficient and analyze latent meaning of words or document(or web pages). Although LSA is good at classification, it has some drawbacks in classification. As SVD reduces dimensions of matrix and creates new semantic space, it doesn't consider which dimensions discriminate vectors well but it does consider which dimensions represent vectors well. It is a reason why LSA doesn't improve performance of classification as expectation. In this paper, we propose new LSA which selects optimal dimensions to discriminate and represent vectors well as minimizing drawbacks and improving performance. This method that we propose shows better and more stable performance than other LSAs' in low-dimension space. In addition, we derive more improvement in classification as creating and selecting features by reducing stopwords and weighting specific values to them statistically.

  • PDF