• Title/Summary/Keyword: principle of text distribution

Search Result 7, Processing Time 0.021 seconds

Benford's Law in Linguistic Texts: Its Principle and Applications (언어 텍스트에 나타나는 벤포드 법칙: 원리와 응용)

  • Hong, Jung-Ha
    • Language and Information
    • /
    • v.14 no.1
    • /
    • pp.145-163
    • /
    • 2010
  • This paper aims to propose that Benford's Law, non-uniform distribution of the leading digits in lists of numbers from many real-life sources, also appears in linguistic texts. The first digits in the frequency lists of morphemes from Sejong Morphologically Analyzed Corpora represent non-uniform distribution following Benford's Law, but showing complexity of numerical sources from complex systems like earthquakes. Benford's Law in texts is a principle reflecting regular distribution of low-frequency linguistic types, called LNRE(large number of rare events), and governing texts, corpora, or sample texts relatively independent of text sizes and the number of types. Although texts share a similar distribution pattern by Benford's Law, we can investigate non-uniform distribution slightly varied from text to text that provides useful applications to evaluate randomness of texts distribution focused on low-frequency types.

  • PDF

An Analysis of the 8th Grade Probability Curriculum in Accordance with the Distribution Concepts (분포 개념의 연계성 목표 관점에 따른 중학교 확률 단원 분석)

  • Lee, Young-Ha;Huh, Ji-Young
    • Journal of Educational Research in Mathematics
    • /
    • v.20 no.2
    • /
    • pp.163-183
    • /
    • 2010
  • It has long been of controversy what the meanings of probability is. And a century has past after the mathematical probability has been at the center of the school curriculum of it. Recently statistical meaning of probability becomes important for various reasons. However the simple modification of its definition is not enough. The computational reasoning of the probability and its practical application needs didactical changes and new instructional transformations along with the modification of it. Most of the current text books introduce probability as a limit of the relative frequencies, a statistical probability. But when the probability computation of the union of two events, or of the simultaneous events is faced on, they use mathematical probability for explanation and practices. Accordingly there is a gap for students in understanding those. Probability is an intuitive concept as far as it belongs to the domain of the experiential frequency. And frequency distribution must be the instructional bases for the (statistical) probability novices. This is what we mean by the probability in accordance with the distribution concepts. First of all, in order to explain the probability of the complementary event we should explain the empirical relative frequency of it first. These are the case for the union of two events and for the simultaneous events. Moreover we need to provide a logic of probabilistic guesses, inferences and decision, which we introduce with the name “the likelihood principle”, the most famous statistical principle. We emphasized this be done through the problems of practical decision making.

  • PDF

Study of Gyeongbosinpyeon, a Late Joseon Medical Records (조선 후기 의안(醫案) 『경보신편(輕寶新編)』 연구)

  • Jeon, Jongwook
    • Journal of Korean Medical classics
    • /
    • v.30 no.1
    • /
    • pp.185-209
    • /
    • 2017
  • Objectives : The objective of this paper is to review the healing processes employed in the traditional age and discover the unique features found in the Korean Medicine through categorizing and analyzing the distribution of patients, and the aspects and results of treatments as recorded in Gyeongbosinpyeon, a historical text thought to have been authored by a regional doctor active in Joseon during the mid- to late-19th century. Methods : A table is created to view all of the total of 141 medical records introduced in the Gyeongbosinpyeon, and 7 categories were created to each contain 2 to 3 medical records that have special images. The paper provides their translation texts along with the original texts, and analyzed their medical and social significances by comparing each medical record. Results : The clinical competence displayed by the doctor who had worked in Joseon during the 19th century was surprisingly high, and it seems its values are worthy of dissemination when compared with Yeogsimanpil that has been introduced to the world. There is a great significance in how the principle of holistic treatments, the fundamental aspect of Joseon's medical study, was adhered. Additionally, the parts that show the historical text's author's medical activities and their unique characteristics are also worthy of attention. Conclusions : Korean medicine possesses a remarkable text called Donguibogam, but clinical behaviors' successes are not guaranteed solely with textual knowledge. It can be witnessed that such texts of authority and such medical records that have recorded actual activities complement each other in order to improve the quality of Joseon's study of medicine.

A Study on the Development Strategy of Artificial Intelligence Technology Using Multi-Attribute Weighted Average Method (다요소 가중 평균법을 이용한 인공지능 기술 개발전략 연구)

  • Chang, Hae Gak;Choi, Il Young;Kim, Jae Kyeong
    • Journal of Information Technology Services
    • /
    • v.19 no.2
    • /
    • pp.93-107
    • /
    • 2020
  • Recently, artificial intelligence (AI) technologies has been widely used in various fields such as finance, and distribution. Accordingly, Korea has also announced its AI R&D strategy for the realization of i-Korea 4.0 in May 2018. However, Korea's AI technology is inferior to major competitors such as the US, Canada, and Japan Therefore, in order to cope with the 4th industrial revolution, it is necessary to allocate AI R&D budgets efficiently through selection and concentration so as to gain competitive advantage under a limited budget. In this study, the importance of each AI technology was evaluated in multi-dimensional way through the questionnaire of expert group using the evaluation index derived from the literature review From the results of this study, we draw the following implication. In order to successfully establish the AI technology development strategies, it is necessary to prioritize the cognitive computing technology that has great market growth potential, ripple effect of technology development, and the urgency of technology development according to the principle of selection and concentration. To this end, it is necessary to find creative ideas, manage assessments, converge multidisciplinary systems and strengthen core competencies. In addition, since AI technology has a large impact on socioeconomic development, it is necessary to comprehensively grasp and manage scientific and technological regulations in order to systematically promote AI technology development.

Study on Improving the System for the Revitalization and Efficient Management of the Local Commercial Area (지역상권 활성화 및 효율적 관리를 위한 제도 개선방안 연구)

  • Kim, Seung-Hee;Kim, Young-Ki
    • Journal of Distribution Science
    • /
    • v.11 no.5
    • /
    • pp.55-62
    • /
    • 2013
  • Purpose - This study aims to determine the problems and limitations of the Commercial Area Activation System, which was created by a special law for promoting traditional markets and shopping districts to revitalize and efficiently manage the central commercial area in different regions. We also suggest different options for its improvement. Research design, data, and methodology - We also look into the problems of which is being promoted as a demonstration project, from the aspects of legal text and guidelines. Results - The current commercial area activation system has several problems. First, the establishment of a comprehensive basic plan on the commercial area activation is not a requirement. Second, the benefit principle should be established to prevent the moral laxity of merchants who serve important roles in the main components of the commercial area activation business when they conduct their business. Third, the current special law constrains the commercial management organization, as under the civil law yields a limitation on finding a profitable business model. Fourth, to efficiently, constructing a system that links the other central government businesses and is needed. into a regional development budget or a budget for funding small businesses that the central government can control, which is effective. Further, we offer some suggestions for medium- and long-term policies. First, an integrated coordination mechanism at the central office level should be installed while setting the basic policy to revitalize the Based on this policy, local governments need a system that exclusively based on the after establishing a comprehensive plan for urban regeneration and getting approval from the integration organization. Second, a system that enables an understanding of the problems with business promotion by monitoring the procedure of supporting projects and regularly assessing business achievements is needed. Third, a plan is needed for resolving conflicts between various interested parties that adopts the commercial area activation system for carrying out a total redevelopment of the commercial area where small shops are densely located. A market maintenance project has been conducted as a means to recover our traditional market, which was economically depressed, and to revive the local economy, but it is mostly conducted in the form of reconstruction or redevelopment and represents the interests of landowners and merchants. Thus, it is most likely to lead to a gradual disappearance of traditional markets. Conclusions - This study looks primarily into the problems that appeared in the legal text or the guidelines regarding the direction of improvement of the commercial area activation business that has been going on as a demonstration project since 2011 and suggests some solutions.

  • PDF

Spatial Clustering Analysis based on Text Mining of Location-Based Social Media Data (위치기반 소셜 미디어 데이터의 텍스트 마이닝 기반 공간적 클러스터링 분석 연구)

  • Park, Woo Jin;Yu, Ki Yun
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.23 no.2
    • /
    • pp.89-96
    • /
    • 2015
  • Location-based social media data have high potential to be used in various area such as big data, location based services and so on. In this study, we applied a series of analysis methodology to figure out how the important keywords in location-based social media are spatially distributed by analyzing text information. For this purpose, we collected tweet data with geo-tag in Gangnam district and its environs in Seoul for a month of August 2013. From this tweet data, principle keywords are extracted. Among these, keywords of three categories such as food, entertainment and work and study are selected and classified by category. The spatial clustering is conducted to the tweet data which contains keywords in each category. Clusters of each category are compared with buildings and benchmark POIs in the same position. As a result of comparison, clusters of food category showed high consistency with commercial areas of large scale. Clusters of entertainment category corresponded with theaters and sports complex. Clusters of work and study showed high consistency with areas where private institutes and office buildings are concentrated.

A Study on the Risk Factors for Maternal and Child Health Care Program with Emphasis on Developing the Risk Score System (모자건강관리를 위한 위험요인별 감별평점분류기준 개발에 관한 연구)

  • 이광옥
    • Journal of Korean Academy of Nursing
    • /
    • v.13 no.1
    • /
    • pp.7-21
    • /
    • 1983
  • For the flexible and rational distribution of limited existing health resources based on measurements of individual risk, the socalled Risk Approach is being proposed by the World Health Organization as a managerial tool in maternal and child health care program. This approach, in principle, puts us under the necessity of developing a technique by which we will be able to measure the degree of risk or to discriminate the future outcomes of pregnancy on the basis of prior information obtainable at prenatal care delivery settings. Numerous recent studies have focussed on the identification of relevant risk factors as the Prior infer mation and on defining the adverse outcomes of pregnancy to be dicriminated, and also have tried on how to develope scoring system of risk factors for the quantitative assessment of the factors as the determinant of pregnancy outcomes. Once the scoring system is established the technique of classifying the patients into with normal and with adverse outcomes will be easily de veloped. The scoring system should be developed to meet the following four basic requirements. 1) Easy to construct 2) Easy to use 3) To be theoretically sound 4) To be valid In searching for a feasible methodology which will meet these requirements, the author has attempted to apply the“Likelihood Method”, one of the well known principles in statistical analysis, to develop such scoring system according to the process as follows. Step 1. Classify the patients into four groups: Group $A_1$: With adverse outcomes on fetal (neonatal) side only. Group $A_2$: With adverse outcomes on maternal side only. Group $A_3$: With adverse outcome on both maternal and fetal (neonatal) sides. Group B: With normal outcomes. Step 2. Construct the marginal tabulation on the distribution of risk factors for each group. Step 3. For the calculation of risk score, take logarithmic transformation of relative proport-ions of the distribution and round them off to integers. Step 4. Test the validity of the score chart. h total of 2, 282 maternity records registered during the period of January 1, 1982-December 31, 1982 at Ewha Womans University Hospital were used for this study and the“Questionnaire for Maternity Record for Prenatal and Intrapartum High Risk Screening”developed by the Korean Institute for Population and Health was used to rearrange the information on the records into an easy analytic form. The findings of the study are summarized as follows. 1) The risk score chart constructed on the basis of“Likelihood Method”ispresented in Table 4 in the main text. 2) From the analysis of the risk score chart it was observed that a total of 24 risk factors could be identified as having significant predicting power for the discrimination of pregnancy outcomes into four groups as defined above. They are: (1) age (2) marital status (3) age at first pregnancy (4) medical insurance (5) number of pregnancies (6) history of Cesarean sections (7). number of living child (8) history of premature infants (9) history of over weighted new born (10) history of congenital anomalies (11) history of multiple pregnancies (12) history of abnormal presentation (13) history of obstetric abnormalities (14) past illness (15) hemoglobin level (16) blood pressure (17) heart status (18) general appearance (19) edema status (20) result of abdominal examination (21) cervix status (22) pelvis status (23) chief complaints (24) Reasons for examination 3) The validity of the score chart turned out to be as follows: a) Sensitivity: Group $A_1$: 0.75 Group $A_2$: 0.78 Group $A_3$: 0.92 All combined : 0.85 b) Specificity : 0.68 4) The diagnosabilities of the“score chart”for a set of hypothetical prevalence of adverse outcomes were calculated as follows (the sensitivity“for all combined”was used). Hypothetidal Prevalence : 5% 10% 20% 30% 40% 50% 60% Diagnosability : 12% 23% 40% 53% 64% 75% 80%.

  • PDF