• Title/Summary/Keyword: Original text

Search Result 428, Processing Time 0.027 seconds

Study on Difference of Wordvectors Analysis Induced by Text Preprocessing for Deep Learning (딥러닝을 위한 텍스트 전처리에 따른 단어벡터 분석의 차이 연구)

  • Ko, Kwang-Ho
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.5
    • /
    • pp.489-495
    • /
    • 2022
  • It makes difference to LSTM D/L(Deep Learning) results for language model construction as the corpus preprocess changes. An LSTM model was trained with a famouse literaure poems(Ki Hyung-do's work) for training corpus in the study. You get the two wordvector sets for two corpus sets of the original text and eraised word ending text each once D/L training completed. It's been inspected of the similarity/analogy operation results, the positions of the wordvectors in 2D plane and the generated texts by the language models for the two different corpus sets. The suggested words by the silmilarity/analogy operations are changed for the corpus sets but they are related well considering the corpus characteristics as a literature work. The positions of the wordvectors are different for each corpus sets but the words sustained the basic meanings and the generated texts are different for each corpus sets also but they have the taste of the original style. It's supposed that the D/L language model can be a useful tool to enjoy the literature in object and in diverse with the analysis results shown in the study.

A study on Haengwuseook(杏雨書屋) Edition "Hwangjenaegyeong-Taeso(黃帝內經太素)"volume 21 and 27 (행우서옥본(杏雨書屋本) "황제내경태소(黃帝內經太素)" 권(卷)21, 권(卷)27의 출간(出刊) 의의(意義)와 그 내용에 대한 고찰(考察))

  • Kim, Jong-Hyun;Baik, You-Sang;Jang, Woo-Chang;Jeong, Chang-Hyun
    • Journal of Korean Medical classics
    • /
    • v.24 no.5
    • /
    • pp.159-175
    • /
    • 2011
  • "Hwangjenaegyeong-taeso(黃帝內經太素)" is a classic work of Yang Sang-seon(楊上善), which comprises original articles of "Hwangjenaegyeong(黃帝內經)" along with "Somun(素問)", "Yeongchu(靈樞)", and "gapeul(甲乙)", as a one of the oldest annotated publications. Therefore, its significance lies in that "Hwangjenaegyeong-taeso(黃帝內經太素)" is a valuable work to reconstruct the original text of "Hwangjenaegyeong(黃帝內經)" and comprehend its fundamental ideas. The only printed edition of "Hwangjenaegyeong-taeso(黃帝內經太素)" was photocopied in 1981, and is currently known as 'Orient Edition'. While 'Orient Edition' was referred to as the draft for the latest revised edition, volume 21 and 27 were photocopied from hand-copied edition, not the original. The original publications of 'Orient Edition' have been stocked at 'Haengwuseook(杏雨書屋)' of Japan and were recently published. Hence, a comparative study between the two original volumes and the former ones has been conducted. Although the most of the differences were trivial, some may have led to distorted interpretation of the text. The errors of the former revised edition fall into a few specific categories, and the most significant ones were errors that were made during the hand-copying procedure. Moreover, there were errors that were made due to the low resolution of the former draft, and simple errors during the publishing. In this work, examples of such cases were presented, and the results were collected.

A Bibliographical Study on "Bonchogyeongjipju(本草經集注)" ("본초경집주(本草經集注)"에 대한 서지학적(書誌學的) 연구)

  • Kim, Yong-Joo;Baik, You-Sang;Jang, Wu-Chang;Jeong, Chang-Hyeon
    • Journal of Korean Medical classics
    • /
    • v.23 no.2
    • /
    • pp.191-203
    • /
    • 2010
  • "Bonchogyeongjipju(本草經集注)" is a pharmacological classic published in the Southern and Northern Dynasties(南北朝時代, 420-589 A.D.) in China by Dohonggyeong(陶弘景, 456-536 A.D.). In "Bonchogyeongjipju(本草經集注)", Dohonggyeong(陶弘景) edited "Sinnongbonchogyeong(神農本草經)", the earliest classical text about material medica containing notes for 365 drugs, by adding another 365 drugs and further information from "Myeong-uibyeollok(名醫別錄)" and writing extended commentaries on them. His commentaries include changes in the geographical distribution, identification of varieties and other various special characteristics. The original text had gradually disappeared after other pharmacological classics were published such as "Sinsuboncho(新修本草)", in Dang Dynasty(唐代), "Gyeongsajeungryubigeupboncho(經史證類備急本草)" in Song Dynasty(宋代). All of these books were based on "Bonchogyeongjipju(本草經集注)", so the original text can be seen indirectly through these later sources. In the early 1900's, a transcribed manuscript of the preface "Bonchogyeongjipju(本草經集注)" was found almost wholly preserved except the first three lines, in the Makgo(莫高) cave of Donhwang(敦煌). Broken strips of transcribed "Bonchogyeongjipju(本草經集注)" have also been excavated in Turfan[吐魯番], which shows its original form written in red and black ink. Mayanagi Makoto[眞柳誠] researched on Donhwang(敦煌) and Turfan[吐魯番] editions, ascertained their existence and explained their bibliographical and historical facts. Sangjigyun(尙志鈞) restored "Bonchogyeongjipju(本草經集注)" based on other related sources such as Donhwang(敦煌) and Turfan[吐魯番] editions. " Bonchogyeongjipju(本草經集注)" can be said as the locus classicus(典範) of herbal medicine, that is most of the following materia medica was based on it. It makes it possible to pass down "Sinnongbonchogyeong(神農本草經)" to posterity and provide a foundation for herbal medical development.

Correction of Signboard Distortion by Vertical Stroke Estimation

  • Lim, Jun Sik;Na, In Seop;Kim, Soo Hyung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.7 no.9
    • /
    • pp.2312-2325
    • /
    • 2013
  • In this paper, we propose a preprocessing method that it is to correct the distortion of text area in Korean signboard images as a preprocessing step to improve character recognition. Distorted perspective in recognizing of Korean signboard text may cause of the low recognition rate. The proposed method consists of four main steps and eight sub-steps: main step consists of potential vertical components detection, vertical components detection, text-boundary estimation and distortion correction. First, potential vertical line components detection consists of four steps, including edge detection for each connected component, pixel distance normalization in the edge, dominant-point detection in the edge and removal of horizontal components. Second, vertical line components detection is composed of removal of diagonal components and extraction of vertical line components. Third, the outline estimation step is composed of the left and right boundary line detection. Finally, distortion of the text image is corrected by bilinear transformation based on the estimated outline. We compared the changes in recognition rates of OCR before and after applying the proposed algorithm. The recognition rate of the distortion corrected signboard images is 29.63% and 21.9% higher at the character and the text unit than those of the original images.

Skew Estimation and Correction in Text Images using Shape Moments (형태 모멘트를 이용한 텍스트 이미지 경사 측정 및 교정)

  • Choo, Moon-Won;Chin, Seong-Ah
    • The Journal of the Korea Contents Association
    • /
    • v.3 no.1
    • /
    • pp.14-20
    • /
    • 2003
  • In this paper efficient skew estimation and correction approaches are proposed. To detect the skew of text images, Hough transform using the perpendicular angle view property and shape moments are peformed. The resultant primary text skew angle is used to align the original text. The performance evaluations of the proposed methods with respect to running time are shown.

  • PDF

A Symmetric Key Cryptography Algorithm by Using 3-Dimensional Matrix of Magic Squares

  • Lee, Sangho;Kim, Shiho;Jung, Kwangho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.11a
    • /
    • pp.768-770
    • /
    • 2013
  • We propose a symmetric key based cryptography algorithm to encode and decode the text data with limited length using 3-dimensional magic square matrix. To encode the plain text message, input text will be translated into an index of the number stored in the key matrix. Then, Caesar's shift with pre-defined constant value is fabricated to finalize an encryption algorithm. In decode process, Caesar's shift is applied first, and the generated key matrix is used with 2D magic squares to replace the index numbers in ciphertext to restore an original text.

A study on the xylographica of ${\ulcorner}$Classified Collection of Medical Prescriptions${\lrcorner}$ ("의방류취(醫方類聚)"에 대한 판본(版本) 연구)

  • Shin, Soon-Shik;Choi, Hwan-Soo
    • Korean Journal of Oriental Medicine
    • /
    • v.3 no.1
    • /
    • pp.1-15
    • /
    • 1997
  • ${\ulcorner}$Classified Collection of Medical Prescriptions${\lrcorner}$(1445) is a book compiled the medical achievements of China and Choseon in those times and it's our source of pride to have it In this country. It also deserves careful investigation since this book can provide some clues of features of missing books in China and Korea. The extent of accuracy of xylographica of old books determines the possiblity of in depth further study. So authors attempted to investigate the xylographica of ${\ulcorner}$Classified Collection of Medical Prescriptions${\lrcorner}$ one of the 3 main books in Korea. Previous investigation done by Miki Sakae and Kim Doo Jong are noticeable. On the basis of their respective works, we analyzed 'Annals of the Choseon Dynasty' to find records related with ${\ulcorner}$Classified Collection of Medical Prescriptions${\lrcorner}$ and estimated the situation of its publication. We tried figure the situation of those times of China, Japan and Korea(including North Korea) and tried to estimate the book's original xylographica as much as we could. By King Sejong's command, the first draft of ${\ulcorner}$Classified Collection of Medical Prescriptions${\lrcorner}$ consisted of 365 books was made by collaboration of civil officials and medical officers during the period from 1443 to 1445. And then from 1451(first year of Moonjong's reign) to 1464(l0th year of Sejo's reign) lots of manpowers were employed and through the process of countless erasure, proofreading, arrangement and rearrangement revised version of ${\ulcorner}$Classified Collection of Medical Prescriptions${\lrcorner}$ which is called by Sejo text was completed. After 3 years of wood engraving work, the first printed form of ${\ulcorner}$Classified Collection of Medical Prescriptions${\lrcorner}$ (alternately called Seongjong text) in folding case consisted of 266 chapters, 264 volumes came into the world in 1477.(8th year of Seongjong's reign). This was 32 years after the initial completion of the edition. So ${\ulcorner}$Classified Collection of Medical Prescriptions${\lrcorner}$ exists in three forms as Sejong text, Sejo text and Seongjong text respectively. Since those texts were plundered during the Japanese invasion of Korea in 1592, none of the original copy remains within korea. The texts were constantly moved to kadeungcheongieong, to Kongdeungpyeongio, Jesookoan of Edo, to East University of department of classic books, to Cheoncho archives, to the Imperial Museum and finally is kept in the royal palace at present. (Doseoryo text Eulhae printing type) Reduced-size republication books of ${\ulcorner}$Classified Collection of Medical Prescriptions${\lrcorner}$ in wooden type were imported at the time of 'Byeongja Korea-Japan Treaty in 1876' and of those 2 books, one copy was treasured in the Royal Household of the Yi Dynasty and than was lost during the Korean War circa 1950. The other remaining copy has been kept succesively by Kojong's imperial grant, Royal doctor Hong Cheol Bo, Hong Taek Joo, Hong Ik Pyo the book agent, and now is kept In Yonsei University Library and this is the only existing copy in Korea at present. In 1965, Dongyang Medical college published the transcription version of ${\ulcorner}$Classified Collection of Medical Prescriptions${\lrcorner}$ consisting of 11 books and then in 1981 after edition and arrangement by Choonghoa(中華) publishing company, photoprint copy of ${\ulcorner}$Classified Collection of Medical Prescriptions${\lrcorner}$ was published in Keumgang(金剛) publishing company In 1991, October Yeokang(驛江) publishing company producd photocopies of ${\ulcorner}$Classified Collection of Medical Prescriptions${\lrcorner}$ which were previously translated into Korean by North Korea Institute of Oriental Medicine and then issued by medical publishing company. In China, two institutes, Zhejiang Institute of Traditional Chinese Medicine and Huzhou Traditional Chinese Medical Hospital cooperated to publish a revised and marked text consiting of 11 books by adding marking points to japanse Edohakhoondang text which were used as a reference. Both the korean and chinese texts issued were grounded by the ${\ulcorner}$Classified Collection of Medical Prescriptions${\lrcorner}$ kept in the royal palace. Any further study concerning ${\ulcorner}$Classified Collection of Medical Prescriptions${\lrcorner}$ can acquire its accuracy and objectivity when the japanese text kept in the royal palace is taken as an original copy.

  • PDF

Study of 'Ji-Qi-Shang-Chong' in Shang-han-lun's 15th Text (상한론(傷寒論) 15조(條)의 '기기상충(其氣上衝)'에 대한 고찰)

  • Lee, Seung-Jun;Kim, Yeong-Mok
    • Journal of Physiology & Pathology in Korean Medicine
    • /
    • v.25 no.6
    • /
    • pp.961-967
    • /
    • 2011
  • This study is about 'Ji-Qi-Shang-Chong(其氣上衝)' in Shang-han-lun("傷寒論")'s 15th text. Shang-han-lun is a basic text about pathology of Traditional Korean Medicine written by Zhang-Zhong-Jing(張仲景). In that text, there are so many cases of people having some symptoms, how to treat them, and which herb medicine to give them, and the side effects of wrong treatments. In those cases, there is symptom said 'Ji-Qi-Shang-Chong(其氣上衝)' in the 15th text. But there is no detailed description about that. So this study is aimed at studying exactly meaning of the 15th text's 'Ji-Qi-Shang-Chong(其氣上衝)' by comparing historical medical practitioners and analyzing with the bibliography, pathology, herb pharmacology, herbal medicine, pharmacology part. In the bibliographical analysis, this sentence has been transmitted from original Shan-han-lun written by Zhang-Zhong-Jing(張仲景). Former part of this sentence "太陽病, 下之後, 其氣上衝者, 可與桂枝湯". is most correspondent part with Zhong-Jing(仲景)'s. And there is correctional possibility about latter part.

영문학교육과 축약.축역본의 위상

  • Lee, Dong-Hwan
    • English Language & Literature Teaching
    • /
    • v.16 no.1
    • /
    • pp.209-233
    • /
    • 2009
  • Many difficult literary texts have been disregarded by the teachers as well as the students in the EFL context. The abridged version, however, has its pedagogical usability when viewed as an extension of the literary text like movies and comic strips. Legible abridgments boost up the critical mind among the learners by enhancing their involvement in responding more actively to each class. In addition, to study an abridged version makes the future teachers accustomed to use it as a usable material. Abridgment has its efficacy in the literary study, too: reader-response criticism and narrative scholarship. First, the learners' creative engagement to the text encourages them to draw their personal experiences which are made up of the basic storyline. Second, a personal experience linked to the story has a relationship to narrative scholarship proposed in contemporary ecocriticism. Narrative scholarship is a new academic trend that merges the writer's personal experience in physical surroundings with the text which describes the same or similar natural environment. The role of teachers is a key to succeed in the abridged version pedagogy. They can facilitate a web of learner, text, and social context by providing a friendly atmosphere to encourage students' active participation, as well as supplementary materials of the original text.

  • PDF

Modern Methods of Text Analysis as an Effective Way to Combat Plagiarism

  • Myronenko, Serhii;Myronenko, Yelyzaveta
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.8
    • /
    • pp.242-248
    • /
    • 2022
  • The article presents the analysis of modern methods of automatic comparison of original and unoriginal text to detect textual plagiarism. The study covers two types of plagiarism - literal, when plagiarists directly make exact copying of the text without changing anything, and intelligent, using more sophisticated techniques, which are harder to detect due to the text manipulation, like words and signs replacement. Standard techniques related to extrinsic detection are string-based, vector space and semantic-based. The first, most common and most successful target models for detecting literal plagiarism - N-gram and Vector Space are analyzed, and their advantages and disadvantages are evaluated. The most effective target models that allow detecting intelligent plagiarism, particularly identifying paraphrases by measuring the semantic similarity of short components of the text, are investigated. Models using neural network architecture and based on natural language sentence matching approaches such as Densely Interactive Inference Network (DIIN), Bilateral Multi-Perspective Matching (BiMPM) and Bidirectional Encoder Representations from Transformers (BERT) and its family of models are considered. The progress in improving plagiarism detection systems, techniques and related models is summarized. Relevant and urgent problems that remain unresolved in detecting intelligent plagiarism - effective recognition of unoriginal ideas and qualitatively paraphrased text - are outlined.