• Title/Summary/Keyword: Language study in Korea

Search Result 3,253, Processing Time 0.034 seconds

A Convergence Analysis of the Ethnographic Method for Doctoral Dissertations in Korea : Focused on Research Participants, Data Collection Methods, and Trustworthiness Criteria (국내 박사학위 논문의 문화 기술적 연구방법에 대한 융복합적 분석 -연구 참여자, 자료 수집방법, 신뢰성 준거를 중심으로-)

  • Oh, Ho-young;Cho, Hong-Joong
    • Journal of the Korea Convergence Society
    • /
    • v.8 no.10
    • /
    • pp.333-338
    • /
    • 2017
  • Ethnography is concerned about specifically-based behavior and belief and the learned pattern of language and aims to describe and interpret them. Therefore, it is a classical form of qualitative research that was developed by anthropologists who spent for long time in conducting fieldworks within the cultural group. The results of analyzing ethnographic research methods of doctoral dissertations in Korea are as follows. First, the number of research participants in data collection methods was 1-10(32 dissertations, 44.4%), 11-20(18, 25%), 21-30(13, 18.1%), 31-40(2, 2.7%), and others(7, 9.8%). Second, data collection methods were in-depth interview(71, 98.6%), participant observation(70, 97.2%), document data(38, 52.7%), engineering device(12, 16.6%), and others(8, 11.1%). Data collection periods were 3-5 months(7 dissertation, 9.8%), 6-8 months(15, 20.8%), 9-11 months(14, 19.6%), 12-14 months(13, 18.1%), more than 15 months(17, 23.6%), and unpresented(4, 5.4%). Third, trustworthiness criteria were triangulation(46 dissertation, 63.9%), research participants' evaluation of study results 44(61.1%), peer researchers' advice and indication(33, 45.8%), follow-up(25, 34.7%), use of reference(20, 27.8%), reflexive subjectivity(17, 23.6%), intensive observation for a sufficient period(10, 13.9%), in-depth description(7, 9.8%), and others(7, 9.8%).

A Study on Conventional Expression of Hangul Ganchal and Email (조선시대 한글 간찰과 이메일의 상투적 표현 고찰)

  • Jeon, Byeong-yong
    • (The)Study of the Eastern Classic
    • /
    • no.49
    • /
    • pp.431-459
    • /
    • 2012
  • The purpose of this article is to compare and analyze the conventional expression of Hangul Ganchal in Cheosun Dynasty and Email. Conventional expression is used remarkably in introductions and conclusions. In introduction, it is used for addressing and safety greetings while in conclusion, it is used for closing address and closing words. In Cheosun Dynasty, an envelope of Ganchal only included the details of the receiver because the letter was genuinely delivered by someone who knew the receiver and the sender very well. An envelope of Ganchal is applicable to the screen of the internet which is used for emailing. In an email, we see the name of the sender and the title of the text and once we click the title, we are able to view the text. The difference between the Ganchal and the email was reflected on how the receiver's detail showed on Ganchal and the email show the sender's details. In a case of addressing in a letter while using the conventional expression, we can see how we use "To~" in humble term and " ~께" in a honorific term. We confirmed that the conventional expression has not yet settled in both of the Gnachal and email for the seasonal greetings. The safety greetings comprised with both of the senders' and receivers' latest updates. In Ganchal, this composition is well described conventionally, whereas in emails, only the receivers' latest news are written but the senders' latest updates are hard to be seen throughout the text. In Ganchal's closing section, the closing address and closing words were expressed conventionally. However, in the case of email; those were again hard to be found throughout. To conclude, in Ganchal the conventional expression was developed and placed in 16thcentury(Sun-eon) when there was a focus in our native language. In 17thcentury(Hyeon-eon), it stood still for a sometime and moved on to 19thcentury(Jing-eon) when there was a strong in fluence of Hangul Ganchal, which resulted in regression to the conservative expression. In general, we are able to confirm that the conventional expression is slowly disappearing.

A Study on the Effect of Using Sentiment Lexicon in Opinion Classification (오피니언 분류의 감성사전 활용효과에 대한 연구)

  • Kim, Seungwoo;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.133-148
    • /
    • 2014
  • Recently, with the advent of various information channels, the number of has continued to grow. The main cause of this phenomenon can be found in the significant increase of unstructured data, as the use of smart devices enables users to create data in the form of text, audio, images, and video. In various types of unstructured data, the user's opinion and a variety of information is clearly expressed in text data such as news, reports, papers, and various articles. Thus, active attempts have been made to create new value by analyzing these texts. The representative techniques used in text analysis are text mining and opinion mining. These share certain important characteristics; for example, they not only use text documents as input data, but also use many natural language processing techniques such as filtering and parsing. Therefore, opinion mining is usually recognized as a sub-concept of text mining, or, in many cases, the two terms are used interchangeably in the literature. Suppose that the purpose of a certain classification analysis is to predict a positive or negative opinion contained in some documents. If we focus on the classification process, the analysis can be regarded as a traditional text mining case. However, if we observe that the target of the analysis is a positive or negative opinion, the analysis can be regarded as a typical example of opinion mining. In other words, two methods (i.e., text mining and opinion mining) are available for opinion classification. Thus, in order to distinguish between the two, a precise definition of each method is needed. In this paper, we found that it is very difficult to distinguish between the two methods clearly with respect to the purpose of analysis and the type of results. We conclude that the most definitive criterion to distinguish text mining from opinion mining is whether an analysis utilizes any kind of sentiment lexicon. We first established two prediction models, one based on opinion mining and the other on text mining. Next, we compared the main processes used by the two prediction models. Finally, we compared their prediction accuracy. We then analyzed 2,000 movie reviews. The results revealed that the prediction model based on opinion mining showed higher average prediction accuracy compared to the text mining model. Moreover, in the lift chart generated by the opinion mining based model, the prediction accuracy for the documents with strong certainty was higher than that for the documents with weak certainty. Most of all, opinion mining has a meaningful advantage in that it can reduce learning time dramatically, because a sentiment lexicon generated once can be reused in a similar application domain. Additionally, the classification results can be clearly explained by using a sentiment lexicon. This study has two limitations. First, the results of the experiments cannot be generalized, mainly because the experiment is limited to a small number of movie reviews. Additionally, various parameters in the parsing and filtering steps of the text mining may have affected the accuracy of the prediction models. However, this research contributes a performance and comparison of text mining analysis and opinion mining analysis for opinion classification. In future research, a more precise evaluation of the two methods should be made through intensive experiments.

A Study on Knowledge Entity Extraction Method for Individual Stocks Based on Neural Tensor Network (뉴럴 텐서 네트워크 기반 주식 개별종목 지식개체명 추출 방법에 관한 연구)

  • Yang, Yunseok;Lee, Hyun Jun;Oh, Kyong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.25-38
    • /
    • 2019
  • Selecting high-quality information that meets the interests and needs of users among the overflowing contents is becoming more important as the generation continues. In the flood of information, efforts to reflect the intention of the user in the search result better are being tried, rather than recognizing the information request as a simple string. Also, large IT companies such as Google and Microsoft focus on developing knowledge-based technologies including search engines which provide users with satisfaction and convenience. Especially, the finance is one of the fields expected to have the usefulness and potential of text data analysis because it's constantly generating new information, and the earlier the information is, the more valuable it is. Automatic knowledge extraction can be effective in areas where information flow is vast, such as financial sector, and new information continues to emerge. However, there are several practical difficulties faced by automatic knowledge extraction. First, there are difficulties in making corpus from different fields with same algorithm, and it is difficult to extract good quality triple. Second, it becomes more difficult to produce labeled text data by people if the extent and scope of knowledge increases and patterns are constantly updated. Third, performance evaluation is difficult due to the characteristics of unsupervised learning. Finally, problem definition for automatic knowledge extraction is not easy because of ambiguous conceptual characteristics of knowledge. So, in order to overcome limits described above and improve the semantic performance of stock-related information searching, this study attempts to extract the knowledge entity by using neural tensor network and evaluate the performance of them. Different from other references, the purpose of this study is to extract knowledge entity which is related to individual stock items. Various but relatively simple data processing methods are applied in the presented model to solve the problems of previous researches and to enhance the effectiveness of the model. From these processes, this study has the following three significances. First, A practical and simple automatic knowledge extraction method that can be applied. Second, the possibility of performance evaluation is presented through simple problem definition. Finally, the expressiveness of the knowledge increased by generating input data on a sentence basis without complex morphological analysis. The results of the empirical analysis and objective performance evaluation method are also presented. The empirical study to confirm the usefulness of the presented model, experts' reports about individual 30 stocks which are top 30 items based on frequency of publication from May 30, 2017 to May 21, 2018 are used. the total number of reports are 5,600, and 3,074 reports, which accounts about 55% of the total, is designated as a training set, and other 45% of reports are designated as a testing set. Before constructing the model, all reports of a training set are classified by stocks, and their entities are extracted using named entity recognition tool which is the KKMA. for each stocks, top 100 entities based on appearance frequency are selected, and become vectorized using one-hot encoding. After that, by using neural tensor network, the same number of score functions as stocks are trained. Thus, if a new entity from a testing set appears, we can try to calculate the score by putting it into every single score function, and the stock of the function with the highest score is predicted as the related item with the entity. To evaluate presented models, we confirm prediction power and determining whether the score functions are well constructed by calculating hit ratio for all reports of testing set. As a result of the empirical study, the presented model shows 69.3% hit accuracy for testing set which consists of 2,526 reports. this hit ratio is meaningfully high despite of some constraints for conducting research. Looking at the prediction performance of the model for each stocks, only 3 stocks, which are LG ELECTRONICS, KiaMtr, and Mando, show extremely low performance than average. this result maybe due to the interference effect with other similar items and generation of new knowledge. In this paper, we propose a methodology to find out key entities or their combinations which are necessary to search related information in accordance with the user's investment intention. Graph data is generated by using only the named entity recognition tool and applied to the neural tensor network without learning corpus or word vectors for the field. From the empirical test, we confirm the effectiveness of the presented model as described above. However, there also exist some limits and things to complement. Representatively, the phenomenon that the model performance is especially bad for only some stocks shows the need for further researches. Finally, through the empirical study, we confirmed that the learning method presented in this study can be used for the purpose of matching the new text information semantically with the related stocks.

Effect of Mantidis $O\ddot{O}theca$ and Mori Fructus On treatment of Osteoporosis In Ovariectomized Rats (상표소와 상심자가 난소적출로 유발된 흰쥐의 골다공증 치료효과에 미치는 영향)

  • Lee, Jae-Woo;Seo, Bu-Il;Park, Ji-Ha;Roh, Seong-Soo;Kim, Yong-Hyun;Kim, Mi-Ryeo
    • The Korea Journal of Herbology
    • /
    • v.24 no.1
    • /
    • pp.59-71
    • /
    • 2009
  • Objectives:The present study has been undertaken to investigate the effects of Mantidis $O\ddot{O}theca$ and Mori Fructus on treatment of osteoporosis in ovariectomized rats. Methods: In this experiment, the rats were ovariectomized. Rats were administered by Mantidis $O\ddot{O}theca$ and Mori Fructus. The levels of bone mineral density, osteocalcin.ALP.calcium.phosphorus in serum, calcium. phosphorus.deoxypyridinoline in urine and calcium.phosphorus.ash weight in bone were measured. Results: 1. The levels of femoral and fibula-tibial bone mineral density were significantly increased in comparison with OVX group at 4, 8 weeks in Mantidis $O\ddot{O}theca$ group. And the levels of femoral and fibula-tibial bone mineral density were significantly increased in comparison with OVX group at 8 weeks in Mori Fructus group. 2. The levels of serum osteoclacin and ALP showed significant decrease in comparison with OVX group at 4, 8 weeks in Mantidis $O\ddot{O}theca$ and Mori Fructus group. The levels of serum calcium showed significant decrease in comparison with OVX group at 4 weeks in Mantidis $O\ddot{O}theca$ and Mori Fructus group. The levels of serum phosphorus showed significant decrease in comparison with OVX group at 4, 8 weeks in Mantidis $O\ddot{O}theca$ and Mori Fructus group. 3. The levels of urine calcium, phosphorus and deoxypyridinoline showed significant decrease in comparison with OVX group in Mantidis $O\ddot{O}theca$ and Mori Fructus group. 4. The levels of fibula-tibial calcium and phosphorus showed significant increase in comparison with OVX group in Mantidis $O\ddot{O}theca$ group and Mori Fructus group. The levels of femoral calcium and phosphorus showed significant increase in comparison with OVX group in Mori Fructus group. The levels of femoral and fibula-tibial ash weight showed significant increase in comparison with OVX group in Mantidis $O\ddot{O}theca$ group and Mori Fructus group. Conclusions: Reviewing these experimetal results, it appeared that Mantidis $O\ddot{O}theca$ and Mori Fructus had efficacy on treatment of osteoporosis.

A Study on the Landscape Symbolism of Tongdo-palkyung and It's Narrative Structure (통도팔경(通度八景)의 경관상징성(景觀象徵性)과 서사구조(敍事構造))

  • Rho, Jae-Hyun
    • Journal of the Korean Institute of Traditional Landscape Architecture
    • /
    • v.28 no.1
    • /
    • pp.27-37
    • /
    • 2010
  • This study tries to illuminate the features and values of the Buddhist temple Palkyung by closely examining the forms, structures, and meanings of Tongdo-palkyung(通度八景) handed down at Tongdosa Temple, the best among Korea's Buddhist temples with its three treasures of Buddha, law of Buddha and Buddhist monks. The findings of this study can be summarized as the following. First of all, it reveals the meaning of the geographical name Yeongchuksan(靈鷲山), located to the west of Tongdosa, and a spectacular sight spread like an eagle's spread wings, as well as its location and spatial features. In particular, the arrangement features of a number of attached hermitages clearly show Yeongchuksan's world as being a temple with buddhist treasures. The multi-layered unfolding and centripetal intention of the scenery can be perceived through the shape of the Sshangryongnongju(雙龍弄珠形), around Tongdosa and the feature of the enclosed landscape encircling the steps of Hyeolcheo(穴處) Geumganggyedan. The substances and components of Tongdopalkyung include sound-based spectacles derived from Beoneumgu(梵音具) creating sounds related to religious rituals to enlighten and redeem mankind, such as Yeongji(影池: a holy pond with shadow reflections), drum sounds, and bell sounds along with physical features like pine trees, Dae(臺), waterfalls, Dongcheon (洞天), and a glow in the sky. On the other hand, Palkyung's geographical arrangements exhibit a circular spatial formation based on the main motif as Buddhist symbolism, beginning with the 'Gukjangsangseokpyo(國長生石標)' awakening the territoriality of Tongdosa and locating the first scene 'Mupunghansong(舞風寒松)' in its introductory area, with the features of water, bridge, pine grove, and Iljumun(gate) to stand for the influx. Six other scenes including 'Anyangdongdae(安養東臺)' are placed in the sacred precincts around Daeungjeon and Geumganggyedan while the glow of sunset at 'Danjoseong' just outside the domain closes the symbolic circular formation of the Tongdopalkyung, which coincides with the development of the Mandala figure symbolizing 'Gusanpalhae(九山八海)' centered in Sumisan(須彌山). What is more, Tongdopalkyung, while excluding primary scenic elements inside the temple, maximizes the domain of the mountain's entrance and the effects of the multi-layered mountain, mountain upon mountain, by intensifying the influx and centripetal qualities. The Tongdopalkyung analysis reveals the antithesis of four-coupled scenes conveying buddhist principles and thoughts on the basis of seasons, directions, space and time to display a narrative structural landscape when viewed from the temple's territoriality. Likewise, the characteristics and porch structures of Tongdopalkyung are tools and language of symbols to both externally strengthen the temple's territoriality and to internally, maximize the desires to the Land of Happiness as well as intensify religious wishes and the Mandala's multi-layered qualities through the meanings of time and space.

Cultural Diversity and Repression in Communities: A Study on China and Latin America (공동체에서의 문화 다양성과 억압 -중국과 라틴아메리카를 중심으로-)

  • Kim Dug-sam
    • Journal of the Daesoon Academy of Sciences
    • /
    • v.44
    • /
    • pp.177-212
    • /
    • 2023
  • In this study, discussions of the suppression of cultural diversity in communities was conducted. First, based on the studies conducted so far and recent changes, the oppression that exists between the Chinese government and ethnic minorities was considered. The visible suppression mentioned was the expansion of Han Chinese Mandarin language education, sanctions on minority languages, and the expansion of higher education at the exclusion of minority identities. In terms of 'invisible' oppression, urbanization, urban development with modernization at the forefront, and the use of officials from minority ethnic groups educated by the central government were items that were discussed. Next, the case of Latin America was examined. In particular, attention was paid to the theory of resistance against Europeans and European culture. Based off of the worries and experiences of Latin American intellectuals who have underwent oppression as individuals from culturally diverse backgrounds, a mature theory was formulated that could be used to defend Chinese minorities in the future. There is a specificity to the problem of Chinese minority communities. However, from a large perspective, experience and self-critical exploration in Latin America serve as an opportunity to expand the specificity of Chinese minority communities. Their situation resembles previous situations in Latin America when native cultures were being culturally eroded by Europe. Thus, as Latin American scholars argue, a shift in perception is necessary. In addition to this, in the text, it is likewise necessary to reflect on diversity, freedom, and mutualistic respect. There are proposals advocating for the realization of Heyibutong (和而不同 harmony but not through sameness) based on the situation in China. In the process of this consideration, much thought was given about what the observed communities are like and what a hypothetically desirable community would be like. This extends not only to Chinese minority communities and native residents of Latin America, but also to Asians in the United States and foreigners in Korea. Through this, it is hoped that desirable communities characterized by cultural diversity can be skillfully pursued.

A Study on improvement of curriculum in Nursing (간호학 교과과정 개선을 위한 조사 연구)

  • 김애실
    • Journal of Korean Academy of Nursing
    • /
    • v.4 no.2
    • /
    • pp.1-16
    • /
    • 1974
  • This Study involved the development of a survey form and the collection of data in an effort-to provide information which can be used in the improvement of nursing curricula. The data examined were the kinds courses currently being taught in the curricula of nursing education institutions throughout Korea, credits required for course completion, and year in-which courses are taken. For the purposes of this study, curricula were classified into college, nursing school and vocational school categories. Courses were directed into the 3 major categories of general education courses, supporting science courses and professional education course, and further subdirector as. follows: 1) General education (following the classification of Philip H. phoenix): a) Symbolics, b) Empirics, c) Aesthetics. 4) Synthetics, e) Ethics, f) Synoptic. 2) Supporting science: a) physical science, b) biological science, c) social science, d) behavioral science, e) Health science, f) Educations 3) Professional Education; a) basic courses, b) courses in each of the respective fields of nursing. Ⅰ. General Education aimed at developing the individual as a person and as a member of society is relatively strong in college curricula compared with the other two. a) Courses included in the category of symbolics included Korean language, English, German. Chines. Mathematics. Statics: Economics and Computer most college curricula included 20 credits. of courses in this sub-category, while nursing schools required 12 credits and vocational school 10 units. English ordinarily receives particularly heavy emphasis. b) Research methodology, Domestic affair and women & courtney was included under the category of empirics in the college curricula, nursing and vocational school do not offer this at all. c) Courses classified under aesthetics were physical education, drill, music, recreation and fine arts. Most college curricula had 4 credits in these areas, nursing school provided for 2 credits, and most vocational schools offered 10 units. d) Synoptic included leadership, interpersonal relationship, and communications, Most schools did not offer courses of this nature. e) The category of ethics included citizenship. 2 credits are provided in college curricula, while vocational schools require 4 units. Nursing schools do not offer these courses. f) Courses included under synoptic were Korean history, cultural history, philosophy, Logics, and religion. Most college curricular 5 credits in these areas, nursing schools 4 credits. and vocational schools 2 units. g) Only physical education was given every Year in college curricula and only English was given in nursing schools and vocational schools in every of the curriculum. Most of the other courses were given during the first year of the curriculum. Ⅱ. Supporting science courses are fundamental to the practice and application of nursing theory. a) Physical science course include physics, chemistry and natural science. most colleges and nursing schools provided for 2 credits of physical science courses in their curricula, while most vocational schools did not offer t me. b) Courses included under biological science were anatomy, physiologic, biology and biochemistry. Most college curricula provided for 15 credits of biological science, nursing schools for the most part provided for 11 credits, and most vocational schools provided for 8 units. c) Courses included under social science were sociology and anthropology. Most colleges provided for 1 credit in courses of this category, which most nursing schools provided for 2 creates Most vocational school did not provide courses of this type. d) Courses included under behavioral science were general and clinical psychology, developmental psychology. mental hygiene and guidance. Most schools did not provide for these courses. e) Courses included under health science included pharmacy and pharmacology, microbiology, pathology, nutrition and dietetics, parasitology, and Chinese medicine. Most college curricula provided for 11 credits, while most nursing schools provide for 12 credits, most part provided 20 units of medical courses. f) Courses included under education included educational psychology, principles of education, philosophy of education, history of education, social education, educational evaluation, educational curricula, class management, guidance techniques and school & community. Host college softer 3 credits in courses in this category, while nursing schools provide 8 credits and vocational schools provide for 6 units, 50% of the colleges prepare these students to qualify as regular teachers of the second level, while 91% of the nursing schools and 60% of the vocational schools prepare their of the vocational schools prepare their students to qualify as school nurse. g) The majority of colleges start supporting science courses in the first year and complete them by the second year. Nursing schools and vocational schools usually complete them in the first year. Ⅲ. Professional Education courses are designed to develop professional nursing knowledge, attitudes and skills in the students. a) Basic courses include social nursing, nursing ethics, history of nursing professional control, nursing administration, social medicine, social welfare, introductory nursing, advanced nursing, medical regulations, efficient nursing, nursing english and basic nursing, College curricula devoted 13 credits to these subjects, nursing schools 14 credits, and vocational schools 26 units indicating a severe difference in the scope of education provided. b) There was noticeable tendency for the colleges to take a unified approach to the branches of nursing. 60% of the schools had courses in public health nursing, 80% in pediatric nursing, 60% in obstetric nursing, 90% in psychiatric nursing and 80% in medical-surgical nursing. The greatest number of schools provided 48 crudites in all of these fields combined. in most of the nursing schools, 52 credits were provided for courses divided according to disease. in the vocational schools, unified courses are provided in public health nursing, child nursing, maternal nursing, psychiatric nursing and adult nursing. In addition, one unit is provided for one hour a week of practice. The total number of units provided in the greatest number of vocational schools is thus Ⅲ units double the number provided in nursing schools and colleges. c) In th leges, the second year is devoted mainly to basic nursing courses, while the third and fourth years are used for advanced nursing courses. In nursing schools and vocational schools, the first year deals primarily with basic nursing and the second and third years are used to cover advanced nursing courses. The study yielded the following conclusions. 1. Instructional goals should be established for each courses in line with the idea of nursing, and curriculum improvements should be made accordingly. 2. Course that fall under the synthetics category should be strengthened and ways should be sought to develop the ability to cooperate with those who work for human welfare and health. 3. The ability to solve problems on the basis of scientific principles and knowledge and understanding of man society should be fostered through a strengthening of courses dealing with physical sciences, social sciences and behavioral sciences and redistribution of courses emphasizing biological and health sciences. 4. There should be more balanced curricula with less emphasis on courses in the major There is a need to establish courses necessary for the individual nurse by doing away with courses centered around specific diseases and combining them in unified courses. In addition it is possible to develop skill in dealing with people by using the social setting in comprehensive training. The most efficient ratio of the study experience should be studied to provide more effective, interesting education Elective course should be initiated to insure a man flexible, responsive educational program. 5. The curriculum stipulated in the education law should be examined.

  • PDF

A Study on Speech Recognition Using the HM-Net Topology Design Algorithm Based on Decision Tree State-clustering (결정트리 상태 클러스터링에 의한 HM-Net 구조결정 알고리즘을 이용한 음성인식에 관한 연구)

  • 정현열;정호열;오세진;황철준;김범국
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.2
    • /
    • pp.199-210
    • /
    • 2002
  • In this paper, we carried out the study on speech recognition using the KM-Net topology design algorithm based on decision tree state-clustering to improve the performance of acoustic models in speech recognition. The Korean has many allophonic and grammatical rules compared to other languages, so we investigate the allophonic variations, which defined the Korean phonetics, and construct the phoneme question set for phonetic decision tree. The basic idea of the HM-Net topology design algorithm is that it has the basic structure of SSS (Successive State Splitting) algorithm and split again the states of the context-dependent acoustic models pre-constructed. That is, it have generated. the phonetic decision tree using the phoneme question sets each the state of models, and have iteratively trained the state sequence of the context-dependent acoustic models using the PDT-SSS (Phonetic Decision Tree-based SSS) algorithm. To verify the effectiveness of the above algorithm we carried out the speech recognition experiments for 452 words of center for Korean language Engineering (KLE452) and 200 sentences of air flight reservation task (YNU200). Experimental results show that the recognition accuracy has progressively improved according to the number of states variations after perform the splitting of states in the phoneme, word and continuous speech recognition experiments respectively. Through the experiments, we have got the average 71.5%, 99.2% of the phoneme, word recognition accuracy when the state number is 2,000, respectively and the average 91.6% of the continuous speech recognition accuracy when the state number is 800. Also we haute carried out the word recognition experiments using the HTK (HMM Too1kit) which is performed the state tying, compared to share the parameters of the HM-Net topology design algorithm. In word recognition experiments, the HM-Net topology design algorithm has an average of 4.0% higher recognition accuracy than the context-dependent acoustic models generated by the HTK implying the effectiveness of it.

Twitter Issue Tracking System by Topic Modeling Techniques (토픽 모델링을 이용한 트위터 이슈 트래킹 시스템)

  • Bae, Jung-Hwan;Han, Nam-Gi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.109-122
    • /
    • 2014
  • People are nowadays creating a tremendous amount of data on Social Network Service (SNS). In particular, the incorporation of SNS into mobile devices has resulted in massive amounts of data generation, thereby greatly influencing society. This is an unmatched phenomenon in history, and now we live in the Age of Big Data. SNS Data is defined as a condition of Big Data where the amount of data (volume), data input and output speeds (velocity), and the variety of data types (variety) are satisfied. If someone intends to discover the trend of an issue in SNS Big Data, this information can be used as a new important source for the creation of new values because this information covers the whole of society. In this study, a Twitter Issue Tracking System (TITS) is designed and established to meet the needs of analyzing SNS Big Data. TITS extracts issues from Twitter texts and visualizes them on the web. The proposed system provides the following four functions: (1) Provide the topic keyword set that corresponds to daily ranking; (2) Visualize the daily time series graph of a topic for the duration of a month; (3) Provide the importance of a topic through a treemap based on the score system and frequency; (4) Visualize the daily time-series graph of keywords by searching the keyword; The present study analyzes the Big Data generated by SNS in real time. SNS Big Data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. In addition, such analysis requires the latest big data technology to process rapidly a large amount of real-time data, such as the Hadoop distributed system or NoSQL, which is an alternative to relational database. We built TITS based on Hadoop to optimize the processing of big data because Hadoop is designed to scale up from single node computing to thousands of machines. Furthermore, we use MongoDB, which is classified as a NoSQL database. In addition, MongoDB is an open source platform, document-oriented database that provides high performance, high availability, and automatic scaling. Unlike existing relational database, there are no schema or tables with MongoDB, and its most important goal is that of data accessibility and data processing performance. In the Age of Big Data, the visualization of Big Data is more attractive to the Big Data community because it helps analysts to examine such data easily and clearly. Therefore, TITS uses the d3.js library as a visualization tool. This library is designed for the purpose of creating Data Driven Documents that bind document object model (DOM) and any data; the interaction between data is easy and useful for managing real-time data stream with smooth animation. In addition, TITS uses a bootstrap made of pre-configured plug-in style sheets and JavaScript libraries to build a web system. The TITS Graphical User Interface (GUI) is designed using these libraries, and it is capable of detecting issues on Twitter in an easy and intuitive manner. The proposed work demonstrates the superiority of our issue detection techniques by matching detected issues with corresponding online news articles. The contributions of the present study are threefold. First, we suggest an alternative approach to real-time big data analysis, which has become an extremely important issue. Second, we apply a topic modeling technique that is used in various research areas, including Library and Information Science (LIS). Based on this, we can confirm the utility of storytelling and time series analysis. Third, we develop a web-based system, and make the system available for the real-time discovery of topics. The present study conducted experiments with nearly 150 million tweets in Korea during March 2013.