• Title/Summary/Keyword: Text analysis

Search Result 3,381, Processing Time 0.028 seconds

Color Recommendation for Text Based on Colors Associated with Words

  • Liba, Saki;Nakamura, Tetsuaki;Sakamoto, Maki
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.17 no.1
    • /
    • pp.21-29
    • /
    • 2012
  • In this paper, we propose a new method to select colors representing the meaning of text contents based on the cognitive relation between words and colors, Our method is designed on the previous study revealing the existence of crucial words to estimate the colors associated with the meaning of text contents, Using the associative probability of each color with a given word and the strength of color association of the word, we estimate the probability of colors associated with a given text. The goal of this study is to propose a system to recommend the cognitively plausible colors for the meaning of the input text. To build a versatile and efficient database used by our system, two psychological experiments were conducted by using news site articles. In experiment 1, we collected 498 words which were chosen by the participants as having the strong association with color. Subsequently, we investigated which color was associated with each word in experiment 2. In addition to those data, we employed the estimated values of the strength of color association and the colors associated with the words included in a very large corpus of newspapers (approximately 130,000 words) based on the similarity between the words obtained by Latent Semantic Analysis (LSA). Therefore our method allows us to select colors for a large variety of words or sentences. Finally, we verified that our system cognitively succeeded in proposing the colors associated with the meaning of the input text, comparing the correct colors answered by participants with the estimated colors by our method. Our system is expected to be of use in various types of situations such as the data visualization, the information retrieval, the art or web pages design, and so on.

Improved Text Recognition using Analysis of Illumination Component in Color Images (컬러 영상의 조명성분 분석을 통한 문자인식 성능 향상)

  • Choi, Mi-Young;Kim, Gye-Young;Choi, Hyung-Il
    • Journal of the Korea Society of Computer and Information
    • /
    • v.12 no.3
    • /
    • pp.131-136
    • /
    • 2007
  • This paper proposes a new approach to eliminate the reflectance component for the detection of text in color images. Color images, printed by color printing technology, normally have an illumination component as well as a reflectance component. It is well known that a reflectance component usually obstructs the task of detecting and recognizing objects like texts in the scene, since it blurs out an overall image. We have developed an approach that efficiently removes reflectance components while preserving illumination components. We decided whether an input image hits Normal or Polarized for determining the light environment, using the histogram which consisted of a red component. We were able to go ahead through the ability to extract by reducing the blur phenomenon of text by light because reflection component by an illumination change and removed it and extracted text. The experimental results have shown a superior performance even when an image has a complex background. Text detection and recognition performance is influenced by changing the illumination condition. Our method is robust to the images with different illumination conditions.

  • PDF

Text Watermarking Based on Syntactic Constituent Movement (구문요소의 전치에 기반한 문서 워터마킹)

  • Kim, Mi-Young
    • The KIPS Transactions:PartB
    • /
    • v.16B no.1
    • /
    • pp.79-84
    • /
    • 2009
  • This paper explores a method of text watermarking for agglutinative languages and develops a syntactic tree-based syntactic constituent movement scheme. Agglutinative languages provide a good ground for the syntactic tree-based natural language watermarking because syntactic constituent order is relatively free. Our proposed natural language watermarking method consists of seven procedures. First, we construct a syntactic dependency tree of unmarked text. Next, we perform clausal segmentation from the syntactic tree. Third, we choose target syntactic constituents, which will move within its clause. Fourth, we determine the movement direction of the target constituents. Then, we embed a watermark bit for each target constituent. Sixth, if the watermark bit does not coincide with the direction of the target constituent movement, we displace the target constituent in the syntactic tree. Finally, from the modified syntactic tree, we obtain a marked text. From the experimental results, we show that the coverage of our method is 91.53%, and the rate of unnatural sentences of marked text is 23.16%, which is better than that of previous systems. Experimental results also show that the marked text keeps the same style, and it has the same information without semantic distortion.

Analysis of Smart Factory Research Trends Based on Big Data Analysis (빅데이터 분석을 활용한 스마트팩토리 연구 동향 분석)

  • Lee, Eun-Ji;Cho, Chul-Ho
    • Journal of Korean Society for Quality Management
    • /
    • v.49 no.4
    • /
    • pp.551-567
    • /
    • 2021
  • Purpose: The purpose of this paper is to present implications by analyzing research trends on smart factories by text analysis and visual analysis(Comprehensive/ Fields / Years-based) which are big data analyses, by collecting data based on previous studies on smart factories. Methods: For the collection of analysis data, deep learning was used in the integrated search on the Academic Research Information Service (www.riss.kr) to search for "SMART FACTORY" and "Smart Factory" as search terms, and the titles and Korean abstracts were scrapped out of the extracted paper and they are organize into EXCEL. For the final step, 739 papers derived were analyzed using the Rx64 4.0.2 program and Rstudio using text mining, one of the big data analysis techniques, and Word Cloud for visualization. Results: The results of this study are as follows; Smart factory research slowed down from 2005 to 2014, but until 2019, research increased rapidly. According to the analysis by fields, smart factories were studied in the order of engineering, social science, and complex science. There were many 'engineering' fields in the early stages of smart factories, and research was expanded to 'social science'. In particular, since 2015, it has been studied in various disciplines such as 'complex studies'. Overall, in keyword analysis, the keywords such as 'technology', 'data', and 'analysis' are most likely to appear, and it was analyzed that there were some differences by fields and years. Conclusion: Government support and expert support for smart factories should be activated, and researches on technology-based strategies are needed. In the future, it is necessary to take various approaches to smart factories. If researches are conducted in consideration of the environment or energy, it is judged that bigger implications can be presented.

A Study on the Perception of Fashion Platforms and Fashion Smart Factories using Big Data Analysis (빅데이터 분석을 이용한 패션 플랫폼과 패션 스마트 팩토리에 대한 인식 연구)

  • Song, Eun-young
    • Fashion & Textile Research Journal
    • /
    • v.23 no.6
    • /
    • pp.799-809
    • /
    • 2021
  • This study aimed to grasp the perceptions and trends in fashion platforms and fashion smart factories using big data analysis. As a research method, big data analysis, fashion platform, and smart factory were identified through literature and prior studies, and text mining analysis and network analysis were performed after collecting text from the web environment between April 2019 and April 2021. After data purification with Textom, the words of fashion platform (1,0591 pieces) and fashion smart factory (9750 pieces) were used for analysis. Key words were derived, the frequency of appearance was calculated, and the results were visualized in word cloud and N-gram. The top 70 words by frequency of appearance were used to generate a matrix, structural equivalence analysis was performed, and the results were displayed using network visualization and dendrograms. The collected data revealed that smart factory had high social issues, but consumer interest and academic research were insufficient, and the amount and frequency of related words on the fashion platform were both high. As a result of structural equalization analysis, it was found that fashion platforms with strong connectivity between clusters are creating new competitiveness with service platforms that add sharing, manufacturing, and curation functions, and fashion smart factories can expect future value to grow together, according to digital technology innovation and platforms. This study can serve as a foundation for future research topics related to fashion platforms and smart factories.

A Text Mining Analysis on Students' Perceptions about Capstone Design: Case of Industrial & Management Engineering (텍스트 마이닝을 활용한 캡스톤 디자인에 관한 학생 인식 탐색: 산업경영공학 사례)

  • Wi, Gwang-Ho;Kim, Yun-jin;Kim, Moon-Soo
    • Journal of Engineering Education Research
    • /
    • v.25 no.5
    • /
    • pp.85-93
    • /
    • 2022
  • Capstone Design, a project-based learning technique, is the most important curriculum that clarifying major knowledge and cultivating the ability to apply through the process of solving problems in the industrial field centered on the student project team. Accordingly, various and extensive studies are being conducted for the successful implementation of capstone design courses. Unlike previous studies, this study aimed to quantitatively analyze the opinions that recorded the experiences and feelings of students who performed capstone design, and used text mining methodologies such as frequency analysis, correlation analysis, topic modeling, and sentiment analysis. As a result of examining the overall opinions of the latter period through frequency analysis and correlation analysis, there was a difference between the languages used by the students in the opinions according to gender and project results. Through topic modeling analysis, 'topic selection' and 'the relationship between team members' showed an increase in occupancy or high occupancy, and topics such as 'presentation', 'leadership', and 'feeling what they felt' showed a tendency to decreasing occupancy. Lastly, sentiment analysis has found that female students showed more neutral emotions than male students, and the passed group showed more negative emotions than the non-passed group and less neutral emotions. Based on these findings, students' practical recognition of the curriculum was considered and implications for the improvement of capstone design were presented.

A Big Data Analysis of Public Interest in Defense Reform 2.0 and Suggestions for Policy Completion

  • Kim, Tae Kyoung;Kang, Wonseok
    • Journal of East Asia Management
    • /
    • v.4 no.1
    • /
    • pp.1-22
    • /
    • 2023
  • This study conducted a big data analysis study through text mining and semantic network analysis to explore the perception of defense reform 2.0. The collected data were analyzed with the top 70 keywords as the appropriate range for network visualization. Through word frequency analysis, connection centrality analysis, and an N-gram analysis, we identified issues that received much attention such as troop reduction, shortening of military service period, dismantling of the border area unit, and returning wartime operational control. In particular, the results of clustering words through CONCOR analysis showed that there was a great interest in pursuing the technical group, concerns about military capacity reduction, and reorganization of manpower structure. The results of the analysis through text mining techniques are as follows. First, it was found that there was a lack of awareness about measures to reinforce the reduced troops while receiving much attention to the reduction of troops in Defense Reform 2.0. Second, it was found that it is necessary to actively communicate with the local community due to the deconstruction and movement of the border area units, such as the decrease of the population of the region and the collapse of the local commercial area. Third, it was judged that it is necessary to show substantial results through the promotion of barracks culture and the defense industry, which showed that there was less interest than military structure and defense operation from the people and the introduction of active policies. Through this study, we analyzed the public's interest in defense reform 2.0, which is a representative defense policy, and suggested a plan to draw support for national policy.

Trend Analysis of FinTech and Digital Financial Services using Text Mining (텍스트마이닝을 활용한 핀테크 및 디지털 금융 서비스 트렌드 분석)

  • Kim, Do-Hee;Kim, Min-Jeong
    • Journal of Digital Convergence
    • /
    • v.20 no.3
    • /
    • pp.131-143
    • /
    • 2022
  • Focusing on FinTech keywords, this study is analyzing newspaper articles and Twitter data by using text mining methodology in order to understand trends in the industry of domestic digital financial service. In the growth of FinTech lifecycle, the frequency analysis has been performed by four important points: Mobile Payment Service, Internet Primary Bank, Data 3 Act, MyData Businesses. Utilizing frequency analysis, which combines the keywords 'China', 'USA', and 'Future' with the 'FinTech', has been predicting the FinTech industry regarding of the current and future position. Next, sentiment analysis was conducted on Twitter to quantify consumers' expectations and concerns about FinTech services. Therefore, this study is able to share meaningful perspective in that it presented strategic directions that the government and companies can use to understanding future FinTech market by combining frequency analysis and sentiment analysis.

Strategies on Text Screen Design Of The Electronic Textbook For Focused Attention Using Automatic Text Scroll (자동 스크롤 가능을 이용한 주의력 집중을 위한 웹기반 전자교과서 텍스트 화면 설계전략)

  • Kwon, Hyunggyu
    • The Journal of Korean Association of Computer Education
    • /
    • v.5 no.4
    • /
    • pp.134-145
    • /
    • 2002
  • The purpose of this study is to present the functional and technical solutions for text learning of web-based textbook in which each letter has its own focal point. The solutions help learners not to lose the main focus when eye moves to the next letter or line. The text screen of the electronic textbook automatically scrolls the text to up and down or left and right directions which are preassigned by learner. It doesn't need the operation of mouse or keyboard. And learner can change scroll speed and types anytime during scrolling. Automatic text scroll function is a solution for controlling data and screen to reflect the personal favor and ability. It contains the content structure of the text(characteristics, categorizations etc.), the appearance of the text(density, size, font etc.), scroll options(scroll, speed etc.), program control type(ram resident program etc.), and the application of the screen design principles(legibility etc.). To resolve these functional problems, technical 8 phases are provided, which are environment setting, scroll option setting, copy, data analysis, scroll coding, centered focus coding, left and right focus coding, implementation. The learner can focus on text without dispersion because the text focal points stay in the fixed area of screen. 1bey read the text following their preferences for fonts, sizes, line spacing and so on.

  • PDF

Bankruptcy Prediction Modeling Using Qualitative Information Based on Big Data Analytics (빅데이터 기반의 정성 정보를 활용한 부도 예측 모형 구축)

  • Jo, Nam-ok;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.33-56
    • /
    • 2016
  • Many researchers have focused on developing bankruptcy prediction models using modeling techniques, such as statistical methods including multiple discriminant analysis (MDA) and logit analysis or artificial intelligence techniques containing artificial neural networks (ANN), decision trees, and support vector machines (SVM), to secure enhanced performance. Most of the bankruptcy prediction models in academic studies have used financial ratios as main input variables. The bankruptcy of firms is associated with firm's financial states and the external economic situation. However, the inclusion of qualitative information, such as the economic atmosphere, has not been actively discussed despite the fact that exploiting only financial ratios has some drawbacks. Accounting information, such as financial ratios, is based on past data, and it is usually determined one year before bankruptcy. Thus, a time lag exists between the point of closing financial statements and the point of credit evaluation. In addition, financial ratios do not contain environmental factors, such as external economic situations. Therefore, using only financial ratios may be insufficient in constructing a bankruptcy prediction model, because they essentially reflect past corporate internal accounting information while neglecting recent information. Thus, qualitative information must be added to the conventional bankruptcy prediction model to supplement accounting information. Due to the lack of an analytic mechanism for obtaining and processing qualitative information from various information sources, previous studies have only used qualitative information. However, recently, big data analytics, such as text mining techniques, have been drawing much attention in academia and industry, with an increasing amount of unstructured text data available on the web. A few previous studies have sought to adopt big data analytics in business prediction modeling. Nevertheless, the use of qualitative information on the web for business prediction modeling is still deemed to be in the primary stage, restricted to limited applications, such as stock prediction and movie revenue prediction applications. Thus, it is necessary to apply big data analytics techniques, such as text mining, to various business prediction problems, including credit risk evaluation. Analytic methods are required for processing qualitative information represented in unstructured text form due to the complexity of managing and processing unstructured text data. This study proposes a bankruptcy prediction model for Korean small- and medium-sized construction firms using both quantitative information, such as financial ratios, and qualitative information acquired from economic news articles. The performance of the proposed method depends on how well information types are transformed from qualitative into quantitative information that is suitable for incorporating into the bankruptcy prediction model. We employ big data analytics techniques, especially text mining, as a mechanism for processing qualitative information. The sentiment index is provided at the industry level by extracting from a large amount of text data to quantify the external economic atmosphere represented in the media. The proposed method involves keyword-based sentiment analysis using a domain-specific sentiment lexicon to extract sentiment from economic news articles. The generated sentiment lexicon is designed to represent sentiment for the construction business by considering the relationship between the occurring term and the actual situation with respect to the economic condition of the industry rather than the inherent semantics of the term. The experimental results proved that incorporating qualitative information based on big data analytics into the traditional bankruptcy prediction model based on accounting information is effective for enhancing the predictive performance. The sentiment variable extracted from economic news articles had an impact on corporate bankruptcy. In particular, a negative sentiment variable improved the accuracy of corporate bankruptcy prediction because the corporate bankruptcy of construction firms is sensitive to poor economic conditions. The bankruptcy prediction model using qualitative information based on big data analytics contributes to the field, in that it reflects not only relatively recent information but also environmental factors, such as external economic conditions.