• Title/Summary/Keyword: traditional experiments

Search Result 1,060, Processing Time 0.028 seconds

Enhancing LoRA Fine-tuning Performance Using Curriculum Learning

  • Daegeon Kim;Namgyu Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.3
    • /
    • pp.43-54
    • /
    • 2024
  • Recently, there has been a lot of research on utilizing Language Models, and Large Language Models have achieved innovative results in various tasks. However, the practical application faces limitations due to the constrained resources and costs required to utilize Large Language Models. Consequently, there has been recent attention towards methods to effectively utilize models within given resources. Curriculum Learning, a methodology that categorizes training data according to difficulty and learns sequentially, has been attracting attention, but it has the limitation that the method of measuring difficulty is complex or not universal. Therefore, in this study, we propose a methodology based on data heterogeneity-based Curriculum Learning that measures the difficulty of data using reliable prior information and facilitates easy utilization across various tasks. To evaluate the performance of the proposed methodology, experiments were conducted using 5,000 specialized documents in the field of information communication technology and 4,917 documents in the field of healthcare. The results confirm that the proposed methodology outperforms traditional fine-tuning in terms of classification accuracy in both LoRA fine-tuning and full fine-tuning.

A Real-Time Stock Market Prediction Using Knowledge Accumulation (지식 누적을 이용한 실시간 주식시장 예측)

  • Kim, Jin-Hwa;Hong, Kwang-Hun;Min, Jin-Young
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.109-130
    • /
    • 2011
  • One of the major problems in the area of data mining is the size of the data, as most data set has huge volume these days. Streams of data are normally accumulated into data storages or databases. Transactions in internet, mobile devices and ubiquitous environment produce streams of data continuously. Some data set are just buried un-used inside huge data storage due to its huge size. Some data set is quickly lost as soon as it is created as it is not saved due to many reasons. How to use this large size data and to use data on stream efficiently are challenging questions in the study of data mining. Stream data is a data set that is accumulated to the data storage from a data source continuously. The size of this data set, in many cases, becomes increasingly large over time. To mine information from this massive data, it takes too many resources such as storage, money and time. These unique characteristics of the stream data make it difficult and expensive to store all the stream data sets accumulated over time. Otherwise, if one uses only recent or partial of data to mine information or pattern, there can be losses of valuable information, which can be useful. To avoid these problems, this study suggests a method efficiently accumulates information or patterns in the form of rule set over time. A rule set is mined from a data set in stream and this rule set is accumulated into a master rule set storage, which is also a model for real-time decision making. One of the main advantages of this method is that it takes much smaller storage space compared to the traditional method, which saves the whole data set. Another advantage of using this method is that the accumulated rule set is used as a prediction model. Prompt response to the request from users is possible anytime as the rule set is ready anytime to be used to make decisions. This makes real-time decision making possible, which is the greatest advantage of this method. Based on theories of ensemble approaches, combination of many different models can produce better prediction model in performance. The consolidated rule set actually covers all the data set while the traditional sampling approach only covers part of the whole data set. This study uses a stock market data that has a heterogeneous data set as the characteristic of data varies over time. The indexes in stock market data can fluctuate in different situations whenever there is an event influencing the stock market index. Therefore the variance of the values in each variable is large compared to that of the homogeneous data set. Prediction with heterogeneous data set is naturally much more difficult, compared to that of homogeneous data set as it is more difficult to predict in unpredictable situation. This study tests two general mining approaches and compare prediction performances of these two suggested methods with the method we suggest in this study. The first approach is inducing a rule set from the recent data set to predict new data set. The seocnd one is inducing a rule set from all the data which have been accumulated from the beginning every time one has to predict new data set. We found neither of these two is as good as the method of accumulated rule set in its performance. Furthermore, the study shows experiments with different prediction models. The first approach is building a prediction model only with more important rule sets and the second approach is the method using all the rule sets by assigning weights on the rules based on their performance. The second approach shows better performance compared to the first one. The experiments also show that the suggested method in this study can be an efficient approach for mining information and pattern with stream data. This method has a limitation of bounding its application to stock market data. More dynamic real-time steam data set is desirable for the application of this method. There is also another problem in this study. When the number of rules is increasing over time, it has to manage special rules such as redundant rules or conflicting rules efficiently.

A Study on Xu Bing's artworks Contributed to expansion of printmaking in Contemporary Chinese Art (중국 현대미술에서의 판화 매체 확장을 일으킨 쉬빙(徐冰) 작품 연구)

  • Song, Dae-Sup;Cho, Ye-In
    • Cartoon and Animation Studies
    • /
    • s.45
    • /
    • pp.321-343
    • /
    • 2016
  • The purpose of this thesis is to look through the political and social background of China preparing for a new era after getting out of the Communist Party of Mao Zedong, rapid inflow of the Western modernism and the avant-garde art arising in China with the focus of art works of Xu Bing, which contributed to the expansion of printmaking of China. Particularly, 85 New Wave Movement arose by young artists since 1985 and the China/Avant-Garde Exhibition held in Beijing in 1989 are the two important issues which reflect a new change from the traditional Chinese art. The artists of 85 New Wave Movement, who pursued a historical revolution and novelty, worked very actively by leading private exhibitions. Since the Cultural Revolution, the government owned the National Museum of Fine Art Beijing had exhibitions on a large scale displaying various visual arts such as performing art, installation, painting, sculpture but the Chinese government interrupted exhibitions two time due to bold performing art and unconcealed installation. Some artists were even taken to the police when performing art. Under these circumstances, Xu Bing, who majored printmaking, produced one of his major works, Books from the sky(1988), while he was working on various experiments focusing on the production process of printmaking and its repetitiveness. Xu Bing devised letters, carved them in trees and finally created approximately 2000 characters. Going further he displayed it as installation work, which means the developed characters go beyond a printed form, for audiences. This made him earn favorable reviews since it was a form of western art coupled with Chinese contents 'Chinese character'. After he received unfavorable reviews, however, he went to America leaving his last work in China, Ghost Pounding the Wall, in 1990, which was not able to exhibited. In those days, China society was going through a chaotic era thanks to the extinction of the Cultural Revolution and Deng Xiaoping's(1904-1997) reformation after the debacle of Tiananmen Massacre. This study looks into Xu Bing's artworks from his initial print works until he went to the US in 1991 and examines how he performed experiments utilizing reproductivity and plurality of prints tinged with Chinese traditional elements, and ultimately became one of the avant-garde artists representing the period.

A Methodology for Automatic Multi-Categorization of Single-Categorized Documents (단일 카테고리 문서의 다중 카테고리 자동확장 방법론)

  • Hong, Jin-Sung;Kim, Namgyu;Lee, Sangwon
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.3
    • /
    • pp.77-92
    • /
    • 2014
  • Recently, numerous documents including unstructured data and text have been created due to the rapid increase in the usage of social media and the Internet. Each document is usually provided with a specific category for the convenience of the users. In the past, the categorization was performed manually. However, in the case of manual categorization, not only can the accuracy of the categorization be not guaranteed but the categorization also requires a large amount of time and huge costs. Many studies have been conducted towards the automatic creation of categories to solve the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorizing complex documents with multiple topics because the methods work by assuming that one document can be categorized into one category only. In order to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, they are also limited in that their learning process involves training using a multi-categorized document set. These methods therefore cannot be applied to multi-categorization of most documents unless multi-categorized training sets are provided. To overcome the limitation of the requirement of a multi-categorized training set by traditional multi-categorization algorithms, we propose a new methodology that can extend a category of a single-categorized document to multiple categorizes by analyzing relationships among categories, topics, and documents. First, we attempt to find the relationship between documents and topics by using the result of topic analysis for single-categorized documents. Second, we construct a correspondence table between topics and categories by investigating the relationship between them. Finally, we calculate the matching scores for each document to multiple categories. The results imply that a document can be classified into a certain category if and only if the matching score is higher than the predefined threshold. For example, we can classify a certain document into three categories that have larger matching scores than the predefined threshold. The main contribution of our study is that our methodology can improve the applicability of traditional multi-category classifiers by generating multi-categorized documents from single-categorized documents. Additionally, we propose a module for verifying the accuracy of the proposed methodology. For performance evaluation, we performed intensive experiments with news articles. News articles are clearly categorized based on the theme, whereas the use of vulgar language and slang is smaller than other usual text document. We collected news articles from July 2012 to June 2013. The articles exhibit large variations in terms of the number of types of categories. This is because readers have different levels of interest in each category. Additionally, the result is also attributed to the differences in the frequency of the events in each category. In order to minimize the distortion of the result from the number of articles in different categories, we extracted 3,000 articles equally from each of the eight categories. Therefore, the total number of articles used in our experiments was 24,000. The eight categories were "IT Science," "Economy," "Society," "Life and Culture," "World," "Sports," "Entertainment," and "Politics." By using the news articles that we collected, we calculated the document/category correspondence scores by utilizing topic/category and document/topics correspondence scores. The document/category correspondence score can be said to indicate the degree of correspondence of each document to a certain category. As a result, we could present two additional categories for each of the 23,089 documents. Precision, recall, and F-score were revealed to be 0.605, 0.629, and 0.617 respectively when only the top 1 predicted category was evaluated, whereas they were revealed to be 0.838, 0.290, and 0.431 when the top 1 - 3 predicted categories were considered. It was very interesting to find a large variation between the scores of the eight categories on precision, recall, and F-score.

Design and Implementation of IoT based Low cost, Effective Learning Mechanism for Empowering STEM Education in India

  • Simmi Chawla;Parul Tomar;Sapna Gambhir
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.4
    • /
    • pp.163-169
    • /
    • 2024
  • India is a developing nation and has come with comprehensive way in modernizing its reducing poverty, economy and rising living standards for an outsized fragment of its residents. The STEM (Science, Technology, Engineering, and Mathematics) education plays an important role in it. STEM is an educational curriculum that emphasis on the subjects of "science, technology, engineering, and mathematics". In traditional education scenario, these subjects are taught independently, but according to the educational philosophy of STEM that teaches these subjects together in project-based lessons. STEM helps the students in his holistic development. Youth unemployment is the biggest concern due to lack of adequate skills. There is a huge skill gap behind jobless engineers and the question arises how we can prepare engineers for a better tomorrow? Now a day's Industry 4.0 is a new fourth industrial revolution which is an intelligent networking of machines and processes for industry through ICT. It is based upon the usage of cyber-physical systems and Internet of Things (IoT). Industrial revolution does not influence only production but also educational system as well. IoT in academics is a new revolution to the Internet technology, which introduced "Smartness" in the entire IT infrastructure. To improve socio-economic status of the India students must equipped with 21st century digital skills and Universities, colleges must provide individual learning kits to their students which can help them in enhancing their productivity and learning outcomes. The major goal of this paper is to present a low cost, effective learning mechanism for STEM implementation using Raspberry Pi 3+ model (Single board computer) and Node Red open source visual programming tool which is developed by IBM for wiring hardware devices together. These tools are broadly used to provide hands on experience on IoT fundamentals during teaching and learning. This paper elaborates the appropriateness and the practicality of these concepts via an example by implementing a user interface (UI) and Dashboard in Node-RED where dashboard palette is used for demonstration with switch, slider, gauge and Raspberry pi palette is used to connect with GPIO pins present on Raspberry pi board. An LED light is connected with a GPIO pin as an output pin. In this experiment, it is shown that the Node-Red dashboard is accessing on Raspberry pi and via Smartphone as well. In the final step results are shown in an elaborate manner. Conversely, inadequate Programming skills in students are the biggest challenge because without good programming skills there would be no pioneers in engineering, robotics and other areas. Coding plays an important role to increase the level of knowledge on a wide scale and to encourage the interest of students in coding. Today Python language which is Open source and most demanding languages in the industry in order to know data science and algorithms, understanding computer science would not be possible without science, technology, engineering and math. In this paper a small experiment is also done with an LED light via writing source code in python. These tiny experiments are really helpful to encourage the students and give play way to learn these advance technologies. The cost estimation is presented in tabular form for per learning kit provided to the students for Hands on experiments. Some Popular In addition, some Open source tools for experimenting with IoT Technology are described. Students can enrich their knowledge by doing lots of experiments with these freely available software's and this low cost hardware in labs or learning kits provided to them.

The Strategy for the Development of Bio-Resources Utilizing Sericultural Products and Insects

  • Lee, Won-Chu;Kim, Iksoo
    • International Journal of Industrial Entomology and Biomaterials
    • /
    • v.1 no.2
    • /
    • pp.95-102
    • /
    • 2000
  • Experiments related to the field of sericulture started in the years 1900, in Korea. The sericultural experimental station in Korea was first organized among agricultural fields in Korea, indicating that sericulture in Korea was regarded as an important field of agriculture. Sericulture has been devoted to a great deal for the improvement of Korean economy during the past 100 years even under the coarse social circumstances caused particularly by the Korean War, However, the traditional Korean sericulture, aimed to produce silk yarn, was weakened, because of several reasons such as diminishment in silk consumption, increased labor charge in Korea, and so on. After this difficulty time, the Korean sericulture was revolutionized by shifting into functional sericulture from 1995, and the Korean sericulture now plays an important role for the improvement of human health. Mulberry tree, silkworm, and silk have a boundless potential to be developed as resources. We expect the know-how obtained through silkworm research would expand to the other insect research too. Thus, an area of entomological industry is hoped to prosper owing to insect research as well as sericulture. Mulberry tree is known to possess many bio-active substances, so it can be utilized as a resource for substitute medicine and a raw material for the functional food. In addition, an invention of genetically engineered mulberry variety, which will produce more bioactive substances, is expected. Silkworm is one of the most extensively studied insect organisms on the genome so far, Thus, silkworm is expected to be an "insect bio-factory", enabling mass-production of useful proteins by transformation, in which useful foreign genes are assimilated into silkworm. Silk can be transformed into several phases, because it possesses useful functional groups, which are sensitive to chemical reaction. Also, because silk fibrin itself is protein, it has a superior applicability as tissue membrane. Due to this usefulness, many researchers are now working on the silk as food, cosmetic, medical resource, and bioengineering resource, and even an expanded application is expected using silk in the future. Until now, the researches on insects were largely focused on the prevention of the damage caused by pest, instead of a beneficial aspect. However, insects are thought to be the fourth natural resource in the world, possessing unlimited potential as world resources in the near future. Therefore, our entomological research effort should be focused on the subject with potential for industrialization. Such subject includes selecting the insect species useful for environmental evaluation, construction of environment-friendly agricultural ecosystem, pollen mediation, pet, and advanced bio-resources.

  • PDF

A Remote Trace Debugger for Multi-Task Programs in Qplus-T Embedded Internet System (Qplus-T내장형 인터넷 시스템에서 멀티 태스크 프로그램을 위한 원격 트레이스 디버거)

  • 이광용;김흥남
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.9 no.2
    • /
    • pp.166-181
    • /
    • 2003
  • With the rapid growth of Internet, many devices such as Web TVs, PDAs and Web phones, begin to be directly connected to the Internet. These devices need real-time operating systems (RTOS) to support complex real-time applications running on them. Development of such real-time applications called embedded internet applications, is difficult due to the lack of adequate tools, especially debuggers. In this paper we present a new tracepoint debugging tool for the Qplus-T RTOS embedded system, which facilitates the instrumentations of the real-time software applications with timing trace-points. Compared with traditional breakpoint debugger, this trace-point debugger provides the ability to dynamically collect and record application data for on-line examination and for further off-line analysis. And, the trace-points can also provide the means for assigning new values to the running application's variables, without neither halting its execution nor interfering with its natural execution flow. Our trace-point debugger provides a highly efficient method for adding numerous monitoring trace-points within a real time target application such as Qplus-T internet applications, utilizing these trace-points to monitor and to analyze the application's behavior while it is running. And also, our trace debugger is different from previous one in that we can specify and detect the timing violations using its RTL (Real-Time Logic) trace experiments.

Caching and Concurrency Control in a Mobile Client/Sever Computing Environment (이동 클라이언트/서버 컴퓨팅환경에서의 캐싱 및 동시성 제어)

  • Lee, Sang-Geun;Hwang, Jong-Seon;Lee, Won-Gyu;Yu, Heon-Chang
    • Journal of KIISE:Software and Applications
    • /
    • v.26 no.8
    • /
    • pp.974-987
    • /
    • 1999
  • 이동 컴퓨팅 환경에서 자주 접근하는 데이터에 대한 캐싱은 무선 채널의 좁은 대역폭에서 경쟁을 줄일 수 있는 유용한 기술이다. 그러나, 트랜잭션 캐시 일관성을 지원하는 전통적인 클라이언트/서버 전략은 클라이언트와 서버간에 많은 양의 통신을 필요로 하기 때문에 이동 클라이언트/서버 컴퓨팅 환경에서는 적절하지 않다. 본 논문에서는 브로드캐스트-기반 캐시 무효화 정책을 사용하면서 트랜잭션 캐시 일관성을 지원하는 OCC-UTS (Optimistic Concurrency Control with Update TimeStamp) 프로토콜을 제안한다. 접근한 데이터에 대한 일관성 검사 및 완료 프로토콜은 캐시 무효화 과정의 내부 과정으로 완전 분산 형태로 효율적으로 구현되며, 일관성 체크의 대부분이 이동 클라이언트에서 수행된다. 또한, 분석 모델에 기반한 성능 비교를 통해, 본 논문에서 제안하는 OCC-UTS 프로토콜이 다른 경쟁 프로토콜보다 높은 트랜잭션 처리율을 얻으며, 데이터 항목을 자주 접근하면 할수록 지역 캐시를 사용하는 OCC-UTS 프로토콜이 더 효율적임을 보인다. 이동 클라이언트의 접속 단절에 대해서는 무효화 브로드캐스트 윈도우를 크게 하여 접속 단절에 적절히 대처할 수 있다.Abstract In a mobile computing environment, caching of frequently accessed data has been shown to be a useful technique for reducing contention on the narrow bandwidth of the wireless channels. However, the traditional client/server strategies for supporting transactional cache consistency that require extensive communications between a client and a server are not appropriate in a mobile client/server computing environment. In this paper, we propose a new protocol, called OCC-UTS (Optimisitic Concurrency Control with Update TimeStamp), to support transactional cache consistency in a mobile client/server computing environment by utilizing the broadcast-based solutions for the problem of invalidating caches. The consistency check on accessed data and the commitment protocol are implemented in a truly distributed fashion as an integral part of cache invalidation process, with most burden of consistency check being downloaded to mobile clients. Also, our experiments based on an analytical model substantiate the basic idea and study the performance characteristics. Experimental results show that OCC-UTS protocol without local cache outperforms other competitor protocol, and the more frequent a mobile client accesses data items the more efficient OCC-UTS protocol with local cache is. With respect to disconnection, the tolerance to disconnection is improved if the invalidation broadcast window size is extended.

Two-Way Donation Locking for Transaction Management in Database Systems (데이터베이스 시스템에서 거래관리를 위한 두단계기부잠금규약)

  • Rhee, Hae-Kyung;Kim, Ung-Mo
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.37 no.2
    • /
    • pp.1-10
    • /
    • 2000
  • Traditional syntax-oriented serializability notions ate considered to be not enough to handle in particular various types of transaction in terms of duration of execution. To deal with this situation, altruistic locking has attempted to reduce delay effect associated with lock release moment by use of the idea of donation. An improved form of altruism has also been deployed in extended altruistic locking in a way that scope of data to be early released is enlarged to include even data initially not intended to be donated. In this paper, we first of all investigated limitations inherent in both altruistic schemes from the perspective of alleviating starvation occasions for transactions in particular of short-lived nature. The idea of two-way donation locking(2DL) has then been experimented to see the effect of more than single donation in client-server database systems. Simulation experiments shows that 2DL outperforms the conventional two-phase locking in terms of the degree of concurrency and average transaction waiting time under the circumstances that the size of long-transaction is in between 5 and 9.

  • PDF

The opening efficiency of the miniaturized large-scale net for anchovy boat seine to reduce the fleet size (선단 축소를 위한 기선권현망 축소형 대형 어구의 전개 성능)

  • AN, Young-Su;BACK, Young-Su;JIN, Song-Han;JANG, Choong-Sik;KANG, Myoung-Hee;CHA, Bong-Jin;CHO, Youn-Hyoung;KIM, Bo-Yeon;CHA, Ju-Hyeng
    • Journal of the Korean Society of Fisheries and Ocean Technology
    • /
    • v.54 no.1
    • /
    • pp.12-24
    • /
    • 2018
  • This study was conducted in order to improve opening efficiency of the miniaturized large-scale net for anchovy boat seine gear to reduce the fleet size. Field experiments were performed to observe geometry of nets by catcher boats. When the distances between the two ships were 150, 300 and 450 m, and the speeds of towing nets were 0.6, 0.9, and 1.2 k't, respectively. The vertical opening and actual opening of each part of the miniaturized large-scale net was as follows: the front part of the wing net, 8.7-13.3 m, 51-78%; the middle part of the wing net, 28.1-34.2 m, 55-67%; the entrance of the inside wing net, 31.3-38.5 m, 60-73%; the square and bosom, 22.7-29.6 m, 47-62%; the entrance of the body net, 20.9-26.4 m, 42-52%; the entrance of the bag net, 17.2-21 m, 72-89%; the flapper, 13.2-15.3 m, 78-83%; and the end of the bag net, 13.2-15.7 m, 72-75%. By connecting the net pendants with the front part of the wing net, the opening of the front part of the wing net was significantly improved compared to the traditional gear, which ensured both the wing net and the inside wing net with a normal net height. This, in turn, increased the efficiency of herding. The height of the body and bag nets was also higher than that of the tradition gear. In particular, the body net attached to the gear significantly improved the pocket shape of the gear and reduced the number of fish that were caught and escaped from the bag net, which increased the rate of fishing. The tension of towing nets was measured approximately between 2,958 and 7,110 kg, which indicates that the fleet can tow nets with 350 ps, the standard engine horse power. The fishing operation time was shortened compared with of the existent net, and the large-scale buoy attachment operation was also possible to operate the ship without fish detecting boat.