• Title/Summary/Keyword: chunks

Search Result 62, Processing Time 0.019 seconds

A Z-Index based MOLAP Cube Storage Scheme (Z-인덱스 기반 MOLAP 큐브 저장 구조)

  • Kim, Myung;Lim, Yoon-Sun
    • Journal of KIISE:Databases
    • /
    • v.29 no.4
    • /
    • pp.262-273
    • /
    • 2002
  • MOLAP is a technology that accelerates multidimensional data analysis by storing data in a multidimensional array and accessing them using their position information. Depending on a mapping scheme of a multidimensional array onto disk, the sliced of MOLAP operations such as slice and dice varies significantly. [1] proposed a MOLAP cube storage scheme that divides a cube into small chunks with equal side length, compresses sparse chunks, and stores the chunks in row-major order of their chunk indexes. This type of cube storage scheme gives a fair chance to all dimensions of the input data. Here, we developed a variant of their cube storage scheme by placing chunks in a different order. Our scheme accelerates slice and dice operations by aligning chunks to physical disk block boundaries and clustering neighboring chunks. Z-indexing is used for chunk clustering. The efficiency of the proposed scheme is evaluated through experiments. We showed that the proposed scheme is efficient for 3~5 dimensional cubes that are frequently used to analyze business data.

Detection of Needle in trimmings or meat offals using DCGAN (DCGAN을 이용한 잡육에서의 바늘 검출)

  • Jang, Won-Jae;Cha, Yun-Seok;Keum, Ye-Eun;Lee, Ye-Jin;Kim, Jeong-Do
    • Journal of Sensor Science and Technology
    • /
    • v.30 no.5
    • /
    • pp.300-308
    • /
    • 2021
  • Usually, during slaughter, the meat is divided into large chunks by part after deboning. The meat chunks are inspected for the presence of needles with an X-ray scanner. Although needles in the meat chunks are easily detectable, they can also be found in trimmings and meat offals, where meat skins, fat chunks, and pieces of meat from different parts get agglomerated. Detection of needles in trimmings and meat offals becomes challenging because of many needle-like patterns that are detected by the X-ray scanner. This problem can be solved by learning the trimmings or meat offals using deep learning. However, it is not easy to collect a large number of learning patterns in trimmings or meat offals. In this study, we demonstrate the use of deep convolutional generative adversarial network (DCGAN) to create fake images of trimmings or meat offals and train them using a convolution neural network (CNN).

A Bitmap Index for Chunk-Based MOLAP Cubes (청크 기반 MOLAP 큐브를 위한 비트맵 인덱스)

  • Lim, Yoon-Sun;Kim, Myung
    • Journal of KIISE:Databases
    • /
    • v.30 no.3
    • /
    • pp.225-236
    • /
    • 2003
  • MOLAP systems store data in a multidimensional away called a 'cube' and access them using way indexes. When a cube is placed into disk, it can be Partitioned into a set of chunks of the same side length. Such a cube storage scheme is called the chunk-based MOLAP cube storage scheme. It gives data clustering effect so that all the dimensions are guaranteed to get a fair chance in terms of the query processing speed. In order to achieve high space utilization, sparse chunks are further compressed. Due to data compression, the relative position of chunks cannot be obtained in constant time without using indexes. In this paper, we propose a bitmap index for chunk-based MOLAP cubes. The index can be constructed along with the corresponding cube generation. The relative position of chunks is retained in the index so that chunk retrieval can be done in constant time. We placed in an index block as many chunks as possible so that the number of index searches is minimized for OLAP operations such as range queries. We showed the proposed index is efficient by comparing it with multidimensional indexes such as UB-tree and grid file in terms of time and space.

Research on Keyword-Overlap Similarity Algorithm Optimization in Short English Text Based on Lexical Chunk Theory

  • Na Li;Cheng Li;Honglie Zhang
    • Journal of Information Processing Systems
    • /
    • v.19 no.5
    • /
    • pp.631-640
    • /
    • 2023
  • Short-text similarity calculation is one of the hot issues in natural language processing research. The conventional keyword-overlap similarity algorithms merely consider the lexical item information and neglect the effect of the word order. And some of its optimized algorithms combine the word order, but the weights are hard to be determined. In the paper, viewing the keyword-overlap similarity algorithm, the short English text similarity algorithm based on lexical chunk theory (LC-SETSA) is proposed, which introduces the lexical chunk theory existing in cognitive psychology category into the short English text similarity calculation for the first time. The lexical chunks are applied to segment short English texts, and the segmentation results demonstrate the semantic connotation and the fixed word order of the lexical chunks, and then the overlap similarity of the lexical chunks is calculated accordingly. Finally, the comparative experiments are carried out, and the experimental results prove that the proposed algorithm of the paper is feasible, stable, and effective to a large extent.

Dependency Parsing by Chunks (단위(Chunks) 분석과 의존문법에 기반한 한국어 구문분석)

  • 김미영;강신재;이종혁
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.04b
    • /
    • pp.327-329
    • /
    • 2000
  • 기존의 구문분석 방법은 구구조문법과 의존문법에 기반한 것이 대부분이다. 이러한 구문분석은 다양한 분석 결과들이 분석되는 동안 많은 시간이 소요되며, 잘못된 분석 결과를 찾아 내어 삭제하기(pruning)도 어렵다. 본 논문은 구문분석에 필요한 의존문법을 적용하기 이전에, 단위화(Chunking) 방법을 사용하는 것을 제안한다. 이렇게 함으로써, 의존문법에 적용하는 차트의 수를 줄이게 되고, 의존관계의 설정 범위(scope)도 제한을 가할 수 있으며, 구문분석 속도 또한 빨라지게 된다.

  • PDF

A Corpus-based Lexical Analysis of the Speech Texts: A Collocational Approach

  • Kim, Nahk-Bohk
    • English Language & Literature Teaching
    • /
    • v.15 no.3
    • /
    • pp.151-170
    • /
    • 2009
  • Recently speech texts have been increasingly used for English education because of their various advantages as language teaching and learning materials. The purpose of this paper is to analyze speech texts in a corpus-based lexical approach, and suggest some productive methods which utilize English speaking or writing as the main resource for the course, along with introducing the actual classroom adaptations. First, this study shows that a speech corpus has some unique features such as different selections of pronouns, nouns, and lexical chunks in comparison to a general corpus. Next, from a collocational perspective, the study demonstrates that the speech corpus consists of a wide variety of collocations and lexical chunks which a number of linguists describe (Lewis, 1997; McCarthy, 1990; Willis, 1990). In other words, the speech corpus suggests that speech texts not only have considerable lexical potential that could be exploited to facilitate chunk-learning, but also that learners are not very likely to unlock this potential autonomously. Based on this result, teachers can develop a learners' corpus and use it by chunking the speech text. This new approach of adapting speech samples as important materials for college students' speaking or writing ability should be implemented as shown in samplers. Finally, to foster learner's productive skills more communicatively, a few practical suggestions are made such as chunking and windowing chunks of speech and presentation, and the pedagogical implications are discussed.

  • PDF

Efficient Content Delivery Method in Wireless Content-Centric Network (무선 Content-Centric Network에서 효과적인 콘텐츠 전달 방식)

  • Park, Chan-Min;Kim, Byung-Seo
    • Journal of Internet Computing and Services
    • /
    • v.18 no.2
    • /
    • pp.13-20
    • /
    • 2017
  • Recently, researches to adopt Content-centric network (CCN), which is one of the promising technologies for replacing TCP/IP-based networks, to wireless networks has actively performed. However, because of erroneous and unreliable channel characteristics, there are many problems to be resolved to adopt CCN to wireless networks. This paper proposes a method to reduce content download time because nodes possess only parts of content chunks. The proposed method enables a node having parts of content chunks to request the rest of parts of content chunks to a provider before a Consumer requests the content. As a consequence, the content download time is reduced.

Influence of Milk Co-precipitates on the Quality of Restructured Buffalo Meat Blocks

  • Kumar, Sunil;Sharma, B.D.;Biswas, A.K.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.17 no.4
    • /
    • pp.564-568
    • /
    • 2004
  • Restructuring had made it possible to utilize lower value cuts and meat trimmings from spent animals by providing convenience in product preparation besides enhancing tenderness, palatability and value. Milk co-precipitates (MCP) have been reported to improve the nutritional and functional properties of certain meat products. This study was undertaken to evaluate the influence of incorporation of milk co-precipitates at four different levels viz. 0, 10, 15 and 20% on the quality of restructured buffalo meat blocks. Low-calcium milk co-precipitates were prepared from skim milk by heat and salt coagulation of milk proteins. Meat chunks were mixed with the curing ingredients and chilled water in a Hobart mixer for 5 minutes, followed by addition of milk co-precipitates along with condiments and spice mix and again mixed for 5 minutes. Treated chunks were stuffed in aluminium moulds and cooked in steam without pressure for 1.5 h. After cooking, treated meat blocks were compared for different physico-chemical and sensory attributes. Meat blocks incorporated with 10% MCP were significantly better (p<0.05) than those incorporated with 0, 15 and 20% MCP in cooking yield, percent shrinkage and moisture retention. Sensory scores were also marginally higher for meat blocks incorporated with 10% MCP than product incorporated with 15 and 20% MCP, besides being significantly higher than control. On the basis of above results 10% MCP was considered optimum for the preparation of restructured buffalo meat blocks. Instrumental texture profile analysis revealed that meat blocks incorporated with 10% MCP were significantly better (p<0.05) in hardness/ firmness than control although, no significant (p>0.05) differences were observed in cohesiveness, springiness, gumminess and chewiness of both type of samples.

A Study of Method to Restore Deduplicated Files in Windows Server 2012 (윈도우 서버 2012에서 데이터 중복 제거 기능이 적용된 파일의 복원 방법에 관한 연구)

  • Son, Gwancheol;Han, Jaehyeok;Lee, Sangjin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.27 no.6
    • /
    • pp.1373-1383
    • /
    • 2017
  • Deduplication is a function to effectively manage data and improve the efficiency of storage space. When the deduplication is applied to the system, it makes it possible to efficiently use the storage space by dividing the stored file into chunks and storing only unique chunk. However, the commercial digital forensic tool do not support the file system analysis, and the original file extracted by the tool can not be executed or opened. Therefore, in this paper, we analyze the process of generating chunks of data for a Windows Server 2012 system that can apply deduplication, and the structure of the resulting file(Chunk Storage). We also analyzed the case where chunks that are not covered in the previous study are compressed. Based on these results, we propose the method to collect deduplicated data and reconstruct the original file for digital forensic investigation.

Identification of Maximal-Length Noun Phrases Based on Expanded Chunks and Classified Punctuations in Chinese (확장청크와 세분화된 문장부호에 기반한 중국어 최장명사구 식별)

  • Bai, Xue-Mei;Li, Jin-Ji;Kim, Dong-Il;Lee, Jong-Hyeok
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.4
    • /
    • pp.320-328
    • /
    • 2009
  • In general, there are two types of noun phrases(NP): Base Noun Phrase(BNP), and Maximal-Length Noun Phrase(MNP). MNP identification can largely reduce the complexity of full parsing, help analyze the general structure of complex sentences, and provide important clues for detecting main predicates in Chinese sentences. In this paper, we propose a 2-phase hybrid approach for MNP identification which adopts salient features such as expanded chunks and classified punctuations to improve performance. Experimental result shows a high quality performance of 89.66% in $F_1$-measure.