Search | Korea Science

Bi-directional Maximal Matching Algorithm to Segment Khmer Words in Sentence

Mao, Makara;Peng, Sony;Yang, Yixuan;Park, Doo-Soon
- Journal of Information Processing Systems
- /
- v.18 no.4
- /
- pp.549-561
- /
- 2022
In the Khmer writing system, the Khmer script is the official letter of Cambodia, written from left to right without a space separator; it is complicated and requires more analysis studies. Without clear standard guidelines, a space separator in the Khmer language is used inconsistently and informally to separate words in sentences. Therefore, a segmented method should be discussed with the combination of the future Khmer natural language processing (NLP) to define the appropriate rule for Khmer sentences. The critical process in NLP with the capability of extensive data language analysis necessitates applying in this scenario. One of the essential components in Khmer language processing is how to split the word into a series of sentences and count the words used in the sentences. Currently, Microsoft Word cannot count Khmer words correctly. So, this study presents a systematic library to segment Khmer phrases using the bi-directional maximal matching (BiMM) method to address these problematic constraints. In the BiMM algorithm, the paper focuses on the Bidirectional implementation of forward maximal matching (FMM) and backward maximal matching (BMM) to improve word segmentation accuracy. A digital or prefix tree of data structure algorithm, also known as a trie, enhances the segmentation accuracy procedure by finding the children of each word parent node. The accuracy of BiMM is higher than using FMM or BMM independently; moreover, the proposed approach improves dictionary structures and reduces the number of errors. The result of this study can reduce the error by 8.57% compared to FMM and BFF algorithms with 94,807 Khmer words.
https://doi.org/10.3745/JIPS.04.0250 인용 PDF KSCI

A Scheduling of Switch Ports for IP Forwarding (IP 포워딩을 위한 스위치 포트 스케쥴링)

Lee, Chae-Y.;Lee, Wang-Hwan;Cho, Hee-K.
- Journal of Korean Institute of Industrial Engineers
- /
- v.25 no.2
- /
- pp.233-239
- /
- 1999
With the increase of Internet protocol (IP) packets the performance of routers became an important issue in internetworking. In this paper we examined the matching algorithm in gigabit router which has input queue with virtual output queueing. Port partitioning concept is employed to reduce the computational burden of the scheduler within a switch. The input and output ports are divided into two groups such that the matching algorithm is implemented within each input-output pair group in parallel. The matching is performed by exchanging input and output port groups at every time slot to handle all incoming traffics. Two algorithms, maximal weight matching by port partitioning (MPP) and modified maximal weight matching by port partitioning (MMPP) are presented. MMPP has the lowest delay for every packet arrival rate. The buffer size on a port is approximately 20-60 packets depending on the packet arrival rates. The throughput is illustrated to be linear to the packet arrival rate, which can be achieved under highly efficient matching algorithm.
PDF

Optimization Driven MapReduce Framework for Indexing and Retrieval of Big Data

Abdalla, Hemn Barzan;Ahmed, Awder Mohammed;Al Sibahee, Mustafa A.
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.14 no.5
- /
- pp.1886-1908
- /
- 2020
With the technical advances, the amount of big data is increasing day-by-day such that the traditional software tools face a burden in handling them. Additionally, the presence of the imbalance data in big data is a massive concern to the research industry. In order to assure the effective management of big data and to deal with the imbalanced data, this paper proposes a new indexing algorithm for retrieving big data in the MapReduce framework. In mappers, the data clustering is done based on the Sparse Fuzzy-c-means (Sparse FCM) algorithm. The reducer combines the clusters generated by the mapper and again performs data clustering with the Sparse FCM algorithm. The two-level query matching is performed for determining the requested data. The first level query matching is performed for determining the cluster, and the second level query matching is done for accessing the requested data. The ranking of data is performed using the proposed Monarch chaotic whale optimization algorithm (M-CWOA), which is designed by combining Monarch butterfly optimization (MBO) [22] and chaotic whale optimization algorithm (CWOA) [21]. Here, the Parametric Enabled-Similarity Measure (PESM) is adapted for matching the similarities between two datasets. The proposed M-CWOA outperformed other methods with maximal precision of 0.9237, recall of 0.9371, F1-score of 0.9223, respectively.
https://doi.org/10.3837/tiis.2020.05.002 인용 PDF KSCI HTML

Constant Time RMESH Algorithm for Computing Longest Common Substring and Maximal Repeat of String (문자열의 최장 공통 부분문자열과 최대 반복자를 구하기 위한 상수시간 RMESH 알고리즘)

Han, Seon-Mi;Woo, Jin-Woon
- The KIPS Transactions:PartA
- /
- v.16A no.5
- /
- pp.319-326
- /
- 2009
Since string operations were applied to computational biology area, various data structures and algorithms for computing efficient string operations have been studied. The longest common substring problem is an operation to find the longest matching substring in more than two strings, and maximal repeat of string problem is an operation to find substrings repeated more than once in the given string. These operations are importantly used in the string processing area such as pattern matching and likelihood measurement. In this paper, we present algorithms to compute the longest common substring of two strings and to find the maximal repeat of string using three-dimensional $n{\times}n{\times}n$ processors on RMESH(Reconfigurable MESH). Our algorithms have O(1) time complexity.
https://doi.org/10.3745/KIPSTA.2009.16A.5.319 인용 PDF KSCI

Grant-Aware Scheduling Algorithm for VOQ-Based Input-Buffered Packet Switches

Han, Kyeong-Eun;Song, Jongtae;Kim, Dae-Ub;Youn, JiWook;Park, Chansung;Kim, Kwangjoon
- ETRI Journal
- /
- v.40 no.3
- /
- pp.337-346
- /
- 2018
In this paper, we propose a grant-aware (GA) scheduling algorithm that can provide higher throughput and lower latency than a conventional dual round-robin matching (DRRM) method. In our proposed GA algorithm, when an output receives requests from different inputs, the output not only sends a grant to the selected input, but also sends a grant indicator to all the other inputs to share the grant information. This allows the inputs to skip the granted outputs in their input arbiters in the next iteration. Simulation results using OPNET show that the proposed algorithm provides a maximum 3% higher throughput with approximately 31% less queuing delay than DRRM.
https://doi.org/10.4218/etrij.2017-0057 인용 PDF KSCI

An Improvement of the Deadlock Avoidance Algorithm (Deadlock 회피책에 대한 개선방안 연구)

Kim, Tae-Yeong;Park, Dong-Won
- The Journal of Engineering Research
- /
- v.1 no.1
- /
- pp.49-57
- /
- 1997
In this paper, the follow-up works of Habermann's deadlock avoidance algorithm is investigated from the view of correction, efficiency and concurrency. Habermann's deadlock avoidance algorithm is briefly surveyed and in-depth discussion of follow-up algorithms modified and improved is presented. Then, further improvement of Kameda's algorithm will be discussed. His algorithm for testing deadlock-freedom in computer system converts the Habermann's model into a labeled bipartite graph so that the deadlock detection problem can be equivalent to finding complete matching for Mormon marriage problem. His algorithm has a running time of O($mn^1.5$) because Dinic's algorithm is used. The speed of above algorithm can be enhanced by employing a faster algorithm for finding a maximal matching. The wave method by Kazanov is used for.
PDF

Algorithm for Minimum Degree Inter-vertex Edge Selection of Maximum Matching Problem (최대 매칭 문제의 최소차수 정점 간 간선 선택 알고리즘)

Lee, Sang-Un
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.22 no.5
- /
- pp.1-6
- /
- 2022
This paper deals with the maximum cardinality matching(MCM) problem. The augmenting path technique is well known in MCM. MCM is obtained by $O({\sqrt{n}}m)$ time complexity augmenting path algorithm for the general graph, and O(m log n) algorithm for the bipartite graph. On the other hand, this paper suggests O(n) linear time algorithm. The proposed algorithm based on the basic principle of as possible as largest selected inter-vertex edges in order to obtain the MCM. This paper simply selects edge {u,𝜐} that the minimum degree vertex u and minimum degree vertex 𝜐 in N_G(u) 𝜈(G)=k times iteration. For various general and bipartite graphs experimental data, this algorithm can be get the 𝜈(G) exactly.
https://doi.org/10.7236/JIIBC.2022.22.5.1 인용 PDF KSCI HTML

Efficient Randomized Parallel Algorithms for the Matching Problem (매칭 문제를 위한 효율적인 랜덤 병렬 알고리즘)

U, Seong-Ho;Yang, Seong-Bong
- Journal of KIISE:Computer Systems and Theory
- /
- v.26 no.10
- /
- pp.1258-1263
- /
- 1999
본 논문에서는 CRCW(Concurrent Read Concurrent Write)와 CREW(Concurrent Read Exclusive Write) PRAM(Parallel Random Access Machine) 모델에서 무방향성 그래프 G=(V, E)의 극대 매칭을 구하기 위해 간결한 랜덤 병렬 알고리즘을 제안한다. CRCW PRAM 모델에서 m개의 선을 가진 그래프에 대해, 제안된 매칭 알고리즘은 m개의 프로세서 상에서 {{{{ OMICRON (log m)의 기대 수행 시간을 가진다. 또한 CRCW 알고리즘을 CREW PRAM 모델에서 구현한 CREW 알고리즘은 OMICRON (log^2 m)의 기대 수행 시간을 가지지만,OMICRON (m/logm) 개의 프로세서만을 가지고 수행될 수 있다.Abstract This paper presents simple randomized parallel algorithms for finding a maximal matching in an undirected graph G=(V, E) for the CRCW and CREW PRAM models. The algorithm for the CRCW model has {{{{ OMICRON (log m) expected running time using m processors, where m is the number of edges in G We also show that the CRCW algorithm can be implemented on a CREW PRAM. The CREW algorithm runs in {{{{ OMICRON (log^2 m) expected time, but it requires only OMICRON (m / log m) processors.

Search Result 8, Processing Time 0.021 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)