Browse > Article
http://dx.doi.org/10.13089/JKIISC.2013.23.3.395

Design of Memory-Efficient Deterministic Finite Automata by Merging States With The Same Input Character  

Choi, Yoon-Ho (Kyonggi University)
Abstract
A pattern matching algorithm plays an important role in traffic identification and classification based on predefined patterns for intrusion detection and prevention. As attacks become prevalent and complex, current patterns are written using regular expressions, called regexes, which are expressed into the deterministic finite automata(DFA) due to the guaranteed worst-case performance in pattern matching process. Currently, because of the increased complexity of regex patterns and their large number, memory-efficient DFA from states reduction have become the mainstay of pattern matching process. However, most of the previous works have focused on reducing only the number of states on a single automaton, and thus there still exists a state blowup problem under the large number of patterns. To solve the above problem, we propose a new state compression algorithm that merges states on multiple automata. We show that by merging states with the same input character on multiple automata, the proposed algorithm can lead to a significant reduction of the number of states in the original DFA by as much as 40.0% on average.
Keywords
pattern matching; regular expression; DFA; state blowup; state merging;
Citations & Related Records
연도 인용수 순위
  • Reference
1 N. Tuck, T. Sherwood, B. Calder, and G. Varghese, "Deterministic memory-efficient string matching algorithms for intrusion detection," The 23th IEEE International Conference on Computer Communications(INFOCOM 2004), pp. 2628-2639, Mar. 2004.
2 S. Kumar, J. Turner and J. Williams, "Advanced Algorithms for Fast and Scalable Deep Packet Inspection," Proceedings of the 2006 ACM/IEEE symposium on Architecture for networking and communications systems(ANCS 2006), pp. 81-92, Dec. 2006.
3 S. Kumar, S. Dharmapurikar, F. Yu, P. Crowley, and J. Tuner, "Algorithms to Accelerate Multiple Regular Expressions Matching for Deep Packet Inspection," ACM SIGCOMM'06, pp. 339-350, Sep. 2006.
4 F.Yu, Z.Chen, Y.Diao, T.V.Lakshman, and R.H.Katz, "Fast and Memory-Efficient Regular Expression Matching For Deep Packet Inspection," Proceedings of the 2006 ACM/IEEE symposium on Architecture for networking and communications systems(ANCS 2006), pp. 93-102, Oct. 2006.
5 M. Becchi and S. Cadambi, "Memory-Efficient Regular Expression Search Using State Merging," The 26th IEEE International Conference on Computer Communications (INFOCOM 2007), pp. 1064-1072, May 2007.
6 F. Yu, R. H. Katz, and T. V. Lakshman, "Gigabit Rate Packet Pattern-Matching Using TCAM," Proceedings of the 12th IEEE International Conference on Network Protocols(ICNP 2004), pp. 174-183, Oct. 2004.
7 L. Tan and T. Sherwood, "A High Throughput String Matching Architecture for Intrusion Detection and Prevention," Proceedings of the 32nd annual international symposium on Computer Architecture(ISCA 2005), pp. 112-122, May 2005.
8 The Snort Project, snort 2.6 ruleset, VRT Rules 2006-06-28, June 2006.
9 Deterministic Finite Automatamon, http://en.wikipedia.org/wiki/Deterministic_finite_automaton
10 D. E. Knuth, J. H. Morris, and V. R. Pratt, "Fast pattern matching in strings," SIAM Journal on Computing vol. 6, no. 2, pp. 323-350, Aug. 1977.   DOI
11 K. G. Anagnostakis and E. P. Markatos, "Generating realistic workloads for network intrusion detection system," Proceedings of the 4th international workshop on Software and performance( WOSP'04), pp. 207-215, Jan. 2004.
12 Y.-H. Choi, M.-Y. Jung and S.-W. Seo, "A fast pattern matching algorithm with multi-byte search unit for high-speed network security," Elsevier Computer Communications, vol. 34, no. 14, pp. 1750-1763, Sep. 2011.   DOI   ScienceOn
13 Wu, S. and Manber, U, "A fast algorithm for multi-pattern searching," Department of Computer Science, University of Arizona. TR94-17. pages 11, May 1994.
14 J. Hopcroft, "An n log n algorithm for minimizing states in a finite automaton," Theory of machines and computations, New York: Academic Press, pages 15, 1971.
15 J. E. Hopcroft, R. Motwani and J. D. Ullman, "Introduction to Automata Theory, Languages, and Computation," Pearson/Addison Wesley, pages 535, 2007.
16 A. V. Aho and M. J. Corasick, "Efficient String Matching: An Aid to Bibliographic Search," Communications of the ACM, pp. 333-340, June 1975.
17 R. Smith, C. Estan and S. Jha, "XFA: Faster signature matching with extended automata," IEEE Symposium on Security and Privacy, pp. 158-172, May 2008.
18 N. Hua, H. Song and T.V. Lakshman, "Variable-Stride Multi-Pattern Matching For Scalable Deep Packet Inspection," The 28th Conference on Computer Communications(INFOCOM 2009), pp.415-423, Apr. 2009.