Aho-Corasick String Searching

.NET Reference Guide | Aho-Corasick String Searching | InformIT

Aho-Corasick String Searching

Last updated Nov 6, 2009.

If you want to search for multiple strings in a large amount of text, the algorithm of choice is the Aho-Corasick algorithm, published by Alfred V. Aho and Margaret J. Corasick in 1975. It’s a kind of dictionary matching algorithm that makes a single pass through the input text, matching all occurrences of all strings in that single pass. There is some small up-front cost for preprocessing the strings being searched for, but that’s easily offset by the algorithm’s reduced time complexity. Whereas the brute force algorithm has N * M complexity, Aho-Corasick is linear in the length of the patterns plus the length of the text to be searched, plus the number of output matches.

發表迴響

你的電子郵件位址並不會被公開。 必要欄位標記為 *

What is 14 + 6 ?
Please leave these two fields as-is:
IMPORTANT! To be able to proceed, you need to solve the following simple math (so we know that you are a human) :-)