196Accesses
Abstract
Given an alphabet Σ={1,2,…,|Σ|} text stringT∈Σn and a pattern stringP∈Σm, for eachi=1,2,…,n−m+1 defineLp(i) as thep-norm distance when the pattern is aligned below the text and starts at positioni of the text. The problem of pattern matching withLp distance is to computeLp(i) for everyi=1,2,…,n−m+1. We discuss the problem ford=1,2,∞. First, in the case ofL1 matching (pattern matching with anL1 distance) we show a reduction of thestring matching with mismatches problem to theL1 matching problem and we present an algorithm that approximates theL1 matching up to a factor of 1+ε, which has an\(O(\frac{1}{\varepsilon^{2}}n\log m\log|\Sigma|)\) run time. Then, theL2 matching problem (pattern matching with anL2 distance) is solved with a simpleO(nlog m) time algorithm. Finally, we provide an algorithm that approximates theL∞ matching up to a factor of 1+ε with a run time of\(O(\frac{1}{\varepsilon}n\log m\log|\Sigma|)\). We also generalize the problem of String Matching with mismatches to have weighted mismatches and present anO(nlog 4m) algorithm that approximates the results of this problem up to a factor ofO(log m) in the case that the weight function is a metric.
This is a preview of subscription content,log in via an institution to check access.
Access this article
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
Price includes VAT (Japan)
Instant access to the full article PDF.
Similar content being viewed by others
References
Abrahamson, K.: Generalized string matching. SIAM J. Comput.16(6), 1039–1051 (1987)
Amir, A., Farach, M.: Efficient 2-dimensional approximate matching of half-rectangular figures. Inf. Comput.118(1), 1–11 (1995)
Amir, A., Farach, M., Muthukrishnan, S.: Alphabet dependence in parameterized matching. Inf. Process. Lett.49, 111–115 (1994)
Amir, A., Lewenstein, M., Porat, E.: Faster algorithms for string matching with k mismatches. J. Algorithms50(2), 257–275 (2004). Special SODA 2000 issue
Atallah, M.J., Chyzak, F., Dumas, P.: A randomized algorithm for approximate string matching. Algorithmica29(3), 468–486 (2001)
Baker, B.S.: A theory of parameterized pattern matching: algorithms and applications. In: Proc. 25th Annual ACM Symposium on the Theory of Computing, pp. 71–80 (1993)
Chu, K.K.W., Wong, M.H.: Fast time-series searching with scaling and shifting. In: Proceedings of the Eighteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, May 31–June 2, 1999, Philadelphia, Pennsylvania, pp. 237–248. ACM, New York (1999)
Cole, R., Hariharan, R.: Verifying candidate matches in sparse and wildcard matching. In: Proc. of the 34th Annual ACM Symposium on Theory of Computing, pp. 592–601. ACM, New York (2002)
Cormen, T.H., Leiserson, C.E., Rivest, R.L.: Introduction to Algorithms. MIT Press, Cambridge (1992)
Cormode, G., Muthukrishnan, S.: The string edit distance matching problem with moves. In: Proceedings of the Thirteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 667–676. ACM, New York (2002)
Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast subsequence matching in time-series databases. In: Proceedings 1994 ACM SIGMOD Conference, Mineapolis, MN, pp. 419–429 (1994)
Fischer, M.J., Paterson, M.S.: String matching and other products. In: Karp, R.M. (ed.) Complexity of Computation. SIAM-AMS Proceedings, vol.7, pp. 113–125 (1974)
Hirschberg, D.S.: Serial computations of Levenshtein distances. In: Apostolico, A., Galil, Z. (eds.) Pattern Matching Algorithms, pp. 123–141. Oxford University Press, London (1997)
Indyk, P.: Faster algorithms for string matching problems: matching the convolution bound. In: Proceedings of the 39th IEEE Annual Symposium on Foundations of Computer Science, pp. 166–173. IEEE Comput. Soc., Los Alamitos (1998)
Indyk, P.: Stable distributions, pseudorandom generators, embeddings and data stream computation. In: Proc. 41st Symp. Foundations of Computer Science, pp. 189–197. IEEE Press, New York (2000)
Indyk, P.: Algorithmic applications of low-distortion geometric embeddings. In: Proceedings of the 42th IEEE Annual Symposium on Foundations of Computer Science. IEEE Comput. Soc., Los Alamitos (2001)
Indyk, P., Koudas, N., Muthukrishnan, S.: Identifying representative trends in massive time series data sets using sketches. In: El Abbadi, A., Brodie, M.L., Chakravarthy, S., Dayal, U., Kamel, N., Schlageter, G., Whang, K.-Y. (eds.) VLDB 2000, Proceedings of 26th International Conference on Very Large Data Bases, September 10–14, 2000, Cairo, Egypt, pp. 363–372. Morgan Kaufmann, San Mateo (2000)
Indyk, P., Lewenstein, M., Lipsky, O., Porat, E.: Closest pair problems in very high dimensions. In: Proceedings of the 31st International Colloquium on Automata, Languages and Programming (ICALP). Lecture Notes in Computer Science, vol. 3142, pp. 782–792. Springer, Berlin (2004)
Karloff, H.: Fast algorithms for approximately counting mismatches. Inf. Process. Lett.48(2), 53–60 (1993)
Knuth, D.E., Morris, J.H., Pratt, V.R.: Fast pattern matching in strings. SIAM J. Comput.6, 323–350 (1977)
Levenshtein, V.I.: Binary codes capable of correcting, deletions, insertions and reversals. Sov. Phys. Dokl.10, 707–710 (1966)
Lipsky, O., Porat, E.: L1 pattern matching lower bound. In: Proceedings of the 12th International Conference on String Processing and Information Retrieval (SPIRE), Bs. As., Argentina. Lecture Notes in Computer Science, vol. 3772, pp. 327–330. Springer, Berlin (2005)
Masek, W.J., Paterson, M.S.: A faster algorithm for computing string-edit distances. J. Comput. Syst. Sci.20, 18–31 (1980)
Muthukrishnan, S.: New results and open problems related to non-standard stringology. In: Proc. 6th Combinatorial Pattern Matching Conference. Lecture Notes in Computer Science, vol. 937, pp. 298–317. Springer, Berlin (1995)
Muthukrishnan, S., Ramesh, H.: String matching under a general matching relation. Inf. Comput.122(1), 140–148 (1995)
Perttu, S.: Combinatorial pattern matching in musical sequences. Master’s thesis, Department of Computer Science, University of Helsinki (2000)
Author information
Authors and Affiliations
Department of Computer Science, Bar-Ilan University, 52900, Ramat-Gan, Israel
Ohad Lipsky & Ely Porat
- Ohad Lipsky
You can also search for this author inPubMed Google Scholar
- Ely Porat
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence toEly Porat.
Additional information
Research supported in part by US-Israel Binational Science Foundation.
Rights and permissions
About this article
Cite this article
Lipsky, O., Porat, E. Approximate Pattern Matching with theL1,L2 andL∞ Metrics.Algorithmica60, 335–348 (2011). https://doi.org/10.1007/s00453-009-9345-9
Received:
Accepted:
Published:
Issue Date:
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative