Movatterモバイル変換


[0]ホーム

URL:


Skip to main content

Advertisement

Springer Nature Link
Log in

An algorithmic theory of learning: Robust concepts and random projection

  • Published:
Machine Learning Aims and scope Submit manuscript

Abstract

We study the phenomenon of cognitive learning from an algorithmic standpoint. How does the brain effectively learn concepts from a small number of examples despite the fact that each example contains a huge amount of information? We provide a novel algorithmic analysis via a model ofrobust concept learning (closely related to “margin classifiers”), and show that a relatively small number of examples are sufficient to learn rich concept classes. The new algorithms have several advantages—they are faster, conceptually simpler, and resistant to low levels of noise. For example, a robust half-space can be learned in linear time using only a constant number of training examples, regardless of the number of attributes. A general (algorithmic) consequence of the model, that “more robust concepts are easier to learn”, is supported by a multitude of psychological studies.

Article PDF

Similar content being viewed by others

Chapter© 2016

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.
Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

  • Achlioptas, D. (2001). Database friendly random projections. InProc. Principles of Database Systems (PODS) (pp. 274–281).

  • Agmon, S. (1954). The relaxation method for linear inequalities.Canadian Journal of Mathematics,6(3), 382–392.

    MATH MathSciNet  Google Scholar 

  • Arriaga, R. I., & Vempala, S. (1999). An algorithmic theory of Learning: Robust concepts and random projection. InProc. of the 39th IEEE Foundations of Computer Science.

  • Balcan, N., Blum, A., & Vempala, S. (2004). On kernels, margins and low-dimensional mappings. InProc. of Algorithmic Learning Theory.

  • Bartlett, P., & Shawe-Taylor, J. (1998). Generalization performance of support vector machines and other pattern classifiers, In B., Schvlkopf, C., Burges, & A.J. Smola, (eds.),Advances in kernel methods—support vector learning. MIT press.

  • Baum, E. B. (1990). On learning a union of half spaces.journal of Complexity,6(1), 67–101.

    Article MATH MathSciNet  Google Scholar 

  • Ben-David, S., Eiron, N., & Simon, H.(2004). Limitations of learning via embeddings in euclidean half spaces.Journal of Machine Learning Research,3, 441–461.

    Article MathSciNet  Google Scholar 

  • Blum, A., Frieze, A. Kannan, R., & Vempala, S. (1996). A polynomial-time algorithm for learning noisy linear threshold functions. InProc. of the 37th IEEE Foundations of Computer Science.

  • Blum, A., & Kannan, R. (1993). Learning an intersection ofk halfspaces over a uniform distribution. InProc. of the 34th IEEE Symposium on the Foundations of Computer Science.

  • Blumer, A., Ehrenfeucht, A., Haussler, D., & Warmuth, M. K. (1989). Learnability and the vapnik-chervonenkis dimension.Journal of ACM,36(4), 929–965.

    Article MathSciNet  Google Scholar 

  • Bylander, T. (1994). Learning linear threshold functions in the presence of classification noise. InProc 7th Workshop on Computational Learning Theory.

  • Cohen, E. (1997). Learning noisy perceptrons by a perceptron in polynomial time. InProc. of the 38th IEEE Foundations of Computer Science.

  • Cortes, C., & Vapnik, V. (1995). Support-vector networks.Machine Learning,20, 273–297.

    Google Scholar 

  • Dasgupta, S., & Gupta, A. (1999). An elementary proof of the Johnson-Lindenstrauss Lemma.Tech Rep.U.C. Berkeley.

  • Feller, W. (1957).An introduction to probability theory and its applications. John Wiley and Sons, Inc.

  • Frankl, P., & Maehara, H. (1988). The Johnson-Lindenstrauss Lemma and the Sphericity of some graphs,J Comb. Theory, B44, 355–362.

    Google Scholar 

  • Freund, Y., & Schapire, R. E. (1999). “Large margin classification using the perceptron algorithm.Machine learning,37(3), 277–296.

    Article  Google Scholar 

  • Garg, A., Har-Peled, S., & Roth, D. (2002). On generalization bounds, projection profile, and margin distribution.ICML, 171–178.

  • Garg, A., & Roth, D. (2003). Margin distribution and learning.ICML, 210–217.

  • Glass, A.L., Holyoak, K.J., & Santa J.L. (1979). The structure of categories. InCognition. Addison-Wesley.

  • Grötschel, M., Lovász, L., & Schrijver, A. (1988).Geometric algorithms and combinatorial optimization, Springer.

  • Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables.Journal of the American Statistical Association,58, 13–30.

    Article MATH MathSciNet  Google Scholar 

  • Indyk, P., & Motwani, R. (1998). Approximate nearest neighbors: Towards removing the curse of dimensionality. InProcedings of ACM STOC.

  • Johnson, W. B., & Lindenstrauss, J.(1984). Extensions of lipshitz mapping into Hilbert space.Contemporary Mathematics,26, 189–206.

    MathSciNet  Google Scholar 

  • Kearns, M.J., & Schapire, R.E. (1994). Efficient distribution-free learning of probabilistic concepts.Journal of Computer and System Sciences,48(3), 464–497.

    Article MathSciNet  Google Scholar 

  • Kearns, M.J., & Vazirani, U. (1994).Introduction to computational learning theory. MIT Press.

  • Kleinberg, J. (1997). Two algorithms for nearest-neighbor search in high dimensions. InProcedings 29th ACM Symposium on Theory of Computing.

  • Klivans, A., & Servedio, R. (2004). Learning intersections of halfspaces with a margin, InProcedings 17th Workshop on Computational Learning Theory.

  • Knowlton, B. (1999). What can neuropsychology tell us about category learning.Trends in Cognitive Science,3, 123–124.

    Article  Google Scholar 

  • Komatsu, L.K. (1992). Recent views on conceptual structure.Psychological Bulletin,112(3).

  • Linial, N., London, E., & Rabinovich, Y. (1994). The geometry of graphs and some of its algorithmic applications. InProc. of 35th IEEE Foundations of Computer Science.

  • Littlestone, N. (1987). Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm.Machine Learning,2, 285–318.

    Google Scholar 

  • Littlestone, N. (1991). Redundant noisy attributes, attribute errors, and linear threshold learning using winnow. InProc. 4th workshop on computational learning theory.

  • Mandler, J. M. (2003).Conceptual Categorization, Chapter 5. In D. H., Rakinson, & L. M. Oakes, (eds.),Early category and concept development: Making sense of the blooming, buzzing confusion. Oxford University Press.

  • Minsky, M., & Papert., S., (1969).Perceptrons: An introduction to computational geometry. The MIT press.

  • Nosofsky, R., & Zaki, S., (1969). Math modeling, neuropsychology, and category learning: Response to B. Knowlton (1999).Trends in Cognitive Science,3, 125–126, 1999.Perceptrons: An introduction to computational geometry. The MIT press.

  • Rakinson, D. H., & Oakes, L. M. (eds.), (2003).Early category and concept development: Making sense of the blooming, buzzing confusion. Oxford University Press.

  • Reed, S. K. (1982).Categorization, in cognition: Theory and applications. brooks/cole.

  • Reed, S. K., & Friedman, M. P. (1973). Perceptual vs. conceptual categorization.Memory and Cognition, 1.

  • Rosch, E. (1978). Principles of categorization. in E., Rosch, & Lloyd, B. B. (eds.),Cognition and categorization Hillsdale.

  • Rosch, E. H., Mervis, C. B., Gray, W. D., Johnson, D. M., & Boyes-Braem, P. (1976). Basic ojects in natural categories.Cognitive Psychology,8.

  • Rosenblatt, F. (1962).Principles of neurodynamics. Spartan Books.

  • Schapire, R. E., Freund, Y., Bartlett, P. L., & Lee, W. S. (1998). Boosting the margin: A new explanation for the effectiveness of voting methods.Annals of Statistics,26(5), 1651–1686.

    Article MathSciNet  Google Scholar 

  • Valiant, L. G. (1984). A theory of the learnable.Communications of the ACM,27(11), 1134–1142.

    Article MATH  Google Scholar 

  • Valiant, L. G. (1998). A neuroidal architecture for cognitive computation. InProc. of ICALP.

  • Vapnik, V. N. (1995).The nature of statistical learning theory. Springer.

  • Vapnik, V. N., & Chervonenkis, A. Ya., (1971). On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its applications,26(2), 264–280.

    Article MathSciNet  Google Scholar 

  • Vempala, S. (1997). A random sampling based algorithm for learning the Intersection of Half-spaces. InProc. of the 38th IEEE Foundations of Computer Science.

  • Vempala, S. (2004).The random projection method.65. DIMACS series, AMS.

Download references

Author information

Authors and Affiliations

  1. Department of Psychology, Southern New Hampshire University, 2500 N. River Road, Manchester, NH, 03106

    Rosa I. Arriaga

  2. Department of Mathematics, M.I.T., 77 massachusetts avenue, cambridge, ma, 02139-4307

    Santosh Vempala

Authors
  1. Rosa I. Arriaga

    You can also search for this author inPubMed Google Scholar

  2. Santosh Vempala

    You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence toSantosh Vempala.

Additional information

Editor: Shai Ben-David

A preliminary version of this paper appeared in the Proc. of the Symposium on the Foundations of Computer Science, 1999

Rights and permissions

About this article

Cite this article

Arriaga, R.I., Vempala, S. An algorithmic theory of learning: Robust concepts and random projection.Mach Learn63, 161–182 (2006). https://doi.org/10.1007/s10994-006-6265-7

Download citation

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Advertisement


[8]ページ先頭

©2009-2025 Movatter.jp