Movatterモバイル変換


[0]ホーム

URL:


×

zbMATH Open — the first resource for mathematics

from until
Reset all

Examples

GeometrySearch for the termGeometry inany field. Queries arecase-independent.
Funct*Wildcard queries are specified by* (e .g.functions,functorial, etc.). Otherwise the search isexact.''Topological group'':Phrases (multi - words) should be set in''straight quotation marks''.
au: Bourbaki & ti: AlgebraSearch forauthorBourbaki andtitleAlgebra. Theand-operator & is default and can be omitted.
Chebyshev | TschebyscheffTheor-operator| allows to search forChebyshev orTschebyscheff.
Quasi* map* py: 1989The resulting documents havepublicationyear1989.
so:Eur* J* Mat* Soc* cc:14Search for publications in a particularsource with aMathematics SubjectClassificationcode in14.
cc:*35 ! any:ellipticSearch for documents about PDEs (prefix with * to search only primary MSC); the not-operator ! eliminates all results containing the wordelliptic.
dt: b & au: HilbertThedocumenttype is set tobooks; alternatively:j forjournal articles,a forbookarticles.
py: 2000 - 2015 cc:(94A | 11T)Numberranges when searching forpublicationyear are accepted . Terms can be grouped within( parentheses).
la: chineseFind documents in a givenlanguage .ISO 639 - 1 (opens in new tab) language codes can also be used.
st: c r sFind documents that arecited, havereferences and are from asingle author.

Fields

ab Text from the summary or review (for phrases use “. ..”)
an zbMATH ID, i.e.: preliminary ID, Zbl number, JFM number, ERAM number
any Includes ab, au, cc, en, rv, so, ti, ut
arxiv arXiv preprint number
au Name(s) of the contributor(s)
br Name of a person with biographic references (to find documents about the life or work)
cc Code from the Mathematics Subject Classification (prefix with* to search only primary MSC)
ci zbMATH ID of a document cited in summary or review
db Database: documents in Zentralblatt für Mathematik/zbMATH Open (db:Zbl), Jahrbuch über die Fortschritte der Mathematik (db:JFM), Crelle's Journal (db:eram), arXiv (db:arxiv)
dt Type of the document: journal article (dt:j), collection article (dt:a), book (dt:b)
doi Digital Object Identifier (DOI)
ed Name of the editor of a book or special issue
en External document ID: DOI, arXiv ID, ISBN, and others
in zbMATH ID of the corresponding issue
la Language (use name, e.g.,la:French, orISO 639-1, e.g.,la:FR)
li External link (URL)
na Number of authors of the document in question. Interval search with “-”
pt Reviewing state: Reviewed (pt:r), Title Only (pt:t), Pending (pt:p), Scanned Review (pt:s)
pu Name of the publisher
py Year of publication. Interval search with “-”
rft Text from the references of a document (for phrases use “...”)
rn Reviewer ID
rv Name or ID of the reviewer
se Serial ID
si swMATH ID of software referred to in a document
so Bibliographical source, e.g., serial title, volume/issue number, page range, year of publication, ISBN, etc.
st State: is cited (st:c), has references (st:r), has single author (st:s)
sw Name of software referred to in a document
ti Title of the document
ut Keywords

Operators

a & bLogical and (default)
a | bLogical or
!abLogical not
abc*Right wildcard
ab cPhrase
(ab c)Term grouping

See also ourGeneral Help.

Products of many large random matrices and gradients in deep neural networks.(English)Zbl 1446.60007

Summary: We study products of random matrices in the regime where the number of terms and the size of the matrices simultaneously tend to infinity. Our main theorem is that the logarithm of the \(\ell_2\) norm of such a product applied to any fixed vector is asymptotically Gaussian. The fluctuations we find can be thought of as a finite temperature correction to the limit in which first the size and then the number of matrices tend to infinity. Depending on the scaling limit considered, the mean and variance of the limiting Gaussian depend only on either the first two or the first four moments of the measure from which matrix entries are drawn. We also obtain explicit error bounds on the moments of the norm and the Kolmogorov-Smirnov distance to a Gaussian. Finally, we apply our result to obtain precise information about the stability of gradients in randomly initialized deep neural networks with ReLU activations. This provides a quantitative measure of the extent to which the exploding and vanishing gradient problem occurs in a fully connected neural network with ReLU activations and a given architecture.

MSC:

60B20 Random matrices (probabilistic aspects)
60F05 Central limit and other weak theorems
68T05 Learning and adaptive systems in artificial intelligence
92B20 Neural networks for/in biological studies, artificial life and related topics

Cite

References:

[1]Akemann, G.; Burda, Z.; Kieburg, M., Universal distribution of Lyapunov exponents for products of Ginibre matrices, J. Phys. A Math. Gen., 47, 395202 (2014) ·Zbl 1327.60021 ·doi:10.1088/1751-8113/47/39/395202
[2]Akemann, G., Burda, Z., Kieburg, M.: From integrable to chaotic systems: universal local statistics of Lyapunov exponents. arXiv e-prints arXiv:1809.05905 (2018) ·Zbl 1327.60021
[3]Akemann, G.; Ipsen, JR, Recent exact and asymptotic results for products of independent random matrices, Acta Phys. Polonica B, 46, 1747 (2015) ·Zbl 1371.60008 ·doi:10.5506/APhysPolB.46.1747
[4]Allen-Zhu, Z., Li, Y., Song, Z.: A convergence theory for deep learning via over-parameterization. arXiv preprint arXiv:1811.03962 (2018)
[5]Anderson, GW; Guionnet, A.; Zeitouni, O., An Introduction to Random Matrices (2009), Cambridge: Cambridge University Press, Cambridge ·Zbl 1170.91002
[6]Comets, F., Moreno Flores, G. R., Ramirez, A.: Random polymers on the complete graph. arXiv e-prints arXiv:1707.01588 (2017) ·Zbl 1442.60108
[7]Cotler, J.; Gur-Ari, G.; Hanada, M.; Polchinski, J.; Saad, P.; Shenker, SH; Stanford, D.; Streicher, A.; Tezuka, M., Black holes and random matrices, J. High Energy Phys., 2017, 5, 118 (2017) ·Zbl 1380.81307 ·doi:10.1007/JHEP05(2017)118
[8]Crisanti, A.; Paladin, G.; Vulpiani, A., Products of Random Matrices: In Statistical Physics (2012), Berlin: Springer, Berlin ·Zbl 0784.58003
[9]Deift, P., Some open problems in random matrix theory and the theory of integrable systems. II, SIGMA, 13, 016 (2017) ·Zbl 1375.37160
[10]Forrester, P., Asymptotics of finite system lyapunov exponents for some random matrix ensembles, J. Phys. A Math. Theor., 48, 21, 215205 (2015) ·Zbl 1323.15021 ·doi:10.1088/1751-8113/48/21/215205
[11]Forrester, PJ, Lyapunov exponents for products of complex Gaussian random matrices, J. Stat. Phys., 151, 796-808 (2013) ·Zbl 1272.82020 ·doi:10.1007/s10955-013-0735-7
[12]Furstenberg, H.; Kesten, H., Products of random matrices, Ann. Math. Stat., 31, 2, 457-469 (1960) ·Zbl 0137.35501 ·doi:10.1214/aoms/1177705909
[13]Goetze, F.; Kosters, H.; Tikhomirov, A., Asymptotic spectra of matrix-valued functions of independent random matrices and free probability, Random Matrices Theory Appl., 04, 08 (2014)
[14]Götze, F., Tikhomirov, A.: On the Asymptotic Spectrum of Products of Independent Random Matrices. arXiv e-prints arXiv:1012.2710 (2010) ·Zbl 1203.60010
[15]Haeusler, E., On the rate of convergence in the central limit theorem for martingales with discrete and continuous time, Ann. Probab., 16, 275-299 (1988) ·Zbl 0639.60030 ·doi:10.1214/aop/1176991901
[16]Hanin, B.: Which neural net architectures give rise to exploding and vanishing gradients? In: Advances in Neural Information Processing Systems (2018)
[17]Ipsen, JR, Lyapunov exponents for products of rectangular real, complex and quaternionic ginibre matrices, J. Phys. A Math. Theor., 48, 15, 155204 (2015) ·Zbl 1316.15041 ·doi:10.1088/1751-8113/48/15/155204
[18]Isopi, M.; Newman, CM, The triangle law for lyapunov exponents of large random matrices, Commun. Math. Phys., 143, 591-598 (1992) ·Zbl 0759.15019 ·doi:10.1007/BF02099267
[19]Jiang, T.; Qi, Y., Spectral radii of large non-hermitian random matrices, J. Theor. Probab., 30, 1, 326-364 (2017) ·Zbl 1362.15024 ·doi:10.1007/s10959-015-0634-8
[20]Pennington, J., Schoenholz, S., Ganguli, S.: The emergence of spectral universality in deep networks. In: International Conference on Artificial Intelligence and Statistics, AISTATS: 9-11 April 2018, Playa Blanca, Lanzarote, Canary Islands, Spain, pp. 1924-1932 (2018)
[21]Kargin, V., On the largest Lyapunov exponent for products of Gaussian matrices, J. Stat. Phys., 157, 70-83 (2014) ·Zbl 1307.15056 ·doi:10.1007/s10955-014-1077-9
[22]Kargin, V., Lyapunov exponents of free operators, J. Funct. Anal., 255, 8, 1874-1888 (2008) ·Zbl 1163.46042 ·doi:10.1016/j.jfa.2008.08.011
[23]Liu, D.-Z., Wang, D., Wang, Y.: Lyapunov exponent, universality and phase transition for products of random matrices. arXiv e-prints arXiv:1810.00433 (2018)
[24]Mingo, J.; Speicher, R., Free Probability and Random Matrices (2017), New York: Springer, New York ·Zbl 1387.60005
[25]Newman, CM, The distribution of lyapunov exponents: exact results for random matrices, Commun. Math. Phys., 103, 1, 121-126 (1986) ·Zbl 0593.58051 ·doi:10.1007/BF01464284
[26]O’Rourke, S., Soshnikov, A.: Products of independent non-Hermitian random matrices. arXiv e-prints arXiv:1012.4497 (2010) ·Zbl 1244.60011
[27]Oseledets, VI, A multiplicative ergodic theorem. Characteristic Ljapunov, exponents of dynamical systems, Trudy Moskovskogo Matematicheskogo Obshchestva, 19, 179-210 (1968) ·Zbl 0236.93034
[28]Pennington, J., Schoenholz, S., Ganguli, S.: Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice. In: Advances in Neural Information Processing Systems, pp. 4788-4798 (2017)
[29]Pennington, J., Worah, P.: Nonlinear random matrix theory for deep learning. In: Advances in Neural Information Processing Systems, pp. 2634-2643 (2017) ·Zbl 1459.60012
[30]Pollicott, M., Maximal lyapunov exponents for random matrix products, Invent. Math., 181, 1, 209-226 (2010) ·Zbl 1196.37032 ·doi:10.1007/s00222-010-0246-y
[31]Tucci, G., Asymptotic products of independent gaussian random matrices with correlated entries, Electron. Commun. Probab., 16, 353-364 (2011) ·Zbl 1225.15037 ·doi:10.1214/ECP.v16-1635
[32]Tulino, A.; Verdú, S., Random matrix theory and wireless communications, Found. Trends Commun. Inf. Theory, 1, 1, 1-82 (2004) ·Zbl 1133.94014 ·doi:10.1561/0100000001
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.
© 2025FIZ Karlsruhe GmbHPrivacy PolicyLegal NoticesTerms & Conditions
  • Mastodon logo
 (opens in new tab)

[8]ページ先頭

©2009-2025 Movatter.jp