NotificationsYou must be signed in to change notification settings
Fork11.1k
Star29.9k

Commitf9c6867

committed

Merge branch 'pull-357'

* pull-357: BUG: change default threshold for matrix_rank

2 parents6a1ab03 +78b7693 commitf9c6867Copy full SHA for f9c6867

File tree

2 files changed

+60

-20

lines changed

numpy/linalg
- linalg.py
- tests
  - test_linalg.py

2 files changed

+60

-20

lines changed

`‎numpy/linalg/linalg.py`

Lines changed: 44 additions & 19 deletions

Original file line number	Diff line number	Diff line change
`@@ -1420,8 +1420,8 @@ def matrix_rank(M, tol=None):`
`1420`	`1420`	`"""`
`1421`	`1421`	`Return matrix rank of array using SVD method`
`1422`	`1422`
`1423`		`- Rank of the array is the number of SVD singular values of the`
`1424`		-array that aregreater than `tol`.
	`1423`	`+ Rank of the array is the number of SVD singular values of the array that are`
	`1424`	+ greater than `tol`.
`1425`	`1425`
`1426`	`1426`	`Parameters`
`1427`	`1427`	`----------`
`@@ -1431,27 +1431,53 @@ def matrix_rank(M, tol=None):`
`1431`	`1431`	threshold below which SVD values are considered zero. If `tol` is
`1432`	`1432`	None, and ``S`` is an array with singular values for `M`, and
`1433`	`1433`	``eps`` is the epsilon value for datatype of ``S``, then `tol` is
`1434`		- set to ``S.max() * eps``.
	`1434`	+ set to ``S.max() max(M.shape) eps``.
`1435`	`1435`
`1436`	`1436`	`Notes`
`1437`	`1437`	`-----`
`1438`		`- Golub and van Loan [1]_ define "numerical rank deficiency" as using`
`1439`		`- tol=eps*S[0] (where S[0] is the maximum singular value and thus the`
`1440`		`- 2-norm of the matrix). This is one definition of rank deficiency,`
`1441`		`- and the one we use here. When floating point roundoff is the main`
`1442`		`- concern, then "numerical rank deficiency" is a reasonable choice. In`
`1443`		`- some cases you may prefer other definitions. The most useful measure`
`1444`		`- of the tolerance depends on the operations you intend to use on your`
`1445`		`- matrix. For example, if your data come from uncertain measurements`
`1446`		`- with uncertainties greater than floating point epsilon, choosing a`
`1447`		`- tolerance near that uncertainty may be preferable. The tolerance`
`1448`		`- may be absolute if the uncertainties are absolute rather than`
`1449`		`- relative.`
	`1438`	`+ The default threshold to detect rank deficiency is a test on the magnitude`
	`1439`	+ of the singular values of `M`. By default, we identify singular values less
	`1440`	+ than ``S.max() * max(M.shape) * eps`` as indicating rank deficiency (with
	`1441`	`+ the symbols defined above). This is the algorithm MATLAB uses [1]. It also`
	`1442`	`+ appears in Numerical recipes in the discussion of SVD solutions for linear`
	`1443`	`+ least squares [2].`
	`1444`	`+`
	`1445`	`+ This default threshold is designed to detect rank deficiency accounting for`
	`1446`	`+ the numerical errors of the SVD computation. Imagine that there is a column`
	`1447`	+ in `M` that is an exact (in floating point) linear combination of other
	`1448`	+ columns in `M`. Computing the SVD on `M` will not produce a singular value
	`1449`	`+ exactly equal to 0 in general: any difference of the smallest SVD value from`
	`1450`	`+ 0 will be caused by numerical imprecision in the calculation of the SVD.`
	`1451`	`+ Our threshold for small SVD values takes this numerical imprecision into`
	`1452`	`+ account, and the default threshold will detect such numerical rank`
	`1453`	+ deficiency. The threshold may declare a matrix `M` rank deficient even if
	`1454`	+ the linear combination of some columns of `M` is not exactly equal to
	`1455`	+ another column of `M` but only numerically very close to another column of
	`1456`	+ `M`.
	`1457`	`+`
	`1458`	`+ We chose our default threshold because it is in wide use. Other thresholds`
	`1459`	`+ are possible. For example, elsewhere in the 2007 edition of *Numerical`
	`1460`	+ recipes* there is an alternative threshold of ``S.max() *
	`1461`	+ np.finfo(M.dtype).eps / 2. * np.sqrt(m + n + 1.)``. The authors describe
	`1462`	`+ this threshold as being based on "expected roundoff error" (p 71).`
	`1463`	`+`
	`1464`	`+ The thresholds above deal with floating point roundoff error in the`
	`1465`	`+ calculation of the SVD. However, you may have more information about the`
	`1466`	+ sources of error in `M` that would make you consider other tolerance values
	`1467`	`+ to detect effective rank deficiency. The most useful measure of the`
	`1468`	`+ tolerance depends on the operations you intend to use on your matrix. For`
	`1469`	`+ example, if your data come from uncertain measurements with uncertainties`
	`1470`	`+ greater than floating point epsilon, choosing a tolerance near that`
	`1471`	`+ uncertainty may be preferable. The tolerance may be absolute if the`
	`1472`	`+ uncertainties are absolute rather than relative.`
`1450`	`1473`
`1451`	`1474`	`References`
`1452`	`1475`	`----------`
`1453`		`- .. [1] G. H. Golub and C. F. Van Loan, Matrix Computations.`
`1454`		`- Baltimore: Johns Hopkins University Press, 1996.`
	`1476`	`+ .. [1] MATLAB reference documention, "Rank"`
	`1477`	`+ http://www.mathworks.com/help/techdoc/ref/rank.html`
	`1478`	`+ .. [2] W. H. Press, S. A. Teukolsky, W. T. Vetterling and B. P. Flannery,`
	`1479`	`+ "Numerical Recipes (3rd edition)", Cambridge University Press, 2007,`
	`1480`	`+ page 795.`
`1455`	`1481`
`1456`	`1482`	`Examples`
`1457`	`1483`	`--------`
`@@ -1465,7 +1491,6 @@ def matrix_rank(M, tol=None):`
`1465`	`1491`	`1`
`1466`	`1492`	`>>> matrix_rank(np.zeros((4,)))`
`1467`	`1493`	`0`
`1468`		`-`
`1469`	`1494`	`"""`
`1470`	`1495`	`M=asarray(M)`
`1471`	`1496`	`ifM.ndim>2:`
`@@ -1474,7 +1499,7 @@ def matrix_rank(M, tol=None):`
`1474`	`1499`	`returnint(notall(M==0))`
`1475`	`1500`	`S=svd(M,compute_uv=False)`
`1476`	`1501`	`iftolisNone:`
`1477`		`-tol=S.max()*finfo(S.dtype).eps`
	`1502`	`+tol=S.max()max(M.shape)finfo(S.dtype).eps`
`1478`	`1503`	`returnsum(S>tol)`
`1479`	`1504`
`1480`	`1505`

`‎numpy/linalg/tests/test_linalg.py`

Lines changed: 16 additions & 1 deletion

Original file line number	Diff line number	Diff line change
`@@ -3,7 +3,9 @@`
`3`	`3`	`importsys`
`4`	`4`
`5`	`5`	`importnumpyasnp`
`6`		`-fromnumpy.testingimport*`
	`6`	`+fromnumpy.testingimport (TestCase,assert_,assert_equal,assert_raises,`
	`7`	`+assert_array_equal,assert_almost_equal,`
	`8`	`+run_module_suite)`
`7`	`9`	`fromnumpyimportarray,single,double,csingle,cdouble,dot,identity`
`8`	`10`	`fromnumpyimportmultiply,atleast_2d,inf,asarray,matrix`
`9`	`11`	`fromnumpyimportlinalg`
`@@ -451,6 +453,19 @@ def test_matrix_rank(self):`
`451`	`453`	`yieldassert_equal,matrix_rank(1),1`
`452`	`454`
`453`	`455`
	`456`	`+deftest_reduced_rank():`
	`457`	`+# Test matrices with reduced rank`
	`458`	`+rng=np.random.RandomState(20120714)`
	`459`	`+foriinrange(100):`
	`460`	`+# Make a rank deficient matrix`
	`461`	`+X=rng.normal(size=(40,10))`
	`462`	`+X[:,0]=X[:,1]+X[:,2]`
	`463`	`+# Assert that matrix_rank detected deficiency`
	`464`	`+assert_equal(matrix_rank(X),9)`
	`465`	`+X[:,3]=X[:,4]+X[:,5]`
	`466`	`+assert_equal(matrix_rank(X),8)`
	`467`	`+`
	`468`	`+`
`454`	`469`	`classTestQR(TestCase):`
`455`	`470`	`deftest_qr_empty(self):`
`456`	`471`	`a=np.zeros((0,2))`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Commitf9c6867

File tree

2 files changed

2 files changed

`‎numpy/linalg/linalg.py`

`‎numpy/linalg/tests/test_linalg.py`

0 commit comments