Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitf9c6867

Browse files
committed
Merge branch 'pull-357'
* pull-357: BUG: change default threshold for matrix_rank
2 parents6a1ab03 +78b7693 commitf9c6867

File tree

2 files changed

+60
-20
lines changed

2 files changed

+60
-20
lines changed

‎numpy/linalg/linalg.py

Lines changed: 44 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1420,8 +1420,8 @@ def matrix_rank(M, tol=None):
14201420
"""
14211421
Return matrix rank of array using SVD method
14221422
1423-
Rank of the array is the number of SVD singular values of the
1424-
array that aregreater than `tol`.
1423+
Rank of the array is the number of SVD singular values of the array that are
1424+
greater than `tol`.
14251425
14261426
Parameters
14271427
----------
@@ -1431,27 +1431,53 @@ def matrix_rank(M, tol=None):
14311431
threshold below which SVD values are considered zero. If `tol` is
14321432
None, and ``S`` is an array with singular values for `M`, and
14331433
``eps`` is the epsilon value for datatype of ``S``, then `tol` is
1434-
set to ``S.max() * eps``.
1434+
set to ``S.max() *max(M.shape) *eps``.
14351435
14361436
Notes
14371437
-----
1438-
Golub and van Loan [1]_ define "numerical rank deficiency" as using
1439-
tol=eps*S[0] (where S[0] is the maximum singular value and thus the
1440-
2-norm of the matrix). This is one definition of rank deficiency,
1441-
and the one we use here. When floating point roundoff is the main
1442-
concern, then "numerical rank deficiency" is a reasonable choice. In
1443-
some cases you may prefer other definitions. The most useful measure
1444-
of the tolerance depends on the operations you intend to use on your
1445-
matrix. For example, if your data come from uncertain measurements
1446-
with uncertainties greater than floating point epsilon, choosing a
1447-
tolerance near that uncertainty may be preferable. The tolerance
1448-
may be absolute if the uncertainties are absolute rather than
1449-
relative.
1438+
The default threshold to detect rank deficiency is a test on the magnitude
1439+
of the singular values of `M`. By default, we identify singular values less
1440+
than ``S.max() * max(M.shape) * eps`` as indicating rank deficiency (with
1441+
the symbols defined above). This is the algorithm MATLAB uses [1]. It also
1442+
appears in *Numerical recipes* in the discussion of SVD solutions for linear
1443+
least squares [2].
1444+
1445+
This default threshold is designed to detect rank deficiency accounting for
1446+
the numerical errors of the SVD computation. Imagine that there is a column
1447+
in `M` that is an exact (in floating point) linear combination of other
1448+
columns in `M`. Computing the SVD on `M` will not produce a singular value
1449+
exactly equal to 0 in general: any difference of the smallest SVD value from
1450+
0 will be caused by numerical imprecision in the calculation of the SVD.
1451+
Our threshold for small SVD values takes this numerical imprecision into
1452+
account, and the default threshold will detect such numerical rank
1453+
deficiency. The threshold may declare a matrix `M` rank deficient even if
1454+
the linear combination of some columns of `M` is not exactly equal to
1455+
another column of `M` but only numerically very close to another column of
1456+
`M`.
1457+
1458+
We chose our default threshold because it is in wide use. Other thresholds
1459+
are possible. For example, elsewhere in the 2007 edition of *Numerical
1460+
recipes* there is an alternative threshold of ``S.max() *
1461+
np.finfo(M.dtype).eps / 2. * np.sqrt(m + n + 1.)``. The authors describe
1462+
this threshold as being based on "expected roundoff error" (p 71).
1463+
1464+
The thresholds above deal with floating point roundoff error in the
1465+
calculation of the SVD. However, you may have more information about the
1466+
sources of error in `M` that would make you consider other tolerance values
1467+
to detect *effective* rank deficiency. The most useful measure of the
1468+
tolerance depends on the operations you intend to use on your matrix. For
1469+
example, if your data come from uncertain measurements with uncertainties
1470+
greater than floating point epsilon, choosing a tolerance near that
1471+
uncertainty may be preferable. The tolerance may be absolute if the
1472+
uncertainties are absolute rather than relative.
14501473
14511474
References
14521475
----------
1453-
.. [1] G. H. Golub and C. F. Van Loan, *Matrix Computations*.
1454-
Baltimore: Johns Hopkins University Press, 1996.
1476+
.. [1] MATLAB reference documention, "Rank"
1477+
http://www.mathworks.com/help/techdoc/ref/rank.html
1478+
.. [2] W. H. Press, S. A. Teukolsky, W. T. Vetterling and B. P. Flannery,
1479+
"Numerical Recipes (3rd edition)", Cambridge University Press, 2007,
1480+
page 795.
14551481
14561482
Examples
14571483
--------
@@ -1465,7 +1491,6 @@ def matrix_rank(M, tol=None):
14651491
1
14661492
>>> matrix_rank(np.zeros((4,)))
14671493
0
1468-
14691494
"""
14701495
M=asarray(M)
14711496
ifM.ndim>2:
@@ -1474,7 +1499,7 @@ def matrix_rank(M, tol=None):
14741499
returnint(notall(M==0))
14751500
S=svd(M,compute_uv=False)
14761501
iftolisNone:
1477-
tol=S.max()*finfo(S.dtype).eps
1502+
tol=S.max()*max(M.shape)*finfo(S.dtype).eps
14781503
returnsum(S>tol)
14791504

14801505

‎numpy/linalg/tests/test_linalg.py

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,9 @@
33
importsys
44

55
importnumpyasnp
6-
fromnumpy.testingimport*
6+
fromnumpy.testingimport (TestCase,assert_,assert_equal,assert_raises,
7+
assert_array_equal,assert_almost_equal,
8+
run_module_suite)
79
fromnumpyimportarray,single,double,csingle,cdouble,dot,identity
810
fromnumpyimportmultiply,atleast_2d,inf,asarray,matrix
911
fromnumpyimportlinalg
@@ -451,6 +453,19 @@ def test_matrix_rank(self):
451453
yieldassert_equal,matrix_rank(1),1
452454

453455

456+
deftest_reduced_rank():
457+
# Test matrices with reduced rank
458+
rng=np.random.RandomState(20120714)
459+
foriinrange(100):
460+
# Make a rank deficient matrix
461+
X=rng.normal(size=(40,10))
462+
X[:,0]=X[:,1]+X[:,2]
463+
# Assert that matrix_rank detected deficiency
464+
assert_equal(matrix_rank(X),9)
465+
X[:,3]=X[:,4]+X[:,5]
466+
assert_equal(matrix_rank(X),8)
467+
468+
454469
classTestQR(TestCase):
455470
deftest_qr_empty(self):
456471
a=np.zeros((0,2))

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp