.2022 Mar;3(1):015017.

doi: 10.1088/2632-2153/ac44a9. Epub 2022 Jan 20.

Mutual Information Scaling for Tensor Network Machine Learning

Ian Convy^{1 2}, William Huggins^{1 2}, Haoran Liao^{3 2}, K Birgitta Whaley^{1 2}

Affiliations

PMID:35211672
PMCID: PMC8862112
DOI: 10.1088/2632-2153/ac44a9

Mutual Information Scaling for Tensor Network Machine Learning

Ian Convy et al. Mach Learn Sci Technol.2022 Mar.

.2022 Mar;3(1):015017.

doi: 10.1088/2632-2153/ac44a9. Epub 2022 Jan 20.

Authors

Ian Convy^{1 2}, William Huggins^{1 2}, Haoran Liao^{3 2}, K Birgitta Whaley^{1 2}

Affiliations

¹ Department of Chemistry, University of California, Berkeley, CA 94720, USA.
² Berkeley Quantum Information and Computation Center, University of California, Berkeley, CA 94720, USA.
³ Department of Physics, University of California, Berkeley, CA 94720, USA.

PMID:35211672
PMCID: PMC8862112
DOI: 10.1088/2632-2153/ac44a9

Abstract

Tensor networks have emerged as promising tools for machine learning, inspired by their widespread use as variational ansatze in quantum many-body physics. It is well known that the success of a given tensor network ansatz depends in part on how well it can reproduce the underlying entanglement structure of the target state, with different network designs favoring different scaling patterns. We demonstrate here how a related correlation analysis can be applied to tensor network machine learning, and explore whether classical data possess correlation scaling patterns similar to those found in quantum states which might indicate the best network to use for a given dataset. We utilize mutual information as measure of correlations in classical data, and show that it can serve as a lower-bound on the entanglement needed for a probabilistic tensor network classifier. We then develop a logistic regression algorithm to estimate the mutual information between bipartitions of data features, and verify its accuracy on a set of Gaussian distributions designed to mimic different correlation patterns. Using this algorithm, we characterize the scaling patterns in the MNIST and Tiny Images datasets, and find clear evidence of boundary-law scaling in the latter. This quantum-inspired classical analysis offers insight into the design of tensor networks which are best suited for specific learning tasks.

PubMed Disclaimer

Figures

**Figure 1:**
Tensor network diagram for an open MPS containing five tensors, with the corresponding equation given above. The first and last tensors are matrices labeled asM, with the rest being third-order tensors labeled generically asA. The contractions between neighboring tensors are made explicit in the graphical notation, without needing to keep track of specific indices. The order of the MPS is also clear from the number of uncontracted legs.

**Figure 2:**
MI curves for a GMRF with nearest-neighbor correlations at different sample sizes, plotted relative to the side lengthL of the inner partition. The plot on the left had a weaker correlation strength withq = −0.12, while the plot on the right had a stronger correlation strength withq = −0.227. The solid lines represent the averages over the trials, while the shaded regions show one standard deviation. The linear boundary-law scaling pattern of the GMRF is evident from the exact curve (red). With the exception of the weakly-correlated, 70,000 sample trial, this linear scaling is successfully reconstructed by the algorithm. The magnitude of the MI is underestimated in all trials, with the fractional error being similar for the strong and weak correlations.

**Figure 3:**
MI curves for a GMRF with uniform correlations at different sample sizes, plotted relative to the side length of the inner partition. The plot on the left had a weaker correlation strength withq = −1.2 × 10⁻³, while the plot on the right had a stronger correlation strength with $q = - 1.27712 \times 10^{- 3} \approx \frac{1}{783}$ . The solid lines represent the averages over the trials, while the shaded regions show one standard deviation. The algorithm successfully reproduced the shape of the exact MI curve, with the larger sample sizes almost matching the analytic MI values. The finite size of the grid causes the curve to gradually bend over as the partition length increases.

**Figure 4:**
MI curves for a GMRF with spatially-randomized correlations at different sample sizes, plotted relative to the side length of the inner partition. The plot on the left had a weaker correlation strength withq = −0.045, while the plot on the right had a stronger correlation strength withq = −0.11. The solid lines represent the averages over the trials, while the shaded regions show one standard deviation. The quadratic, volume-law scaling is clear from the analytic MI curve at smallL, which for the strong correlations was reproduced across all sample sizes (though the 70,000 sample curve is greatly diminished). For the weak correlations the model was unable to find any correlations using the smallest sample size of 70,000, as was the case for the nearest-neighbor correlations (Figure 2, left panel).

**Figure 5:**
MI estimates for the MNIST and Tiny Images datasets using logistic regression, plotted relative to the side lengthL of the inner partition. The solid lines are averages from twenty separate trials, while the shaded regions show one standard deviation. The MNIST curve most closely resembles the strongly-correlated uniform GMRF from Sec. 5.4, and exhibits minimal variance. The Tiny Images curve is most similar to the nearest-neighbor, boundary-law GMRF from Sec. 5.3 (Figure 2, right panel), but the shape is harder to pin down due to its high variance.

**Figure 6:**
Analytic MI curves from the GMRFs fitted to MNIST and the Tiny Images, plotted relative to the side lengthL of the inner partition. The Tiny Images curve shows a clear boundary law, while the MNIST curve also starts linear but gradually bends over. Since the GMRFs only model simple pairwise correlations, these MI values are very likely underestimates.

**Figure 7:**
Left: Low-resolution image taken from the Tiny Images dataset after having been cropped and converted to grayscale. Right: The handwritten digit “5” taken from MNIST. The axes give the height and width of each image.

**Figure 8:**
Covariance plots for thea) nearest-neighbor,b) uniform, andc) randomized GMRF distributions used in Sec. 5. The covariance values are taken with respect to the center variable highlighted in red, with brighter colors indicating stronger correlations and darker pixels indicating weaker correlations.

**Figure 9:**
Sample “images” taken from the GRMF distributions of Sec. 5 at both strong and weak correlation strengths.

**Figure 10:**
The covariances (top row) and sample images (bottom row) from GRMFs fit to the Tiny Images and MNIST datatsets. The covariance values are calculated with respect to the central pixel highlighted in red, with brighter colors indicating larger values. The Tiny Images covariance plot shows a strong nearest-neighbor pattern, while the MNIST plot has a more complicated and long-range structure. The sample images show some structure, but are not identifiable as a digit or object.

See this image and copyright information in PMC

References

1. Kolda T and Bader B “Tensor Decompositions and Applications”. In: SIAM Review 51.3 (Aug. 2009), pp. 455–500. ISSN: 0036–1445. DOI: 10.1137/07070111X. - DOI
1. Hackbusch W Tensor Spaces and Numerical Tensor Calculus. Springer series in computational mathematics Springer Verlag, 2012. ISBN: 978–3-642–28026-9.
1. Bridgeman Jacob C. and Chubb Christopher T. “Hand-waving and interpretive dance: an introductory course on tensor networks”. In: Journal of Physics A: Mathematical and Theoretical 50.22 (2017), p. 223001.
1. Eisert J “Entanglement and tensor network states”. In: arXiv:1308.3318 [cond-mat, physics:quantph] (Sept. 2013). arXiv: 1308.3318. URL:http://arxiv.org/abs/1308.3318.
1. Verstraete F, Cirac JI, and Murg V “Matrix Product States, Projected Entangled Pair States, and variational renormalization group methods for quantum spin systems”. In: Advances in Physics 57.2 (Mar. 2008). arXiv: 0907.2796, pp. 143–224. ISSN: 0001–8732, 1460–6976. DOI: 10.1080/14789940801912366. - DOI

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Movatterモバイル変換

Account

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Full text links

Actions

Share

Mutual Information Scaling for Tensor Network Machine Learning

Affiliations

Mutual Information Scaling for Tensor Network Machine Learning

Authors

Affiliations

Abstract

Figures

Similar articles

References

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources