- Open Access
Reinforcement Learning for Many-Body Ground-State Preparation Inspired by Counterdiabatic Driving
Jiahao Yao1,*,Lin Lin1,2,3, andMarin Bukov4,5,†
- 1Department of Mathematics, University of California, Berkeley, California 94720, USA
- 2Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
- 3Challenge Institute for Quantum Computation, University of California, Berkeley, California 94720, USA
- 4Department of Physics, University of California, Berkeley, California 94720, USA
- 5Department of Physics, St. Kliment Ohridski University of Sofia, 5 James Bourchier Boulevard, 1164 Sofia, Bulgaria
- *jiahaoyao@berkeley.edu
- †mgbukov@phys.uni-sofia.bg
Phys. Rev. X11, 031070 –Published 30 September, 2021
DOI:https://doi.org/10.1103/PhysRevX.11.031070
Abstract
The quantum alternating operator ansatz (QAOA) is a prominent example of variational quantum algorithms. We propose a generalized QAOA called CD-QAOA, which is inspired by the counterdiabatic driving procedure, designed for quantum many-body systems and optimized using a reinforcement learning (RL) approach. The resulting hybrid control algorithm proves versatile in preparing the ground state of quantum-chaotic many-body spin chains by minimizing the energy. We show that using terms occurring in the adiabatic gauge potential as generators of additional control unitaries, it is possible to achieve fast high-fidelity many-body control away from the adiabatic regime. While each unitary retains the conventional QAOA-intrinsic continuous control degree of freedom such as the time duration, we consider the order of the multiple available unitaries appearing in the control sequence as an additional discrete optimization problem. Endowing the policy gradient algorithm with an autoregressive deep learning architecture to capture causality, we train the RL agent to construct optimal sequences of unitaries. The algorithm has no access to the quantum state, and we find that the protocol learned on small systems may generalize to larger systems. By scanning a range of protocol durations, we present numerical evidence for a finite quantum speed limit in the nonintegrable mixed-field spin- Ising and Lipkin-Meshkov-Glick models, and for the suitability to prepare ground states of the spin-1 Heisenberg chain in the long-range and topologically ordered parameter regimes. This work paves the way to incorporate recent success from deep learning for the purpose of quantum many-body control.
Physics Subject Headings(PhySH)
Popular Summary
Strongly correlated quantum many-body systems describe condensed-matter materials which, once understood, are expected to lead to new technologies based on quantum mechanics. It has recently been realized that ground-state properties that give rise to exotic quantum phenomena (such as superconductivity or topological states of matter) can be simulated on highly controllable AMO-based quantum simulators. A major obstacle to accessing this exciting new physics is the preparation of many-body ground states on the quantum simulator. Our work presents a new physics-inspired, machine-learning-based approach to fast and efficient many-body quantum control.
We develop a generalization of the quantum alternating operator ansatz—a well-established quantum control algorithm—specifically tailored to prepare quantum many-body states. To do this, we integrate counterdiabatic driving—a concept from quantum dynamics for transitionless driving—into a physics-informed ansatz. This allows us to construct a suitable control protocol space, designed for the specific many-body system of interest.
To find the optimal protocol in this large control space, we apply a novel reinforcement learning algorithm based on deep autoregressive neural networks. The resulting hybrid variational algorithm combines the best of the quantum alternating operator ansatz and counterdiabatic driving to enable the preparation of high-fidelity quantum many-body ground states in systems lacking closed-form analytical solutions.
Our study opens the door to identifying relevant control degrees of freedom in quantum many-body systems. The generalized quantum alternating operator ansatz that we develop is suitable for both digital and analog quantum simulators. In addition, this work displays a beneficial symbiosis between reinforcement learning and optimal quantum control.
Article Text
References (94)
- M. Lewenstein, A. Sanpera, V. Ahufinger, B. Damski, A. Sen, and U. Sen, Ultracold Atomic Gases in Optical Lattices: Mimicking Condensed Matter Physics and Beyond,Adv. Phys.56, 243 (2007).
- I. Bloch, J. Dalibard, and W. Zwerger, Many-Body Physics with Ultracold Gases,Rev. Mod. Phys.80, 885 (2008).
- H. Häffner, C. F. Roos, and R. Blatt, Quantum Computing with Trapped Ions,Phys. Rep.469, 155 (2008).
- R. Blatt and C. F. Roos, Quantum Simulations with Trapped Ions,Nat. Phys.8, 277 (2012).
- C. Monroe and J. Kim, Scaling the Ion Trap Quantum Processor,Science339, 1164 (2013).
- M. H. Devoret and R. J. Schoelkopf, Superconducting Circuits for Quantum Information: An Outlook,Science339, 1169 (2013).
- M. W. Doherty, N. B. Manson, P. Delaney, F. Jelezko, J. Wrachtrup, and L. C. Hollenberg, The Nitrogen-Vacancy Colour Centre in Diamond,Phys. Rep.528, 1 (2013).
- R. Schirhagl, K. Chang, M. Loretz, and C. L. Degen, Nitrogen-Vacancy Centers in Diamond: Nanoscale Sensors for Physics and Biology,Annu. Rev. Phys. Chem.65, 83 (2014).
- F. Casola, T. van der Sar, and A. Yacoby, Probing Condensed Matter Physics with Magnetometry Based on Nitrogen-Vacancy Centres in Diamond,Nat. Rev. Mater.3, 17088 (2018).
- Z.-L. Xiang, S. Ashhab, J. Q. You, and F. Nori, Hybrid Quantum Circuits: Superconducting Circuits Interacting with Other Quantum Systems,Rev. Mod. Phys.85, 623 (2013).
- F. Arute, K. Arya, R. Babbush, D. Bacon, J. C. Bardin, R. Barends, R. Biswas, S. Boixo, F. G. Brandao, D. A. Buellet al., Quantum Supremacy Using a Programmable Superconducting Processor,Nature (London)574, 505 (2019).
- N. Khaneja, T. Reiss, C. Kehlet, T. Schulte-Herbrüggen, and S. J. Glaser, Optimal Control of Coupled Spin Dynamics: Design of NMR Pulse Sequences by Gradient Ascent Algorithms,J. Magn. Reson.172, 296 (2005).
- T. Caneva, T. Calarco, and S. Montangero, Chopped Random-Basis Quantum Optimization,Phys. Rev. A84, 022326 (2011).
- A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q. Zhou, P. J. Love, A. Aspuru-Guzik, and J. L. O’brien, A Variational Eigenvalue Solver on a Photonic Quantum Processor,Nat. Commun.5, 4213 (2014).
- L. Zhou, S.-T. Wang, S. Choi, H. Pichler, and M. D. Lukin, Quantum Approximate Optimization Algorithm: Performance, Mechanism, and Implementation on Near-Term Devices,Phys. Rev. X10, 021067 (2020).
- S. Hadfield, Z. Wang, B. O’Gorman, E. G. Rieffel, D. Venturelli, and R. Biswas, From the Quantum Approximate Optimization Algorithm to a Quantum Alternating Operator Ansatz,Algorithms12, 34 (2019).
- M. Demirplak and S. A. Rice, Adiabatic Population Transfer with Control Fields,J. Phys. Chem. A107, 9937 (2003).
- M. Berry, Transitionless Quantum Driving,J. Phys. A42, 365303 (2009).
- M. Kolodrubetz, D. Sels, P. Mehta, and A. Polkovnikov, Geometry and Non-adiabatic Response in Quantum and Classical Systems,Phys. Rep.697, 1 (2017).
- M. Bukov, D. Sels, and A. Polkovnikov, Geometric Speed Limit of Accessible Many-Body State Preparation,Phys. Rev. X9, 011034 (2019).
We focus on pure states, although the cost function can trivially be generalized to mixed states.
- V. Jurdjevic and H. J. Sussmann, Control Systems on Lie Groups,J. Diff. Eqs.12, 313 (1972).
- L. Zhu, H. L. Tang, G. S. Barron, F. A. Calderon-Vargas, N. J. Mayhall, E. Barnes, and S. E. Economou, An Adaptive Quantum Approximate Optimization Algorithm for Solving Combinatorial Problems on a Quantum Computer,arXiv:2005.10258.
- J. R. McClean, S. Boixo, V. N. Smelyanskiy, R. Babbush, and H. Neven, Barren Plateaus in Quantum Neural Network Training Landscapes,Nat. Commun.9, 4812 (2018).
- M. Cerezo, A. Sone, T. Volkoff, L. Cincio, and P. J. Coles, Cost Function Dependent Barren Plateaus in Shallow Parametrized Quantum Circuits,Nat. Commun.12, 1791 (2021).
- E. Grant, L. Wossnig, M. Ostaszewski, and M. Benedetti, An Initialization Strategy for Addressing Barren Plateaus in Parametrized Quantum Circuits,Quantum3, 214 (2019).
- P. Huembeli and A. Dauphin, Characterizing the Loss Landscape of Variational Quantum Circuits,Quantum Sci. Technol.6, 025011 (2021).
Considering as a choice of unitaries, we impose the extra constraint that, even though unitaries can be repeated in the sequence, the same unitary cannot appear consecutively (or one can combine the two corresponding choices into a single variable).
- A. G. R. Day, M. Bukov, P. Weinberg, P. Mehta, and D. Sels, Glassy Phase of Optimal Quantum Control,Phys. Rev. Lett.122, 020601 (2019).
- M. Bukov, A. G. R. Day, P. Weinberg, A. Polkovnikov, P. Mehta, and D. Sels, Broken Symmetry in a Two-Qubit Quantum Control Landscape,Phys. Rev. A97, 052114 (2018).
A similar procedure appeared recently in Ref. [32], although they considered a different problem setup with greedy or beam search algorithms.
- L. Li, M. Fan, M. Coram, P. Riley, and S. Leichenauer, Quantum Optimization with a Novel Gibbs Objective Function and Ansatz Architecture Search,Phys. Rev. Research2, 023074 (2020).
In principle, one can use any optimizer that allows for constraining the sum.
- N. Lacroix, C. Hellings, C. K. Andersen, A. D. Paolo, A. Remm, S. Lazar, S. Krinner, G. J. Norris, M. Gabureac, J. Heinsoo, A. Blais, C. Eichler, and A. Wallraff, Improving the Performance of Deep Quantum Optimization Algorithms with Continuous Gate Sets,PRX Quantum1, 110304 (2020).
- Y. Ding, Y. Ban, J. D. Martín-Guerrero, E. Solano, J. Casanova, and X. Chen, Breaking Adiabatic Quantum Control with Deep Learning,Phys. Rev. A103, L040401 (2021).
- D. Sels and A. Polkovnikov, Minimizing Irreversible Losses in Quantum Systems by Local Counterdiabatic Driving,Proc. Natl. Acad. Sci. U.S.A.114, E3909 (2017).
- A. Hartmann and W. Lechner, Rapid Counter-Diabatic Sweeps in Lattice Gauge Adiabatic Quantum Computing,New J. Phys.21, 043025 (2019).
- J. Wurtz, P. W. Claeys, and A. Polkovnikov, Variational Schrieffer-Wolff Transformations for Quantum Many-Body Dynamics,Phys. Rev. B101, 014302 (2020).
- N. N. Hegade, K. Paul, Y. Ding, M. Sanz, F. Albarrán-Arriagada, E. Solano, and X. Chen, Shortcuts to Adiabaticity in Digitized Adiabatic Quantum Computing,Phys. Rev. Applied15 (2021).
- M. Pandey, P. W. Claeys, D. K. Campbell, A. Polkovnikov, and D. Sels, Adiabatic Eigenstate Deformations as a Sensitive Probe for Quantum Chaos,Phys. Rev. X10, 041017 (2020).
Below, we sometimes abuse notation and set, denoting the set of unitaries by their generators.
- J. Wurtz and P. J. Love, Counterdiabaticity and the Quantum Approximate Optimization Algorithm,arXiv:2106.15645.
- G. Matos, S. Johri, and Z. Papić, Quantifying the Efficiency of State Preparation via Quantum Variational Eigensolvers,PRX Quantum2, 010309 (2021).
- W. W. Ho and T. H. Hsieh, Efficient Variational Simulation of Non-trivial Quantum States,SciPost Phys.6, 29 (2019).
The role of the RL algorithm is to decide which three out of the five unitaries to apply and in which order.
We define “order” in the context of phase transitions in condensed matter physics.
- W. Chen, K. Hida, and B. C. Sanctuary, Ground-State Phase Diagram of XXZ Chains with Uniaxial Single-Ion-Type Anisotropy,Phys. Rev. B67, 104401 (2003).
- F. Pollmann, A. M. Turner, E. Berg, and M. Oshikawa, Entanglement Spectrum of a Topological Phase in One Dimension,Phys. Rev. B81, 064439 (2010).
- A. Langari, F. Pollmann, and M. Siahatgar, Ground-State Fidelity of the Spin-1 Heisenberg Chain with Single Ion Anisotropy: Quantum Renormalization Group and Exact Diagonalization Approaches,J. Phys. Condens. Matter25, 406002 (2013).
- H. Lipkin, N. Meshkov, and A. Glick, Validity of Many-Body Approximation Methods for a Solvable Model: (I). Exact Solutions and Perturbation Theory,Nucl. Phys.62, 188 (1965).
- R. Botet and R. Jullien, Large-Size Critical Behavior of Infinitely Coordinated Systems,Phys. Rev. B28, 3955 (1983).
- H. Strobel, W. Muessel, D. Linnemann, T. Zibold, D. B. Hume, L. Pezzè, A. Smerzi, and M. K. Oberthaler, Fisher Information and Entanglement of Non-Gaussian Spin States,Science345, 424 (2014).
- E. J. Davis, A. Periwal, E. S. Cooper, G. Bentsen, S. J. Evered, K. Van Kirk, and M. H. Schleier-Smith, Protecting Spin Coherence in a Tunable Heisenberg Model,Phys. Rev. Lett.125, 060402 (2020).
We deliberately use a different form in Eq. (6) as compared to Eq. (3); the former may appear more natural in quantum many-body physics, where the transverse-field Ising model can be mapped to free fermions.
- M. J. S. Beach, R. G. Melko, T. Grover, and T. H. Hsieh, Making Trotters Sprint: A Variational Imaginary Time Ansatz for Quantum Many-Body Systems,Phys. Rev. B100, 094434 (2019).
- P. Weinberg and M. Bukov, QuSpin: A Python Package for Dynamics and Exact Diagonalisation of Quantum Many Body Systems Part I: Spin Chains,SciPost Phys.2, 003 (2017).
- P. Weinberg and M. Bukov, QuSpin: A Python Package for Dynamics and Exact Diagonalisation of Quantum Many Body Systems. Part II: Bosons, Fermions and Higher Spins,SciPost Phys.7, 20 (2019).
- V. Dunjko and H. J. Briegel, Machine Learning & Artificial Intelligence in the Quantum Domain: A Review of Recent Progress,Rep. Prog. Phys.81, 074001 (2018).
- P. Mehta, M. Bukov, C.-H. Wang, A. G. Day, C. Richardson, C. K. Fisher, and D. J. Schwab, A High-Bias, Low-Variance Introduction to Machine Learning for Physicists,Phys. Rep.810, 1 (2019).
- G. Carleo, I. Cirac, K. Cranmer, L. Daudet, M. Schuld, N. Tishby, L. Vogt-Maranto, and L. Zdeborová, Machine Learning and the Physical Sciences,Rev. Mod. Phys.91, 045002 (2019).
- J. Carrasquilla, Machine Learning for Quantum Matter,Adv. Phys. X5, 1797528 (2020).
- K. J. Sung, J. Yao, M. P. Harrigan, N. C. Rubin, Z. Jiang, L. Lin, R. Babbush, and J. R. McClean, Using Models to Improve Optimizers for Variational Quantum Algorithms,Quantum Sci. Technol.5, 044008 (2020).
- F. Schäfer, M. Kloc, C. Bruder, and N. Lörch, A Differentiable Programming Method for Quantum Control,Mach. Learn.1, 035009 (2020).
- F. Sauvage and F. Mintert, Optimal Quantum Control with Poor Statistics,PRX Quantum1, 020322 (2020).
- T. Fösel, S. Krastanov, F. Marquardt, and L. Jiang, Efficient Cavity Control with Snap Gates,arXiv:2004.14256.
- R.-B. Wu, X. Cao, P. Xie, and Y.-X. Liu, End-to-End Quantum Machine Learning Implemented with Controlled Quantum Dynamics,Phys. Rev. Applied14, 064020 (2020).
- F. Albarrán-Arriagada, J. C. Retamal, E. Solano, and L. Lamata, Measurement-Based Adaptation Protocol with Quantum Reinforcement Learning,Phys. Rev. A98, 042315 (2018).
- T. Fösel, P. Tighineanu, T. Weiss, and F. Marquardt, Reinforcement Learning with Neural Networks for Quantum Feedback,Phys. Rev. X8, 031084 (2018).
- H. P. Nautrup, N. Delfosse, V. Dunjko, H. J. Briegel, and N. Friis, Optimizing Quantum Error Correction Codes with Reinforcement Learning,Quantum3, 215 (2019).
- R. S. Sutton and A. G. Barto,Reinforcement Learning: An Introduction (MIT Press, Cambridge, MA, 2018).
- D. C. Rose, J. F. Mair, and J. P. Garrahan, A Reinforcement Learning Approach to Rare Trajectory Sampling,New J. Phys.23, 013013 (2021).
- M. Y. Niu, S. Boixo, V. N. Smelyanskiy, and H. Neven, Universal Quantum Control through Deep Reinforcement Learning,npj Quantum Inf.5, 33 (2019).
- M. August and J. M. Hernández-Lobato, Taking Gradients through Experiments: LSTMs and Memory Proximal Policy Optimization for Black-Box Quantum Control, in High Performance Computing (Springer International Publishing, Cham, 2018), pp. 591–613,10.1007/978-3-030-02465-9_43.
- R. Porotti, D. Tamascelli, M. Restelli, and E. Prati, Coherent Transport of Quantum States by Deep Reinforcement Learning,Commun. Phys.2, 61 (2019).
- M. Bukov, Reinforcement Learning for Autonomous Preparation of Floquet-Engineered States: Inverting the Quantum Kapitza Oscillator,Phys. Rev. B98, 224305 (2018).
- M. Bukov, A. G. R. Day, D. Sels, P. Weinberg, A. Polkovnikov, and P. Mehta, Reinforcement Learning in Different Phases of Quantum Control,Phys. Rev. X8, 031086 (2018).
- M. Dalgaard, F. Motzoi, J. J. Sørensen, and J. Sherson, Global Optimization of Quantum Dynamics with AlphaZero Deep Exploration,npj Quantum Inf.6, 6 (2020).
- J. Yao, M. Bukov, and L. Lin, Policy Gradient Based Quantum Approximate Optimization Algorithm, in Mathematical and Scientific Machine Learning Conference (MSML), 2020 (PMLR, Princeton, NJ, USA, 2020), Vol. 107, pp. 605–634,http://proceedings.mlr.press/v107/yao20a.html.
- M. M. Wauters, E. Panizon, G. B. Mbeng, and G. E. Santoro, Reinforcement Learning Assisted Quantum Optimization,Phys. Rev. Research2, 033446 (2020).
- S. Khairy, R. Shaydulin, L. Cincio, Y. Alexeev, and P. Balaprakash, Reinforcement-Learning-Based Variational Quantum Circuits Optimization for Combinatorial Problems,arXiv:1911.04574.
- A. Garcia-Saez and J. Riu, Quantum Observables for Continuous Control of the Quantum Approximate Optimization Algorithm via Reinforcement Learning,arXiv:1911.09682.
- A. Bolens and M. Heyl, Reinforcement Learning for Digital Quantum Simulation,Phys. Rev. Lett.127, 110502 (2021).
It is also possible to define a RL framework for hybrid continuous-discrete control where optimization is entirely based on RL, cf. Ref. [84].
- J. Yao, P. Köttering, H. Gundlach, L. Lin, and M. Bukov, Noise-Robust End-to-End Quantum Control Using Deep Autoregressive Policy Networks, Mathematical and Scientific Machine Learning Conference, 2021,arXiv:2012.06701.
- M. Bukov, Reinforcement Learning for Autonomous Preparation of Floquet-Engineered States: Inverting the Quantum Kapitza Oscillator,Phys. Rev. B98, 224305 (2018).
- D. Wu, L. Wang, and P. Zhang, Solving Statistical Mechanics Using Variational Autoregressive Networks,Phys. Rev. Lett.122, 080602 (2019).
- O. Sharir, Y. Levine, N. Wies, G. Carleo, and A. Shashua, Deep Autoregressive Models for the Efficient Variational Simulation of Many-Body Quantum Systems,Phys. Rev. Lett.124, 020503 (2020).
- D. P. Kingma and J. Ba, Adam: A Method for Stochastic Optimization,arXiv:1412.6980.
- J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, Proximal Policy Optimization Algorithms,arXiv:1707.06347.
- B. D. Ziebart,Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy (Carnegie, Pittsburgh, 2010).
- T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor,arXiv:1801.01290.
- J. Nocedal and W. Stephen, inNumerical Optimization (Springer Science & Business Media, New York, 2006).
- T. Hatomura, Shortcuts to Adiabaticity in the Infinite-Range Ising Model by Mean-Field Counter-Diabatic Driving,J. Phys. Soc. Jpn.86, 094002 (2017).
- Z. Mzaouali, R. Puebla, J. Goold, M. E. Baz, and S. Campbell, Work Statistics and Symmetry Breaking in an Excited-State Quantum Phase Transition,Phys. Rev. E103, 032145 (2021).
- Received 24 November 2020
- Revised 28 May 2021
- Accepted 15 July 2021

Published by the American Physical Society under the terms of theCreative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.
Published by the American Physical Society