Kunal Marwaha: journal club notes from whaley research group

I write them primarily for cementing my own understanding. I sometimes oversimplify to help me understand the main point.

If there are errors, please let me know by email.

(solo reading)

Main idea: new, optimization-based method to compute phase factors for QSP

Theme: Block encoding is a powerful technique!

- Goal is to implement matrix polynomials with few ancilla qubits
- Useful for hamiltonian simulation n
*e*^{ − iAt}, eigenvalue filtering, QLSP (*A*^{ − 1}), thermal state*e*^{ − βA}

- Approximate with a polynomial function
*f*(*x*), then encode*f*(*A*) exactly via block-encoding - QSP uses adjustable phase factors, but it’s hard to find phase factors given
*f*(*x*) - It involves an “iterate” matrix. Taking an initial
*A*as an LCU, then block encoding it, the eigenvalues of block encoding are related to*A*. - QSP uses D iterates and D+1 rotations (with a phase factor for each of D+1 angles).(Similar to QAOA!)

- This presents a loss function, when optimized, finds the best angles to choose. (You need to guess well!)
- It uses the “Remez exchange algorithm” to reduce the approximations required.
- This work makes an insight about the decay of phase factors as related to the decay of Chebyshev polynomials.

What’s the relationship between the below approaches?

- QSVT
- HHL algorithm for matrix function f(A)
- LCU method
- QSP
- QAOA
- qubitization

Main idea: There is a problem that 1D noiseless, 3D noisy constant depth quantum circuits solve but classical circuits need non-constant depth to solve.

The 1D problem uses the magic square game on 2 sets of *n* qubit pairs where qubit pair *j* from set 1 and *k* from set 2 are inputs. The classical constant-depth circuit cannot achieve much above 8/9 chance of success.

- More information on Clifford gates
- This is the CNOT symbol

- What is the X-type part of P(s)?

A sub-theory is a subset of operations, closed under joint measurements.

Main idea: If a sub-theory is contextual, it needs *O*(*n*^{2}) space to be classically simulated (vs *O*(*n*) for some noncontextual sub-theories, such as qubit stabilizer codes.) This quadratic scaling matches the Gottesman-Knill algorithm.

- Where else does contextuality imply simulation hardness?
- What about operations that are not closed?

Main idea: A set of operators is noncontextual if commuting is transitive among the subset that do not commute with all other operators.

Noncontextuality is an extension of a classicality (where all operators pairwise commute).

It is easy to test for noncontextuality by checking if commuting is transitive.

This also means that noncontextuality implies commutativity is an equivalence relation (i.e. if operators commute, they are in a clique)

“Strongly contextual” means you can’t assign a consistent outcome to combinations of operators! The measurement set is not closed under multiplication.

For example: (*X**I* * *I**X*) * (*Z**I* * *I**Z*) is YY, but (*X**I* * *I**Z*) * (*Z**I* * *I**X*) is -YY

“Weakly contextual” means joint distributions don’t match a hidden variable theory: *v*(*A*)*v*(*B*) ≠ *v*(*A**B*)

- How does the PBR proof work, exactly?
- When can you tell if something is weakly contextual?
- What do these things mean quantitatively, outside of Pauli operators?
- What about computability (things can be classically hard but still noncontextual)

Noncontextuality is a generalization of classical-ness.

Noncontextuality implies a hidden variable theory can explain the results of measurements

If everything has to commute or anticommute (as in with Pauli measurements), then it is easy to simulate quasi-quantized, “noncontextual” Pauli Hamiltonians! So one can simulate more than just the diagonal terms.

I found this talk really helpful. It started simple and ramped up very quickly.

- Unifying perspective on quantum walks, amplitude amplification, phase estimation, Grover diffusion
- Quantum walks and repeated operators
- Take unitary U, but just look at the top part A
- But also you can make a singular value decomposition!
- You can implement a function on the singular values on A!!
- This is more general than a function on the eigenvalues (can be non-square)
- A is arbitrary. It depends on the polynomial function you want to implement on the singular values! (just has to be an odd polynomial function)

- Low degree approximate polynomials is the key.
- Lots of other possibilities (QMA amplification)

This is from the Google collaboration. They introduced a new optimizing algorithm and compared performance by experimental demonstration with existing optimizers on common physics and CS models.

This is a cool protocol, “qDRIFT”:

- Split up H into normalized
*H*_{j}with constants*c*_{j} - Randomly use unitaries
*e*^{itHj}with probability*c*_{j}/∑*c*_{j} - With enough gates, it will approximate
*e*^{itH}(wow!)

It’s good compared to Trotter/Suzuki since dependence on number of terms is much better (not directly on L, just on ∑*c*_{j})

Stochastically drifts towards target unitary.

There is an interesting quantum channel argument, showing that the Taylor expansion of *e*^{itH}*ρ**e*^{ − itH} is close to the probabilistic mixing of gates in qDRIFT (choosing *H*_{j} with probability *c*_{j}/∑*c*_{j}, many times)

- Is there a higher-order qDRIFT?
- What about H = H_a + H_b, using qDRIFT on each one? (Perhaps we can partition the terms so that [
*H*_{a},*H*_{b}] ≈ 0 ? - What if you have one big
*λ*and a lot of little corrections? Can you run qDRIFT on the little corrections and make*H*_{A}+*H*_{qdrift}?

- What is “compilation”?
- Why does qdrift perform so much better than trotter? even numerically?
- How does this compare to time dependent perturbation theory?

This is a more sophisticated theory of trotter error bounds for product formulas.

It makes these contributions:

- defining types of trotter error (additive vs multiplicative vs exponential)
- defining order conditions (rates of convergence of the errors)
- using “error representations” (to prove bounds on errors)

Order conditions imply that many derivatives vanish: *f*(*t*) = *O*(*t*^{p}) means first *p* terms vanish (Since *f* is smaller than *c**t*^{p} near 0, *f* = *a*_{k}*t*^{k} + ... + *a*_{p}*t*^{p} ). This causes many terms to drop out.

- What improvements did Su & Childs make before this?
- Eq 10 was tricky – how do you get the R(t) part?
- Lemma 2: proof isn’t clear to me:
*A*+*e*^{ − At2}*B**e*^{At2}=*H*? Maybe the exponentials are constants? - How is corollary 5 proved?
- pth order condition seems pretty crucial!
- How do you get eq 30?

With an ensemble of unitaries, you only need a few measurements to generate a “classical shadow” (a best guess, like least squares). This can approximate a lot of linear observables without knowing what you’re looking for.

But you have to bound the shadow norm of operators.

Computing the partition function is related to approximating the ground state.

This is a polynomial approximation algorithm for quantum spin systems.

It converts the Hamiltonian into a sum of edge contributions of a graph. There are several methods to find *Z*(*β*) = *T**r*(*e*^{ − βH}):

- correlation decay analysis
- MCMC
- interpolation

This uses “abstract cluster expansion”. A polymer in the quantum instance (QCE) is a multiset of hyperedges, with multiplicities per hyperedge, using a connected subgraph.

- This is possible because at high temperature,
*β*is bounded - The truncation of the subgraph measures the locality

I read the paper and was impressed by its presentation:

- Main results up front, with clear introduction, explanation of definitions (like “fully polynomial approximation scheme”)
- Neat sections, with explanations of the sections at the end of the introduction
- Appendices for the equation manipulation

- Why is the degree O(log d Delta)? Why is it not just Delta?
- ∼ is a symmetric compatibility relation such that each polymer is incompatible with itself … what is this? Any symmetric subset of P(X) where (a, a) is not included?
- Why is Z(C, w) the form it is?
- Why is the Ursell function involved?
- Is V(H) the number of vertices in the graph? Is this just |Gamma|?
- The summation is the sum of connected, spanning subsets of edges?
- This will only have a value for clusters!

- I didn’t walk through Lemma 2, 4, 6, 7
- Lemma 5:
- All connected subgraphs of size at most m – why can it be done in |
*V*|*e*^{m}? - Why are there (m-1 choose n-1) polymers of size at most m?

- All connected subgraphs of size at most m – why can it be done in |
- Lemma 8: how many clusters are there? |
*V*|^{O(1)}

Do you have a paper suggestion? Was this helpful for you? Please send me an email.