-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a new method to return the final state vector array instead of wrapping it #623
Conversation
qsimcirq/qsim_simulator.py
Outdated
def simulate_sweep_iter( | ||
self, | ||
program: cirq.Circuit, | ||
params: cirq.Sweepable, | ||
qubit_order: cirq.QubitOrderOrList = cirq.QubitOrder.DEFAULT, | ||
initial_state: Optional[Union[int, np.ndarray]] = None, | ||
as_1d_state_vector: bool = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This violates the API defined by cirq.SimulatesFinalState.simulate_sweep_iter
. If there are plans to modify that function as well, please link the relevant Cirq PR. (The required cirq
version for qsim will also need to be updated if this is the case.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it doesn't violate the API. but it changes the internal data representation just for this if and only if as_1d_state_vector = True
which by default is False
so nothing changes unless the caller explicitly wants the 1D representation. This is done only for the qsim simulator.
The 1D representation is used for only one reason and that is to report the result, because the normal representation is a tensor that has number of dimensions equal to the num_qubits
which breaks when the number of qubits is greater than the limit on numpy array dimensions (see issue in docstring & PR description)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QSimSimulator
inherits from cirq.SimulatesFinalState
, whose simulate_sweep_iter
method does not have this new argument. Even though the implementation here can accept any valid input to the function of the parent class, it's still a violation of the API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the approach to adding a new method rather than extending the existing API.
@95-martin-orion This PR is just a workaround to solve quantumlib/Cirq#6031 until numpy starts to support more than 32 dimensions numpy/numpy#5744 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some docstring requests, otherwise this LGTM. A new qsimcirq
version is necessary to make this generally available - would you like me to cut a new release?
yes, please 😄 |
Thank you for the myriad fixes, @NoureldinYosri ! Logs for the Kokoro error can be found here. I unfortunately don't have much context on this, though I do know that the Kokoro tests are not affected by the |
@95-martin-orion from the logs
It tries to download an old version of the TF runtime that no longer exists https://storage.googleapis.com/mirror.tensorflow.org/github.com/tensorflow/runtime/archive/4ce3e4da2e21ae4dfcee9366415e55f408c884ec.tar.gz the versions that are still hosted on storage.googleapis.com/mirror.tensorflow.org are in http://mirror.tensorflow.org/. Where does it decide to go for that specific version of the runtime? Looking deeper in the logs it looks like it pypasses that error and then gets a cuda11 environment but then decides to look at cuda12
this looks to be the real problem |
The files for this are stored in Google-internal repositories - I'll email you the links. |
@NoureldinYosri thank you once again for this feature. May I know the timeline for the next release for qsim? |
I see, thank you, just in time to do huge statevector for Halloween! |
@NoureldinYosri there was a delay in using this feature in our production instances. We were waiting for the cuQuantum Appliance to have qsimcirq>=0.17.x (NVIDIA/cuQuantum#98), but it hasn't happened. But I was able to test this PR by straight up patching on qsimcirq 0.15.0 on cuQuantum Appliance 23.10. I am running a 2xA100 instance, with the following code import time
from memory_profiler import memory_usage
import cirq
import qsimcirq
def f():
num_qubits = 33
qc_cirq = cirq.Circuit()
qubits = cirq.LineQubit.range(num_qubits)
for i in range(num_qubits):
qc_cirq.append(cirq.H(qubits[i]))
sim = qsimcirq.QSimSimulator()
tic = time.time()
# sim = cirq.Simulator()
print("?", sim.simulate_into_1d_array)
sim.simulate_into_1d_array(qc_cirq)
print("Elapsed", time.time() - tic)
# print("Max memory", max(memory_usage(f)))
f() but still got this OOM error
Here is the benchmark result for 32 qubits (haven't measured GPU memory usage from
Here is the manual patch I applied 535c535
< def simulate_sweep_iter(
---
> def _simulate_impl(
541,570c541
< ) -> Iterator[cirq.StateVectorTrialResult]:
< """Simulates the supplied Circuit.
<
< This method returns a result which allows access to the entire
< wave function. In contrast to simulate, this allows for sweeping
< over different parameter values.
<
< Avoid using this method with `use_gpu=True` in the simulator options;
< when used with GPU this method must copy state from device to host memory
< multiple times, which can be very slow. This issue is not present in
< `simulate_expectation_values_sweep`.
<
< Args:
< program: The circuit to simulate.
< params: Parameters to run with the program.
< qubit_order: Determines the canonical ordering of the qubits. This is
< often used in specifying the initial state, i.e. the ordering of the
< computational basis states.
< initial_state: The initial state for the simulation. This can either
< be an integer representing a pure state (e.g. 11010) or a numpy
< array containing the full state vector. If none is provided, this
< is assumed to be the all-zeros state.
<
< Returns:
< List of SimulationTrialResults for this run, one for each
< possible parameter resolver.
<
< Raises:
< TypeError: if an invalid initial_state is provided.
< """
---
> ) -> Iterator[Tuple[cirq.ParamResolver, np.ndarray, Sequence[int]]]:
625a597,649
> yield prs, qsim_state.view(np.complex64), cirq_order
>
> def simulate_into_1d_array(
> self,
> program: cirq.AbstractCircuit,
> param_resolver: cirq.ParamResolverOrSimilarType = None,
> qubit_order: cirq.QubitOrderOrList = cirq.ops.QubitOrder.DEFAULT,
> initial_state: Any = None,
> ) -> Tuple[cirq.ParamResolver, np.ndarray, Sequence[int]]:
> """Same as simulate() but returns raw simulation result without wrapping it.
> The returned result is not wrapped in a StateVectorTrialResult but can be used
> to create a StateVectorTrialResult.
> Returns:
> Tuple of (param resolver, final state, qubit order)
> """
> params = cirq.study.ParamResolver(param_resolver)
> return next(self._simulate_impl(program, params, qubit_order, initial_state))
>
> def simulate_sweep_iter(
> self,
> program: cirq.Circuit,
> params: cirq.Sweepable,
> qubit_order: cirq.QubitOrderOrList = cirq.QubitOrder.DEFAULT,
> initial_state: Optional[Union[int, np.ndarray]] = None,
> ) -> Iterator[cirq.StateVectorTrialResult]:
> """Simulates the supplied Circuit.
> This method returns a result which allows access to the entire
> wave function. In contrast to simulate, this allows for sweeping
> over different parameter values.
> Avoid using this method with `use_gpu=True` in the simulator options;
> when used with GPU this method must copy state from device to host memory
> multiple times, which can be very slow. This issue is not present in
> `simulate_expectation_values_sweep`.
> Args:
> program: The circuit to simulate.
> params: Parameters to run with the program.
> qubit_order: Determines the canonical ordering of the qubits. This is
> often used in specifying the initial state, i.e. the ordering of the
> computational basis states.
> initial_state: The initial state for the simulation. This can either
> be an integer representing a pure state (e.g. 11010) or a numpy
> array containing the full state vector. If none is provided, this
> is assumed to be the all-zeros state.
> Returns:
> Iterator over SimulationTrialResults for this run, one for each
> possible parameter resolver.
> Raises:
> TypeError: if an invalid initial_state is provided.
> """
>
> for prs, state_vector, cirq_order in self._simulate_impl(
> program, params, qubit_order, initial_state
> ):
627c651
< initial_state=qsim_state.view(np.complex64), qubits=cirq_order
---
> initial_state=np.complex64, qubits=cirq_order |
Something is still consuming more GPU memory much more than in the past. I used to be able to do 33 qubits on a 2xA100 instance.
|
For 32 qubits we have a state vector of Are sure you could do 33 qubits on this machine?. The same calculation gives |
Yes, we are able to do so on I am in the process of measuring the max GPU memory consumption by polling |
Update: all is good! I am able to run 33 qubits on the 2xA100 instance. I confirm this PR works. The bug in the code in #623 (comment) was that I forgot to specify
My measurements (I'm not sure why the GPU memory is that low, but anyway, it works):
The GPU memory is measured by reading the output of |
My guess is that the time spent on the GPU is somewhat lower than the interval the |
This is to avoid the numpy limit on the number of dimensions quantumlib/Cirq#6031
The 1D representation should only be used when the number of qubits is greater than the numpy limit on the number of dimensions (currently set to 32) numpy/numpy#5744.
fixes quantumlib/Cirq#6031