-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathquantum-mechanics.tex
1340 lines (991 loc) · 73.1 KB
/
quantum-mechanics.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\chapter{Quantum Mechanics} \label{ch:qm}
\section{State Vectors and Dirac Notation}
In quantum mechanics everything knowable about the state of some system is described in a vector, known as the state vector. The vector is from a vector space defined over the field of complex numbers, so it is important to use the correct definition of the inner product (§\ref{sec:vectors-complex}) where we take the conjugate of one of the vectors, to ensure that the inner product of a vector with itself is a positive real number.
The inner product in this context is written like this:
$$\langle \vec{a}|\vec{b} \rangle$$
Note that this is not the same as the notations we've used for a covector acting on a vector (§\ref{covector}), nor for the inner product (§\ref{inner-product}). If the vector space had real scalars, the distinction between those would be somewhat irrelevant, as in the state vector space we will always be able to choose an orthonormal basis, and so we'll never need to worry about how the metric tensor is defined (it will always be $\delta_{ij}$). But as we're dealing with a complex vector space, even though coordinates are always on an orthonormal basis, the dual covector of a vector has coordinates that are the complex conjugates of the vector's own coordinates, so we need to be careful about the distinction.
If the vectors $\vec{a}$ and $\vec{b}$ are represented by column matrices $a$ and $b$ (to spare ourselves, for the moment, from things we can't imagine, let's assume we're discussing a finite-dimensional vector space), the above is equivalent to:
$$a^{\dagger} \, b$$
We can split the new inner product notation into separate pieces, so we can write $\langle \vec{a}|$ to mean the covector whose matrix representation in some orthonormal basis is a single row containing the complex conjugates of the elements in the single column of the matrix representing $|\vec{a} \rangle$.
It is valid to say that $\langle \vec{a}|$ is the covector of $|\vec{a} \rangle$, which is the same as saying that $\langle \vec{a}|$ is a function that acts on some vector $|\vec{b} \rangle$ to extract the coordinate of a basis vector $|\vec{a} \rangle$, as in the expression $\langle \vec{a}|\vec{b} \rangle$. And in concrete matrix terms we can picture $\langle \vec{a}|$ as a 1-row matrix (a row vector) that is the dual of the 1-column matrix (column vector) $|\vec{a} \rangle$, and their corresponding coordinates are mutually complex conjugates.
And with this matrix representation in mind, it follows that we can write them the other way round from the inner product:
$$|\vec{b} \rangle \langle \vec{a} |$$
which must therefore produce a square matrix: the product of a column vector on the left and a row vector on the right. This is the \textit{outer} product. A matrix can act as a vector-valued function of vectors: apply it to a vector to transform that vector to another vector. So:
$$|\vec{b} \rangle \langle \vec{a} | \vec{c} \rangle$$
See how the notation nicely suggests we bracket the $\langle \vec{a} | \vec{c} \rangle$ first as an inner product and thus a mere number. So we immediately know that the result will be the vector $|\vec{b} \rangle$ scaled by a number, i.e. it will be co-linear with $|\vec{b} \rangle$. We've measured $|\vec{c} \rangle$ against $|\vec{a} \rangle$ and used that to scale $|\vec{b} \rangle$.
Given an orthonormal basis $|\vec{b_n} \rangle$, we can picture it as a set of $n$ column vectors, and expressed in their own basis they would be the standard basis. The outer product:
$$|\vec{b_n} \rangle \langle \vec{b_n} |$$
will produce a matrix with a single $1$ in one place of the diagonal. So if we sum over all $n$, we get the identity matrix, a matrix that makes no difference to whatever vector it applies to. Spelling this out, if our vector space has just two basis vectors:
$$|\vec{0} \rangle = \begin{bmatrix} 1 \\ 0 \end{bmatrix}$$
and
$$|\vec{1} \rangle = \begin{bmatrix} 0 \\ 1 \end{bmatrix}$$
Then the outer product of $|\vec{0} \rangle$ with itself is just:
$$
|\vec{0} \rangle \langle \vec{0}| =
\begin{bmatrix} 1 \\ 0 \end{bmatrix}
\begin{bmatrix} 1 & 0 \end{bmatrix} =
\begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix}
$$
and likewise of $|\vec{1} \rangle$ with itself:
$$
|1 \rangle \langle 1| =
\begin{bmatrix} 0 \\ 1 \end{bmatrix}
\begin{bmatrix} 0 & 1 \end{bmatrix} =
\begin{bmatrix} 0 & 0 \\ 0 & 1 \end{bmatrix}
$$
And as predicted, summing those matrices gives the identity matrix:
$$
\begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix} +
\begin{bmatrix} 0 & 0 \\ 0 & 1 \end{bmatrix} =
\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}
$$
Even if the $|\vec{b}_n \rangle$ were expressed in some other basis, the above summation would still be the identity matrix. Now for any vector $|\vec{c} \rangle$ we can construct for each $n$:
$$|\vec{b}_n \rangle \langle \vec{b}_n | c \rangle$$
The $|\vec{b}_n \rangle \langle \vec{b}_n|$ operator is called a projection operator, because it projects its argument onto the subspace spanned by $|\vec{b}_n \rangle$, resulting in a component vector of the argument in the direction of $|\vec{b}_n \rangle$. Clearly if we act with the same projection operator again on that result, nothing will change, because it's already projected. This is a way of defining a projection operator: it's idempotent.
And the sum of all those resulting vectors for all $n$ will just be $|\vec{c} \rangle$, of course, because we've done the equivalent of acting with the identity operator.
\section{Hilbert Spaces}
The vector spaces used to represent physical states are examples of Hilbert spaces, which is a category that includes most of the familiar examples (e.g. a simple Euclidean vector space is also a Hilbert space). A Hilbert space is an inner product space with certain requirements, but they are very loose requirements, so it also includes more exotic situations than we encounter elsewhere in physics:
\begin{itemize}
\item scalars may be complex,
\item despite which, there is an inner product that we can use to get a non-negative real number for the modulus of a vector: $\sqrt{\langle a|a \rangle}$, and
\item the space may be infinite dimensional.
\end{itemize}
The latter possibility includes infinities that are continuous (uncountable). Such vectors cannot be represented by a column of discrete values, not even an infinitely long column. Instead we have to specify a complex-valued function over a continuous (real) variable. Such functions can be added and scaled, as is required of a vector, and so they qualify as elements of a vector space (§\ref{sec:vectors-space}) and we therefore have no choice but to admit that they are vectors.
The real parameter of such a function is analogous to the integer index that labels the rows in a column vector; instead of fetching the $i$th component by its position in the column, we evaluate the function with some real value $x$ to get its component "at" $x$.
Similarly, whereas the inner product over discrete components is:
$$
\langle \vec{a} | \vec{b} \rangle
=
\sum_i
a_i^* b_i
$$
the inner product over functions $f$ and $g$ of a real variable $x$ is:
$$
\langle \vec{f} | \vec{g} \rangle
=
\int_{-\infty}^{+\infty}
f(x)^* g(x)
dx
$$
This is also called the overlap integral, because it measures the extent to which the two functions overlap, but it is most definitely also the inner product between two vectors. Thus we can in some sense find the square of the "length" of a function: $\langle \vec{f} | \vec{f} \rangle$. This sounds like gibberish, but it is an unavoidable consequence of the definition of a vector space, which is abstract enough to admit a space of possible functions.
\section{Physical Interpretation}
To interpret the state vector physically, we choose a basis so we can resolve it into components. Our choice of basis has to do with the observable quantity we are presently interested in, such as position, momentum, orientation or energy. If it may take on any real value, the state vector will have to be a function of that value; if it may only take on certain discrete values, it can be a column vector (albeit sometimes one with infinitely many rows) in which each row corresponds to one of those possible discrete values that the observable may exhibit when measured.
The information available from the state vector is, in general, probabilistic. Each component, being a complex number, is related to the probability of the observable quantity taking on the value represented by that component. The squared modulus of the component (its value multiplied by the complex conjugate of its value) is the probability of obtaining that value, or if the state vector is a function $f(x)$, then:
$$\int_{a}^{b} f(x)^* f(x) \, dx$$
is the probability that $x$ will have a value somewhere between $a$ and $b$.
As a probability is a number between $0$ and $1$, it must be the case that the sum of the squared modulus of all the components (or the above integral from $-\infty$ to $+\infty$) must be $1$. This is the same as saying that $\langle S | S \rangle = 1$ for any physically realistic state vector. Or to put it another way, the magnitude of a state vector is not significant, only the direction (i.e. the relative values of the components in some basis). We will always fix the magnitude to be $1$.
Unsurprisingly, if one of the components is $1$ and all the others are zero, the vector represents certainty that the observable has the value represented by that component. But this also means that the state vector is equal to one of the basis vectors. Thus the basis vectors for an observable represent exact values that the observable may exhibit when measured.
Further, a measurement of the observable (or more precisely, any interaction producing subsequent behaviour that could be used to infer the value of the observable) causes the state vector to change to the basis vector of that observable corresponding to the measured value. This change is (at least in this theory) assumed to be instantaneous and to have no mechanism that we can deduce anything further about.
Thus after measuring an observable, subsequent measurements of the same observable will with certainty produce the same result.
(This is not quite true in the continuous cases when the state vector is actually a function of a real variable. We don't expect to ever find such a system precisely aligned with a single base state, but instead to have at least some small spread of probabilities.)
\section{Switching Basis}
Having constructed a column representation of a state vector in one basis, relating to one observable, we can switch to another. The operation for doing this will depend on both the "before" and "after" bases (§\ref{sec:vectors-change-basis}). A state vector contains everything knowable about a system, including all we can know about any of its observable quantities. By re-expressing the same state vector as a different set of components in terms of the basis associated with a different observable, we recover the probability distribution for that observable.
The operation that switches basis will be a matrix if we're dealing with a finite-dimensional space, or something analogous to that if the space is infinite. As always when using a matrix to transform a vector's components we need to be clear on whether we want to get a different vector in the same basis or the same vector in a different basis. In this case we're talking about the latter; a state vector represents something physically real, and we're just changing how we describe it. On the other hand the choice of basis is not entirely arbitrary because a basis relates to an observable quantity.
Any basis we change to must still be orthonormal. Therefore the transformation must preserve the inner product between any pairs of vectors, i.e. it must be unitary, and if it's represented by a matrix then the Hermitian conjugate serves as the inverse:
$$\hat{U} \hat{U}^\dagger = I$$
\section{Operators Representing Observables} \label{sec:qm-operators1}
Observables have an associated operator, which maps vectors to different vectors. This has nothing to do with a change of basis. In QM when we talk about the observable's associated operator, we are talking about something that is not directly of any use for converting between bases (it is not unitary, for one thing), though it will indicate how we could perform such an operation.
An observable operator can be applied to a state vector as a kind of test, but it is much more powerful when we picture it applying to every possible state vector (that is, all unit vectors in the space) to find out how it affects them.
In QM operators associated with observables are Hermitian or self-adjoint, meaning that for an operator $\hat{O}$:
$$\langle \vec{a}|\hat{O} \vec{b} \rangle = \langle \hat{O} \vec{a}| \vec{b} \rangle$$
This has a few useful implications:
\begin{itemize}
\item in the discrete finite vector case, operators can be represented as a matrix $O$, $O^{\dagger} = O$, or $O_{ij} = O_{ji}^*$, so the main diagonal elements are real,
\item regardless of representation, eigenvectors (§\ref{sec:vectors-eigen}) with distinct eigenvalues are orthogonal and complete (they span the space, so you can take a unit vector in each of these orthogonal directions and you have an orthonormal basis) and
\item regardless of representation, their eigenvalues are real.
\end{itemize}
Think of the analogy of a Euclidean real plane vector space, and a symmetric $2 \times 2$ matrix $M$ operating on it. The eigenvectors are lines in the plane along which vectors do not change direction, only magnitude, when the operator is applied. Because the matrix is symmetric ($M_{ij} = M_{ji}$) these lines are orthogonal. So it is with an Hermitian operator in a complex space, with only the added complication of needing to be careful about taking the complex conjugate when comparing diagonally opposite elements.
The basis vectors of the observable are just unit vectors that are eigenvectors of the operator. That is, if you apply the observable's operator to every possible state vector, a subset of them will be scaled (by a potentially complex factor) without any change to their alignment. There will be a set of orthogonal unit vectors that pass this "alignment preserving" test, and these form the basis of the observable.
In other words, quantum mechanics is substantially about:
\begin{itemize}
\item defining the operator for an observable,
\item solving the eigenvalue equation for that operator (that is, finding its eigenvectors and their associated eigenvalues)
\item using the eigenvectors as a basis for representing state vectors,
\item assuming that when the observable is measured, the state will snap into alignment with one of those eigenvectors,
\item interpreting a coordinate in that basis as a complex amplitude whose mod-square is the probability that the state will align itself with that basis vector,
\item interpreting the eigenvalue associated with the basis vector as the measured value of the observable (the eigenvalues of Hermitian operators are real numbers, fortunately.)
\end{itemize}
If a system's state vector matches one of these eigenvectors, then the system is already in an eigenstate and if the observable is measured, the result will with certainty be the eigenvalue associated with that eigenstate.
Otherwise, the state vector will be a linear combination of the eigenstates, and if the observable is measured and found to have a particular value, then the state vector will have instantaneously realigned itself with an eigenvector having that eigenvalue.
For the avoidance of confusion:
\begin{itemize}
\item Applying the operator to the state is not the same as "making a measurement"; that is something you do in the real world, where you may not even have any knowledge of the state prior to the measurement.
\item Applying the operator to the state does not reveal the probability of anything (but see below about finding the expectation value.)
\item Applying the operator to the state is not the same as changing its coordinate representation to be in the basis associated with the observable for that operator.
\end{itemize}
If you apply the operator to a state vector (which is, as always, of unit magnitude) and it is already aligned with an eigenstate, then its alignment will not be affected and its magnitude will change to experimental value you would get if you measured the observable in a system in that state. If it is not aligned with an eigenstate, the vector's alignment will change, but it will not snap into alignment with an eigenstate, which is what happens when a real world measurement is performed.
But if you solve the eigenvalue equation for the operator, you will know the complete basis of the observable, along with the eigenvalue associated with each basis vector, and you can then resolve our state vector against those basis vectors. Then you will have a set of coordinates that serve as probability amplitudes for the associated (measurable) eigenvalues.
In addition, there is a meaningful interpretation for the result of applying the operator for an observable to a given state vector: the inner product of the resulting vector with the original state vector gives the expectation value (§\ref{sec:expectation-value}) of the observable.
While we've discussed all this in terms of more easily pictured finite-dimensional vectors with discrete complex components, all the same concepts translate to complex-valued functions of an integer or real parameter.
\section{Euclidean Visualisation}
If we simplify radically, keeping to real numbers and two-dimensional Hilbert space, which is to say, the Euclidean plane, it becomes possible to draw pictures and invoke familiar geometric concepts. This is just a real-valued analogy for the complex-valued situations we will encounter in QM, so it has certain limitations, but it is accurate in many important details.
A state $|\psi\rangle$ is a vector. \textit{It is not intrinsically in any basis.} But it is certainly of unit length.
\begin{figure}[h]
\centering
\begin{tikzpicture}
\draw (0,0) circle (1);
\draw[thick,->] (0,0) -- (0.383,0.924) node[anchor=south] {$|\psi\rangle$};
\end{tikzpicture}
\centering
\caption{Anonymous state vector} \label{fig:state-vector}
\end{figure}
Suppose the system can be found to be in one of two moods, $|a\rangle$ (affable) and $|b\rangle$ (bored). These will correspond to two orthonormal state vectors:
\begin{figure}[h]
\centering
\begin{tikzpicture}
\draw (0,0) circle (1);
\draw[thick,->] (0,0) -- (0.383,0.924) node[anchor=south west] {$|\psi\rangle$};
\draw[dashed,->] (0,0) -- (1,0) node[anchor=west] {$|b\rangle$};
\draw[dashed,->] (0,0) -- (0,1) node[anchor=south] {$|a\rangle$};
\end{tikzpicture}
\centering
\caption{Mood basis} \label{fig:mood-basis}
\end{figure}
Naturally $\psi$ can be resolved into a pair of coordinates by using the orthonormal vectors of mood as a basis. We are tempted to say that the mood is presently more closely aligned with affable rather than bored, but must always remember that the mood can only ever be measured to be exactly affable or exactly bored (upon which it will snap into alignment with $|a\rangle$ or $|b\rangle$ accordingly). It is just more likely to found affable, with probability given by the square of the coordinate given by the inner product $\langle a|\psi \rangle$.
Also the system can be found to be listening to music in one of two genres, $|c\rangle$ (country) or $|d\rangle$ (disco):
\begin{figure}[h]
\centering
\begin{tikzpicture}
\draw (0,0) circle (1);
\draw[thick,->] (0,0) -- (0.383,0.924) node[anchor=south] {$|\psi\rangle$};
\draw[dashed,->] (0,0) -- (-0.707,0.707) node[anchor=south east] {$|c\rangle$};
\draw[dashed,->] (0,0) -- (0.707,0.707) node[anchor=south west] {$|d\rangle$};
\end{tikzpicture}
\centering
\caption{Genre basis} \label{fig:genre-basis}
\end{figure}
Against this genre basis the coordinates for the same $|\psi\rangle$ are clearly going to be different from what they were in the mood basis, and our $|\psi\rangle$ is leaning more toward disco than country. If you tilt your head to the left\footnote{That is, apply a unitary operator} so as to align $|d\rangle$ with the horizontal, pointing right, and $|c\rangle$ with the vertical, pointing up, then $|\psi\rangle$ will appear to be closer to horizontal than vertical.
Returning to the mood observable, there is an operator $\hat{M}$ associated with that observable. We can think of the operator as acting on all possible state vectors, represented by an evenly-spaced selection of them.
\begin{figure}[h]
\caption{Effect of operator $\hat{M}$}
\begin{subfigure}{0.5\textwidth}
\centering
\begin{tikzpicture}
\draw (0,0) circle (1);
\draw[->] (0,0) -- (1.000,0.000);
\draw[->] (0,0) -- (0.924,0.383);
\draw[->] (0,0) -- (0.707,0.707);
\draw[->] (0,0) -- (0.383,0.924);
\draw[->] (0,0) -- (0.000,1.000);
\draw[->] (0,0) -- (-0.383,0.924);
\draw[->] (0,0) -- (-0.707,0.707);
\draw[->] (0,0) -- (-0.924,0.383);
\draw[->] (0,0) -- (-1.000,0.000);
\draw[->] (0,0) -- (-0.924,-0.383);
\draw[->] (0,0) -- (-0.707,-0.707);
\draw[->] (0,0) -- (-0.383,-0.924);
\draw[->] (0,0) -- (-0.000,-1.000);
\draw[->] (0,0) -- (0.383,-0.924);
\draw[->] (0,0) -- (0.707,-0.707);
\draw[->] (0,0) -- (0.924,-0.383);
\end{tikzpicture}
\caption{Before} \label{fig:mood-before}
\end{subfigure}
\begin{subfigure}{0.5\textwidth}
\centering
\begin{tikzpicture}
\draw (0,0) circle (1);
\draw[dotted] (0,0) ellipse (0.8 and 1.2);
\draw[thick,->] (0,0) -- (0.800,0.000);
\draw[dotted,->] (0,0) -- (0.739,0.459);
\draw[dotted,->] (0,0) -- (0.566,0.849);
\draw[dotted,->] (0,0) -- (0.306,1.109);
\draw[thick,->] (0,0) -- (0.000,1.200);
\draw[dotted,->] (0,0) -- (-0.306,1.109);
\draw[dotted,->] (0,0) -- (-0.566,0.849);
\draw[dotted,->] (0,0) -- (-0.739,0.459);
\draw[dotted,->] (0,0) -- (-0.800,0.000);
\draw[dotted,->] (0,0) -- (-0.739,-0.459);
\draw[dotted,->] (0,0) -- (-0.566,-0.849);
\draw[dotted,->] (0,0) -- (-0.306,-1.109);
\draw[dotted,->] (0,0) -- (-0.000,-1.200);
\draw[dotted,->] (0,0) -- (0.306,-1.109);
\draw[dotted,->] (0,0) -- (0.566,-0.849);
\draw[dotted,->] (0,0) -- (0.739,-0.459);
\end{tikzpicture}
\caption{After} \label{fig:mood-after}
\end{subfigure}
\end{figure}
After the operator has done its work, the adjusted vectors fit into an ellipse rather than a circle: they are no longer all unit vectors. There are just two directions along which the vectors preserved their alignment: these are the eigenvectors of $\hat{M}$. They are orthogonal. This is how we discovered the basis vectors $|a\rangle$ and $|b\rangle$, by finding the states $|\psi\rangle$ for which:
$$
\hat{M}|\psi\rangle = m|\psi\rangle
$$
where $m$ is just a number, i.e. the directions along which $\hat{M}$ does not change the alignment of the vector, only the length. Also the scaling factor along (say) the $|b\rangle$ direction is the numerical measurement that we interpret as the bored state.
Operators associated with observables, such as $\hat{M}$, are Hermitian, which (in this real vector space with only two orthogonal directions) means they can be represented by a symmetric $2 \times 2$ matrix, and will always have the effect of stretching or squashing along two orthogonal directions, thus picking out the two basis vectors for the observable.
And to be complete we should visualise the effect of the genre operator $\hat{G}$.
\begin{figure}[h]
\caption{Effect of operator $\hat{G}$}
\begin{subfigure}{0.5\textwidth}
\centering
\begin{tikzpicture}
\draw (0,0) circle (1);
\draw[->] (0,0) -- (1.000,0.000);
\draw[->] (0,0) -- (0.924,0.383);
\draw[->] (0,0) -- (0.707,0.707);
\draw[->] (0,0) -- (0.383,0.924);
\draw[->] (0,0) -- (0.000,1.000);
\draw[->] (0,0) -- (-0.383,0.924);
\draw[->] (0,0) -- (-0.707,0.707);
\draw[->] (0,0) -- (-0.924,0.383);
\draw[->] (0,0) -- (-1.000,0.000);
\draw[->] (0,0) -- (-0.924,-0.383);
\draw[->] (0,0) -- (-0.707,-0.707);
\draw[->] (0,0) -- (-0.383,-0.924);
\draw[->] (0,0) -- (-0.000,-1.000);
\draw[->] (0,0) -- (0.383,-0.924);
\draw[->] (0,0) -- (0.707,-0.707);
\draw[->] (0,0) -- (0.924,-0.383);
\end{tikzpicture}
\caption{Before} \label{fig:genre-before}
\end{subfigure}
\begin{subfigure}{0.5\textwidth}
\centering
\begin{tikzpicture}
\draw (0,0) circle (1);
\draw[rotate=45,dotted] (0,0) ellipse (0.8 and 1.2);
\draw[rotate=45,thick,->] (0,0) -- (0.800,0.000);
\draw[rotate=45,dotted,->] (0,0) -- (0.739,0.459);
\draw[rotate=45,dotted,->] (0,0) -- (0.566,0.849);
\draw[rotate=45,dotted,->] (0,0) -- (0.306,1.109);
\draw[rotate=45,thick,->] (0,0) -- (0.000,1.200);
\draw[rotate=45,dotted,->] (0,0) -- (-0.306,1.109);
\draw[rotate=45,dotted,->] (0,0) -- (-0.566,0.849);
\draw[rotate=45,dotted,->] (0,0) -- (-0.739,0.459);
\draw[rotate=45,dotted,->] (0,0) -- (-0.800,0.000);
\draw[rotate=45,dotted,->] (0,0) -- (-0.739,-0.459);
\draw[rotate=45,dotted,->] (0,0) -- (-0.566,-0.849);
\draw[rotate=45,dotted,->] (0,0) -- (-0.306,-1.109);
\draw[rotate=45,dotted,->] (0,0) -- (-0.000,-1.200);
\draw[rotate=45,dotted,->] (0,0) -- (0.306,-1.109);
\draw[rotate=45,dotted,->] (0,0) -- (0.566,-0.849);
\draw[rotate=45,dotted,->] (0,0) -- (0.739,-0.459);
\end{tikzpicture}
\caption{After} \label{fig:genre-after}
\end{subfigure}
\end{figure}
It's another Hermitian operator, so it has again picked out two orthogonal directions along which it only applies a scaling.
There is no "true" basis against which a state vector is actually supposed to be measured. Basis vectors are just states that have a particular significance for certain operators.
\begin{figure}[h]
\centering
\begin{tikzpicture}
\draw (0,0) circle (1);
\draw[thick,->] (0,0) -- (0.383,0.924);
\node at (0.500,1.190) {$|\psi\rangle$};
\draw[dashed,->] (0,0) -- (1,0) node[anchor=west] {$|b\rangle$};
\draw[dashed,->] (0,0) -- (0,1) node[anchor=south] {$|a\rangle$};
\draw[dashed,->] (0,0) -- (-0.707,0.707) node[anchor=south east] {$|c\rangle$};
\draw[dashed,->] (0,0) -- (0.707,0.707) node[anchor=south west] {$|d\rangle$};
\end{tikzpicture}
\centering
\caption{No special basis} \label{fig:state-no-special-basis}
\end{figure}
The state could be aligned with $|a\rangle$, so the genre would be uncertain, and then it could become aligned with $|c\rangle$ and then the mood would be uncertain.
To relate all this back to more realistic QM scenarios:
\begin{itemize}
\item instead of restricting to real scalars, we allow complex scalars,
\item as well as just two orthogonal directions in state space, we allow infinitely many, even a continuum (as with position and momentum),
\item we use the Hermitian inner product, taking the complex conjugate on the left, which means that the inner product of a vector with itself will always be real and positive, but the inner product between two different vectors may be complex,
\item operators cannot generally be represented by matrices due to the continuous nature of some state spaces, but where they can, the matrix is Hermitian or self-adjoint, meaning that it is equal to its own conjugate transpose, which is the complex equivalent of a symmetric matrix,
\item orthogonal eigenvectors of a Hermitian operator may in some cases have the same eigenvalue, and thus represent states that cannot be distinguished between by means of a measurement of the observable (picture our circle of state vectors growing or shrinking uniformly in all directions and thus remaining a circle, instead of being distorted into an ellipse),
\item to get a probability from a complex coordinate, we take the modulus squared, to ensure it's a real number.
\end{itemize}
\section{The Wave Function}
One way to approach QM initially is to consider the position and momentum of an electron. These are continuous variables, so we will be working entirely with state vectors that are represented by functions of real variables, and operators that transform functions.
We model this situation as a continuous complex-valued function of position and time, $\Psi(x, y, z, t)$, very often abbreviated to $\Psi$. We will sometimes also consider functions only of space, $\psi$. (This upper/lowercase distinction is quite widespread but not universally observed.)
By considering only one spatial dimension we can picture the wave function at one instant as a line, somewhere along which the electron could be found. At each point $x$ on the line there is an associated complex plane (visualised as normal to the line), with an arrow lying in it, pointing out from the line. This is the complex value of $\Psi$ at that position $x$ and time $t$.
The complex plane should not be confused with vectors. Any given snapshot of $\Psi(x, t)$ at some instant $t$, given by a function $\psi(x)$, is itself an entire vector. The position $x$ labels a single infinitesimal component of the vector, and every such component is a complex number, which we can therefore visualise as a complex plane with an arrow on it.
So for example we could picture the arrows as making a corkscrew shape, rotating around the line such that the angle depends linearly on $x$, but the modulus of the complex value (the length of the arrow) happens to be constant in this example. This is the notional wave function for a free electron (no forces acting it) with a precisely defined momentum and therefore no defined position, something never observed in reality.
More generally, the arrow length will also vary with $x, t$. The arrow length at $x$ determines the likelihood that the electron will be found at $x$. More precisely, the modulus-squared of $\Psi$, which can be calculated with $\Psi^*\Psi$, is proportional to the probability density:
\begin{equation}
\rho(x) = \Psi^*\Psi
\label{eqn:pdf}
\end{equation}
Given the electron is in some region $A$ between $x_1$ and $x_2$, the integral:
$$
\alpha =
\int_{x_1}^{x_2}
\Psi^*\Psi
\,dx
$$
is \textit{proportional} to the probability of finding the electron in $A$.
Recall that the product of a complex number and its own complex conjugate is a real number, and here we are doing $\Psi(x)^*\Psi(x)$, using the single complex value at position $x$, so the result will be real. But the complex conjugate is not a general purpose magic way to get a real number from a product of any two complex numbers; $\Psi(x_1)^*\Psi(x_2)$ need not be real.
If we compute the same integral $\beta$ for some larger surrounding region $B$, we can compute the conditional probability:
$$
P(A|B) = \frac{\alpha}{\beta}
$$
That is: the probability of finding the electron in $A$ \textit{given that} it is somewhere in $B$ is given by the fraction $\alpha / \beta$.
If $\Psi$ is suitably behaved (square-integrable; roughly, it goes to zero at some distance and does not become infinite anywhere) then we can compute the integral over the whole of our one dimension of space:
$$
\alpha =
\int_{-\infty}^{+\infty}
\Psi^*\Psi
\,dx
$$
We can then include a factor of $1/\sqrt{\alpha}$ within $\Psi$ to "normalise" it, such that integrating the normalised $\Psi^*\Psi$ over some region will directly give us the absolute (unconditional) probability of finding the electron in that region.
Some interesting things to note at this early stage:
\begin{itemize}
\item For the simple first example of the free electron with definite momentum, normalisation is not possible because the integral over all of space does not converge on a finite value.
\item A global change in the amplitude of the function (scaling the entire function by some complex constant) is not a physically significant change; there is a set of wave functions $a\Psi$ for any complex constant $a$, which all mean the same thing. What matters is how the amplitude varies from place to place (the same will turn out to be true for the complex phase).
\item To normalise, we have to find the sum over all space of the mod-squared wave function. Interpreting the wave function as a vector, we're taking the inner product of the vector with itself, so we are in a sense finding the "length"-squared of the wave function as a vector. We then can then use this factor to scale it to be a unit vector, but preserving the relative shape of the wave (that is, preserving the "alignment" of the vector).
\end{itemize}
\section{Schrödinger Equation}
Any wave can be described as a sum of many simple component waves. (It is interesting that we use the word "component"; they are also basis vectors, so in vector terminology we should use the word component to refer to the complex constant factor applied to each simple wave included in the sum).
Each individual component wave has \textit{two} parameters:
\begin{itemize}
\item if we nominate a fixed point in space, there is a frequency of oscillation, $\nu$
\item if we freeze time, we can measure the wavelength, $\lambda$, the distance between adjacent peaks in space
\end{itemize}
These can be independently adjusted (do not be confused by the familiar example of EM waves, where wavelength and frequency are coupled due to the constant speed of light!)
So the component wave can be described by the complex exponential:
$$
\Psi(x, t) = \exp \left[ 2\pi i(\frac{x}{\lambda} - \nu t) \right]
$$
Pick any fixed point in space, so $x$ is constant, and $\nu$ determines the rate of oscillation. Pick a fixed instant in time, so $t$ is constant, and $\lambda$ determines the distance between peaks. With both in play, we have a corkscrew complex wave pattern that is moving.
Anything we figure out for this model wave can be taken to be true for any linear combination of many such waves, in the sense that we can imagine decomposing some messy wave into a set of components, each component characterised only by two numbers.
Planck inferred the relationship between frequency and energy:
$$\nu = \frac{E}{h}$$
And de Broglie likewise for momentum and wavelength:
$$\lambda = \frac{h}{p}$$
So we can write the wave function very neatly in terms of energy and momentum instead:
$$
\Psi(x, t) = \exp \left[ {\frac{i(px - Et)}{\hbar}} \right]
$$
Nothing much has changed: as before, we have two parameters shaping a complex corkscrew wave. (We use $\hbar = h/2\pi$ for brevity because that combination isn't going away.) All that has changed is that we've got two parameters with a physical interpretation for something we've previously thought of as a "particle".
We can take the partial differential of the above w.r.t $t$ or $x$, and the way that works with exponentials is strangely illuminating.
Doing $t$ first:
$$
\frac{\partial \Psi}{\partial t}
=
-\frac{iE}{\hbar}
\exp \left[ {\frac{i(px - Et)}{\hbar}} \right]
$$
The constant factor is copied outside the exponential, which otherwise remains the same. So in fact:
$$
\frac{\partial \Psi}{\partial t}
=
-\frac{iE}{\hbar}
\Psi
$$
We can tidy up by multiplying both sides by $i\hbar$:
$$
i\hbar \frac{\partial \Psi}{\partial t}
= E \Psi
$$
The exact same procedure with $x$ yields:
$$
- i\hbar \frac{\partial \Psi}{\partial x}
= p \Psi
$$
But we can also take the second derivative and get:
$$
- \hbar^2 \frac{\partial^2 \Psi}{\partial x^2}
= p^2 \Psi
$$
Returning to our physical interpretation, a free particle has energy that is purely kinetic, related to its momentum by:
$$
p^2 = 2m E
$$
(This is just $\frac{1}{2}mv^2$ smushed into the definition of momentum, $mv$.)
Substituting the Planck and de Broglie relations:
$$
\frac{\hbar}{2m} = \lambda^2\nu
$$
In general a corkscrew wave is governed by two independent parameters:
\begin{itemize}
\item momentum, which goes with wavelength (and the $x$ coordinate)
\item energy, which goes with frequency (and the $t$ coordinate)
\end{itemize}
We've now coupled them, making them no longer independent. But we've also added a new parameter: the particle's mass. For a free particle of a given mass, if you know the momentum you know the energy, and vice versa. Equivalently, if you know the wavelength you know the frequency, and vice versa.
Returning to the classical relationship between momentum, energy and mass, we can use it to rewrite our expression for $p^2 \Psi$, substituting into the R.H.S. to easily obtain:
$$
- \hbar^2 \frac{\partial^2 \Psi}{\partial x^2}
= 2mE\Psi
$$
And as we also have an expression for $E\Psi$, let's isolate that:
$$
E\Psi =
- \frac{\hbar^2}{2m} \frac{\partial^2 \Psi}{\partial x^2}
$$
and insert our $E\Psi$ expression:
$$
i\hbar \frac{\partial \Psi}{\partial t}
=
- \frac{\hbar^2}{2m} \frac{\partial^2 \Psi}{\partial x^2}
$$
So, recalling that $\Psi$ is an abbreviation for $\Psi(x, t)$, a complex valued function of space and time, now we have a differential equation that relates only these things:
\begin{itemize}
\item $\hbar$, Planck's constant, a universal fixed real number with units of joules-seconds, very accurately determined by experiment, not something we can adjust to fit this equation to different scenarios
\item $i$, which just provides a 90\textdegree phase shift
\item the first partial derivative of $\Psi$ w.r.t. to time, which is another function of space and time that tells you how $\Psi$ is changing
\item $m$, the mass of the particle
\item the second partial derivative of $\Psi$ w.r.t. space.
\end{itemize}
This means that from a snapshot $\psi$ (at a specific instant of time) of the wave function of a particle with a known mass, so you have its shape in space, you can find the second derivative of that shape w.r.t. space, then multiply that by $i\hbar/2m$ and you have the the first partial derivative of $\Psi$ w.r.t. to time. That is, a snapshot contains complete information about the past and future of the wave; it tells you how to compute every past and future state.
So far, so kind-of rigorous. The situation becomes vaguer when we introduce a force field acting on the particle.
Schrödinger himself seems to have mostly taken a guess and found that the resulting equation agreed with several previously unexplained experimental results. Many widely used textbooks don't even give any background for it but merely state it. More advanced theory can be used to derive it, e.g. it is a low-energy approximation of QED.
The full classical account of the energy of a particle is:
$$
E = \frac{p^2}{2m} + V
$$
where the potential is a function $V(x)$. Realistically it will also be a function of $t$, but later we're going to pretend it isn't.
Some authors note that by multiplying the above throughout by $\Psi$:
$$
E\Psi = \frac{p^2{\Psi}}{2m} + V{\Psi}
$$
we obtain some scaffolding into which we can plug in our expressions for $E \Psi$ and $p^2 \Psi$:
\begin{equation}
i\hbar \frac{\partial \Psi}{\partial t}
=
- \frac{\hbar^2}{2m} \frac{\partial^2 \Psi}{\partial x^2}
+ V{\Psi}
\label{eqn:se}
\end{equation}
And this is the same as the free particle equation with the added $V\Psi$ term, and is the complete Schrödinger equation which governs the time evolution of $\Psi$.
The extra term doesn't change the important property that if you have a snapshot $\psi(x)$ taken of $\Psi(x, t)$ at a specific initial instant of time, then you know all future states (glossing over what happens when there is any kind of interaction, including measurements).
This is sometimes contrasted with Newton's 2nd law relating acceleration to force, acceleration being the second order derivative of the position w.r.t time. Each time we integrate we need to conjure up a constant of integration, and we have to integrate acceleration twice to get the position. The two constants we need to add are the position and velocity. Thus a snapshot of the position of a particle is not generally enough to know what is happening to it.
But a snapshot $\psi(x)$ taken of $\Psi(x, t)$ at some time is not just one number, but a continuous function giving a (complex) number at each point $x$ along the line, so it is generously endowed with information. If we decompose the snapshot into component waves, each one has its own wavelength.
And if we multiple $\Psi$ by some constant (possibly complex) factor, the result is still a solution to the function. Such arbitrary constant scale factors make no difference to the physical meaning; what matters is how the function varies from location to location (and from time to time). This is what allows us to normalise the function (where possible) to ensure that it sums to 1 over all of space.
\section{Time Evolution}
We can say little here about wave functions unless they can be normalised, i.e. wave functions that tend to zero at infinity. Assuming this is the case, if we integrate the PDF over all of space:
$$
\int_{-\infty}^{+\infty}
\Psi^*\Psi
\,dx
$$
we expect the result to be constant (if normalised, it should always remain 1 as time passes), i.e.
$$
\frac{d}{d t}
\int_{-\infty}^{+\infty}
\Psi^*\Psi
\,dx
= 0
$$
Note that as we are integrating over $x$, outside the integral $x$ is not a variable. We can move the differentiation w.r.t. $t$ inside the integral, but only we change it to partial, because inside the integral $x$ is a variable:
$$
\int_{-\infty}^{+\infty}
\frac{\partial}{\partial t}
\Psi^*\Psi
\,dx
= 0
$$
Focusing on the inside of the integral, by the product rule:
$$
\frac{\partial}{\partial t} \, \Psi^*\Psi
=
\frac{\partial \Psi^*}{\partial t} \Psi
+
\frac{\partial \Psi}{\partial t} \Psi^*
$$
Now, the Schrödinger equation gives us an expression for the partial time derivative of the wave function by slightly rearranging \eqref{eqn:se}:
$$
\frac{\partial \Psi}{\partial t}
=
\frac{i \hbar}{2m} \frac{\partial^2 \Psi}{\partial x^2}
- \frac{i V}{\hbar}{\Psi}
$$
From this we can get the same for the complex conjugate:
$$
\frac{\partial \Psi^*}{\partial t}
=
- \frac{i \hbar}{2m} \frac{\partial^2 \Psi^*}{\partial x^2}
+ \frac{i V}{\hbar}{\Psi^*}
$$
Plugging those into our expression:
$$
\frac{\partial}{\partial t} \, \Psi^*\Psi
=
\left[
- \frac{i \hbar}{2m} \frac{\partial^2 \Psi^*}{\partial x^2}
+ \frac{i V}{\hbar}\Psi^*
\right] \Psi
+
\left[
\frac{i \hbar}{2m} \frac{\partial^2 \Psi}{\partial x^2}
- \frac{i V}{\hbar}\Psi
\right] \Psi^*
$$
Multiplying out:
$$
\frac{\partial}{\partial t} \, \Psi^*\Psi
=
- \frac{i \hbar}{2m} \frac{\partial^2 \Psi^*}{\partial x^2}
\Psi
+ \frac{i V}{\hbar}\Psi^*\Psi
+
\frac{i \hbar}{2m} \frac{\partial^2 \Psi}{\partial x^2}
\Psi^*
- \frac{i V}{\hbar}\Psi\Psi^*
$$
The second and fourth terms cancel each other:
$$
\frac{\partial}{\partial t} \, \Psi^*\Psi
=
- \frac{i \hbar}{2m} \frac{\partial^2 \Psi^*}{\partial x^2}
\Psi
+
\frac{i \hbar}{2m} \frac{\partial^2 \Psi}{\partial x^2}
\Psi^*
$$
Also there's a common factor we can pull out:
$$
\frac{\partial}{\partial t} \, \Psi^*\Psi
=
\frac{i \hbar}{2m}
\left[
\frac{\partial^2 \Psi}{\partial x^2}\Psi^*
- \frac{\partial^2 \Psi^*}{\partial x^2}\Psi
\right]
$$
Recall that we are working out an expression for this because it appears inside an integral over all space:
$$
\int_{-\infty}^{+\infty}
\frac{i \hbar}{2m}
\left[
\frac{\partial^2 \Psi}{\partial x^2}\Psi^*
- \frac{\partial^2 \Psi^*}{\partial x^2}\Psi
\right]
dx
$$
Now the fundamental theorem of calculus is that integration is the inverse of differentiation, so there is clearly some redundancy here in that we are taking the second partial differential w.r.t. $x$ only to then integrate over all $x$.
To make this explicit:
\begin{equation}
\frac{\partial}{\partial t} \, \Psi^*\Psi
=
\frac{i \hbar}{2m} \
\left[
\frac{\partial}{\partial x}
\left(
\frac{\partial \Psi}{\partial x}\Psi^*
- \frac{\partial \Psi^*}{\partial x}\Psi
\right)
\right]
\label{eqn:qm-byparts}
\end{equation}
The integral and the partial differentiation w.r.t. $x$ cancel out to give us an expression that we can evaluate at the two limits and take the difference:
$$
\frac{d}{d t}
\int_{-\infty}^{+\infty}
\Psi^*\Psi
\,dx
=
\frac{i \hbar}{2m}
\left[
\frac{\partial \Psi}{\partial x}\Psi^*
- \frac{\partial \Psi^*}{\partial x}\Psi
\right]
\bigg\rvert_{-\infty}^{+\infty}
$$
If we do that, we will have an expression for the rate of change, w.r.t. to time, of the integral of $\Psi^*\Psi$ over all space.
But at these limits, we've said $\Psi$ goes to zero, so as to be normalisable, making the whole expression zero at those limits. So in fact we've shown that, as we wanted:
$$
\frac{d}{d t}
\int_{-\infty}^{+\infty}
\Psi^*\Psi
\,dx
= 0
$$
So if it is possible to normalise a wave function at all, and it satisfies \eqref{eqn:se}, then the constant of normalisation lives up to its name: it is the same for all time.
\section{Motion}
Given this abstract notion of an electron being entirely represented by a complex-valued function of position, how can we make sense of an electron moving?
Supposing the wave function is more concentrated in some region, it makes sense to compute the expectation value of the position variable:
$$
\langle x \rangle =
\int_{-\infty}^{+\infty}
x \, \rho(x)
\,dx
$$
Substituting our definition of $\rho$ from \eqref{eqn:pdf}:
$$
\langle x \rangle =
\int_{-\infty}^{+\infty}
x \, \Psi^*\Psi
\,dx
$$
remembering always that $\Psi$ is short for $\Psi(x, t)$, so $\langle x \rangle$ is also a function of $t$, and so this gives us a way of thinking about motion: the way the expectation value of the position changes with time.
$$
\frac{d}{dt} \langle x \rangle =
\frac{d}{dt}
\int_{-\infty}^{+\infty}
x \, \Psi^*\Psi
\,dx
$$
We can rearrange to move the derivative inside the integral, giving:
$$
\frac{d}{dt} \langle x \rangle =
\int_{-\infty}^{+\infty}
x \frac{\partial}{\partial t}
\, \Psi^*\Psi
\,dx
$$
Like before, it's the $t$-derivative of something that depends on $x$, inside the integral over $x$ we clarify that it is the partial derivative, and therefore $x$ is a constant for that derivative.
And borrowing from \eqref{eqn:qm-byparts} we can rewrite this as:
$$
\frac{d}{dt} \langle x \rangle =
\frac{i \hbar}{2m}
\int_{-\infty}^{+\infty}
x
\frac{\partial}{\partial x} \
\left(
\frac{\partial \Psi}{\partial x}\Psi^*
- \frac{\partial \Psi^*}{\partial x}\Psi
\right)
\,dx
$$
This isn't as simple as before where we cancelled out the integration and the differentiation, because of the pesky $x$. But the good news is this is the easiest ever opportunity for integration by parts. Recall:
$$
\int
u
\frac{dv}{dx}
dx = uv -
\int
v
\frac{du}{dx}
dx
$$
So $u$ is just $x$ and to get $v$ we have to calculate it at the limits:
$$
v =
\frac{\partial \Psi}{\partial x}\Psi^*
- \frac{\partial \Psi^*}{\partial x}\Psi
\bigg\rvert_{-\infty}^{+\infty}
$$
Plugging them in:
$$
x
\left(
\frac{\partial \Psi}{\partial x}\Psi^*
- \frac{\partial \Psi^*}{\partial x}\Psi
\right)
\bigg\rvert_{-\infty}^{+\infty}
-
\int_{-\infty}^{+\infty}
\left(
\frac{\partial \Psi}{\partial x}\Psi^*
- \frac{\partial \Psi^*}{\partial x}\Psi
\right)
\frac{dx}{dx}
dx
$$
As before, with $\Psi$ vanishing at infinity the first term can be removed, and of course $dx/dx$ is $1$. Finally the above is just the integral from our $\langle x \rangle$ expression, so:
$$
\frac{d}{dt} \langle x \rangle = -
\frac{i \hbar}{2m}
\int_{-\infty}^{+\infty}
\left(
\frac{\partial \Psi}{\partial x}\Psi^*
- \frac{\partial \Psi^*}{\partial x}\Psi
\right)
dx
$$
Having unwrapped one layer with integration by parts we can pull the same trick with $\frac{\partial \Psi^*}{\partial x}\Psi$, with $u = \Psi$ and $v = \Psi^*$, which once again means the $uv$ term is zero, leaving:
$$
-
\int_{-\infty}^{+\infty}
\frac{\partial \Psi}{\partial x}
\Psi^*
$$
So putting this back into $\langle x \rangle$:
$$
\frac{d}{dt} \langle x \rangle = -
\frac{i \hbar}{2m}
\int_{-\infty}^{+\infty}
\left(
\frac{\partial \Psi}{\partial x}\Psi^*
+ \frac{\partial \Psi}{\partial x}\Psi^*
\right)
dx
$$
The two identical terms cancel with the $2$ on the bottom of the fraction, so:
$$
\frac{d}{dt} \langle x \rangle = -
\frac{i \hbar}{m}
\int_{-\infty}^{+\infty}
\frac{\partial \Psi}{\partial x}\Psi^*
dx
$$
If we think of the rate of change of $\langle x \rangle$ as the expectation value of the velocity, or $\langle v \rangle$, we can multiply by $m$ to get $\langle p \rangle$, which actually cancels the $m$.
$$
\langle p \rangle = -
i \hbar
\int_{-\infty}^{+\infty}
\frac{\partial \Psi}{\partial x}\Psi^*
dx
$$
\section{Operators Again} \label{sec:qm-operators2}
Another way to describe what we're doing here is rediscovering operators. To apply an operator $\hat{O}$ and get its expectation value $\langle O \rangle$, the recipe is:
$$
\langle O \rangle =
\int_{-\infty}^{+\infty}
\Psi^*
\hat{O}
\Psi
\,dx
$$
How does this relate to our previous discussion about observable operators (§\ref{sec:qm-operators1})? We said that the operator for an observable is Hermitian, so it has orthogonal eigenvectors, and if the state vector is equal to an eigenvector then the observable, when measured, will be certain to equal the eigenvalue of that eigenvector. Our wave function at an instant in time $\psi(x)$ is a vector. To get a coordinate from that vector, we evaluate the function for some position $x$, and so the vector has a "coordinate" for every point in space. Therefore it is a vector expressed in the "position basis".
If the particle is very precisely localised, the function's value (the coordinates) will be zero everywhere except at that precise location. At the theoretical extreme, it will zero everywhere except at an infinitesimal single position (§\ref{sec:fourier-spike}). That is, it will be a basis vector in the position basis.
An observable operator has to scale its eigenvectors by the value that would be measured for a state equal to that eigenvector. That is exactly what happens if we multiply $\psi(x)$ by $x$: if it is a pure spike (a complex value of modulus $1$) at some position $x_1$, and zero everywhere else, the spike (and thus the whole vector) will be scaled by the value $x_1$. Whereas if it isn't a pure spike (not a position eigenvector), each non-zero value will be multiplied by a different value (its own position value), which will distort the shape of the function (or equivalently, change the "direction" and magnitude of the vector).
We also mentioned in passing that if we apply an observable's operator to a specific state vector, we get an adjusted vector, and if we take the inner product between the original state vector and the adjusted vector, the resulting scalar value will be the expectation value of the observable. So this is just another way of writing down the above integral:
$$
\langle O \rangle =
\langle \Psi| \hat{O} | \Psi \rangle
$$
Because $\Psi$ is a function of $x$ and $t$, by integrating over all $x$ we get a function of time, telling us the evolving expectation value of whatever observable the operator represents. To remove a little complexity we'll switch to considering an instant of time so $\psi(x)$ is all we need.
This "operator sandwich" pattern is intuitively sensible when we apply the position operator to a wave function of position, because this fits precisely with how we understand the expectation value to be computed: it is the sum of every possible value multiplied by its probability of occurring. $\hat{x}|\psi\rangle$ is just $x \psi(x)$. If we multiply that by $\psi(x)^*$ then it will be $x$ multiplied by the probability of measuring the position to be $x$; clearly then the integral over all space will be the expectation value $\langle x \rangle$.
So in the position basis, the position operator $\hat{x}$ is just $x$ itself:
$$
\langle x \rangle =
\int_{-\infty}^{+\infty}
\psi(x)^*
\hat{x}
\psi(x)
\,dx
=
\int_{-\infty}^{+\infty}
\psi(x)^*
x
\psi(x)
\,dx
$$
The momentum operator $\hat{p}$, which we discovered above by looking for the expectation value of momentum, is $-ih\frac{\partial}{\partial x}$: