Transcript
Graduate Texts in Physics
Thanu Padmanabhan
Quantum
Field Theory
The Why, What and How
Graduate Texts in Physics
Series Editors
Kurt H. Becker, PhD
New York, USA
Jean-Marc Di Meglio
Paris, France
Sadri Hassani
Urbana, Illinois, USA
Bill Munro
Kanagawa, Japan
Richard Needs
Cambridge, UK
William T. Rhodes
Boca Raton, Florida, USA
Professor Susan Scott
Canberra, Australia
Professor H. Eugene Stanley
Boston, Massachusetts, USA
Martin Stutzmann
Garching, Germany
Andreas Wipf
Jena, Germany
Graduate Texts in Physics publishes core learning/teaching material for graduate and advanced-level undergraduate
courses on topics of current and emerging fields within physics, both pure and applied. These textbooks serve students
at the MS- or PhD-level and their instructors as comprehensive sources of principles, definitions, derivations,
experiments and applications (as relevant) for their mastery and teaching, respectively. International in scope and
relevance, the textbooks correspond to course syllabi sufficiently to serve as required reading. Their didactic style,
comprehensiveness and coverage of fundamental material also make them suitable as introductions or references for
scientists entering, or requiring timely knowledge of, a research field.
More information about this series at http://www.springer.com/series/8431
Thanu Padmanabhan
Quantum Field Theory
The Why, What and How
Thanu Padmanabhan
Inter-University Centre for Astronomy and Astrophysics
Pune, India
ISSN 1868-4513
ISSN 1868-4521 (electronic)
Graduate Texts in Physics
ISBN 978-3-319-28171-1
ISBN 978-3-319-28173-5 (eBook)
DOI 10.1007/978-3-319-28173-5
Library of Congress Control Number: 2016930164
Springer Cham Heidelberg New York Dordrecht London
© Springer International Publishing Switzerland 2016
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the
rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or
hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a
specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the
date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained
herein or for any errors or omissions that may have been made.
Printed on acid-free paper
Springer International Publishing AG Switzerland is part of Springer Science+Business Media (www.springer.com)
Preface
Paraphrasing the author of a quantum mechanics textbook, one might say
that there is as much need for yet another book on QFT as there is for a
New Revised Table of Integers! So let me explain why I am writing this
book and how it will enrich the existing literature and fill a niche.
Standard textbooks in QFT go along a well-trodden path: Classical
fields, their quantization, interactions and perturbation theory, tree-level
computation of cross-sections using Feynman diagrams, divergences and
renormalization as a way to handle them, etc. etc. Depending on the level
of the textbook, the above topics will be discussed at different levels of
sophistication and could be extended further towards e.g., gauge theories
or the standard model. The calculational techniques you learn will be
good enough to compute all the usual stuff in high energy physics, which
is broadly the aim of most of these textbooks.
My interaction with bright graduate students, post-docs and theoretical
physicists over the last three decades have made me realize some of the
shortcomings of learning QFT exclusively along these lines. Let me list a
few of them.
(1) Many smart students have asked me the following question: “If
I know the classical action for a non-relativistic particle, I can construct
the corresponding quantum theory using a path integral. I do know the
classical action for a relativistic particle. Why can’t I use it and construct
the quantum theory of the relativistic particle and be done with it?”
This question goes to the core of QFT but — as far as I know — it is
not answered directly in any of the existing textbooks. In fact, many high
energy physicists (and at least one author of a QFT book) did not know
that the exact, nonquadratic, path integral for the relativistic particle can
be calculated in closed form! Sure, every book pays lip service to why the
single particle description is inadequate when you bring quantum theory
and special relativity together, why the existence of antiparticles is a key
new feature and all that. But this is done in a bit of hurry after which the
author proceeds quickly to the classical field theory and its quantization! In
other words, what these books do is to start from fields and obtain particles
as their quanta rather than demonstrate that if you start from a relativistic
particle you will be led to the concept of fields.
This is the first question I will address in Chapter 1. I will show, using
the exact evaluation of the relativistic path integral, that if you start from
the path integral quantization of the relativistic particle you will arrive at
the notion of a field in a satisfactory manner. This route, from particles to
fields (instead of from fields to particles) should be a welcome addition to
the existing literature.
v
vi
PREFACE
(2) The way renormalization is introduced in most of the textbooks
is conceptually somewhat unsatisfactory. Many graduate students come
away with the impression that renormalization is a trick to get meaningful
answers out of divergent expressions, and do not understand clearly the
distinction between regularization and renormalization. (The notable exceptions are students and textbooks with a condensed matter perspective,
who do a better job in this regard.) Many authors feel that the Wilsonian perspective of quantum field theory is a bit too “advanced” for an
introductory level course. In fact it is not, and students grasp it with a
lot more ease (and with a lot less misunderstanding) than the more conventional approach. I will take the Wilsonian point of view as a backdrop
right from the beginning, so that the student needs to re-learn very little
as she progresses and will not have any fear for the so called “advanced”
concepts.
(3) A closely related issue is the discussion of non-perturbative phenomena, which is conspicuous by its absence in almost all textbooks. This,
in turn, makes students identify some concepts like renormalization too
strongly with perturbation theory. Further, it creates difficulties for many
students who are learning QFT as a prelude to specializing in areas where
non-perturbative techniques are important. For example, students who
want to work in gravitational physics, like QFT in external gravitational
fields or some aspects of quantum gravity, find that the conventional perturbation theory approach — which is all that is emphasized in most QFT
textbooks — leaves them rather inadequately prepared.
To remedy this, I will introduce some of the non-perturbative aspects
of QFT (like, for example, particle production by external sources) before
the students lose their innocence through learning perturbation theory and
Feynman diagrams! Concepts like electromagnetic charge renormalization
and running coupling constants, for example, will be introduced through
the study of pair production in external electromagnetic fields, thereby
divorcing them conceptually from the perturbation theory. Again, some of
these topics (like e.g., the effective Lagrangian in QED) which are usually
considered “advanced”, are actually quite easy to grasp if introduced in an
appropriate manner and early on. This will make the book useful for a
wider class of readers whose interest may not be limited to just computing
Feynman diagrams in the standard model.
(4) Most textbooks fail to do justice to several interesting and curious
phenomena in field theory because of the rather rigid framework in which
they operate. For example, the Davies-Unruh effect, which teaches us that
concepts like “vacuum state” and “particle” can change in a non-inertial
frame, is a beautiful result with far-reaching implications. Similarly, the
Casimir effect is an excellent illustration of non-trivial consequences of freefield theory in appropriate circumstances. Most standard textbooks do not
spend adequate amount of time discussing such fascinating topics which
are yet to enter the mainstream of high energy physics. I will dip into
such applications of QFT whenever possible, which should help students
to develop a broader perspective of the subject.
Of course, a textbook like this is useless if, at the end of the day, the
student cannot calculate anything! Rest assured that this book develops
at adequate depth all the standard techniques of QFT as well. A student
who reaches the last chapter would have computed the anomalous magnetic
PREFACE
vii
moment of the electron in QED, and would have worked out the two-loop
renormalization of the λφ4 theory. (You will find a description of the
individual chapters of the book in the Chapter Highlights in pages v-vii,
just after this Preface.)
In addition to nearly 80 Exercises sprinkled throughout the text in
the marginal notes, I have also included 18 Problems (with solutions) at
the end of the book. These vary significantly in their difficulty levels;
some will require the student to fill in the details of the discussion in the
text, some will illustrate additional concepts extending the ideas in the
text, and some others will be applications of the results in the text in
different contexts. This should make the book useful in self-study. So,
by mastering the material in this book, the student will learn both the
conceptual foundations as well as the computational techniques. The latter
aspect is the theme of many of the excellent textbooks which are available
and the student can supplement her education from any of them.
The readership for this book is very wide. Senior undergraduates, graduate students and researchers interested in high energy physics and quantum field theory will find this book very useful. (I expect the student to
have some background in advanced quantum mechanics and special relativity in four-vector notation. A previous exposure to the nonrelativistic
path integral will help, but is not essential.) The approach I have taken
will also attract readership from people working in the interface of gravity
and quantum theory, as well as in condensed matter theory. Further, the
book can be easily adapted for courses in QFT of different durations.
The approach presented here has been tested out in the class room. I
have been teaching selected topics in QFT for graduate students for about
three decades in order to train them, by and large, to work in the area
of interface between QFT and General Relativity. My lectures have covered many of the issues I have described above. The approach was wellappreciated and the students found it useful and enlightening. The feedback I got very often was that my course — taught at graduate student
level — came a year too late!
Prompted by all these, in 2012, I gave a 50 hour course on QFT to
Masters-level students in physics of the University of Pune. (In the Indian
educational system, these are the students who will proceed to Ph.D graduate school in the next year.) The lectures were viedographed and made
available from my institute website to a wider audience. (These are now
available on YouTube; you can access a higher resolution video by sending a
request to
[email protected].) From the classroom feedback as well as
from students who used the videos, I learnt that the course was a success.
This book is an expanded version of the course.
Many people have contributed to this venture and I express my gratitude to them. To begin with, I thank the Physics Department of Pune
University for giving me the opportunity to teach this course to their students in 2012, and many of these students for their valuable feedback. I
thank Suprit Singh who did an excellent job of videographing the lectures.
Sumanta Chakraborty and Hamsa Padmanabhan read through the entire
first draft of the book and gave detailed comments. In addition, Swastik
Bhattacharya, Sunu Engineer, Dawood Kothawala, Kinjalk Lochan, Aseem
Paranjape, Sudipta Sarkar, Sandipan Sengupta, S. Shankaranarayanan and
Tejinder Pal Singh read through several of the chapters and offered comments. I thank all of them. Angela Lahee of Springer initiated this project
viii
PREFACE
and helped me through its completion, displaying considerable initiative
and accommodating my special formatting requirements involving marginal
notes.
This book would not have been possible without the dedicated support
from Vasanthi Padmanabhan, who not only did the entire LaTeXing and
formatting but also produced most of the figures. I thank her for her help.
It is a pleasure to acknowledge the library facilities available at IUCAA,
which were useful in this task.
T. Padmanabhan
Pune, September 2015.
PREFACE
ix
Chapter Highlights
1. From Particles to Fields
The purpose of this chapter is to compute the path integral amplitude for the propagation of a free relativistic particle from the event
xi1 to the event xi2 and demonstrate how the concept of a field emerges
from this description. After introducing (i) the path integral amplitude and (ii) the standard Hamiltonian evolution in the case of a
non-relativistic particle, we proceed to evaluate the propagator for a
relativistic particle. An investigation of the structure of this propagator will lead to the concept of a field in a rather natural fashion. You
will see how the standard unitary evolution, propagating forward in
time, requires an infinite number of degrees of freedom for the proper
description of (what you thought is) a single relativistic particle. In
the process, you will also learn a host of useful techniques related
to propagators, path integrals, analytic extension to imaginary time,
etc. I will also clarify how the approach leads to the notion of the
antiparticle, and why causality requires us to deal with the particle
and antiparticle together.
2. Disturbing the Vacuum
The purpose of this — relatively short — chapter is to introduce you
to the key aspect of QFT, viz., that particles can be created and destroyed. Using an external, classical scalar source J(x), we obtain the
propagator for a relativistic particle from general arguments related
to the nature of creation and destruction events. The discussion then
introduces functional techniques and shows how the notion of the
field again arises, quite naturally, from the notion of particles which
can be created or destroyed by external sources.
By the end of the first two chapters, you would have firmly grasped
how and why combining the principles of relativity and quantum theory demands a concept like the field (with an infinite number of degrees of freedom), and would have also mastered several mathematical
techniques needed in QFT. These include path integrals, functional
calculus, evaluation of operator determinants, analytic properties of
propagators and the use of complex time methods.
3. From Fields to Particles
Having shown in the first two chapters how the quantum theory of a
relativistic particle naturally leads to the concept of fields, we next
address the complementary issue of how fields lead to particles. After
rapidly reviewing the action principle in classical mechanics, we make
a seamless transition from mechanics to field theory. This is followed
by a description of the (i) real and (ii) complex scalar fields and (iii)
the electromagnetic field. Two key concepts in modern physics —
spontaneous symmetry breaking and the notion of gauge fields — are
introduced early on and in fact, the electromagnetic field will come
in as a classical U (1) gauge field.
I then describe the quantization of real and complex scalar fields —
which is fairly straightforward — and connect up with the ideas introduced in chapters 1 and 2. The discussion will compare the transition from particles to fields vis-a-vis from fields to particles, thereby
x
PREFACE
strengthening conceptual understanding of both perspectives. The
idea of particles arising as excitations of the fields naturally brings
in the notion of Bogoliubov transformations. Using this, it is easy to
understand the Unruh-Davies effect, viz., that the vacuum state in an
inertial frame appears as a thermal state in a uniformly accelerated
frame.
We next take up the detailed description of the quantization of the
electromagnetic field. I do this first in the radiation gauge in order
to get the physical results quickly and to explain the interaction of
matter and radiation. This is followed by the covariant quantization
of the gauge field which provides an opportunity to introduce the
Fadeev-Popov technique in the simplest possible context, and to familiarize you with the issues that arise while quantizing a gauge field.
Finally, I provide a detailed description of the Casimir effect which is
used to introduce — among other things — the notion of dimensional
regularization.
4. Real Life I: Interactions
Having described the free quantum fields, we now turn to the description of interacting fields. The standard procedure in textbooks
is to introduce perturbation theory, obtain the Feynman rules, calculate physical processes, and then introduce renormalization as a
procedure to tackle the divergences in the perturbation theory, etc.
For the reasons I described in the Preface, I think it is better to start
from a non-perturbative approach, through the concept of effective
action.
I will do this both for λφ4 theory and for electromagnetic field coupled
to a complex scalar field. In both the cases, one is led to the concept of
renormalization group and that of running coupling constants. These,
in turn, allow us to introduce the Wilsonian approach to QFT, which
is probably the best language available to us today to understand
QFT. The notion of effective action also leads to the Schwinger effect,
viz., the production of charged particles by a strong electric field. This
effect is non-analytic in the electromagnetic coupling constant, and
hence cannot be obtained by perturbation theory.
After having discussed the non-perturbative effects, I turn to the standard perturbation theory for the λφ4 case and obtain the usual Feynman diagrams (using functional integral techniques) and describe how
various processes are calculated. This allows us to connect up themes
like the effective Lagrangian and the running coupling constant from
both perturbative and non-perturbative perspectives.
5. Real Life II: Fermions and QED
Upto this point, I have avoided fermions in order to describe the
issues of QFT in a simplified setting. This last chapter is devoted
to the description of fermions and, in particular, QED. The Dirac
equation is introduced
in a slightly novel way through the relativistic
a
pa , after discussing the corresponding nonrelsquare root p2 = γ
ativistic square root p2 = σ · p and the Pauli equation. Having
obtained the Dirac equation, I describe the standard lore related to
Dirac matrices and obtain the magnetic moment of the electron. I
PREFACE
xi
then proceed to discuss the quantization of the Dirac field, paying
careful attention to the role of causality in fermionic field quantization. The path integral approach to fermionic fields is introduced
through Grassmannians (which is developed to the extent required)
and once again, we will begin with non-perturbative features like the
Schwinger effect for electrons, before discussing perturbation theory
and the Feynman rules in QED.
Finally, I provide a detailed discussion of the one loop QED and renormalization. This will allow, as an example, the computation of the
anomalous magnetic moment of the electron, which many consider to
be the greatest triumph of QED. The discussion of one loop diagrams
in QED also allows the study of renormalization in the perturbative
context and connect up the “running” of the electromagnetic coupling constant computed by the perturbative and non-perturbative
techniques.
Contents
1 From Particles to Fields
1.1 Motivating the Quantum Field Theory . . . . . . . . . . . .
1.2 Quantum Propagation Amplitude for the Non-relativistic
Particle . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.1 Path Integral for the Non-relativistic Particle . . . .
1.2.2 Hamiltonian Evolution: Non-relativistic Particle . .
1.2.3 A Digression into Imaginary Time . . . . . . . . . .
1.2.4 Path Integral for the Jacobi Action . . . . . . . . . .
1.3 Quantum Propagation Amplitude for the Relativistic Particle
1.3.1 Path Integral for the Relativistic Particle . . . . . .
1.4 Mathematical Structure of G(x2 ; x1 ) . . . . . . . . . . . . .
1.4.1 Lack of Transitivity . . . . . . . . . . . . . . . . . .
1.4.2 Propagation Outside the Light Cone . . . . . . . . .
1.4.3 Three Dimensional Fourier Transform of G(x2 ; x1 ) .
1.4.4 Four Dimensional Fourier Transform of G(x2 ; x1 ) . .
1.4.5 The First Non-triviality: Closed Loops and G(x; x) .
1.4.6 Hamiltonian Evolution: Relativistic Particle . . . . .
1.5 Interpreting G(x2 ; x1 ) in Terms of a Field . . . . . . . . . .
1.5.1 Propagation Amplitude and Antiparticles . . . . . .
1.5.2 Why do we Really Need Antiparticles? . . . . . . . .
1.5.3 Aside: Occupation Number Basis in Quantum Mechanics . . . . . . . . . . . . . . . . . . . . . . . . .
1.6 Mathematical Supplement . . . . . . . . . . . . . . . . . . .
1.6.1 Path Integral From Time Slicing . . . . . . . . . . .
1.6.2 Evaluation of the Relativistic Path Integral . . . . .
2 Disturbing the Vacuum
2.1 Sources that Disturb the Vacuum . . . . . . . . . . . . . . .
2.1.1 Vacuum Persistence Amplitude and G(x2 ; x1 ) . . . .
2.1.2 Vacuum Instability and the Interaction Energy of the
Sources . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 From the Source to the Field . . . . . . . . . . . . . . . . .
2.2.1 Source to Field: Via Functional Fourier Transform .
2.2.2 Functional Integral Determinant: A First Look at
Infinity . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.3 Source to the Field: via Harmonic Oscillators . . . .
2.3 Mathematical Supplement . . . . . . . . . . . . . . . . . . .
2.3.1 Aspects of Functional Calculus . . . . . . . . . . . .
1
1
3
4
6
9
13
17
17
20
20
20
21
24
26
27
29
32
35
37
39
39
41
45
45
45
50
52
53
57
63
64
64
xiii
xiv
CONTENTS
3 From Fields to Particles
3.1 Classical Field Theory . . . . . . . . . . . . . . . . . . . . .
3.1.1 Action Principle in Classical Mechanics . . . . . . .
3.1.2 From Classical Mechanics to Classical Field Theory
3.1.3 Real Scalar Field . . . . . . . . . . . . . . . . . . . .
3.1.4 Complex Scalar Field . . . . . . . . . . . . . . . . .
3.1.5 Vector Potential as a Gauge Field . . . . . . . . . .
3.1.6 Electromagnetic Field . . . . . . . . . . . . . . . . .
3.2 Aside: Spontaneous Symmetry Breaking . . . . . . . . . . .
3.3 Quantizing the Real Scalar Field . . . . . . . . . . . . . . .
3.4 Davies-Unruh Effect: What is a Particle? . . . . . . . . . .
3.5 Quantizing the Complex Scalar Field . . . . . . . . . . . . .
3.6 Quantizing the Electromagnetic Field . . . . . . . . . . . .
3.6.1 Quantization in the Radiation Gauge . . . . . . . . .
3.6.2 Gauge Fixing and Covariant Quantization . . . . . .
3.6.3 Casimir Effect . . . . . . . . . . . . . . . . . . . . .
3.6.4 Interaction of Matter and Radiation . . . . . . . . .
3.7 Aside: Analytical Structure of the Propagator . . . . . . . .
3.8 Mathematical Supplement . . . . . . . . . . . . . . . . . . .
3.8.1 Summation of Series . . . . . . . . . . . . . . . . . .
3.8.2 Analytic Continuation of the Zeta Function . . . . .
67
68
68
70
74
77
78
81
87
90
96
101
104
105
109
115
121
125
128
128
129
4 Real Life I: Interactions
4.1 Interacting Fields . . . . . . . . . . . . . . . . . . . . . . . .
4.1.1 The Paradigm of the Effective Field Theory . . . . .
4.1.2 A First Look at the Effective Action . . . . . . . . .
4.2 Effective Action for Electrodynamics . . . . . . . . . . . . .
4.2.1 Schwinger Effect for the Charged Scalar Field . . . .
4.2.2 The Running of the Electromagnetic Coupling . . .
4.3 Effective Action for the λφ4 Theory . . . . . . . . . . . . .
4.4 Perturbation Theory . . . . . . . . . . . . . . . . . . . . . .
4.4.1 Setting up the Perturbation Series . . . . . . . . . .
4.4.2 Feynman Rules for the λφ4 Theory . . . . . . . . . .
4.4.3 Feynman Rules in the Momentum Space . . . . . . .
4.5 Effective Action and the Perturbation Expansion . . . . . .
4.6 Aside: LSZ Reduction Formulas . . . . . . . . . . . . . . . .
4.7 Handling the Divergences in the Perturbation Theory . . .
4.7.1 One Loop Divergences in the λφ4 Theory . . . . . .
4.7.2 Running Coupling Constant in the Perturbative Approach . . . . . . . . . . . . . . . . . . . . . . . . . .
4.8 Renormalized Perturbation Theory for the λφ4 Model . . .
4.9 Mathematical Supplement . . . . . . . . . . . . . . . . . . .
4.9.1 Leff from the Vacuum Energy Density . . . . . . . .
4.9.2 Analytical Structure of the Scattering Amplitude . .
131
131
132
136
139
142
143
146
152
153
156
160
162
164
167
168
5 Real Life II: Fermions and QED
5.1 Understanding the Electron . . . . . . . . . . . . . . . .
5.2 Non-relativistic Square Roots . . . . . . . . . . . . . . .
5.2.1 Square Root of pα pα and the Pauli Matrices . . .
5.2.2 Spin Magnetic Moment from the Pauli Equation
5.2.3 How does ψ Transform? . . . . . . . . . . . . . .
5.3 Relativistic Square Roots . . . . . . . . . . . . . . . . .
189
189
191
191
192
193
195
.
.
.
.
.
.
.
.
.
.
.
.
171
176
180
180
184
CONTENTS
5.4
5.5
5.6
5.7
5.8
5.9
5.3.1 Square Root of pa pa and the Dirac Matrices . . . .
5.3.2 How does ψ Transform? . . . . . . . . . . . . . . .
5.3.3 Spin Magnetic Moment from the Dirac Equation .
Lorentz Group and Fields . . . . . . . . . . . . . . . . . .
5.4.1 Matrix Representation of a Transformation Group
5.4.2 Generators and their Algebra . . . . . . . . . . . .
5.4.3 Generators of the Lorentz Group . . . . . . . . . .
5.4.4 Representations of the Lorentz Group . . . . . . .
5.4.5 Why do we use the Dirac Spinor? . . . . . . . . . .
5.4.6 Spin of the Field . . . . . . . . . . . . . . . . . . .
5.4.7 The Poincare Group . . . . . . . . . . . . . . . . .
Dirac Equation and the Dirac Spinor . . . . . . . . . . . .
5.5.1 The Adjoint Spinor and the Dirac Lagrangian . . .
5.5.2 Charge Conjugation . . . . . . . . . . . . . . . . .
5.5.3 Plane Wave solutions to the Dirac Equation . . . .
Quantizing the Dirac Field . . . . . . . . . . . . . . . . .
5.6.1 Quantization with Anticommutation Rules . . . .
5.6.2 Electron Propagator . . . . . . . . . . . . . . . . .
5.6.3 Propagator from a Fermionic Path Integral . . . .
5.6.4 Ward Identities . . . . . . . . . . . . . . . . . . . .
5.6.5 Schwinger Effect for the Fermions . . . . . . . . .
5.6.6 Feynman Rules for QED . . . . . . . . . . . . . . .
One Loop Structure of QED . . . . . . . . . . . . . . . . .
5.7.1 Photon Propagator at One Loop . . . . . . . . . .
5.7.2 Electron Propagator at One Loop . . . . . . . . .
5.7.3 Vertex Correction at One Loop . . . . . . . . . . .
QED Renormalization at One Loop . . . . . . . . . . . . .
Mathematical Supplement . . . . . . . . . . . . . . . . . .
5.9.1 Calculation of the One Loop Electron Propagator . .
5.9.2 Calculation of the One Loop Vertex Function . . .
xv
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
195
197
200
201
201
202
203
205
206
208
209
212
212
213
214
216
216
219
222
224
226
231
232
234
245
248
250
255
255
257
A Potpourri of Problems
261
Annotated Reading List
279
Index
281
A Potpourri of Problems
(with Solutions)
Evaluation of 0|φ(x)φ(y)|0 for Spacelike Separation
261
Number Density and the Non-relativistic Limit
261
From [φ, π] to [a, a† ]
263
Counting the Modes between the Casimir Plates
263
Effective Potential for m = 0
264
Anomalous Dimension of the Coupling Constant
265
Running of the Cosmological Constant
265
Feynman Rules for Scalar Electrodynamics
266
Two-loop Contribution to the Propagator in the λφ4 Theory
267
Strong Field Limit of the Effective Lagrangian in QED
270
Structure of the Little Group
270
Path Integral for the Dirac Propagator
270
Dirac Propagator as a Green Function
271
An Alternative Approach to the Ward Identities
272
Chiral Anomaly
273
Compton Scattering: A Case Study
274
Photon Mass from Radiative Corrections in d = 2
276
Electron Self-energy with a Massive Photon
277
xvii
Chapter 1
From Particles to Fields
1.1
Motivating the Quantum Field Theory
Quantum field theory is a set of rules using which high energy physicists
can “shut up and calculate” the behaviour of particles at high energies.
The resulting predictions match incredibly well with experiments, showing
that these rules must have (at least) a grain of truth in them. While a
practitioner of the art might be satisfied in inventing and using these rules
(and picking up the resultant Nobel prizes), some students might be curious
to know the conceptual motivation behind these computational rules and
their relationship to the non-relativistic quantum mechanics (NRQM) they
are familiar with. One natural question such a student asks is the following:
Given the classical description of a non-relativistic particle in an external potential, one can obtain the corresponding quantum description in a
fairly straightforward way1 . Since we do have a well-defined classical theory of relativistic particles as well, why can’t we just quantise it and get
a relativistically invariant quantum theory? Why do we need a quantum
field theory?
Most textbooks of quantum field theory raise this question in their
early part and tell you that you need the quantum field theory because any
theory, which incorporates quantum mechanics and relativity, has to be a
theory in which number of particles (and even the identity of particles) is
not conserved. If you collide an electron with a positron at high enough
energies, you may end up getting a plethora of other particle-antiparticle
pairs. Even at low energies, you can annihilate an electron with a positron
and get a couple of photons. Clearly, if you try to write a (relativistically
invariant version of) the Schrodinger equation for an interacting electronpositron pair, you wouldn’t know what to do once they have disappeared
and some other particles have appeared on the scene.
This fact, while quite important, is only one part of the story. Condensed matter physics is full of examples in which phonons, magnons and
other nons keep appearing and disappearing and — in fact — field theory is
the most efficient language to describe these situations. But, in principle,
one could have also written down the non-relativistic Schrodinger equation for, say, all the electrons and lattice ions in a solid and described the
same physics. This situation is quite different from relativistic quantum
theory, wherein the basic principles demand the existence of antiparticles,
to which there is no simple non-relativistic analogue. Obviously, this has
© Springer International Publishing Switzerland 2016
T. Padmanabhan, Quantum Field Theory, Graduate Texts in Physics, DOI 10.1007/978-3-319-28173-5_1
1 Issues like spin or Pauli exclusion principle cannot be understood
within non-relativistic quantum mechanics; so the students who stop at
NRQM must learn to live with familiarity rather than understanding!
1
2
2 Historically, we started with the
wave nature of radiation and the particle nature of the electron. The complementary description of particles as
waves arises even in the non-relativistic
theory while the description of waves
as particles requires relativity. This
is why, even though quantum theory
had its historical origins in the realization that light waves behave as though
they are made of particles (‘photons’),
the initial efforts were more successful
in describing the wave nature of particles like electrons — because you could
deal with a non-relativistic system —
rather than in describing the particle
nature of waves.
Chapter 1. From Particles to Fields
extra implications that go beyond the fact that we need to deal with systems having variable number of particles — which, by itself, can be handled
comparatively easily.
The second complication is the following: Combining relativity with
quantum theory also forbids the localization of a particle in an arbitrarily
small region of space. In non-relativistic mechanics, one can work either
with momentum eigenstates |p or position eigenstates |x for a particle
with equal ease. You can localize a particle in space with arbitrary accuracy, if you are willing to sacrifice the knowledge about its momentum. In
relativistic quantum mechanics, all sorts of bad things happen if you try
to localize a particle of mass m to a region smaller than /mc. This will
correspond to an uncertainty in the momentum of the order of mc leading
to an uncertainty in the energy of the order of mc2 . This — in turn —
can lead to the production of particle-antiparticle pairs; so the single particle description breaks down if you try to localize the particles in a small
enough region. Mathematically, the momentum eigenstates |p continue
to be useful in relativistic quantum mechanics, but not the position eigenstates |x. The concept of a precise position for a particle, within a single
particle description, ceases to exist in relativistic quantum mechanics. (We
will say more about this in Sect. 1.4.6.)
The third issue is the existence of spin. In the case of an electron,
one might try to imagine its spin as arising out of some kind of intrinsic
rotation, though it leads to a hopelessly incorrect picture. In the case of
a massless spin-1 particle like the photon, there is no rest frame and it is
not clear how to even define a concept like ‘intrinsic rotation’; but we do
know that the photon also has two spin states (in spite of being a spin-1
particle). Classically, the electromagnetic wave has two polarization states
and one would like to think of them translating to the two spin states of
photons. But since the photon is massless and is never at rest, it is difficult
to understand how an individual photon can have an intrinsic rotation
attributed to it. Spin, as we shall see, is a purely relativistic concept.
A feature closely related to spin is the Pauli exclusion principle. Particles with half-integral spins (called fermions) obey this principle which
states that no two identical fermions can be put in the same state. If we
have a Schrodinger wave function ψ(x1 , x2 , · · · xn ) in NRQM describing
the state of a system of N fermions — say, electrons — then this wave
function must be antisymmetric under the exchange of any pair of coordinates xi ↔ xj . Why particles with half-integral spin should behave in this
manner cannot be explained within the context of NRQM. It is possible to
understand this result in quantum field theory — albeit after introducing a
fairly formidable amount of theoretical machinery — as arising due to the
structure of the Lorentz group, causality, locality and a few other natural
assumptions. Once again, adding relativity to quantum theory leads to
phenomena which are quite counter-intuitive and affects the structure of
NRQM itself.
There are at least two more good reasons to think of fields — rather
than particles — as fundamental physical entities. The first reason is the
existence of the electromagnetic field which has classical solutions that describe propagating waves.2 Experiments, however, show that electromagnetic waves also behave like a bunch of particles (photons) each having a
momentum (p) and energy (ωp = |p|). We need to come up with ‘rules’
that will get us the photons starting from a card-holding field, if there ever
1.2. G(x2 ; x1 ) for the Non-relativistic Particle
3
was one, which requires one to somehow “quantise” the field.
Second, there are certain phenomena like the spontaneous symmetry
breaking which are most easily understood in terms of fields and the ground
states of fields which are described as condensates. (We will say more about
this in Sect. 3.2.) This is very difficult to interpret completely in terms
of the particle picture though we know that there is a formal equivalence
between the two. In general, interactions between the particles are best
described using the language of fields and more so when there is some funny
business like spontaneous symmetry breaking going on. So one needs to be
able to proceed from particles to fields and vice-versa to describe the real
world, which is what quantum field theory attempts to do.
Having said something along these lines, we can, at this stage, jump to
the description of classical fields and their quantization — which is what
most text books do. I will do it differently.
I will actually show you how a straightforward attempt to describe
relativistic free particles in a quantum mechanical language leads to the
concept of fields. We will do this from different perspectives demonstrating
how these perspectives merge together in a consistent manner. In Chapter
1, we will start with a single, relativistic spinless particle and — by evaluating the probability amplitude for this particle to propagate from an event
P to an event Q in spacetime — we will be led to the concept of quantum
fields as well as the notion of antiparticles. These ideas will be reinforced in
Chapter 2 where we will take a closer look at the creation and subsequent
detection of spinless particles by external agencies. Once again, we will be
led to the notion of a quantum field and the action functional for the same,
starting from the amplitude for propagation.
Having obtained the fields from the particles and established the raison
d’ˆetre for the quantum fields, we will reverse the logic and obtain the (relativistic) particles as excitations of the quantum fields in Chapter 3. This is
more in tune with the conventional textbook discussion and has the power
to deal with particles which have a nonzero spin — like, for example, the
photons. In Chapter 4 we will introduce the interaction between the fields
in the simplest possible context, still dealing with scalar and electromagnetic fields. Finally, Chapter 5 will introduce you to the strange features
of fermions and QED.
The conceptual consistency of this formalism — viewed from different
perspectives — as well as the computational success, leading to results
which are in excellent agreement with the experiments, are what make one
believe that the overall description cannot be far from the truth, in spite
of several counterintuitive features we will come across (and tackle) as we
go along.
1.2
Quantum Propagation Amplitude for the
Non-relativistic Particle
Given the classical dynamics of a free, spin-zero non-relativistic particle
of mass m, we can proceed to its quantum description using either the
action functional A or the Hamiltonian H. In the case of a free particle,
nothing much happens and the only interesting quantity to compute is the
amplitude A for detecting a particle at an event x2 ≡ xa2 ≡ (t2 , x2 ) if it was
created at an event x1 ≡ xa1 ≡ (t1 , x1 ). This is given by the product: A =
NOTATION
— We will work in a (1+D) spacetime (though we are usually interested in the case of D = 3).
— We use the mostly negative
signature, hopefully consistently.
— We take = 1, c = 1 most of
the time.
— x1 , x2 ... etc. with no indices
as well as xa , pi ... will denote
spacetime vectors with Latin indices taking the range of values
a, i, .... = 0, 1 − D.
— p, x, ...etc. denote D dimensional spatial vectors.
— Greek indices like in xα , pμ etc.
vary over space: α, μ, .... = 1 − D.
— While taking Fourier transforms the integrals are taken
over f (x) exp(−ip · x)dD x and
f (p) exp(ip · x)[dD p/(2π)D ] and
similarly for spacetime Fourier
transforms.
— both p·x and px stand for pi xi .
We use px when p0 is a specific
function of p given by the ‘onshell’ value. We use the notation
p · x for pi xi = p0 x0 − p · x when
p0 is unspecified.
— The symbol ≡ in an equation
indicates that it defines a new
variable or a notation.
4
Chapter 1. From Particles to Fields
D(x2 )G(x2 ; x1 )C(x1 ) where D(x2 ) is the amplitude for detection, C(x1 ) is
the amplitude for creation and G(x2 ; x1 ) is the amplitude for the particle
to propagate from one event x1 to another event x2 . (For example, the
beaten-to-death electron two slit experiment involves an electron gun to
create electrons and a detector on the screen to detect them). The resulting
probability |A|2 is expected to have an invariant, absolute, meaning.
The creation and detection processes, described by C(x1 ), D(x2 ), turn
out to be far more non-trivial than one would have first imagined and we
will have a lot to say about them in Sect. 2.1. In this section we will
take these as provided by our experimentalist friend and concentrate on
G(x2 ; x1 ).
1.2.1
Path Integral for the Non-relativistic Particle
We will start with the non-relativistic particle and obtain the amplitude
G(x2 ; x1 ) first using the path integral (in this section) and then using the
Hamiltonian (in the next section). The path integral prescription says that
G(x2 ; x1 ) can be expressed as a sum over paths x(t) connecting the two
events in the form
t2
1
˙ 2 dt
G(x2 ; x1 ) =
m|x|
(1.1)
exp iA[x(t)] =
exp i
t1 2
x(t)
t
x2
y0
x1
x
Figure 1.1:
Examples of paths included in the sum in Eq. (1.1)
t
x2
y0
x1
x
Figure 1.2: A path that goes forward
and backward which is not included in
the sum in Eq. (1.1).
x(t)
The paths summed over are restricted to those which satisfy the following
condition: Any given path x(t) cuts the spatial hypersurface t = y 0 at any
intermediate time, t2 > y 0 > t1 , at only one point. In other words, while
doing the sum over paths, we are restricting ourselves to paths of the kind
shown in Fig. 1.1 (which always go ‘forward in time’) and will not include,
for example, paths like the one shown in Fig. 1.2 (which go both forward
and backward in time). The path in Fig. 1.2 cuts the constant time surface
t = y 0 at three events suggesting that at t = y 0 there were three particles
simultaneously present even though we started out with one particle. It is
this feature which we avoid (and stick to single particle propagation) by
imposing this condition on the class of paths that is included in the sum.
By the same token, we will assume that the amplitude G(x2 ; x1 ) vanishes
for x02 < x01 ; the propagation is forward in time.
This choice of paths, in turn, implies the following ‘transitivity constraint’ for the amplitude:
G(x2 ; x1 ) = dD y G(x2 ; y)G(y; x1 )
(1.2)
The integration at an intermediate event y i = (y 0 , y) (with t2 > y 0 > t1 )
is limited to integration over the spatial coordinates because each of the
paths summed over cuts the intermediate spatial surface at only one point.
Therefore, every path which connects the events x2 and x1 can be uniquely
specified by the spatial location y at which it crosses the surface t = y 0 .
So the sum over all the paths can be divided into the sum over all the
paths from xi1 to some location y at t = y 0 , followed by the sum over all
the paths from y i to xi2 with an integration over all the locations y at the
intermediate time t = y 0 . This leads to Eq. (1.2).
The transitivity condition in Eq. (1.2) is vital for the standard probabilistic interpretation of the wave function in non-relativistic quantum
1.2. G(x2 ; x1 ) for the Non-relativistic Particle
5
mechanics. If ψ(t1 , x1 ) is the wave function giving the amplitude to find a
particle at x1 at time t1 , then the wave function at a later time t = y 0 is
given by the integral:
0
ψ(y , y) = dD x1 G(y; x1 ) ψ(t1 , x1 )
(1.3)
which interprets G(y; x1 ) as a propagator kernel allowing us to determine
the solution to the Schrodinger equation at a later time t = y 0 from its
solution at t = t1 . Writing the expression for ψ(t2 , x2 ) in terms of ψ(y 0 , y)
and G(x2 ; y) and using Eq. (1.3) to express ψ(y 0 , y) in terms of ψ(t1 , x1 ),
it is easy to see that Eq. (1.2) is needed for consistency. Equation (1.2) or
Eq. (1.3) also implies the condition:
G(t, x; t, y) = t, x|t, y = δ(x − y)
(1.4)
where |t, x is a position eigenstate at time t. Three crucial factors have
gone into these seemingly innocuous results:3
(i) The wave function at time t can be obtained from knowing only the
wave function at an earlier time (without, e.g., knowing its time derivative)
which means that the differential equation governing ψ must be first order
in time.
ˆ (t) at
(ii) One can introduce eigenstates |t, x of the position operator x
ˆ (t)|t, x = x|t, x so that ψ(t, x) = t, x|ψ with Eq. (1.4) allowtime t by x
ing the possibility of localizing a particle in space with arbitrary accuracy.
(iii) One can interpret G(x2 ; x1 ) in terms of the position eigenstates as
t2 , x2 |t1 , x1 .
The result of the integral in Eq. (1.2) has to be independent of y 0
because the left hand side is independent of y 0 . This is a strong restriction
on the form of G(x2 ; x1 ). The transitivity condition, plus the fact that
the free particle amplitude G(x2 ; x1 ) can only depend on |x2 − x1 | and
(t2 − t1 ) because of translational and rotational invariance, fixes the form
of G(x2 ; x1 ) to a great extent. To see this, express G(x2 ; x1 ) in terms of its
spatial Fourier transform in the form:
dD p
G(y; x) =
θ(y 0 − x0 ) F (|p|; y 0 − x0 ) eip·(y−x)
(1.5)
(2π)D
3 As
you will see, all of them will run
into trouble in the context of a relativistic particle.
and substitute into Eq. (1.2). This will lead to the condition
F (|p|; x02 − y 0 )F (|p|; y 0 − x01 ) = F (|p|; x02 − x01 )
(x02 > y 0 > x01 ) (1.6)
Exercise 1.1: Prove all these claims.
which has a unique solution F (|p|; t) = exp[α(|p|)t] where α(|p|) is an arbitrary function of |p|. Further, we note that F (|p|; y 0 − x0 ) propagates
the momentum space wave function φ(x0 , p) — which is the spatial Fourier
transform of ψ(x0 , x) — from t = x0 to t = y 0 . Since φ is the Fourier transform of ψ, this “propagation” is just multiplication by F . The probability
calculated from the momentum space wave function will be well behaved
for |t| → ∞ only if α is pure imaginary, thereby only contributing a phase.
So α = −if (|p|) where f (|p|) is an arbitrary function4 of |p|. Thus, the
spatial Fourier transform of G(x2 ; x1 ) must have the form
dD x G(x2 ; x1 ) e−ip·x = θ(t) e−if (|p|)t
(1.7)
4 You can also obtain the same result from the fact that exp(iA) goes
to exp(−iA) under the time reversal
t2 ⇐⇒ t1 ; the path integral sum must
be defined such that F → F ∗ under
t → −t which requires α to be pure
imaginary.
where xa ≡ xa2 − xa1 ≡ (t, x). That is, it must be a pure phase5 determined
5 If we interpret this phase as due
to the energy ωp = p2 /2m and set
f (p) = ωp then an inverse Fourier
transform of Eq. (1.7) will immediately
determine G(x2 ; x1 ) leading to our result obtained below in Eq. (1.9). But
then we are getting ahead of the story
and missing all the fun!.
6
Chapter 1. From Particles to Fields
by a single function f (|p|).
The sum over paths in Eq. (1.1) itself is trivial to evaluate even without us defining precisely what the sum means. (The more sophisticated
definitions for the sum work — or rather, are designed to work — only because we know the answer for G(x2 ; x1 ) from other well-founded methods!.)
We first note that the sum over all x(t) is the same as the sum over all
q(t) ≡ x(t) − xc (t) where xc (t) is the classical path for which the action is
an extremum. Because of the extremum condition, A[xc +q] = A[xc ]+A[q].
Substituting into Eq. (1.1) and noting that q(t) vanishes at the end points,
we see that the sum over q(t) must be just a function of (t2 − t1 ). (It can
only depend on the time difference rather than on t2 and t1 individually
whenever the action has no explicit time dependence; i.e., for any closed
system). Thus we get
G(x2 ; x1 ) = e
iA[xc ]
e
iA[q(t)]
= N (t) exp iA[xc ] ≡ N (t) exp
q
Exercise 1.2: Prove this result for
N (t).
6 Obviously,
a similar argument
works for any action which is at
most quadratic in x˙ and x and the
sum over paths will have the form
N (t2 , t1 ) exp(iA[xc (t)]) even when the
action has an explicit time dependence. If the action has no explicit
time dependence, we further know that
N (t2 , t1 ) = N (t) which is the only nontrivial quantity left to compute. If the
action is not quadratic in x˙ and x, nobody knows how to do the sum over
paths, except in a few rare cases, and
we need to use less fancy (but more
powerful) methods to quantise the system.
7 Note that we write px ≡ (ω t −
p
p · x) even when pi is not a fourvector. The combination (ωp t − p · x),
as well as the relations in HamiltonJacobi theory, E = −∂t A, p = ∇A —
which give you ‘relativity without relativity’, pi = −∂i A — have their origins
in wave phenomena and not in relativity.
i m|x|2
2 t
(1.8)
where t ≡ t2 − t1 and x ≡ x2 − x1 . The form of N (t) is strongly constrained
by the transitivity condition, Eq. (1.2) — or, equivalently, by Eq. (1.7) —
which requires the N (t) to have the form (m/2πit)D/2 eat where a = iϕ,
say. Thus, except for an ignorable, constant, phase factor ϕ (which is
equivalent to adding a constant to the original Lagrangian), N (t) is given
by (m/2πit)D/2 and we can write the full propagation amplitude for a
non-relativistic particle as:6
m D/2
i m|x|2
G(x2 ; x1 ) = θ(t)
exp
(1.9)
2πit
2 t
The θ(t) tells you that we are considering a particle which is created, say,
at t1 and detected at t2 with t2 > t1 . In non-relativistic mechanics all
inertial observers will give an invariant meaning to the statement t2 > t1 .
It is also easy to see that the G(x2 ; x1 ) in Eq. (1.9) satisfies the condition
in Eq. (1.4).
1.2.2
Hamiltonian Evolution: Non-relativistic Particle
Let us next review briefly how these ideas connect up with the more familiar
Hamiltonian approach to NRQM. We know that the behaviour of a free
non-relativistic particle is governed by a Hamiltonian H(p) = p2 /2m and
that the free particle wave functions are made out of exp(ip · x). To connect
up the path integral result with such a Hamiltonian description, we only
have to Fourier transform the amplitude G(x2 ; x1 ) in Eq. (1.9) with respect
to x getting
G(p, t) = G(x2 ; x1 )e−ip·x dD x = θ(t) exp(−iωp t)
(1.10)
where ωp ≡ p2 /2m is the energy of the particle of momentum p. [This
form, of course, matches with the result in Eq. (1.7).] Thus the G(x2 ; x1 )
can be expressed, by an inverse Fourier transform, in the form7
G(x2 ; x1 ) = θ(t)
dD p i(p·x−ωp t)
e
= θ(t)
(2π)D
dD p −ipx
e
(2π)D
(1.11)
1.2. G(x2 ; x1 ) for the Non-relativistic Particle
7
We will now obtain the same result from first principles in order to set
the ground for the relativistic particle later on. Let us start by rapidly
reviewing some basic concepts in quantum theory.
ˆ
If A(t)
is an operator at time t in the Heisenberg picture, then its
ˆ
eigenvalue equation can be written in the form: A(t)|t,
a = a|t, a. The
eigenkets |t, a are associated with the operator at time t and carry the
time label t as well as the eigenvalue a. By convention, the eigenket at
t = 0 will be denoted without the time label simply as |a ≡ |0, a. We will
work with closed systems for which the Hamiltonian does not have explicit
time dependence. In that case, the evolution equation for the operators
ˆ H]
ˆ has the solution
i∂t Aˆ = [A,
ˆ ˆ
ˆ
ˆ = eiHt
A(t)
A(0) e−iHt
(1.12)
The eigenvalue equation at time t can now be written in the form
ˆ ˆ
ˆ
ˆ
e−iHt |t, a
A(t)|t,
a = a|t, a = eiHt A(0)
(1.13)
or equivalently,
ˆ
ˆ
ˆ
A(0)
e−iHt |t, a = a e−iHt |t, a
(1.14)
ˆ
This is clearly an eigenvalue equation for A(0)
with eigenvalue a and hence
ˆ
−iHt
we can identify e
|t, a with |0, a and write
ˆ
|t, a = eiHt |0, a;
t, a| = 0, a|e−iHt
ˆ
(1.15)
Therefore the amplitude t , a |t, a is given by
t , a |t, a = 0, a |e−iHt eiHt |0, a = a |e−iH(t −t) |a
ˆ
ˆ
(1.16)
Of particular importance to us are the position and momentum operˆ (t) and p
ˆ (t) satisfying the standard commutation rules [xα , pβ ] =
ators x
αβ
ˆ (0) are denoted by |x, then Eq. (1.16) tells us
iδ . If the eigenkets of x
that we can write the propagation amplitude as
t2 , x2 |t1 , x1 ≡ G(x2 ; x1 ) = x2 |e−iH(t2 −t1 ) |x1
ˆ
(1.17)
The eigenstates of the momentum operator form a complete set obeying
the relation:
dD p
|pp| = 1
(1.18)
(2π)D
Multiplying both sides by the eigenket |k, it is clear that consistency
requires:
(1.19)
p|k = (2π)D δ(p − k)
Further, one can easily show that the operator e−i·ˆp generates translations
in space. To see this, let us concentrate on just one dimension and consider
the operators
ˆ = 1 + i ˆ
ˆ −1 = 1 − i ˆ
p
(1.20)
U
p
U
where is infinitesimal. Then,
ˆ qˆU
ˆ −1 = (1+i ˆ
p)ˆ
q (1−i ˆ
p) = (1+i ˆ
p)(ˆ
q −i ˆ
qpˆ) = qˆ+i [ˆ
p, qˆ] = qˆ+ (1.21)
U
Exercise 1.3: Prove this.
8
Chapter 1. From Particles to Fields
ˆ −1 |x:
Consider now the operation of qˆ on the ket U
ˆ −1 |x = U
ˆ −1 (U
ˆ qˆ U
ˆ −1 )|x = U
ˆ −1 (ˆ
ˆ −1 |x
qˆ U
q + )|x = (x + )U
8 This is just a fancy way of doing the
Taylor series expansion
of a function to
n
n
find ψ(x + a) =
n (a /n!)∂x ψ(x) =
exp[a∂x ]ψ(x) ≡ exp[iaP ]ψ(x), using
the ‘generator’ P ≡ −i∂x . If ψ(x +
a) = x + a|ψ then we get x + a| =
exp[iaP ]x| which is the same as |x +
a = exp[−iaP ]|x. The four dimensional generalization of this result is
ˆ
|x+a = eiap |x where ap = a0 H−a·ˆ
p.
9 We are ignoring any p-dependent
phase in Cp here. This corresponds
to taking x = i(∂/∂p) rather than x =
i(∂/∂p) + f (p) to implement [x, p] = i.
(1.22)
ˆ −1 |x = |x + . Proceeding to finite displacements
which implies that U
from the infinitesimal displacement and taking into account all the D dimensions, we get8
|x + = e−i·ˆp |x
(1.23)
Using this, we can write
dD p
dD p −ip·x
e
|pp|0
=
Cp e−ip·x |p
|x = e−ix·ˆp |0 =
D
(2π)
(2π)D
(1.24)
where Cp = p|0 and we have introduced a complete set of momentum
basis in the second equality. It is traditional to set the phase convention
such that Cp = 1 so that the wave functions in coordinate space, x|ψ,
and momentum space, p|ψ, are related by a simple Fourier transform:9
dD p
dD p ip·x
x|ψ =
x|pp|ψ
=
e
p|ψ
(1.25)
D
(2π)
(2π)D
After these preliminaries, we are in a position to compute G(x2 ; x1 ) in
standard Heisenberg picture. The propagation amplitude from x1 at t = t1
to x2 at t = t2 is given by (with xi2 − xi1 ≡ xi = (t, x)) the expression:
dD p −iωp t
ˆ
G(x2 ; x1 ) = x2 |e−iH(p)t |x1 =
e
x2 |pp|x1
(1.26)
(2π)D
where we have introduced a complete set of momentum eigenstates and
ˆ
used the result e−iH(p)t |p = e−iωp t |p. Using p|x = exp[−ip · x] in
Eq. (1.26), we get
dD p −iωp t+ip·x
G(x2 ; x1 ) =
e
(1.27)
(2π)D
This matches with the momentum space structure of G(x2 ; x1 ) given by
Eq. (1.11).
Taking the product of Eq. (1.18) from the left with x| and from the
right with |y we get the orthonormality condition for x|y.
dD p ip·(x−y)
dD p
x|pp|y
=
e
= δ(x − y)
(1.28)
x|y =
(2π)D
(2π)D
10 In non-relativistic quantum mechanics both position eigenkets |x and
momentum eigenkets |p are equally
good basis to work with. We will see
later that momentum eigenkets continue to be a good basis in relativistic quantum mechanics, but defining a
useful position operator for a relativistic particle becomes a nontrivial and
unrewarding task.
This, in turn, requires the path integral amplitude to satisfy the condition
in Eq. (1.4), which we know it does. Thus everything works out fine in this
context.10
Incidentally, the expression in Eq. (1.26) can also be expanded in terms
of the eigenkets |E of energy to give an expression for G(x2 ; x1 ) in terms
of the energy eigenfunctions φE (x) ≡ x|E for any system with a time independent Hamiltonian, bounded from below. In that case, we can arrange
matters so that E ≥ 0. We have
∞
ˆ
dE e−iEt x2 |EE|x1
G(x2 ; x1 ) = x2 |e−iH(p)t |x1 =
0
∞
=
dE e−iEt φE (x2 )φ∗E (x1 )
(1.29)
0
1.2. G(x2 ; x1 ) for the Non-relativistic Particle
9
So, in principle, you can find G(x2 ; x1 ) for any system for which you know
φE (x). This relation will be useful later on.
Now that we have discussed the calculation of G(x2 ; x1 ) from the path
integral as well as from the Hamiltonian, it is worth spending a moment
to connect up the two. (The details of this approach are described in the
Mathematical Supplement Sect. 1.6.1.) The basic idea is to compute the
ˆ
matrix element y|e−itH |x by dividing the time interval (0, t) into N equal
intervals of size such that t = N . The matrix element is evaluated by
writing it as a product of matrix elements between a complete set of position
eigenstates |xk with k = 1, 2, ...(N − 1) and integrating over the xk . At
the end of the calculation one takes the limit N → ∞, → 0 with t = N
remaining finite. The result will remain finite only if the integrations and
the limiting process are carried out with a suitable measure which should
emerge from the calculation itself. This approach of dividing up the time
into equal intervals reduces the problem to computing the matrix element
ˆ
ˆ has the standard form H = p
ˆ 2 /2m + V (ˆ
x), we
xk+1 |e−iH |xk . If H
can evaluate this matrix element by expanding the exponential in a Taylor
series in and retaining only up to linear term.11 By this procedure one
can obtain
dpk ipk ·(xk+1 −xk ) −iH(pk ,xk )
ˆ
−iH
|xk =
e
e
(1.30)
xk+1 |e
(2π)D
Interpreting (xk+1 − xk )/ as the velocity x˙ in the continuum limit, one
can see that the argument of the exponential has the p·x˙ − H(p, x) = L
integrated over the infinitesimal time interval. If you take the product of
these factors you obtain a natural definition of the path integral in phase
space in the form
ˆ
(1.31)
y|e−iHt |x = Dp Dx exp i dt [p·x˙ − H(p, x)]
In the case of the standard Hamiltonian with H = p2 /2m + V (x), we can
integrate over pk in Eq. (1.30) and obtain
m D/2
im
ˆ
−iH
2
(xk+1 − xk ) − i V (xk ) (1.32)
exp
xk+1 |e
|xk =
2πi
2
In this case, the continuum limit will directly give the expression:
1
ˆ
y|e−iHt |x = Dx exp i dt mx˙ 2 − V (x)
2
(1.33)
In both the cases, the path integral measure, Dx, Dp etc. are defined as a
product of integrals over intermediate states with a measure which depends
on , as shown in detail in mathematical supplement Sect.1.6.1.
1.2.3
A Digression into Imaginary Time
The mathematical manipulations used in the computation of path integrals
become less ambiguous if we use a technique of analytically continuing the
results back and forth between real and imaginary time. This is a bit of
an overkill for the non-relativistic free particle but will become a valuable
mathematical tool in field theory and even for studying the quantum mechanics of a particle in external potentials. We will now introduce this
procedure.
11 Since eA+B = eA eB when A and
B do not commute, we resort to this
trick to evaluate the matrix element.
ˆ ) has a more non-trivial strucIf H(ˆ
p, x
ture, there could be factor ordering issues, as well as questions like whether
to evaluate H at xk or at xk+1 or at
the midpoint, etc. None of these (very
interesting) issues are relevant for our
purpose.
10
12 A relation like t = −it cannot hold
E
with both t and tE being real, and this
fact can be confusing. It is better to
think of a complex t in the form te−iθ
with θ = 0 giving the usual time which
makes you age and θ = π/2 giving
the expressions in the Euclidean sector. Algebraically, however, this can be
done by setting t = −itE and thinking
of tE as real!
Im t
Re t
Im t
Re t
Figure 1.3: The i and Euclidean time
prescription.
Chapter 1. From Particles to Fields
We consider all expressions to be defined in a complex t-plane with the
usual time coordinate running along the real axis. The integrals etc. will
be defined, though, off the real-t axis by assuming that t actually stands
for t exp(−iθ) where 0 ≤ θ ≤ π/2 and t now stands for the magnitude
in the complex plane. There are two values for θ which are frequently
used: If you think of θ as an infinitesimal , this is equivalent to doing
the integrals etc. with t(1 − i ) instead of t. Alternatively, we can choose
θ = π/2 and work with −it. This is done by replacing t by −itE in the
relevant expressions and treating tE as real.12 . (The tE is usually called
the Euclidean time because this change makes the Lorentzian line element
−dt2 + dx2 with signature (−, +, +, ...) go into the Euclidean line element
+dt2E + dx2 .) If there are no poles in the complex t plane for the range
−π/2 ≤ θ ≤ 0, π ≤ θ ≤ 3π/2 in the expressions we are interested in, then
both the prescriptions — the i prescription or the t → −itE prescription
— will give the same result.
If we put t = −itE , then the phase of the path integral amplitude for a
particle in a potential V (x) becomes:
2
2
dx
dx
1
1
m
− V (x) = − dtE
+ V (x)
iA = i dt m
2
dt
2
dtE
(1.34)
so that the sum over paths is now given by an expression which is properly
convergent for a V (x) that is bounded from below:
2
dx
1
m
+ V (x)
(1.35)
GE (x2 ; x1 ) = Dx exp − dtE
2
dtE
Everything works out as before (even better as far as the integrals are
concerned) and the Euclidean version of the path integral sum leads to:
D/2
m
1 m|x|2
exp −
(1.36)
GE (x2 ; x1 ) = θ(tE )
2πtE
2 tE
This analytic continuation should be thought of as a rotation in the complex t-plane with |t|e−iθ , rotating from θ = 0 to θ = π/2. Incidentally,
GE (x2 ; x1 ) represents the kernel for the diffusion operator because the
Schrodinger equation for a free particle becomes a diffusion equation in
Euclidean time.
We will illustrate the utility of this formalism, especially in giving meaning to certain limits, with a couple of examples in the context of nonrelativistic quantum mechanics. As a first example, we will show that the
ground state energy and the ground state wave function of a system can be
directly determined from GE (x2 ; x1 ) in the Euclidean sector. To do this,
we will use Eq. (1.29) written in the form
φn (x2 )φ∗n (x1 ) exp(−iEn t)
(1.37)
t, x2 |0, x1 =
n
where φn (x) = x|En is the n-th energy eigenfunction of the system under
consideration. We have assumed that the energy levels are discrete for the
sake of simplicity; the summation over n will be replaced by integration
over the energy E if the levels form a continuum. Let us now consider the
following limit:
W (t; x2 , x1 ) ≡ lim G(x2 ; x1 )
t→+∞
(1.38)
1.2. G(x2 ; x1 ) for the Non-relativistic Particle
11
The t-dependence of the left hand side is supposed to indicate the leading
dependence when the limit is taken. This limit cannot be directly ascertained from Eq. (1.37) because the exponent oscillates. However, we can
give meaning to this limit if we first transform Eq. (1.37) to the imaginary
time τ1 = it1 and τ2 = it2 and consider the limit tE = (τ2 − τ1 ) → ∞. We
find that
WE (tE ; x2 , x1 ) = lim GE (x2 ; x1 ) ∼
= φ0 (x2 )φ0 (x1 ) exp[−E0 tE ]
tE →∞
(1.39)
where the zero-subscript denotes the lowest energy state. (Note that φ∗0 =
φ0 for the ground state.) Thus only the ground state contributes in this
infinite time limit. Writing this as
∞
∞, 0| − ∞, 0E = |φ0 (0)|2 exp −
dtE E0
(1.40)
−∞
we see that the phase of ∞, 0| − ∞, 0E gives the ground state energy.
More explicitly, from WE (tE ; 0, 0) ≈ (constant) exp(−E0 tE ), we have:
1
1
E0 = lim − ln WE (tE ; 0, 0) = lim − lntE , 0|0, 0E
tE →∞
tE →∞
tE
tE
(1.41)
Using a variant of Eq. (1.39), we can also express the ground state wave
function itself in terms of G(x2 ; x1 ). From13
we see that
WE (tE ; x2 , x1 ) ∼ φ0 (x2 )φ0 (x1 )
=
WE (tE ; 0, 0)
φ0 (0)φ0 (0)
(1.42)
WE (tE ; 0, x) ∼ φ0 (x)
=
WE (tE ; 0, 0)
φ0 (0)
(1.43)
Ignoring the proportionality constants, which can be fixed by normalizing
the ground state wave function, we have the result
(1.44)
φ0 (x) ∝ lim WE (tE ; 0, x) ∝ Dq exp (−AE [∞, 0; 0, x])
tE →∞
So if you evaluate the Euclidean path integral with the boundary conditions
that at tE = 0 we have q = x and at tE = ∞, q = 0, then, except for
unimportant constants, the path integral will reproduce the ground state
wave function. This result is useful in field theoretic contexts in which
one cannot explicitly solve the analog of the Schrodinger equation to find
the ground state wave function but can evaluate the path integral in some
suitable approximation.
The propagation amplitude can also be used to study the effect of external perturbations on the system. (We will discuss this in detail in Sect. 2.1;
here, we shall just derive a simple formula which will be of use later on.)
Let us suppose that the system was in the ground state in the asymptotic
past (t1 → −∞). At some time t = −T we switch on an external, time dependent disturbance J(t) affecting the system. Finally at t = +T we switch
off the perturbation. Because of the time-dependence, we no longer have
stationary energy eigenstates for the system. In fact, the system is likely
to have absorbed energy from the perturbation and ended up at some excited state at t2 → +∞; so the probability for it to be found in the ground
13 While we introduce the analytic
continuation to Euclidean time, t →
tE , as a mathematical trick, it will
soon acquire a life of its own. One
conceptual issue is that the paths you
sum over in the Euclidean sector can
be quite different compared to those in
the Lorentzian sector. Later on, when
we deal with the relativistic fields,
this distinction will become more serious, because the Lorentzian space has
a light cone structure which the Euclidean sector does not have. There are
people who even believe quantum field
theory should be actually formulated
in the Euclidean sector and then analytically continued to the usual spacetime. It cannot be denied that the
Euclidean formulation gives certain insights which are surprisingly difficult
to obtain from other methods.
Exercise 1.4: Work this out explicitly
for a harmonic oscillator and verify this
claim.
12
Chapter 1. From Particles to Fields
state as t2 → +∞ will be less than one. This probability can be computed
from the propagation amplitude GJ (x2 ; x1 ) in the presence of the source
J. Consider the limit of:
P
=
=
lim GJ (x2 ; x1 ) = lim lim t2 , x2 |t1 , x1 J
t2 →∞ t1 →−∞
+∞
lim
dD x dD x
lim
t2 →∞ t1 →−∞
lim
t2 →∞ t1 →−∞
(1.45)
−∞
× t2 , x2 |T, xT, x| − T, x −T, x |t1 , x1
Since J = 0 during t2 > t > T and −T > t > t1 , the matrix elements in
these intervals can be expressed in terms of the energy eigenstates of the
original system. We can then take the limits t2 → ∞ and t1 → −∞ by
first going over to Euclidean time, taking the limits and then analytically
continuing back to normal time. Using Eq. (1.39), we have:
lim t2 , x2 |T, x ∼
= φ0 (x2 )φ0 (x) exp −iE0 (t2 − T )
t2 →∞
lim −T, x |t1 , x1
t1 →−∞
∼
= φ0 (x )φ0 (x1 ) exp −iE0 (−T − t1 ) (1.46)
which, when substituted into Eq. (1.45), gives:
P ∼
= φ0 (x2 )φ0 (x1 )e−iE0 (t2 −t1 )
(1.47)
+∞
×
dD x dD x (φ0 (x)eiE0 T )T, x|x , −T (φ0 (x )eiE0 T )
−∞
∼
=
J=0
lim G
t→∞
+∞
(x2 ; x1 )
dD x dD x [φ0 (x, T )]∗ T, x|x , −T [φ0 (x , −T )]
−∞
where φ0 (x, T ) represents the ground state wave function at time T , etc.
The quantity
+∞
Q=
dD x dD x [φ0 (x, T )]∗ x, T |x , −T [φ0 (x , −T )]
(1.48)
−∞
Exercise 1.5: Compute Q for a simple harmonic oscillator of frequency ω
coupled to an external time dependent
source J(t). Your result should match
with the one we need to use later in
Eq. (2.87).
represents the amplitude for the system to remain in the ground state
after the source is switched off if it started out in the ground state before
the source was switched on. (It is usually called the vacuum persistence
amplitude.) From Eq. (1.47) and Eq. (1.45) we find that this amplitude is
given by the limit:
Q = lim
t2 →∞
GJ (x2 ; x1 )
t1 →−∞ GJ=0 (x2 ; x1 )
lim
(1.49)
This result can be further simplified by noticing that the x2 and x1 dependences cancel out in the ratio in Eq. (1.49) so that we can set x2 = x1 = 0.
Thus the vacuum persistence amplitude can be found from the propagation
amplitude by a simple limiting procedure.
Finally we mention an important application of imaginary time in statistical mechanics and condensed matter physics. The analytic continuation
to imaginary values of time has close mathematical connections with the
1.2. G(x2 ; x1 ) for the Non-relativistic Particle
13
description of systems in a thermal bath. To see this, consider the mean
value of some observable O(q) of a quantum mechanical system. If the system is in an energy eigenstate described by the wave function φn (q), then
the expectation value of O(q) can be obtained by integrating O(q)|φn (q)|2
over q. If the system is in a thermal bath at temperature β −1 , described
by a canonical ensemble, then the mean value has to be computed by averaging over all the energy eigenstates as well with a weightage exp(−βEn ).
In this case, the mean value can be expressed as
1
1
∗
−βEn
≡
dq φn (q)O(q)φn (q) e
O =
dq ρ(q, q)O(q) (1.50)
Z n
Z
where Z is the partition function and we have defined a density matrix
ρ(q, q ) by
ρ(q, q ) ≡
φn (q)φ∗n (q ) e−βEn
(1.51)
n
in terms of which we can rewrite Eq. (1.50) as
O =
Tr (ρO)
Tr (ρ)
(1.52)
where the trace operation involves setting q = q and integrating over q.
This result shows how ρ(q, q ) contains information about both thermal and
quantum mechanical averaging. Comparing Eq. (1.51) with Eq. (1.37),
we find that the density matrix can be immediately obtained from the
Euclidean propagation amplitude:
ρ(q, q ) = β, q|0, q E
(1.53)
with the Euclidean time acting as inverse temperature. Its trace, ρ(q, q) ∝
β, q|0, qE is obtained by computing the Euclidean path integral with a
periodic boundary condition on the imaginary time. The period of the
imaginary time gives the inverse temperature.
1.2.4
Path Integral for the Jacobi Action
It is possible to use the results obtained in the Sect. 1.2.1 to give meaning
to a sum over paths for an action which is not quadratic. Given its utility
in the study of the relativistic particle in the next section, we will first
develop this approach in the non-relativistic context.
In classical mechanics the usual form of the action A[x(t)] ≡ A[xα (t)]
is taken to be a time integral [in the range (t1 , t2 )] of the Lagrangian L =
T (xα , x˙ α )−V (xα ). Demanding δA = 0 with δxα = 0 at the end points leads
to the equations of motion determining the trajectory xα (t) satisfying the
boundary conditions. This procedure not only determines the path taken
by the particle in space (like e.g., the ellipse in the Kepler problem) but
also the actual coordinates of the particle as a function of time t (like, for
e.g., r(t) and θ(t) in the Kepler problem; eliminating t between these two
functions gives the equation to the path r(θ)).
Suppose we are not interested in the latter information and only want
to know the equation to the path taken in space by the particle. It is then
possible to use a different action principle — called the Jacobi-Maupertuis
action — the extremum condition for which leads directly to the trajectory.
We will first recall this action principle from classical mechanics.
Exercise 1.6: Compute ρ(q, q ) for a
harmonic oscillator at finite temperature and study the low and high temperature limits. Can you interpret
these limits?
14
Chapter 1. From Particles to Fields
Let us consider a situation in which the kinetic energy term T of the
Lagrangian is a homogeneous quadratic function of the velocities and can
be written in the form
2
d
1
1
(1.54)
= m gαβ x˙ α x˙ β
T = m
2
dλ
2
where d2 = gαβ (xμ )dxα dxβ is the spatial line element in some arbitrary
coordinate system and λ is the time, with x˙ α = dxα /dλ etc.. Since we
consider Lagrangians with no explicit time dependence, we can take t1 =
0, t2 = t in defining the action. Using the relation between the Lagrangian
and the Hamiltonian, L = pα x˙ α − H, we can write the action functional as
2
t
d
1
dλ m
− V (xα ) ;
d2 = gαβ (xμ )dxα dxβ
A =
2
dλ
0
t
=
dλ [pα x˙ α − H]
(1.55)
0
This form of the action motivates us to consider another action functional
given by
x2
λ2
∂L
α
x˙ α
(1.56)
pα dx =
dλ
AJ ≡
∂ x˙ α
x1
λ1
Exercise 1.7: Vary AJ and prove
these claims.
14 If
we consider an abstract space
with metric Gαβ ≡ 2m(E − V )gαβ ,
then AJ for a given value of E is just
the length of the path calculated with
this metric Gαβ . So the spatial paths
followed by the particle are geodesics
in the space with this metric. This is
amusing but does not seem to lead to
any deep insight.
which differs from Eq. (1.55) by the absence of Hdλ term. Roughly speaking, one would have expected this term not to contribute to the variation if
we consider trajectories with a fixed energy E and express pα in AJ above
in terms of E. It can be easily verified that the modified action principle
based on AJ in Eq. (1.56) leads to the actual paths in space as a solution
to the variational principle δAJ = 0 when we vary all trajectories connectα
ing the end points xα
1 and x2 . This trajectory will have energy E, but
will contain no information about the time coordinate. In fact, the first
form of the action in Eq. (1.56) makes clear that there is no time dependence in the action. Geometrically, we are only interested in various curves
α
connecting two points in space xα
1 and x2 , irrespective of their parameterization. We can describe the curve with some parameter λ by giving the
D functions xα (λ) or some other set of functions obtained by changing the
parameterization from λ → f (λ). The curves remain the same and the
reparameterization invariance of the action expresses this fact.
It is possible to rewrite the expression for AJ in a nicer form. Using
the fact that L is a homogeneous quadratic function of velocities, we have
the result
2
∂L
d
x˙ α
= 2 [E − V (xα )]
(1.57)
=
2T
=
m
∂ x˙ α
dλ
Substituting into Eq. (1.56) we get
x2
x2
d
m
2m(E − V (xα )) d
AJ =
d =
dλ
x1
x1
(1.58)
which is again manifestly reparameterization invariant with no reference to
time.14
Since AJ describes a valid action principle for finding the path of a
particle with energy E classically, one might wonder what happens if we try
1.2. G(x2 ; x1 ) for the Non-relativistic Particle
15
to quantize the system by performing a sum over amplitudes exp(iAJ ). We
would expect it to lead to the amplitude for the particle to propagate from
α
xα
1 to x2 with energy E. This is indeed true but since AJ is not quadratic
in velocities even for a free particle, (because d involves a square root) it
is not easy to evaluate the sum over exp(iAJ ). But since we already have
an alternative path integral procedure for the system, we can use it to give
meaning to this sum, thereby defining the sum over paths for at least one
non-quadratic action.
Our idea is to write the sum over all paths in the original action principle
(with amplitude exp(iA)) as a sum over paths with energy E followed by
a sum over all E. Using the result in Eq. (1.55), we get
t,x2
exp(iA) =
0,x1
x2
E
e−iEt exp iAJ [E, x(τ )] ∝
∞
dE e−iEt
0
x1
x2
exp(iAJ )
x1
(1.59)
In the last step we have treated the sum over E as an integral over E > 0
(since, for any Hamiltonian which is bounded from below, we can always
achieve this by adding a suitable constant to the Hamiltonian) but there
could be an extra proportionality constant which we cannot rule out. This
constant will depend on the measure used to define the sum over exp(iAJ )
but can be fixed by using the known form of the left hand side if required.
Inverting the Fourier transform, we get:
P(E; x2 , x1 )
≡
x2
∞
exp(iAJ ) = C
0
x1
= C
dt e
iEt
∞
t,x2
exp(iA)
0,x1
dt eiEt G(x2 ; x1 )
(1.60)
0
where we have denoted the proportionality constant by C. This result
shows that the sum over the Jacobi action involving a square root of velocities can be re-expressed in terms of the standard path integral; if the
latter can be evaluated for a given system, then the sum over Jacobi action
can be defined by this procedure.
The result also has an obvious interpretation. The G(x2 ; x1 ) on the
right hand side gives the amplitude for a particle to propagate from x1 to
x2 in time t. Its Fourier transform with respect to t can be thought of as
the amplitude for the particle to propagate from x1 to x2 with energy E,
which is precisely what we expect the sum over the Jacobi action to give.
The idea actually works even for particles in a potential if we evaluate the
path integral on the right hand side by some other means like, e.g., by
solving the relevant Schrodinger equation.15
With future applications in mind, we will display the explicit form of
this result for the case of a free particle with V = 0. Denoting the length
α
of the path connecting xα
1 and x2 by (x2 , x1 ), we have:
x2
x1
√
exp i 2mE (x2 , x1 ) = C
0
∞
dt eiEt
t,x2
0,x1
exp
im
2
t
dτ gαβ x˙ α x˙ β
0
(1.61)
This result shows that the sum over paths with a Jacobi action, involving a
square root of velocities, can be re-expressed in terms of the standard path
integral involving only quadratic terms in the velocities. We, of course,
√
this case, the E − V will lead
to imaginary values for E < V , thereby
describing quantum mechanical tunneling. The path integral with the Jacobi action will give exponentially suppressed amplitudes for classically forbidden processes — a result we will
again allude to in the next section.
15 In
16
Exercise 1.8: Work out Eq. (1.61) explicitly for a D-dimensional space. You
will find that the integral is trivial for
D = 1, 3 while it involves the Bessel
function for other values of D. For example, D = 4 — which we will consider
in the next section — will give you a
MacDonald function.
Chapter 1. From Particles to Fields
know the result of the path integral in the right hand side (for gαβ = δαβ
in Cartesian coordinates) and thus we can evaluate the sum on the left
hand side.
It must be stressed that the path integral sum over the Jacobi action,
given by P(E; x2 , x1 ), is a very different beast compared to the path integral sum over the usual action, G(x2 ; x1 ). The P(E; x2 , x1 ) does not share
some of the crucial properties of G(x2 ; x1 ). Most importantly, it does not
obey any kind of transitivity like that in Eq. (1.2). It can be directly
verified that
P(E; x2 , x1 ) = dD y P(E; x2 , y) P(E; y, x1 )
(1.62)
This means P(E; x2 , x1 ) cannot be used to “propagate” any wave function
unlike G(x2 ; x1 ) which acts as a propagator for the solutions of Schrodinger
equations in accordance with Eq. (1.3). Mathematically, this is obvious
from the structure of P(E; x2 , x1 ) which can be expressed formally as the
matrix element:
∞
ˆ
ˆ + i )−1 |x1
P(E; x2 , x1 ) =
dt eit(E+i) x2 |e−itH |x1 = ix2 |(E − H
0
(1.63)
where we have introduced an i factor, with an infinitesimal , to ensure
convergence. This result can be expressed more explicitly in terms of the
energy eigenfunctions using Eq. (1.29):
∞ ∞
dt
dμ e−iμt φμ (x2 )φ∗μ (x1 )eiEt
P(E; x2 , x1 ) =
0
0
∞
iφμ (x2 )φ∗μ (x1 )
(1.64)
=
dμ
(E − μ + i )
0
where we have again changed E to E + i to ensure the convergence of
the original integral which also avoids a pole in the real axis for the μ
integration. Clearly, this expression has a very different structure compared
to the one in Eq. (1.29), which accounts for lack of transitivity, etc. From
either Eq. (1.63) or Eq. (1.64), it is easy to see that:
∂ ln P(E; x2 , x1 )
D
d y P(E; x2 , y)P(E; y, x1 ) = P(E; x2 , x1 )
(1.65)
i∂E
where the factor in the square bracket gives the deviation from transitivity.
In the case of a free particle, in which energy eigenstates are labeled by
the momenta p, Eq. (1.63) has the form
P(E; x2 , x1 ) =
ieip·x
dD p
(2π)D E − p2 /2m + i
(1.66)
In non-relativistic quantum mechanics all these are more of curiosities and
we don’t really care about the Jacobi action and a path integral defined by
it. (It is not even clear what a Schrodinger-like “propagation” could mean
when the time coordinate has been integrated out.) However these features
will hit us hard when we evaluate the path integral for a relativistic particle
because it will turn out to be identical in structure to the path integral
defined using a Jacobi action. We shall take that up next.
1.3. G(x2 ; x1 ) for the Relativistic particle
1.3
17
Quantum Propagation Amplitude for the
Relativistic Particle
The above discussion would have convinced you that, as far as the nonrelativistic free particle is concerned, we can proceed from the classical
theory to the quantum theory in a fairly straightforward way. We can either
use the path integral and obtain the propagation amplitude or follow the
more conventional procedure of setting up a Hamiltonian and quantizing
the system. Encouraged by this success, we could try the same procedure
with a relativistic free particle and obtain its quantum description from the
classical counterpart using, say, the path integral. Here is where all sorts
of strange things are expected to happen. The procedure should lead us
from a single particle description to a field theoretic description. We will
now see how this comes about.
1.3.1
Path Integral for the Relativistic Particle
Let us next try our hand with a relativistic particle for which all hell is
expected to break loose. But since we know the classical action (as well as
the Hamiltonian) for this case as well, we should be able to compute the
amplitude G(x2 ; x1 ). The standard action for a relativistic particle is given
by
A = −m
t2
dt
1−
v2
= −m
t1
x2
a
b
ηab dx dx = −m
x1
λ2
dλ ηab x˙ a x˙ b
λ1
(1.67)
where xa (λ) gives a parameterized curve connecting the events x1 and x2
in the spacetime with the parameter λ and x˙ a ≡ dxa /dλ. In the second
and third forms of the expression, the integral is evaluated for any curve
connecting the two events with the limits of integration depending on the
nature of the parametrization. (For example, we have chosen x(λ = λ1 ) =
x1 , x(λ = λ2 ) = x2 but the numerical value of the integral is independent
of the parametrization and depends only on the curve. If we choose to use
λ = t as the parameter, then we reproduce the first expression from the
last.)
It is obvious that this action has the same structure as the Jacobi action
for a free particle discussed in the last section. To obtain the propagation
amplitude G(x2 ; x1 ), we need to do the path integral using the above action,
G(x2 ; x1 )
=
t,x2
0,x1
=
t,x2
0,x1
exp −im
t2
dt
1 − v2
t1
exp −im
τ
dλ t˙2 − x˙ 2
(1.68)
0
which can be accomplished using the results in the last section.16 We first
take the complex conjugate of Eq. (1.61) (in order to get the overall minus
sign in the action in Eq. (1.67)) and generalize the result from space to
16 It
is also possible, fortunately, to
do this rigorously. (The details are
given in the Mathematical Supplement
in Sect. 1.6.2.) One can define the
action in Eq. (1.67) in a Euclidean
D-dimensional cubic lattice with lattice spacing , interpreting the amplitude as e−ml where l is the length of
any path connecting two given lattice
points. The sum over paths can be calculated by multiplying the number of
paths of a given length l by e−ml and
summing over all l. This is a simple
problem on the lattice and the sum can
be explicitly determined. The result in
the continuum needs to be obtained by
taking the limit → 0 using a suitable
measure which has to be determined in
such a way that the limit is finite. Such
an analysis gives the answer which is
identical to what we obtain here and
has the merit of being more rigorous,
while the simpler procedure used here
has the advantage of having a direct
physical interpretation.
18
Chapter 1. From Particles to Fields
spacetime, leading to
x2
√
exp −i 2mE (x2 , x1 ) = C
∞
dτ e−iEτ
0
x1
t,x2
0,x1
exp −
im
2
τ
dλ gab x˙ a x˙ b
0
(1.69)
In order to get −im(x2 , x1 ) on the left hand side we take E = m/2; ie.,
we use the above formula with the replacements
E
=
=
m
;
gab = dia (1, −1, −1, −1);
2
t2
τ
gab x˙ a x˙ b dλ =
dt 1 − v2
(1.70)
t1
0
The path integral over the quadratic action can be immediately borrowed
from Eq. (1.9) with D = 4, taking due care of the fact that in the quadratic
action the t˙2 enters with negative sign while x˙ 2 enters with the usual positive sign. This gives an extra factor i to N and the answer is:
m 2
i mx2
x2 , τ |x1 , 0 = θ(τ )i
exp −
(1.71)
2πiτ
2 τ
The θ(τ ) is introduced for the same reason as θ(t) in Eq. (1.9) but will
turn out to be irrelevant since we will integrate over it. Therefore the path
integral we need to compute is given by:
G(x2 ; x1 ) =
t,x2
exp −im
17 This is one physically meaningful
choice for timelike curves; for spacelike and null curves, the corresponding
choices are proper length and affine parameter.
τ
xi2 = (t2, x2)
τ =s
τ =0
xi
xi1 = (t1, x1)
Figure 1.4: Spacetime trajectories as
a function of proper time
1 − v2
(1.72)
m 2
im
im
dτ exp − τ i
exp − x2
2
2πiτ
2τ
0
∞
im 2
m
ds −ims
x
e
exp
−
,
(2Cm)(−i)
16π 2 0 s2
4s
=
dt
t1
0,x1
=
t2
∞
C
with τ = 2s. We have thus given meaning to the sum over paths for the
relativistic particle thereby obtaining G(x2 ; x1 ). The integral expression
also gives a nice interpretation for G(x2 ; x1 ) which we will first describe
before discussing this result.
The trajectory of a classical relativistic particle in spacetime is given by
the four functions xi (τ ) where τ could be taken as the proper time shown
by a clock which moves with the particle17 . Such a description treats space
and time on an equal footing with x(τ ) and t(τ ) being dependent variables
and τ being the independent variable having an observer independent, absolute, meaning. This is a natural generalization of x(t) in non-relativistic
mechanics with (x, y, z) being dependent variables and t being the independent variable having an observer independent, absolute, status. Let us
now consider an action for the relativistic particle given by
s
1
A[x(τ )] = − m
dτ x˙ a x˙ a
(1.73)
4
0
where x˙ a ≡ (dxa /dτ ) etc. This action, of course, gives the correct equations
of motion d2 xa /dτ 2 = 0 but the overall constant in front of the integral —
which is arbitrary as far as classical equations of motion go — is chosen
based on the result obtained below in Eq. (1.75). Evaluating a path inte-
1.3. G(x2 ; x1 ) for the Relativistic particle
19
gral with this action will now lead to an amplitude of the form x2 , s|x1 , 0
which describes a particle propagating from an event x1 to event x2 when
the proper time lapse is given by s. But we are interested in the amplitude G(x2 ; x1 ) and don’t care what is the amount of proper time that has
elapsed. Therefore we need to also sum over (i.e., integrate) all the proper
time lapses with some suitable measure. Since the rest energy of the particle m is conjugate to the proper time (which measures the lapse of time
in the instantaneous co-moving Lorentz frame of a particle) it seems reasonable to choose this measure to be proportional to a phase factor e−ims .
Thus we have the relation
∞
∞
−ims
G(x2 ; x1 ) = Cm
ds e
x2 , s|x1 , 0 = Cm
ds e−ims
eiA[x(τ )]
−∞
−∞
x(τ )
(1.74)
where Cm is a normalization constant,18 possibly dependent on m, which
we will fix later. (We have kept the integration limits on s to be the
entire real line but it will get limited to (0, ∞) because of θ(s) in the path
integral.) In the second equality we have used the standard path integral
prescription.
Exactly as before, the sum over paths is now to be evaluated limiting
ourselves to paths xi (τ ) which only go forward in the proper time τ (see Fig.
1.4; just as the paths in Eq. (1.8) were limited to those which go forward in
the Newtonian absolute time t). However, we have to now allow paths like
the one shown in Fig. 1.2 which go back and forth in the coordinate time t
just as we allowed in Eq. (1.8) the paths which went back and forth in the
y coordinate, say. The time coordinate t(τ ) of a path now has the same
status as the spatial coordinate, say y(t), in the non-relativistic description.
The special role played by the absolute Newtonian time t is taken over by
the proper time τ in this description. This will, of course, have important
consequences later on.
Let us now get back to the discussion of our main result. We will rewrite
Eq. (1.72) in the form:
m ∞ ds
i mx2
G(x2 ; x1 ) = −(2Cm)i
exp −ims −
16π 2 0 s2
4 s
∞
dμ
i x2
i
2
(1.75)
exp −i(m − i )μ −
= −
16π 2 0 μ2
4 μ
where we have made three modifications to arrive at the second line. First,
we have rescaled the variable s to μ by s ≡ mμ. Second, we have made
the choice C = 1/2m which, as we shall see, matches with conventional
results later on and — more importantly — allows us to take the m → 0
limit, if we want to study zero mass particles. Finally, we have replaced
m2 by (m2 − i ), where is an infinitesimal positive constant, in order to
make the integral convergent in the upper limit.19 We set = 0 at the end
of the calculations. The integral can be evaluated in terms of MacDonald
functions, leading to the result:
G(x2 ; x1 ) =
√
m
√ K1 (im x2 )
4π 2 i x2
(1.76)
where, of course, x2 = t2 − |x|2 and hence its square-root is imaginary for
spacelike intervals.
18 Dimension alert:
The amplitude
G(x2 ; x1 ) in Eq. (1.9) has the dimensions of (length)−D as it should. So
the G(x2 ; x1 ) in Eq. (1.74) will have
the dimension (length)−3 after integrating over s if Cm is dimensionless.
People like it to have dimensions of
(length)−2 which is achieved by taking
Cm ∝ (1/m), as we will soon do.
19 Alternatively,
you could have
worked everything out in the Euclidean sector it = tE , is = sE
etc. It will be equivalent to the i
prescription. (Try it out.)
20
Chapter 1. From Particles to Fields
That is it! We have found the propagation amplitude G(x2 ; x1 ) for
a relativistic particle just as we did it for a non-relativistic particle with
Eq. (1.75) being the relativistic analogue of Eq. (1.9). The expression, of
course, is a bit complicated algebraically but we will soon see that it has a
simple expression in Fourier space. Where are the quantum fields, negative
energy solutions and all the rest which we are supposed to see? To unravel
all these structures we will scrutinize the expression in Eq. (1.75) more
closely.
1.4
1.4.1
Exercise 1.9: Prove Eq. (1.77). It
turns out that the transitivity does
not hold even if we do the integral in
Eq. (1.77) over 4-dimensions with d4 y.
The mathematical reason for this is
that the amplitude we want, given by
Eq. (1.74) is analogous to the propagation amplitude P(E; x2 , x1 ) at fixed
energy E in non-relativistic quantum
mechanics, which is obvious from the
fact that we evaluated the relativistic
G(x2 ; x1 ) using the Jacobi path integral. As we saw in Sect. 1.2.4, such an
energy propagator does not obey transitivity even in non-relativistic quantum mechanics.
20 This drastic difference, of course,
is related to the
t2 − |x|2 factor in
Eq. (1.76). A simpler way to see this
is to note that in the saddle point approximation to Eq. (1.75), the saddle
point√is determined by the condition
μ = x2 /2m, which is real for x2 > 0
making the exponentials in Eq. (1.75)
oscillatory, while it is pure imaginary
for x2 < 0, leading to exponential
damping.
21 Though
some text books fuss over
this, it is no big deal.
In nonrelativistic quantum mechanics, a particle can tunnel through a barrier
which is classically forbidden. So the
fact that relativistic quantum theory
allows certain processes which relativistic classical theory forbids, by itself, is no more of a surprise than
the fact that tunneling amplitudes in
quantum mechanics are non-zero. The
real trouble arises due to issues connected to causality, as we shall see
later.
Mathematical Structure of G(x2; x1)
Lack of Transitivity
The first disastrous consequence arising from the nature of paths which
were summed over in Eq. (1.74) is that the amplitude G(x2 ; x1 ) does not
satisfy the transitivity property:
G(x2 ; x1 ) = dD y G(x2 ; y)G(y; x1 )
(1.77)
So, we cannot have a Schrodinger-like wave function ψ(xi ) which is propagated by G(x2 ; x1 ) in a consistent manner, in sharp contrast to the situation in non-relativistic mechanics. This means that, we cannot think of the
particle being ‘somewhere in space’ at intermediate times x02 > y 0 > x01 .
Mathematically, this is quite understandable because one of the paths we
summed over [see Fig 1.2] suggests that the particle was at three different
locations simultaneously.
1.4.2
Propagation Outside the Light Cone
Another immediate consequence is that the amplitude G(x2 ; x1 ) does not
vanish when x2 and x1 are separated by a spacelike interval, i.e., when x2
lies outside the light cone originating at x1 . From the asymptotic forms of
the MacDonald functions,20 one can easily show that
⎧
±imt
⎪
(for |x| = 0)
⎨e
(1.78)
G(x2 ; x1 ) −→
⎪
⎩ −m|x|
(for t = 0)
e
When the two events are separated by a spacelike interval, one can always
find a coordinate system in which t = 0, in which the second form of the
result applies. It shows that the amplitude is non-zero outside the light
cone but decreases exponentially with a scale length (1/m).
Classically a relativistic particle cannot go outside the light cone, but
quantum mechanically it seems that the particle can.21 In fact, our discussion of the Jacobi action makes it clear that the sum over paths calculated
with the Jacobi action will give exponentially decaying amplitudes in classically forbidden regions due to tunneling, which is precisely what happens
here. This
√ is also obvious from the fact that the relativistic Lagrangian
L = −m 1 − v2 becomes imaginary for |v| > 1 which arises in paths
which go outside the light cone. So this result is even to be expected.
The next trouble is that we can no longer localize the particle states.
It is clear from Eq. (1.75) and Eq. (1.78) that G(x2 ; x1 ) does not obey the
1.4. Mathematical Structure of G(x2 ; x1 )
21
normalization condition in Eq. (1.4). The fact that x|y = δ(x − y) shows
that we cannot really think in terms of non-overlapping position eigenvalues representing a single particle. Suppose we try to express G(x2 ; x1 ) as
x2 |e−iH(p)t |x1 for a relativistic, single particle Hamiltonian. Lorentz invariance plus translational invariance alone tells you that the result must
have the form x2 |e−iH(p)t |x1 = F (x2 ). Suppose now that F (x2 ) is
nonzero for two events x1 and x2 which have a spacelike separation —
as we found in our case. Then, one can choose a Lorentz frame in which
t = 0 and it immediately follows that
lim x2 |e−iH(p)t |x1 = x2 |x1 = F (−|x|2 ) = 0
t→0
(for x2 = x1 ) (1.79)
The non-zero overlap of x2 |x1 for x2 = x1 signals the impossibility of
physically localized single particle states — which is a direct consequence
of F (x2 ) = 0 for spacelike separations. We will say more about this in the
next section but it should be obvious from these facts that G(x2 ; x1 ) is not
going to permit a simple single particle Schrodinger like description.
There is also some difficulty with the simple notion of causality. We
know that when x2 and x1 are separated by a timelike interval, the causal
relation x02 > x01 has a Lorentz invariant meaning. If you create a particle
at x1 (say, by a process in which an atom emits a photon thereby creating
a photon) and detect it at x2 (say, by a process in which another atom
absorbs it) and claim that the detection occurred after the creation, such a
statement has a Lorentz invariant meaning only if x2 and x1 are separated
by a timelike interval. All other observers will agree with you that an atom
detected the photon after it was first created. But our discussion shows
that a relativistic particle has a non-zero amplitude to reach a point x2
which is related by a spacelike interval with respect to x1 . In one frame
x02 > x01 might hold but you can always find another Lorentz frame in
which x02 < x01 . This suggests that some observers might end up detecting
a particle before it is created.22
1.4.3
Three Dimensional Fourier Transform of G(x2 ; x1 )
To come to grips with such issues and to see the physics behind G(x2 ; x1 ),
it is good to play the same trick which we played with the non-relativistic
amplitude in Eq. (1.9), viz., Fourier transform it and look at it in the
momentum space. Now we can do this either by Fourier transforming with
respect to spatial coordinates or by Fourier transforming with respect to
space and time. This will also get rid of MacDonald functions and express
G(x2 ; x1 ) in a human readable form.
The spatial Fourier transform is given by the integral (where we have
suppressed the i factor for simplicity):
∞
dμ −im2 μ−i(t2 /4μ)
i
−ip·x 3
e
d x = −
G(x2 ; x1 )e
2
16π 0 μ2
2
×
d3 x ei|x| /4μ−ip·x
(1.80)
The integral over d3 x has the value (4πiμ)3/2 exp(−i|p|2 μ). Changing the
variable of integration to ρ with μ = ρ2 we find that the integral is given
by
1/2 ∞
i
it2
−ip·x 3
2 2
(1.81)
G(x2 ; x1 )e
d x=
dρ exp −iωp ρ − 2
π
4ρ
0
22 This also makes lot of people uncomfortable and may be rightly so.
But no amount of advanced quantum
field theory which you learn is ever going to make the amplitude G(x2 ; x1 )
vanish when x2 is outside the light cone
of x1 . The probability for creating a
particle at x1 and detecting a particle
at x2 outside x1 ’s light cone is nonzero in the real world and you just
have to live with it. But what quantum field theory will provide is a nicer,
causal, reinterpretation of this result in
terms of processes which work within
the light cones. The price you pay for
that is a transition from a single particle description to many (actually infinite) particle description.
22
Exercise 1.10: Prove the result for
I(a, b) which can be done using a clever
trick, other than typing Integrate in
Mathematica.
Chapter 1. From Particles to Fields
where ωp = + p2 + m2 is the energy of a classical particle with momentum p. This integral can be evaluated using the result
∞
2 2
2 −2
1 π 1/2 −2i|a| |b|
I(a, b) =
dx e−ia x −ib x =
e
(1.82)
2|a| i
0
to give a remarkably simple final answer:
1
exp(−iωp |t|)
G(x2 ; x1 )e−ip·x d3 x =
2ωp
We can, therefore, express the propagation amplitude as
d3 p ip·x 1 −iωp |t|
G(x2 ; x1 ) =
e
e
(2π)3
2ωp
≡
dΩp θ(t)e−ipx + θ(−t)eipx
Exercise
1.11:
Prove
that
the
integration
measure
dΩp
≡
(d3 p/(2π)3 )(1/2ωp ) is
indeed Lorentz invariant.
23 Incidentally, the same maths occurs in a process called paraxial optics when we study light propagating
along the +z axis, say, in spite of the
fact that you start with a wave equation which is second order in z and
symmetric under z ⇔ −z. In getting the non-relativistic quantum mechanics you are doing paraxial optics of
particles in the time direction, keeping
only forward propagation. For a detailed description of this curious connection, see Chapters 16,17 of T. Padmanabhan, Sleeping Beauties in Theoretical Physics, Springer (2015).
(1.83)
(1.84)
The integration measure dΩp ≡ (d3 p/(2π)3 )(1/2ωp ) is Lorentz invariant
because it arises from the measure d4 pδ(p2 −m2 )θ(p0 ); the factor exp(±ipx)
is, of course, Lorentz invariant.
We see that Eq. (1.10) as well as the first expression in Eq. (1.84) can
be expressed together in the form
⎧
θ(t)e−iωp t
(non-relativistic)
⎪
⎪
⎨
(1.85)
d3 x G(x2 ; x1 )e−ip·x =
⎪
1 −iωp |t|
⎪
⎩
e
(relativistic)
2ωp
Expressed in this manner, the relativistic result looks surprisingly similar
to the one obtained in the case of the non-relativistic particle in Eq. (1.11),
except for two crucial differences.
First, the non-relativistic amplitude propagates modes with positive
energy ωp forward in time and we could take it to be zero for t < 0. The
relativistic amplitude, on the other hand, has |t| leading to propagation
both forward and backward in time. It propagates modes with energy
ωp forward in time, but also propagates modes with negative energy −ωp
backwards in time, because ωp |t| = (−ωp )t for t < 0. As you can guess,
this will be a major talking point for us.23
What about the c → ∞ limit when we expect to recover the nonrelativistic form of G(x2 ; x1 ) from the relativistic expression? In this limit
(1/2ωp) → (1/2m); so that is fine and arises from the 2mC = 1 normalization of Eq. (1.75). In the argument of the exponentials, we set
ωp mc2 + p2 /2m. If we now pull out a factor exp(−imc2 t) then the
t > 0 part of the propagator will have the non-relativistic form. The t < 0
part of the propagator will pick up an exp(+2imc2 t) factor. So we get
2
eimc t GR (x2 ; x1 ) ≈ GN R (x2 ; x1 ) plus a term which oscillates rapidly due
to exp(+2imc2 t) factor when c → ∞. At this stage you argue that the second term can be ignored compared to the first, thereby getting the correct
NR limit with no t < 0 contribution.
Second, the Fourier transform of G(x2 ; x1 ) is not a pure phase in the
relativistic case due to the factor (1/2ωp ) in front. This is contrary to the
result we found in Eq. (1.7) for the transitivity condition Eq. (1.2) to hold.
But we know that transitivity does not hold in the relativistic case; see
1.4. Mathematical Structure of G(x2 ; x1 )
23
Eq. (1.77). Alternatively, we can use the result in Eq. (1.85) as a proof
that transitivity does not hold in the relativistic case.
It is convenient at this stage to introduce two different Lorentz invariant
amplitudes:
d3 p 1 i(p·x∓ωp t)
ip·x ∓itωp
G± (x2 ; x1 ) = dΩp e
e
=
e
= dΩp e∓ipx
(2π)3 2ωp
(1.86)
In arriving at the third equality, we have flipped the sign of p in the integrand for G− (x2 ; x1 ) in order to write everything in a manifestly Lorentz
invariant form. Obviously,
G+ (x2 ; x1 ) = G− (x1 ; x2 ) = G∗− (x2 ; x1 )
(1.87)
so that only one of them is independent. The propagation amplitude in
Eq. (1.84) can now be written as
G(x2 ; x1 ) = θ(t)G+ (x2 ; x1 ) + θ(−t)G− (x2 ; x1 )
(1.88)
We know that the left hand side of the above expression is Lorentz
invariant. Therefore the θ(±t) factors in the right hand side should not
create any problems with Lorentz invariance. When the events x2 and x1
are separated by a timelike interval, the notion of x02 −x01 ≡ t being positive
or negative is a Lorentz invariant statement and causes no difficulty. Only
one of the two terms in the right hand side of Eq. (1.84) contributes when
x2 and x1 are separated by a timelike interval in any Lorentz frame. The
situation however is more tricky when x2 and x1 are separated by a spacelike interval. In that case, in a given Lorentz frame, we may have t > 0
and G(x2 ; x1 ) = G+ (x2 ; x1 ) in this frame, while a Lorentz transformation
can take us to another frame in which the same two events have t < 0 so
that G(x2 ; x1 ) = G− (x2 ; x1 ) in this frame!
For consistency it is necessary that G+ (x2 ; x1 ) = G− (x2 ; x1 ) when x2
and x1 are separated by a spacelike interval, though, of course, they cannot
be identically equal to each other. It is easy to verify that this is indeed
the case. We have, in general,
G+ (x2 ; x1 ) − G− (x2 ; x1 )
= G+ (x2 ; x1 ) − G+ (x1 ; x2 )
(1.89)
=
d Ωp [e−ipx − e+ipx ]
(1.90)
=
d Ωp [e−iωp t+ip·x − eiωp t+ip·x ]
=
d Ωp eip·x [e−iωp t − eiωp t ]
In arriving at the last equality, we have flipped the sign of p in the eipx
term. We can now see that this expression vanishes when x2 < 0. In that
case, one can always chose a coordinate system such that t = 0 and the
last line of Eq. (1.89) shows that the integrand vanishes for t = 0. The
explicit Lorentz invariance of the expression assures us that it will vanish
in all frames when x2 < 0. Therefore, it does not matter which of the two
terms in Eq. (1.84) is picked up when x2 < 0 and everything is consistent
though in a subtle way. This result will play a crucial role later on when
we introduce the notion of causality in quantum field theory.
24
1.4.4
Chapter 1. From Particles to Fields
Four Dimensional Fourier Transform of G(x2 ; x1 )
24 We are now using the expression
pi xi when all the components (p0 , p)
are independent (‘off-shell’). We will
use the notation p · x — with a dot in
the middle — when the pi is off-shell
while px stands for the same expression on-shell. Usually the context will
make clear whether the expression is
on-shell of off-shell, but do watch out.
Let us next consider the 4-dimensional24 Fourier transform of G(x2 ; x1 ).
This expression, which will play a crucial role in our future discussions, is
given by:
∞
dμ −i(m2 −i)μ
i
G(x2 ; x1 )eip·x d4 x = −
e
(1.91)
2
16π 0 μ2
i x2
+ ip · x
×
d4 x exp −
4 μ
i
=
(1.92)
(p2 − m2 + i )
25 Though this expression looks remarkably relativistic (and indeed it
is), it has an elementary connection to
the non-relativistic result in Eq. (1.66).
For the action in Eq. (1.73), the Hamiltonian is H = −p2 /m and the energy conjugate to proper time will be
E = −m. If you make the substitutions E → −m, p2 /2m → −p2 /m in
Eq. (1.66), you essentially get the result in Eq. (1.92) showing the origin of
the “energy denominator”.
The inverse25 Fourier transform of the right hand side should, of course,
now give G(x2 ; x1 ). We will do this in two different ways to illustrate
two important technical points. First, note that we can write the inverse
Fourier transform using an integral representation as
∞
ie−ip·x
d4 p −ipx−iμ(m2 −p2 −i)
d4 p
=
dμ
e
G(x2 ; x1 ) =
(2π)4 p2 − m2 + i
(2π)4
0
(1.93)
−iμ(m2 −p2 )
We next note that the factor e
can be thought of as a matrix
2
element of the operator e−iμ(m +) . Hence we can write
∞
∞
2
G(x2 ; x1 ) =
dμx2 |e−iμ(+m −i) |x1 ≡
dμx2 , μ|x1 , 0 (1.94)
0
Im p0
X
X
Re p0
X
Re p0
Im p0
X
Figure 1.5: The i prescription and
the Euclidean prescription in the momentum space.
0
So our amplitude G(x2 ; x1 ) can be thought of as the amplitude for a particle
to propagate from x1 at τ = 0 to x2 at τ = μ followed by summing over
all μ. This is identical in content to our original result Eq. (1.74).
The integrand x2 , μ|x1 , 0 in Eq. (1.94) can be thought of as the propagation amplitude x2 , μ|x1 , 0 = x2 |e−iμH |x1 for a (fictitious) particle
with the operator H = + m2 − i sometimes called the super-Hamiltonian
which generates translations in proper time. This (super)Hamiltonian
H = −p2 + m2 corresponds to a (super)Lagrangian L = −(1/4)x˙ 2 − m2 .
This is essentially the same as the Lagrangian in Eq. (1.73) with a rescaling τ → mτ and adding a constant (−m2 ). The above result shows that
one can think of the propagation amplitude G(x2 ; x1 ) as arising from the
propagation of a particle in a 5-dimensional space with the fifth dimension
being the proper time. The proper time should be integrated out to give
the amplitude in physical spacetime.
The second technical aspect is related to the crucial role played by the i
factor which was introduced very innocuously in Eq. (1.75). In the inverse
Fourier transform, separating out d4 p as d3 pdp0 , we get
i
d4 p
e−ip·x
G(x2 ; x1 ) =
4
2
(2π) (p − m2 + i )
0
ie−ip t
d3 p ip·x +∞ dp0
=
e
(1.95)
0 2
2
(2π)3
−∞ 2π (p ) − ωp + i
The p0 integral is now well defined because the zeros of the denominator
1/2
⇒ ±(ωp − i )
p0 = ± ωp2 − i
(1.96)
1.4. Mathematical Structure of G(x2 ; x1 )
25
are shifted off the real axis. If we did not have the i factor, the denominator
would vanish at p0 = ±ωp and the integral will be ill-defined until we
specify a procedure for going around the poles. The i prescription tells
you that the contour has to go above the pole at +ωp and below the pole at
−ωp which is equivalent to using the contour in Fig. 1.5 with integration
along p0 (1 + i ) keeping the poles on the real axis. This allows us to rotate
the contour to the imaginary axis in the complex p0 plane and define the
expressions in the Euclidean sector by the analytic continuation. In the
complex p0 plane we think of the integral as being along p0 eiθ with θ =
giving the i prescription and θ = π/2 giving the Euclidean prescription.
The latter is equivalent to replacing p0 by ip0E and treating p0E as real.
We recall from Sect. 1.2.3 that the corresponding change for time coordinate was t → −itE . So the complex phase of the Fourier transform
remains complex when we simultaneously analytically continue in both t
and p0 . That is,
i(p0 t − p · x) → i(p0E tE − p · x)
(1.97)
The relative sign between the two terms in the Euclidean expression on the
right hand side might appear strange but it does not matter when we deal
with integrals over functions of p2 . Using the trick of flipping the sign of
spatial momenta, we have the identity:
0
d4 pf (p2 )e−ip·x = i d4 pE f (−p2E )e−i(pE tE −p·x)
0
= i d4 pE f (−p2E )e−i(pE tE +p·x)
= i d4 pE f (−p2E )e−ipE ·xE
(1.98)
where pE · xE ≡ p0E tE + p · x so that everything looks sensible for functions
of p2 . In the Euclidean sector our integral will become
4
d pE e−ipE .xE
i
d4 p
−ip·x
e
=
(1.99)
4
2
2
(2π) (p − m + i )
(2π)4 p2E + m2
which, of course, is well defined. This is yet another illustration that if
we had worked everything out in the Euclidean sector and analytically
continued back to real time and real p0 , we would have got the correct
result as obtained by the i prescription. Hence, one often uses the i
prescription and the Euclidean definition interchangeably in field theory.
Either by contour integration or by doing the Euclidean integral and
analytically continuing, one can easily show that
+∞
−∞
0
dp0
ie−ip t
1 −iωp |t|
=
e
0
2
2π (p ) − ωp2 + i
2ωp
Thus we get the final answer to
G(x2 ; x1 ) =
≡
(1.100)
be
d3 p ip·x 1 −iωp |t|
e
e
(2π)3
2ωp
dΩp θ(t)e−ipx + θ(−t)eipx
which, of course, matches with our earlier result in Eq. (1.84).
(1.101)
Exercise 1.12: Work out the p0 integration along a contour which goes (a)
above both poles and (b) below both
poles. These expressions correspond to
what are known as retarded and advanced propagators in the context of,
for e.g., a harmonic oscillator.
26
1.4.5
t
t2
t1
x=x
x = x
x
Chapter 1. From Particles to Fields
The First Non-triviality: Closed Loops and G(x; x)
Since G(x2 ; x1 ) does not vanish outside the light cone, one is tempted
to look at the form of G(x; x) which would correspond to the amplitude
for a particle to propagate in a closed loop in spacetime! This process is
remarkable because it could happen even without any relativistic particle in
sight. Being a “zero particle process” (see Fig. 1.6) it contains the key
essence of the new features we have stumbled upon. To analyse this, let us
go back to the amplitude x2 , μ|x1 , 0 given in Eq. (1.94) for a particle to
propagate from x1 to x2 in proper time μ. We define the Fourier transform
of this expression with respect to μ to obtain
∞
x, μ|x, 0 eiEμ dμ ≡ P(x; E)
(1.102)
0
Figure 1.6: The amplitude for such
closed loops is nonzero.
26 Later
on we will see that E can be
interpreted as an effective Lagrangian
of a theory when x2 , μ|x1 , 0 =
x2 |e−iμH |x1 arises from more complicated operator H in interacting theories. The factor (1/2) is added in
Eq. (1.103) because each loop can be
traversed in two orientations — clockwise or anticlockwise.
where P(x; E) can be thought of as the amplitude for a closed loop with
energy E. (This integral could have been from μ = −∞ to μ = +∞ if
we interpret x2 , μ|x1 , 0 with a θ(μ) factor.) We next integrate over all
energies and define a quantity
1 ∞
E(x) =
dE P(x; E)
(1.103)
2 0
Taking the definition for x2 , μ|x1 , 0 from Eq. (1.94) and integrating over
E first, we get26
2
1 ∞
i ∞ dμ
−1
x, μ|x, 0 =
x|e−iμ(+m ) |x (1.104)
dμ
E=
2 0
iμ
2 0 μ
It follows from this relation that
1
∂E
= G(x; x)
2
∂m
2
(1.105)
On the other hand, from Eq. (1.84) we find that (with temporarily restored)
d3 p 1
1
d3 p
∂
1
1
G(x; x) =
ω
(1.106)
=
p
2
2
(2π)3 2ωp
∂m2
(2π)3 2
Comparing Eq. (1.105) with Eq. (1.106) we get the remarkable result
d3 p
1
ω
(1.107)
E=
p
(2π)3
2
when we set the integration constant to zero. Or, more explicitly,
∞
1 ∞
1
d3 p
iEμ
ωp
(1.108)
dE
dμ e
x, μ|x, 0 =
2 0
(2π)3
2
0
27 The divergence of the expressions
makes some of these operations “illegal”. One can obtain the same results after “regularizing” the integrals
— a procedure we will describe later
on. (Whether that makes anything “legal” is a question you are not supposed
to ask!)
The left hand side can be interpreted (except for a factor (1/2) which, as
we said, takes care of traversing the loop in two directions) as giving the
amplitude of all closed loops obtained by summing over the energy associated with each loop. We find that this is the same as summing over the zero
point energies, (1/2)ωp , of an infinite number of harmonic oscillators each
labeled by the frequency ωp . Right now there are no harmonic oscillators
anywhere in sight and it is rather intriguing that we get a result like this;
a physical interpretation will emerge much much later, in spite of the fact
that both sides of this expression are divergent.27
1.4. Mathematical Structure of G(x2 ; x1 )
1.4.6
27
Hamiltonian Evolution: Relativistic Particle
To understand further the structural difference in G(x2 ; x1 ) between the
relativistic and non-relativistic case, we will now try to redo our analysis
in Sect. 1.2.2 for the relativistic case. The lack of transitivity in G(x2 ; x1 )
should tell you that this is a lost cause but it is interesting to see it explicitly.
The classical Hamiltonian for a relativistic particle is given by H(p) =
+ p2 + m2 which has a square-root and is defined to be positive definite.
Squaring this is a notoriously bad idea since it allows negative energy states
to creep in and is universally condemned in textbooks. So we don’t want to
do that. If we retain the square root and use the coordinate
representation
√
with p → −i∇, we have to deal with the operator −∇2 + m2 which is
non-local. Its action on a function f (x) is given by
dD p
(p2 + m2 )1/2 f (p)eip·x
−∇2 + m2 f (x) ≡
(2π)D
=
dD y G(x − y) f (y)
(1.109)
where G(x) is a singular Kernel given by
dD p
(p2 + m2 )1/2 eip·x
G(x) ≡
(2π)D
(1.110)
The simplest way to avoid this problem28 is to work in the momentum
representation in which H(p) is diagonal and local. So we begin again by
introducing a complete set of momentum eigenstates |p. The completeness
condition analogous to Eq. (1.18) will now have to be Lorentz invariant and
hence should have the form
dD p 1
|pp|
=
dΩp |pp| = 1
(1.111)
(2π)D 2ωp
Taking the product with |k, we now require the Lorentz invariant orthonormality condition29
p|k = (2π)D (2ωk ) δ(p − k)
(1.112)
We next introduce position eigenstates |x and will try to interpret the
amplitude G(x2 ; x1 ) as x2 |e−iH(p)t |x1 . This leads to:
−iH(p)t
x2 |e
|x1 = dΩp e−iωp t x2 |pp|x1
(1.113)
Once again we can argue based on translational invariance (or the fact
that in the momentum basis the commutation rule [xα , pβ ] = iδαβ is implemented using x
ˆα = i∂/∂pα ) that p|x = Cp exp[−ip · x]. Therefore,
x2 |e−iH(p)t |x1 = dΩp e−iωp t |Cp |2 eip·x = dΩp |Cp |2 e−ipx (1.114)
We will require the amplitude G(x2 ; x1 ) = x2 |e−iH(p)t |x1 to be Lorentz
invariant. Since dΩp and e−ipx are Lorentz invariant, it follows that we
must have |Cp |2 to be a constant which we can take to be unity. If we
make this choice, then we will have
x2 |e−iH(p)t |x1 = G(x2 ; x1 ) (for t2 > t1 )
Exercise 1.13: Prove this and evaluate G(x) explicitly. It is a pretty much
a useless object but you learn how to
work with singular integrals in the process.
(1.115)
28 If
we have to deal with a
Schrodinger-like equation in coordinate space which is non-local,
you might think we have already lost
the battle. But note that, even in
the usual non-relativistic quantum
mechanics, there is nothing sacred
about the coordinate representation
vis-a-vis the momentum representation. If you write the Schrodinger
equation for any, say, non-polynomial
potential V (x) in the momentum
representation, you will end up getting
a non-local Schrodinger equation in
momentum space. Nothing should go
wrong, just because of this.
29 This follows from the Lorentz invariance of 1 and d Ω
when expressed
p D
in the form 1 =
d pδ(p − k) =
d Ωp (2π)D 2ωp δ(p − k).
28
Chapter 1. From Particles to Fields
as can be verified by comparing with Eq. (1.84). The expression in the left
hand side cannot generate the form of G(x2 ; x1 ) for t2 < t1 . This term
requires e+ipx which in turn involves the negative energy factor with −ωp .
Since we started with a Hamiltonian which is explicitly positive definite,
there is no way we can get this term in this manner.
This fact already shows that our program fails; there is, however, one
more difficulty which is worth pointing out. The Lorentz invariance of
G(x2 ; x1 ) and our desire to express it in the form x2 |e−iH(p)t |x1 has
forced us to choose |Cp |2 = 1. But then if we multiply Eq. (1.111) from
the left by x| and from the right by |y we get
dD p 1 ip·(x−y)
e
= δ(x − y) (1.116)
x|y = dΩp x|pp|y =
(2π)D 2ωp
Exercise 1.14: Construct a single
particle Lorentz invariant relativistic
quantum theory with a consistent interpretation. (Hint: Of course, you
cannot! But a good way to convince
yourself of the need for quantum field
theory is to attempt this exercise and
see for yourself how hard it is.)
30 This is closely related to something
called the Newton-Wigner position operator in the literature. It is known
that this idea has serious problems.
This shows that x|y is in general non-zero for x = y. (In fact, from
our expression in Eq. (1.76) it is clear that x|y is non-zero and decreases
exponentially with a scale length of (1/m).) Physically, this indicates the
impossibility of localizing particles to a region smaller than (1/m) and still
maintain a single particle description. We have already seen in the last
section that this is connected with the fact that G(x2 ; x1 ) does not vanish
for spacelike intervals.
There is no easyway out of these difficulties. If, for example, we
have chosen Cp = 2ωp then we could have canceled out the (1/2ωp)
in Eq. (1.116) thereby making x|y = δ(x − y). But with such a choice
Eq. (1.113) would have picked up an extra 2ωp factor and the resulting
expression will not be Lorentz invariant.30 Part of these technical difficulties arise from the following elementary fact: In non-relativistic quantum
mechanics we have two equally good basis corresponding to either position
eigenvalue |x or corresponding to momentum eigenvalue |p with integration measures d3 x and d3 p. In the case of a relativistic particle we do
have a Lorentz invariant integration measure d3 p/(2ωp ) which can be used
with |p. But we do not have anything analogous which will make the integration measure d3 x into something which is physically meaningful and
Lorentz invariant. Roughly speaking all these boil down to the fact that a
free particle with momentum p has an associated energy ωp while a free
particle with position x does not have “an associated time t” with it. So the
momentum basis |p continues to be useful in quantum field theory while
talking about particle interactions etc. But the position basis is harder to
define and use.
The best we can do is to stick with Lorentz invariance with the choice
Cp = 1 and use two different Lorentz invariant amplitudes G± (x2 ; x1 ) defined earlier in Eq. (1.86). We can keep these two amplitudes separate
and think
of G+ (x2 ; x1 ) arising from evolution due to the Hamiltonian
H = + p2 + m2 , while
we have to deliberately introduce another Hamiltonian with H = − p2 + m2 to obtain G− (x2 ; x1 ). Comparing with our
discussion in the case of the non-relativistic particle, we find that the existence of two energies ±ωp requires us to define two different amplitudes.
In the path integral approach, on the other hand, we are naturally led to
an expression for G(x2 ; x1 ) which can be expressed in the form of Eq. (1.88).
If we knew nothing about the path integral result, we would have made the
“natural” choice of +ωp and declared the propagator to be θ(t)G+ (x2 ; x1 )
which is just the first term in Eq. (1.88). This is inconsistent with the path
integral result; it is also not Lorentz invariant because θ(t) is not Lorentz
1.5. Interpreting G(x2 ; x1 ) in Terms of a Field
29
invariant when x2 and x1 are spacelike. One could have restored Lorentz
invariance with only positive energies by taking the amplitude to be just
G+ (x2 ; x1 ) (without the θ(t), for all times), but then it will not match the
path integral result for t < 0.
The only way to interpret G(x2 ; x1 ) is as follows: There is a non-zero
amplitude for propagating either forward in time or backward in time.31
Forward propagation occurs with positive energy +ωp while backward
propagation occurs with negative energy −ωp ! This is the main difference between Eq. (1.84) and Eq. (1.11); in the non-relativistic case given
by Eq. (1.11) there are only positive energy terms and they propagate forward in time; in the relativistic case given by Eq. (1.84) we have both
positive energy terms which propagate forward in time (for +ωp ) and negative energy terms which propagate backward in time (for −ωp ). It looks
as though the relativistic propagation amplitude simultaneously describes
two kinds of particles rather than one: Those with positive energy which
propagate forward and some other strange species with negative energies
which propagate backwards.
We saw earlier that, because G(x2 ; x1 ) is non-zero for spacelike separation, there can be an ambiguity when two observers try to interpret
emission and absorption of particles. The existence of propagation with
negative energy adds a new twist to it. To make the idea concrete, consider a two-level system with energies E2 , E1 with E ≡ E2 − E1 > 0. We
use one of these systems (call it A) to emit a particle and the second system
(call it B) to absorb the particle. To do this, we first get A to the excited
state and let it drop back to the ground state emitting a particle with positive energy; this particle propagates to B which is in the ground state and
gets excited on absorbing positive energy. But now consider what happens
when we allow for propagation of modes with negative energy. You can
start with system A in the ground state, emit a particle with negative energy (−E) and in the process get it excited to E2 . If we had kept B at an
excited state, then it can absorb this particle with energy (−E) and come
down to the ground state. Someone who did not know about the negative
energy modes would have treated this as emission of a particle by B and
absorption by A, rather than the other way round. Clearly, the lack of
causal ordering between emission and absorption can be reinterpreted in
terms of the existence of negative energy modes. We will make this notion
sharper in Sect. 2.1.
1.5
31 Note that, in the relativistic case,
we cannot express the propagation amplitude G(x2 ; x1 ) as x2 |x1 . This is
because, in the relativistic case, we no
longer have a |x1 = |t1 , x1 which
is an eigenket of some sensible posiˆ 1 ) with eigenvalue x1 .
tion operator x(t
We do have such an eigenket in the
NRQM.
Interpreting G(x2; x1) in Terms of a Field
All this is vaguely disturbing to someone who is accustomed to sensible
physical evolution which proceeds monotonously forward in time from t1 <
t2 < t3 .... and hence it would be nice if we can reinterpret G(x2 ; x1 ) in such
a nice, causal manner.32 We have already seen that G(x2 ; x1 ) comes from
summing over paths which include the likes of the one in Figs. 1.2, 1.7. If
we say that a single particle has three degrees of freedom (in D = 3), then
we start and end (at x1 and x2 ) with three degrees of freedom in Fig. 1.7.
But if a path cuts a spatial slice at an intermediate time at k points (the
figure is drawn for k = 3), then we need to be able to describe 3k degrees
of freedom at this intermediate time. Since k can be arbitrarily large, we
conclude that if we want a description in terms of causal evolution going
from t1 to t2 then we need to use a mathematical description involving an
32 Recall that the result for G(x ; x )
2
1
in Eq. (1.75) or Eq. (1.84) is nonnegotiable! All that we are allowed
to do is to come up with some other
system and a calculational rule which
is completely causal and leads to the
same expression for G(x2 ; x1 ).
30
33 These
infinite number of degrees of freedom.33 Obviously, such a system must
be ‘hidden’ in G(x2 ; x1 ) because it knows about paths which cut the t =
constant surface an arbitrarily large number of times. Let us see whether
we can dig it out from G(x2 ; x1 ).
Since (i) the Fourier transform of a function has the same information as
the function and since (ii) the real difference between relativistic and nonrelativistic cases (see Eq. (1.85)) is in the manner in which propagation in
time is treated, it makes sense to try and understand the spatial Fourier
transform of G(x2 ; x1 ) from some alternate approach. This quantity is
given by (see Eq. (1.85))
1 −iωp |t|
e
(1.117)
d3 x G(x2 ; x1 )e−ip·x =
2ωp
are called fields.
t
x2
t2
B
y0
t1
Chapter 1. From Particles to Fields
A
x1
x
Figure 1.7: Path that represents 3
particles at t = y 0 though there was
only one at t = t1 .
but incredibly enough, the same expression arises in a completely different
context. Consider a quantum mechanical harmonic oscillator with frequency ωp and unit mass. If qp (t) is the dynamical variable characterizing
this oscillator, then it is trivial to verify that
1 −iωp |t|
e
0|T qp (t2 )qp† (t1 ) |0 =
2ωp
34 There is an ambiguity when t = t
1
2
if A and B do not commute at equal
times. This turns out to be irrelevant
in the present case.
(1.118)
where |0 is the ground state of the oscillator and T is a time-ordering
operator which arranges the variables within the bracket chronologically
from right to left.34 For example,
T [A(t2 )B(t1 )] ≡ θ(t2 − t1 )A(t2 )B(t1 ) + θ(t1 − t2 )B(t1 )A(t2 )
(1.119)
Usually, the dynamical variables qp (t) of a quantum mechanical oscillator will be Hermitian and hence qp† = qp ; this is because we think of the
corresponding classical coordinate as real. Here we are just trying to construct a mathematical structure which will reproduce the right hand side
of Eq. (1.117), so it is not necessary to impose this condition and we will
keep qp non-Hermitian. Let us briefly review how the usual results for a
harmonic oscillator translate to this case before proceeding further.
For this purpose, it is useful to begin with two harmonic oscillators with
the same frequency ω and unit mass, denoted by
√ the dynamical variables
X(t) and Y (t). Defining a new variable q = (1/ 2)(X + iY ) we can write
the Lagrangian for the oscillators as
1
1 ˙2
X + Y˙ 2 − ω 2 X 2 + Y 2 = q˙q˙∗ − ω 2 qq ∗
(1.120)
L=
2
2
The usual quantum mechanical description of X(t) and Y (t) will proceed
by introducing the creation and annihilation operators for each of them so
that we can write
1 −iωt
1 −iωt
αe
βe
+ α† eiωt ; Y (t) = √
+ β † eiωt
X(t) = √
2ω
2ω
(1.121)
The commutation relations are [α, α† ] = 1 = [β, β † ] with all other commutators vanishing. (In particular, α and α† commute with β and β † .) The
corresponding expansion for q is given by
1 −iωt
1 1
q=√ √
(α + iβ) e−iωt + α† + iβ † eiωt ≡ √
ae
+ b† eiωt
2 2ω
2ω
(1.122)
1.5. Interpreting G(x2 ; x1 ) in Terms of a Field
31
where we have defined
1
a = √ (α + iβ) ;
2
1
b = √ (α − iβ)
2
(1.123)
[a, a† ] = 1 = [b, b† ]
(1.124)
It is easy to verify that
[a, b] = [a† , b† ] = 0;
Thus we can equivalently treat a and b as the annihilation operators for two
independent oscillators.35 The variable q is made of two different oscillators
and is non-Hermitian unlike X and Y which were Hermitian. Classically,
this difference arises from the fact that q is a complex number with two
degrees of freedom compared to X or Y which, being real, have only one
degree of freedom each.
For our purpose we need to have one frequency ωp associated with
each momentum labeled p. Therefore we will introduce one q for each
momentum label with qp (t) now having the expansion in terms of creation
and annihilation operators given by
1
1 † +iωp t
ap e−iωp t + b†−p eiωp t ; qp† =
ap e
qp =
+ b−p e−iωp t
2ωp
2ωp
(1.125)
(We use b−p rather than bp in the second term for future convenience.)
With the continuum labels k, p etc., the standard commutator relations
become:
[ap , a†k ] = (2π)3 δ(p − k);
[bp , b†k ] = (2π)3 δ(p − k)
(1.126)
with all other commutators vanishing. The result in Eq. (1.118) now generalizes for any two oscillators, labeled by vectors k and p, to the form:
δ(k − p) −iωp |t|
e
0|T qk (t2 )qp† (t1 ) |0 = (2π)3
2ωp
(1.127)
We now see that our amplitude G(x2 ; x1 ) can be expressed in the form
d3 p
d3 k
G(x2 ; x1 ) =
0|T qk (t2 )qp† (t1 ) |0ei(k·x2 −p·x1 )
(2π)3
(2π)3
≡ 0|T [φ(x2 )φ† (x1 )]|0
(1.128)
where we have introduced a new set of operators φ(x) at every event in
spacetime given by
1
d3 p
d3 p
ip·x
ap e−ipx + b†p eipx (1.129)
φ(x) ≡
q
(t)
e
≡
p
3
3
(2π)
(2π)
2ωp
where we have flipped the sign of spatial momentum in the second term
to get the b†p eipx factor. (This is why we called it b†−p in Eq. (1.125) with
a minus sign.) Since G is a Lorentz invariant scalar, Eq. (1.128) suggests
that φ(x) is a Lorentz invariant object. This is
indeed true andcan be seen
by introducing two rescaled operators Ap = 2ωp ap , Bp = 2ωp bp and
writing Eq. (1.129) as
(1.130)
φ(x) = d Ωp Ap e−ipx + Bp† eipx
35 It would have been more appropriate to define a† as the annihilation operator, given the obvious relationship
between daggers and annihilation. Unfortunately, the opposite notation has
already been adopted!
32
Chapter 1. From Particles to Fields
The Lorentz invariance of φ(x) requires the commutation rules for Ap etc.
to be Lorentz invariant. Since [ap , a†k ] = (2π)3 δ(p − k), it follows that
[Ap , A†k ] = (2π)3 (2ωp )δ(p − k). This commutator is Lorentz invariant
since (2ωp )δ(p − k) is Lorentz invariant. So φ(x) behaves as a scalar under
the Lorentz transformation. We will have more to say about Eq. (1.130) in
the next section.
This result is astonishingly beautiful when you think about it. To make
it explicit, what we have proved is the equality
t,x2
exp −im
t2
2
dt 1 − v = G(x2 ; x1 ) = 0|T [φ(x2 )φ† (x1 )]|0
t1
0,x1
36 Why
does this equality in
Eq. (1.131) hold?
Nobody really
knows, in the sense that there is no
simple argument which will make
this equality self evident and obvious
without doing any extensive calculation. This probably means we do
not yet have a really fundamental
understanding of the quantum physics
of a relativistic free particle, but
most people would disagree and will
consider this equality as merely a very
convenient mathematical identity.
(1.131)
On the left side is the propagation amplitude computed using sum over
paths for a single relativistic free particle. On the right is an expectation
value in some (fictitious) harmonic oscillator ground state for the time
ordered product of two field operators.36 No two systems could have been
a priori more different, but both can be used to compute the propagation
amplitude which is given in the middle!
This is the result we were looking for. It suggests that if we understand
the system described by these harmonic oscillators and the associated field
φ(x), we can provide an alternate interpretation for the G(x2 ; x1 ) in spacetime using fully causal evolution. The last part is not yet obvious but there
is cause for hope and we will see that it does work out.
1.5.1
Propagation Amplitude and Antiparticles
The decomposition in Eq. (1.130) allows us to separate φ(x) into two
Lorentz invariant scalar fields with φ(x) = A(x) + B † (x) where
−ipx
; B(x) ≡ d Ωp Bp e−ipx
(1.132)
A(x) ≡ d Ωp Ap e
We can then express G± (x2 ; x1 ), defined in Eq. (1.86), in terms of the
expectation values of fields by comparing Eq. (1.128) with Eq. (1.88). This
gives
0|φ(x2 )φ† (x1 )|0 = 0|A(x2 )A† (x1 )|0 = d Ωp e−ipx = G+ (x2 ; x1 )
†
†
0|φ (x1 )φ(x2 )|0 = 0|B(x1 )B (x2 )|0 = d Ωp e+ipx = G− (x2 ; x1 )
(1.133)
so that we have
G(x2 ; x1 ) = θ(t)0|A(x2 )A† (x1 )|0 + θ(−t)0|B(x1 )B † (x2 )|0
(1.134)
This expression is quite intriguing and let us try to understand its structure.
Let us begin with the case when x2 and x1 are separated by a timelike
interval so that the notion of t > 0 or t < 0 is Lorentz invariant. We see
from Eq. (1.134) that when t > 0 we have G(x2 ; x1 ) = 0|A(x2 )A† (x1 )|0.
This corresponds to creation of an “a-type particle”, say, at x1 followed by
its annihilation at x2 ; so it makes sense to think of this as a propagation
of an a-type particle from x2 to x1 .
This interpretation can be made
1.5. Interpreting G(x2 ; x1 ) in Terms of a Field
33
more specific along the following lines: Since |0 is the ground state of all
harmonic oscillators, it follows that
φ† (x)|0 = A† (x)|0 = d Ωp eipx A†p |0 ≡ d Ωp eipx |1p
(1.135)
where |1p denotes the state in which the p-th oscillator is in the first
excited state with all other oscillators remaining in the ground state. If
we make a spacetime translation by xi → xi + Li , this state picks up
an extra factor exp(ipL) showing that this state has a four-momentum
pi = (ωp , p).37 Therefore, it makes sense to think of eipx A†p |0 as the state
|pp|x and write
φ† (x1 )|0 = A† (x1 )|0 = d Ωp |pp|x1 ;
0|φ(x2 ) = 0|A(x2 ) = d Ωq x2 |qq|
(1.136)
and obtain
0|φ(x2 )φ† (x1 )|0 = 0|A(x2 )A† (x1 )|0 =
d Ωp
37 Treated as an eigenstate of the standard harmonic oscillator Hamiltonian,
the state |1p has the energy ωp [1 +
(1/2)] = (3/2)ωp and we don’t know
how to assign any momentum to it. We
also have no clue at this stage what relevance the harmonic oscillator Hamiltonian has to the relativistic free particle we are studying. For the moment,
we get around all these using the spacetime translation argument.
d Ωq x2 |qq|pp|x1
(1.137)
Using the Lorentz invariant38 normalization q|p = (2π)3 (2ωp )δ(q − p)
we reproduce the original result
0|φ(x2 )φ† (x1 )|0 = d Ωp x2 |pp|x1 = G(x2 ; x1 ) (t > 0) (1.138)
This shows that one can possibly identify the state A†p |0 with a one-particle
state of momentum p and energy ωp . The identity operator for the oneparticle states can be expanded in terms of |p states by the usual relation
d3 p 1
|pp|
(1.139)
1=
(2π)3 2ωp
You can also easily verify that p|φ0 (0, x)|0 = e−ip·x . This relation looks
very similar to the standard relation in NRQM giving p|x = exp(−ip·x).
It is then rather tempting to think of φ(0, x) acting on |0 as analogous to
|x corresponding to a particle “located at” |x. But, as we saw earlier, such
an interpretation is fraught with difficulties.39 In short, our field operator
acting on the ground state of some (fictitious) harmonic oscillators can
produce a state with a particle at x as indicated by Eq. (1.137).
Something similar happens for x02 < x01 , i.e., t < 0, when we start with
φ(x)|0 in which the Bp† produces a particle. Now we get the propagation
of a “b-type particle” from x2 to x1 which is in the direction opposite to
the propagation indicated by G(x2 ; x1 ). This can be interpreted by saying
that when t < 0 the (backward in time) propagation of a particle from
x1 to x2 is governed by an amplitude for the forward propagation of an
antiparticle represented by B and B † operators. In this interpretation, we
call the b-type particle as the antiparticle of the a-type particle. (Note
that both a-type and b-type, i.e., the particle and the antiparticle, have
ωp2 = p2 + m2 with the same m.) As long as x2 and x1 are separated
by a timelike interval, the amplitude G(x2 ; x1 ) is contributed by a particle
propagating from x1 to x2 if t > 0 and is contributed by an antiparticle
38 The normalization here is partially a question of convention just as
whether we use ap or Ap in the expansion of the field; see Eq. (1.129)
and Eq. (1.130). The latter has nicer
Lorentz transformation properties and
a Lorentz invariant commutator. Similarly, the states produced by ap |0 and
Ap |0 will be normalized differently.
We choose to work with a Lorentz invariant normalization but final results
should not depend on this as long as
we do it consistently.
39 Of course, there are many states in
the Hilbert space which satisfy p|ψ =
exp(−ip · x). Since p| has non-zero
matrix elements only with the oneparticle state, adding two or zero particle states like φ2 (0, x)|0 will still give
the same p|ψ.
34
t
0|A(X)A†(0)|0
X(xi)
0|A(Z)A† (0)|0
OR
0|B(0)B †(Z)|0
i
Z(z )
x
0|B(0)B †(Y )|0
Y (y i )
Figure 1.8: Interpretation of G(x; y)
in terms of propagation of particles and
antiparticles.
40 How come the path integral amplitude G(x2 ; x1 ) we computed is the
same for charged and neutral particles?
Recall that we used an action for a
free particle, so the maths knows nothing about coupling to an electromagnetic field and the particle’s charge is
irrelevant. Neutral and charged particles will behave the same if we ignore
electromagnetic interactions and treat
them as free. What is really curious is
that one can decipher this information
from the field approach. This is because we put in by hand the fact that
φ is not Hermitian.
Chapter 1. From Particles to Fields
propagating from x2 to x1 if t < 0. All these notions are Lorentz invariant
as long as x2 and x1 are separated by a timelike interval.
On the other hand, if x2 and x1 are separated by a spacelike interval,
then the notion of t > 0 or t < 0 is not Lorentz invariant. Then, if we are in
a Lorentz frame in which t > 0, the amplitude G(x2 ; x1 ) is contributed by
a particle propagating from x1 to x2 ; but if we transform to another frame
in which the same two events have t < 0, then the G(x2 ; x1 ) is contributed
by an antiparticle propagating from x2 to x1 . Thus, in the case of events
separated by a spacelike interval, our interpretation of G(x2 ; x1 ) (which, of
course, is Lorentz invariant) in terms of particle or antiparticle propagation
depends on the Lorentz frame.
This situation is summarized in Fig. 1.8 which shows the propagation of
a particle from the origin to an event P for three different cases: (a) When
P is within the future light-cone of the origin, like the event X; (b) When
P is within the past light-cone of the origin, like the event Y ; (c) When
P is outside the light-cone of the origin (like the event Z), in which case
one cannot talk of future and past in a Lorentz invariant way. The event
Z could be to the future of the origin in one Lorentz frame while it could
be at the past of the origin in another reference frame. The amplitude
G(x; 0) for the case (a) is given by 0|A(X)A† (0)|0 which represents the
creation of a particle at origin followed by its annihilation at the event X.
In case (b) this amplitude is given by 0|B(0)B † (Y )|0 which represents
the creation of an antiparticle at Y followed by its subsequent annihilation
at the origin. Note that the arrow in the figure refers to the propagation
amplitude G(y; 0) which, in this case, is a backward propagation in time.
This is given by the propagation of an antiparticle forward in time. In the
case (c), the amplitude could have been contributed either by creation of a
particle at the origin followed by its annihilation at Z (which is shown in
the figure) or by creation of an antiparticle at Z followed by its annihilation
at origin depending on the Lorentz frame in which we describe the physics.
The numerical value of the amplitude remains the same thanks to the
crucial result obtained in Eq. (1.89) which shows that these two amplitudes
are the same for spacelike separated events.
Since the field φ creates an antiparticle and destroys a particle (with
the field φ† doing exactly the opposite), it collectively
captures the full
2
dynamics. The description uses the same ωp = p + m2 with the same
m for both the particle and antiparticle; therefore, they must have the
same mass. But if we add a phase α to φ by changing φ → eiα φ, then φ†
changes by φ† → e−iα φ† . We will see in Sect. 3.1.5 that such phase changes
are related to the charges carried by the particle, and in particular to the
standard electric charge. The above behaviour shows that if the particle
has a positive charge then the antiparticle must have an equal and opposite
negative charge.40
The fact that the particle and antiparticle have opposite charges allows
us to make the notion of our propagation more precise. Let us consider
a physical process which increases the charge at the event x1 by Q and
decreases the charge at x2 by Q through particle propagation. We can
do this by creating a particle (which carries a charge +Q) at x1 , allowing
it propagate to x2 and destroying it at x2 (thereby decreasing the charge
by Q). We can also produce the same effect by creating an antiparticle
(which carries a charge −Q) at x2 , allowing it to propagate to x1 and
destroying it at x1 (thereby increasing the charge by Q). Which of these
1.5. Interpreting G(x2 ; x1 ) in Terms of a Field
35
two processes we can use depends on whether x02 > x01 or x02 < x01 since
we need to create a particle or antiparticle before we can destroy it. If
x02 > x01 then we will use A† (x1 ) to create a particle at x1 and A(x2 ) to
destroy it at x2 . This process has the amplitude θ(t)0|A(x2 )A† (x1 )|0.
We cannot, however, use this process if we are in a Lorentz frame in which
x02 < x01 . But in this case we can first create an antiparticle with B † (x2 ) at
x2 (thereby decreasing the charge by Q at x2 and then destroying it with
B(x1 ) at x1 (thereby increasing the charge by Q at x1 ). This process has
the amplitude θ(−t)0|B(x1 )B † (x2 )|0. Thus the total amplitude for this
physical process (see Fig. 1.9) is given by
G(x2 ; x1 ) =
θ(t2 − t1 )0|A(x2 )A† (x1 )|0 + θ(t1 − t2 )0|B(x1 )B † (x2 )|0
=
=
θ(t2 − t1 )0|φ(x2 )φ† (x1 )|0 + θ(t1 − t2 )0|φ† (x1 )φ(x2 )|0
0|T [φ(x2 )φ† (x1 )]|0
(1.140)
As usual, the distinction between the two terms has a Lorentz invariant
meaning only when x2 and x1 are separated by a timelike interval. If not,
which of the terms contribute to the amplitude will depend on the Lorentz
frame. But in any Lorentz frame, we would have always created a particle
(or antiparticle) before destroying it which takes care of an elementary
notion of causality. This result is precisely what our path integral leads us
to; it is a net amplitude for the physical process taking into account the
existence of both particles and antiparticles.
This is also needed for consistency of causal description of the situation
at an intermediate time shown in Fig. 1.7, which is what we started with.
Armed with our new interpretation, we can think of a particle antiparticle
pair being created at A with the antiparticle traveling forward in time to
B (which is equivalent to our original particle traveling backward in time
from B to A), annihilating with our original particle at B. What we detect
at x2 is the particle created at A. Obviously, each of these propagations
can be within the light cone with x1 and x2 still being outside each other’s
light cone. The mechanism we set up for localized emission and detection
events at x1 and x2 for our relativistic particle must be responsible for
the pair creation as well. We, however, do not want conserved charges to
appear and disappear at intermediate times like t = y 0 . This requires the
antiparticle traveling from A to B to have a charge equal and opposite to
that carried by the particle traveling from A to x2 .
Most of the previous discussion will also go through if we impose an
extra condition on the field φ(x) that it must be Hermitian. This is equivalent to setting Ap = Bp for all p. In this case, the particle happens to be
its own antiparticle which can be a bit confusing when we are still trying
to figure things out.41 This also shows that our “association” of the field
with the relativistic particle in order to reproduce the amplitude G(x2 ; x1 )
is by no means unique. In the simplest context we could have done it with
either a Hermitian scalar field or a non-Hermitian one.
1.5.2
−Q
+Q
x2
A(x2)
B †(x2)
†
x1 A (x1)
B(x1)
Figure
1.9:
Interpretation of
G(x2 ; x1 ) in terms of particles and
antiparticles.
41 Many textbooks, which begin with
a real classical scalar field (for which
the particle is the same as its antiparticle) and its canonical quantization,
are guilty of beginning with this potentially confusing situation!
Why do we Really Need Antiparticles?
If you were alert, you would have noticed that the field φ is actually made
of two Lorentz invariant, non-Hermitian fields (which do not talk to each
other) given by the two terms in Eq. (1.130). Since we can express G(x2 ; x1 )
by adding up these two, why are we dealing42 with φ(x) rather than with
A(x) and B(x)? This question is conceptually important because φ deals
42 If we were using a Hermitian scalar
field, you could have given it as a
reason to prefer φ(x) to the nonHermitian bits of it, A(x) and B(x).
But since φ is not Hermitian either,
this does not give a strong motivation.
36
Chapter 1. From Particles to Fields
with both particles and antiparticles on an equal footing (that is, has both
negative frequency and positive frequency modes in it) while A(x), B(x)
are built from positive frequency modes alone. Why is it better to treat
particles and antiparticles together?
The answer is non-trivial and has to do with the nature of commutation relations these fields satisfy amongst themselves. To begin with, it is
obvious that
[A(x), A(y)] = 0;
[B(x), B(y)] = 0;
[A(x), B(y)] = 0
(1.141)
These follow from the fact that the corresponding annihilation operators
commute among themselves. The only nontrivial commutator we need to
evaluate is [A(x), A† (y)] which should be the same as [B(x), B † (y)]. We
see that
d Ωp d Ωq e−ipx2 eiqx1 [Ap , A†q ]
[A(x2 ), A† (x1 )] =
=
d Ωp e−ipx = G+ (x2 ; x1 )
(1.142)
This commutator does not vanish when (x2 − x1 )2 ) < 0, i.e., when the two
events are separated by a spacelike interval. This means that one cannot,
for example, specify independently the value of A(0, x1 ) and A† (0, x2 ) on
a t = 0 spacelike hypersurface since [A(0, x1 ), A† (0, x2 )] = 0. An identical
conclusion holds for B(x).
In contrast, consider the corresponding commutators for φ(x). Again,
[φ(x1 ), φ(x2 )] vanishes as long as φ is non-Hermitian, since in this case the
A’s commute with the B’s. The only nontrivial commutator, [φ(x2 ), φ† (x1 )],
is given by
[A(x2 ) + B † (x2 ), A† (x1 ) + B(x1 )] = [A(x2 ), A† (x1 )] − [B(x1 ), B † (x2 )]
(1.143)
The two terms differ only by the interchange of coordinates, x → −x, and
hence the commutator is given by
†
Δ(x) ≡ [φ(x2 ), φ (x1 )] = G+ (x2 ; x1 ) − G+ (x1 ; x2 ); = d Ωp [e−ipx − e+ipx ]
43 The Δ(x) is, of course, non-zero inside the light cone where x2 > 0 and
can be expressed in terms of Bessel
functions. Mercifully, we will almost
never need its exact form. The commutator of the fields vanishes for spacelike separations but is non-zero otherwise. This will be rather strange
if the commutators are honest-to-god
analytic functions of the coordinates!
This feat becomes possible only because the commutators are, mathematically speaking, distributions and not
functions.
Exercise 1.15: Prove this claim.
(1.144)
This is precisely the expression we evaluated in Eq. (1.89) and proved to
vanish43 when x2 < 0. So we see that φ(x1 ) commutes with φ† (x2 ) when
the events are spacelike. Therefore, unlike in the case of A(x) and B(x), we
can measure and specify the value of φ(0, x1 ) and φ† (0, x2 ) on a spacelike
hypersurface (without one measurement influencing the other) at t = 0
which is nice. From Eq. (1.143) you see clearly that the commutators
involving either A or B do not vanish outside the light cone but when we
combine them to form A(x) + B † (x) = φ(x), the two commutators nicely
cancel each other outside the light cone. This is why we use φ rather than
just A or B. Roughly speaking, the antiparticle contribution cancels the
particle contribution outside the light cone.
If we build observables from φ by using bilinear, Hermitian, combinations of the form O(x) = φ(x)D(x)φ† (x) where D(x) is either a c-number
or a local differential operator, then a straightforward computation shows
that [O(x2 ), O(x1 )] also vanishes when x1 and x2 are spacelike. When the
fields are injected with lives of their own later on, this will turn out to be a
1.5. Interpreting G(x2 ; x1 ) in Terms of a Field
37
very desirable property, viz., that the measurement of bilinear observables
do not influence each other outside the light cone. It is this version of
causality which we will adopt in field theory rather than the vanishing of
the propagation amplitude G(x2 ; x1 ) when the two events are separated by
a spacelike interval.
Finally, once we have mapped the propagation of the relativistic particle to the physics of an infinite number of oscillators, it provides a strong
motivation to study the dynamics of the latter system. Since each (complex) oscillator is described by a Lagrangian Lk = [|q˙k |2 − ωk2 |qk |2 ], the full
system is described by the action
d3 k
2
2
2
A=
dt[|q˙k |2 − ωk2 |qk |2 ] (1.145)
dt[|q˙k | − ωk |qk | ] ⇒
(2π)3
k
Converting back to spacetime by expressing qk (t) in terms of the field φ(x),
this action becomes44
(1.146)
A = d4 x ∂a φ∂ a φ∗ − m2 |φ|2
It is clear that this action describes a field with dynamics of its own if
we decide to treat this as a physical system. By ‘quantizing’ this field
(which is most easily done by decomposing it again into harmonic oscillators
and quantizing each oscillator), we will end up getting a description of
relativistic particles and antiparticles directly as the ‘quanta’ of the field.
This, of course, is the reverse route of getting the particles from the field
by what is usually called canonical quantization — a subject we will take
up45 in Chap. 3.
1.5.3
Aside: Occupation Number Basis in Quantum
Mechanics
Yet another way to understand the need for using the combination φ =
A + B † rather than just φ = A is to compare it with a situation in which
just using φ = A is indeed sufficient. This example is provided by nonrelativistic quantum mechanics when we study many particle systems using
the occupation number basis. Let us briefly recall this approach in some
simple contexts.46
Think of a system made of a large number N of indistinguishable particles all located in some common external potential with energy levels
E1 , E2 , E3 ..... If you know the wave function describing the quantum state
of each particle, then the full wave function of the system can be obtained
by a suitable product of individual wave functions, ensuring that the symmetry (or antisymmetry) with respect to interchange of particles is taken
care of depending on whether the particles are bosons or fermions; let us
consider bosons for simplicity.
The quantum indistinguishability of the particles tells you that you
cannot really ask which particle is in which state but only talk in terms of
the number of particles in a given state. It is, therefore, much more natural
to specify the state of the system by giving the occupation number for each
energy level; that is, you use as the basis a set of states |n1 , n2 , ... which
corresponds to a situation in which there are n1 particles in the energy
level E1 etc. Interactions will now change the occupation number of the
44 We now think of q (t) and φ(x) are
k
c-number functions, not as operators.
45 Most textbooks follow the route of
introducing and studying a classical
field first and then quantizing it and,
lo and behold, particles appear. We
chose to postpone this to Chap. 3 because in that procedure it is never clear
how we would have arrived at the concept of a field if we had started with
a single relativistic particle and quantized it.
46 Unfortunately, many textbooks use
the phrase “second quantization” to
describe the use of the occupation
number basis in describing quantum
mechanics. This is, of course, misleading. You don’t quantize a many body
system for the second time; you are
only rephrasing what you have after
the (first) quantization, in a different
basis.
38
Chapter 1. From Particles to Fields
different energy levels and all operators can be specified by giving their
matrix elements in the occupation number basis. In fact it is fairly easy to
construct a “dictionary” between the matrix elements of operators in the
usual, say, coordinate basis and their matrix elements in the occupation
number basis. To describe the change in the occupation numbers we only
need to introduce suitable annihilation and creation operators for each
energy level; their action will essentially make the occupation numbers of
the levels go down or up by unity.
For example, consider two non-relativistic, spin zero particles which interact via a short range two-body potential V (x1 − x1 ). Suppose the two
particles are initially well separated, move towards each other and scatter
off to a final, well separated state. We specify the initial state of the particles by the momenta (p1 , p2 ) and the final state by the final momenta
(k1 , k2 ). One can study this process by writing down the Schrodinger
equation for the two particle system and solving it (possibly in some approximation) with specified boundary conditions. There is, however, a nicer
way to phrase this problem by using the occupation number basis. To do
this, we start with a fictitious state |0 of no particles and introduce a set
of creation (a†p ) [and annihilation (ap )] operators which can act on |0 and
generate one-particle states with a given momentum. The corresponding
operator (not wave function) which describes a particle at x at the time
t = 0 can be taken to be
d3 p
ψ(0, x) =
ap eip·x
(1.147)
(2π)3
47 This
operator is completely analogous to our A(x); both involve only
positive energy solutions and in fact
if you do a Taylor expansion of (p +
m2 )1/2 in |p2 /m2 |, ignore the phase
due to rest energy and approximate
dΩp = [d3 p/(2π)3 ] [1/2ωp ] by dΩp
[d3 p/(2π)3 ] [1/2m], you will find that
ψ(t, x) ≈ 2mA(x).
At a later time, the standard time evolution will introduce the phase factor
exp(−iωp t) with ωp = (p2 /2m) leading to47
ψ(t, x) =
d3 p
ap e−ipx ;
(2π)3
px ≡ ωp t − p · x
(1.148)
Our scattering problem can be restated in terms of the creation and
annihilation operators as follows: The scattering annihilates two particles
with momenta (p1 , p2 ) through the action of ap1 ap2 and then creates two
particles with momenta (k1 , k2 ) through the action of a†k1 a†k2 with some
amplitude Vl which is just the Fourier transform of V (x) evaluated at
the momentum transfer. The whole process is described by an operator
H = a†k1 a†k2 Vl ap1 ap2 which acts on the occupation number basis states.
In fact, any other interaction can be re-expressed in this language, often
simplifying the calculations significantly. As we said before, there is a oneto-one correspondence between the matrix elements of the operators in the
usual basis and the matrix elements evaluated in occupation number basis.
So you can see that the usual, non-relativistic, quantum mechanics can
be translated into a language involving occupation numbers and changes in
occupation numbers corresponding to creation and annihilation of particles.
What makes quantum field theory different is therefore not the fact that we
need to deal with situations involving a variable number of particles. The
crucial difference is that just a single ψ(x) ∝ A(x), propagating forward
in time with positive energy, is inadequate in the relativistic case. We
need another B(x) to ensure causality which — in turn — leads to the
propagation of negative energy modes backward in time and the existence
of antiparticles. This is what combining relativity and quantum theory
1.6. Mathematical Supplement
39
leads to — which has no analogue in non-relativistic quantum mechanics
even when we use a language suited for a variable number of particles.
Finally let us consider the NRQM limit of the relativistic field theory.
You might have seen textbook discussions of the Klein-Gordon equation
(and Dirac equation, which we will discuss in a later chapter) solved in
different contexts, like, for e.g., in the hydrogen atom problem. Obviously,
the scalar field φ(x) which we have introduced is an operator and cannot
be treated as a c-number solution to some differential equation. (We are
considering the Hermitian case for simplicity.) To obtain the NRQM from
quantum field theory, one needs to proceed in a rather subtle way. One
procedure is to define an NRQM “wave function” ψ(x) by the relation
ψ(x) ≡ 0|φ(t, x)|ψ
(1.149)
It then follows that
i∂t ψ
=
=
=
d3 p
(2π)3
p 2 + m2
(Ap e−ipx − A†p eipx )|ψ
2ωp
i0|∂t φ(t, x)|ψ = 0|
d3 p
p 2 + m2
0|
(Ap e−ipx )|ψ
3
(2π)
2ωp
0| m2 − ∇2 φ(x)|ψ
(1.150)
where we have used the fact 0|A†p = 0. So:
1
∇2
2
2
i∂t ψ(x) = m − ∇ ψ(x) = m −
+O
ψ(x)
2m
m2
(1.151)
The first term on the right hand side is the rest energy mc2 which has to be
removed by changing the phase of the wave function ψ. That is, we define
ψ(x) = Ψ(x) exp(−imt). Then it is clear that Ψ satisfies, to the lowest
order, the free particle Schrodinger equation given by
i∂t Ψ(x) = −
∇2
Ψ(x)
2m
(1.152)
The non-triviality of the operations involved in this process is yet another
reminder that the single particle description is not easy to incorporate in
relativistic quantum theory.
1.6
1.6.1
Mathematical Supplement
Path Integral From Time Slicing
In the Heisenberg picture, the amplitude for a particle to propagate from
x = xa at t = 0 to x = xb at t = T is given by
T, xb |0, xa = xb |e−iHT |xa
(1.153)
We now break up the time interval into N slices of each of duration and
write
e−iHT = e−iH e−iH e−iH . . . e−iH
(N factors).
(1.154)
40
Chapter 1. From Particles to Fields
We will next insert a complete set of intermediate states between each of
these factors and sum over the relevant variables. More formally, this is
done by introducing a factor unity expressed in the form
(1.155)
1 = dxk |xk xk |
between each of the exponentials so that we are left with (N − 1) factors
corresponding to k = 1, 2, ....(N − 1). To simplify the notation and include
xa and xb , we will define x0 ≡ xa and xN ≡ xb , so that we have a product
of factors involving the matrix element of e−iH taken between |xk and
xk+1 | for k = 0, 1, ...N .
We are ultimately interested in the limit of → 0 and N → ∞ and
hence we can approximate the exponentials retaining only up to first order
in . Thus our problem reduces to evaluating matrix elements of the form
xk+1 |e−iH |xk −−−→ xk+1 |1 − iH + · · · )|xk ,
→0
48 To be rigorous we should evaluate f
at (1/2)[xk + xk+1 ] but we will get the
same final result.
(1.156)
taking their product, and then taking their appropriate limit. This task becomes significantly simple if H has the form A(p) + B(x) like, for example,
when H = p2 /2m + V (x). For any function of x, we have the result48
xk+1 |f (x)|xk = f (xk ) δ (xk − xk+1 )
which we can write as
dpk
exp [ipk · (xk+1 − xk )]
(2π)D
xk+1 |f (x)|xk = f (xk )
(1.157)
(1.158)
Similarly, for any term which is only a function of momenta, we can write
dpk
xk+1 |f (p)|xk =
f (pk ) exp [ipk · (xk+1 − xk )]
(1.159)
(2π)D
Therefore, when H contains only terms which can be expressed in the form
A(p) + B(x) the relevant matrix element becomes
dpk
xk+1 |H(p, x))|xk =
H (xk , pk ) exp [ipk · (xk+1 − xk )] (1.160)
(2π)D
so that
xk+1 |e−iH |xk
=
dpk
exp [−i H (xk , pk )]
(2π)D
× exp [ipk · (xk+1 − xk )] (1.161)
where we have switched back to the exponential form. Now multiplying all
the factors together, we get the result
dpk
exp
i pk · (xk+1 − xk )
dxk
T, xN |0, x0 =
(2π)D
k
(1.162)
− H (xk , pk )
There is one momentum integral for each value of k = 0, 1, 2, ...N − 1 and
one coordinate integral for each k = 1, 2, ...(N − 1). Switching back to a
continuum notation we have
T
dt (p · x˙ − H(x, p)) (1.163)
T, xb |0, xa = Dx(t) Dp(t) exp i
0
1.6. Mathematical Supplement
41
where the function x(t) takes fixed values at the end points while the functions p(t) remain unconstrained. The measure in the functional integral is
now defined as the product of integrations in phase space
dxk dpk
(1.164)
Dx Dp →
(2π)D
k
with suitable limits taken in the end. When H has a specific form H =
p2 /2m + V (x), we can do the integrations over p using the standard result
im
p2k
1
dpk
2
exp
(x
exp
i
p
·
(x
−
x
)
−
−
x
)
=
k
k+1
k
k+1
k
(2π)D
2m
C( )
2
(1.165)
where C( ) = (2πi /m)1/2 . Using this we can write the final expression
involving only the x integration:
1
dxk
T, xb |0, xa = lim
(1.166)
→0
C( )
C( )
k
m (xk+1 − xk )2
− V (xk )
× exp i
2
which goes over to the original expression involving the sum over paths,
with each path contributing the amplitude exp(iA):
T
1
2
T, xb |0, xa = Dx exp i
dt mx˙ − V (x)
(1.167)
2
0
The time slicing gives meaning to this formal functional integral in this
case. As you can see, these ideas generalize naturally to more complicated
systems.
1.6.2
Evaluation of the Relativistic Path Integral
In Sect. 1.3.1 we evaluated G(x2 ; x1 ) for a relativistic particle using the
Jacobi action trick. The conventional action functional for a relativistic
particle is given by Eq. (1.67) and and we will now show how the path
integral with this action can be directly evaluated by a lattice regularization
procedure even though it is not quadratic in the dynamical variables.
We will work in the Euclidean space of D-dimensions, evaluate the path
integral and analytically continue to the Lorentzian space. We need to give
meaning to the sum over paths
exp −m [x(s)]
(1.168)
GE (x2 , x1 ; m) =
all x(s)
in the Euclidean sector, where
s2
(x2 , x1 ) =
ds
s1
dx dx
·
ds ds
(1.169)
is just the length of the curve x(s), parametrized by s, connecting x(s1 ) =
x1 and x(s2 ) = x2 . Of course, the length is independent of the parametrization and the integral defining l is manifestly invariant under the reparameterization s → f (s).
42
49 Purely from dimensional analysis,
we would expect the mass parameter
μ() to scale inversely as lattice spacing; in fact, we will see that this is what
happens.
Chapter 1. From Particles to Fields
The GE can be defined through the following limiting procedure: Consider a lattice of points in a D-dimensional cubic lattice with a uniform
lattice spacing of . We will work out GE in the lattice and will then take
the limit of → 0 with a suitable measure. To obtain a finite answer, we
have to use an overall normalization factor M ( ) in Eq. (1.168) as well as
treat m (which is the only parameter in the problem) as varying with in a
specific manner; i.e. we will use a function μ( ) in place of m on the lattice
and will reserve the symbol m for the parameter in the continuum limit.49
Thus the sum over paths in the continuum limit is defined by the limiting
procedure
(1.170)
GE (x2 , x1 ; m) = lim [M ( )GE (x2 , x1 ; μ( ))]
→0
In a lattice with spacing of , Eq. (1.168) can be evaluated in a straightforward manner. Because of the translation invariance of the problem,
GE can only depend on x2 − x1 ; so we can set x1 = 0 and call x2 =
R where R is a D-dimensional vector with integral components: R =
(n1 , n2 , n3 · · · nD ). Let C(N, R) be the number of paths of length N connecting the origin to the lattice point R. Since all such paths contribute
a term [exp −μ( )(N )] to Eq. (1.168), we get:
GE (R; ) =
∞
C(N ; R) exp (−μ( )N )
(1.171)
N =0
Exercise 1.16: Prove this result. One
possible way is as follows: Consider a
path of length N (all lengths in units
of ) with ri steps to the right and li
steps to the left along the i-th axis (i =
1, 2, ...D). If Q[N, ri , li ] is the number of such paths connecting the origin to R = (n1 , n2 , ...nD ), show that
r l
(Σxi + Σyi )N = ΣQ(N, ri , li )xi i yii ;
i.e., the left hand side is the generating function for Q. Next show that
we are interested in the case where
ri − li = ni is a given quantity for each
i and that the number of paths with
this condition can be obtained by setting yi = (1/xi ). Taking xi = exp(iki )
gives the result. Another (simpler)
way to prove the result is to note that a
path of length N which reaches R must
have come from the paths of length
N − 1 which have reached one of the
2D neighbouring sites in the previous
step. This allows you to fix the Fourier
transform of C(N ; R) through a recursion relation.
It can be shown from elementary combinatorics that the C(N ; R) satisfy
the condition
N
F N ≡ eik1 + eik2 + · · · eikD + e−ik1 + · · · e−ikD
=
C(N ; R)eik.R
R
(1.172)
Therefore,
eik.R GE (R; ) =
R
=
∞
C(N ; R)eik.R exp (−μ( )N )
N =0 R
∞
−μ()N
e
FN =
N =0
=
1 − F e−μ()
−1
∞
N
F e−μ()
N =0
Inverting the Fourier transform, we get
e−ik.R
dD k
GE (R; ) =
D
(2π) (1 − e−μ() F )
e−ik.R
dD k
=
D
(2π) (1 − 2e−μ() D
j=1 cos kj )
(1.173)
(1.174)
Converting to the physical length scales x = R and p = −1 k gives
D D
d p
e−ip.x
GE (x; ) =
(1.175)
D
(2π) (1 − 2e−μ() D
j=1 cos pj ε)
This is an exact result in the lattice and we now have to take the limit → 0
in a suitable manner to keep the limit finite. As → 0, the denominator
1.6. Mathematical Supplement
43
of the integrand becomes
1
=
1 − 2e−μ() D − 2 |p|2
2
=
1 − 2De−μ() + 2 e−μ() |p|2
2 −μ()
1 − 2De−μ()
|p| +
2 e−μ()
(1.176)
2
e
so that we get, for small ,
GE (x; )
dD p A( )e−ip.x
(2π)D |p|2 + B( )
(1.177)
where
= D−2 eμ()
1 μ()
e
=
−
2D
2
A( )
B( )
(1.178)
The continuum theory has to be defined in the limit of → 0 with some
measure M ( ); that is, we want to choose M ( ) such that the limit
GE (x; m)|continuum = lim {M ( )GE (x; )}
(1.179)
→0
is finite. It is easy to see that we only need to demand
1
lim 2 eμ() − 2D = m2
→0
(1.180)
lim M ( ) D−2 eμ() = 1
and
(1.181)
→0
to achieve this. The first condition implies that, near ≈ 0,
μ( ) ≈
ln 2D m2
ln 2D
+
≈
2D
(1.182)
The second condition Eq. (1.181), allows us to determine the measure as
M ( ) =
1 1
2D D−2
(1.183)
With this choice, we get
lim GE (x; )M ( ) =
→0
dD p e−ip.x
(2π)D |p|2 + m2
(1.184)
which is the continuum propagator we computed earlier (see Eq. (1.99)).
This analysis gives a rigorous meaning to the path integral for the relativistic free particle.
The rest is easy. We write (|p|2 + m2 )−1 as a integral over λ of
exp[−λ(|p|2 + m2 )] and doing the p integration, we get
GE =
0
∞
dλ e−λm
2
dD p ip·x−λp2
e
=
(2π)D
0
∞
dλ e−λm
2
1
4λπ
D/2
e−
|x|2
4λ
(1.185)
44
Chapter 1. From Particles to Fields
When D = 4, this reduces to
GE =
1
16π 2
0
∞
dλ −m2 λ− |x|2
4λ
e
λ2
(1.186)
To analytically continue from Euclidean to Lorentzian spacetime, we have
to change the sign of one of the coordinates in |x|2 and obtain |x|2 − t2 and
set λ = is. This gives
∞
∞
ds −im2 s+ i (|x|2 −t2 )
ds −im2 s− i x2
i
i
4s
4s
GE = −
e
=−
e
16π 2 0 s2
16π 2 0 s2
(1.187)
which matches with the expression in Eq. (1.75).
Chapter 2
Disturbing the Vacuum
2.1
Sources that Disturb the Vacuum
We started our discussion in Sect. 1.1 with the amplitude for the process
A = D(x2 )G(x2 ; x1 )C(x1 ) involving creation, propagation and detection
(usually accompanied by destruction) of a particle and concentrated on the
middle part G(x2 ; x1 ). This analysis, however, dragged us into a description
in which φ† (x1 )|0 could be interpreted as creating a particle from a noparticle state |0. Similarly, φ(x2 ) acting on this state destroys this particle,
taking the state back to |0. We now want to study more carefully the
creation and destruction of the particle by external agencies and see what
it can tell us. Surprisingly enough, we can learn a lot from such a study
and even determine G(x2 ; x1 ) using some basic principles. This provides a
second perspective on G(x2 ; x1 ) and — what is more — it will again lead
to a description involving fields.
2.1.1
Vacuum Persistence Amplitude and G(x2 ; x1 )
We will start by describing the notion of a source which is capable of creating or destroying a particle.1 We introduce the amplitude 1p |0− J ≡ iJ(p)
which describes the creation of a one-particle state, with on-shell momentum p = (ωp , p), due to the action of a source (labeled by the superscript J)
from the no-particle state |0− at very early (i.e., as t → −∞) times. The
i-factor is introduced on the right hand side of 1p |0− J ≡ iJ(p) for future
convenience. The corresponding amplitude for the destruction will have an
amplitude 0+ |1p J where |0+ denotes the no-particle state at very late
(i.e., as t → +∞) times after all the sources have ceased to operate. This
state |0+ , in general, could be different from |0− .
It is convenient to start with “weak sources”, the action of which only
leads to single particle, rather than multi-particle, intermediate states during the entire evolution of the system from t = −∞ to t = ∞; i.e., we think
of the source as a mild perturbation on the no-particle state. We can then
easily show that 0+ |1p J = iJ ∗ (p). To do this, we begin with the relation
0− |1p J=0 = 0 which represents the orthogonality of the one-particle and
no-particle states in the absence of sources and expand this relation by
introducing a complete set of states in the presence of the source. This will
© Springer International Publishing Switzerland 2016
T. Padmanabhan, Quantum Field Theory, Graduate Texts in Physics, DOI 10.1007/978-3-319-28173-5_2
1 A purist will call the agency that
creates as the source and the one that
destroys as sink; we won’t be that
fussy.
45
46
Chapter 2. Disturbing the Vacuum
give
0
= 0− |1p J=0
=
0− |0+ J J0+ |1p J=0 +
J=0
1 × 0+ |1p
J
J=0
+
0− |1q J J1q |1p J=0 + · · ·
J=0
q
∗
(−iJ (q))δqp
(2.1)
q
2 The matrix element ψ|e−iHt |φ to
the lowest order in t will have the
term −itψ|H|φ. On the other hand,
the matrix element φ|e−iHt |ψ will
have −itφ|H|ψ. Though ψ|H|φ =
φ|H|ψ∗ , we need to take into account
the extra i-factor in relating the two.
The origin of i-factor in 0+ |1p J =
iJ ∗ (p) is similar.
In arriving at the second equality, we have ignored multi-particle intermediate states (denoted by · · · ) because the source is weak. For the same reason
we can replace J=00− |0+ J by its value for J = 0, viz., 0− |0+ J=0 = 1 to
leading order in J and replace J1q |1p J=0 by 1p |1q J=0 = δpq (which is
again the value for J = 0) and use 0− |1q J = 1q |0− ∗ = −iJ ∗ (q). Equation (2.1) then leads to 0+ |1p J = iJ ∗ (p). (This is why we kept an extra
i-factor2 in the definition of amplitudes so that J and J ∗ describe creation
and annihilation amplitudes.)
It will be convenient to have a corresponding spacetime description of
the source. This can be achieved by introducing a real scalar function J(x)
and introducing the four dimensional Fourier transform
4
ip·x
∗
J (p) = d4 x e−ip·x J(x)
(2.2)
J(p) ≡ d x e J(x);
and identifying the on-shell amplitudes −i1p |0− J = J(p) = J(ωp , p) as
the on-shell values of the Fourier transform defined above. Note that we
are considering an arbitrary J(x) and a correspondingly arbitrary J(p)
[except for the condition J ∗ (p) = J(−p) ensuring the reality of J(x)]; but
the amplitude 1p |0− J only depends on the on-shell behaviour of J(p).
After these preliminaries, we will turn our attention to the amplitude
0+ |0− J for the no-particle state to remain a no-particle state even after
the action of J(x), once as a source and once as a sink. (The 0+ |0− J
is usually called the vacuum persistence amplitude.) Because the source
is assumed to be weak, this amplitude will differ from unity by a small
quantity: 0+ |0− J = 1 + F [J]. Further since each action of J changes the
state only from a no-particle state to a one-particle state and vice-versa,
F [J] must be a quadratic functional of J to the lowest order. Therefore,
the amplitude can be expressed in the form
1
0+ |0− J = 1 + F [J] ≡ 1 −
d4 x2 d4 x1 J(x2 )S(x2 − x1 )J(x1 ) (2.3)
2
where S(x2 −x1 ) is a function, which can be taken to be symmetric without
loss of generality,
(2.4)
S(x2 − x1 ) = S(x1 − x2 )
3 Originally due to J. Schwinger; in
fact, this entire description is adapted
from Schwinger’s Source theory, a powerful alternative to conventional quantum field theory, almost universally ignored by particle physicists.
that needs to be determined. It can depend only on x2 − x1 because
of translational invariance. (The −1/2 factor in Eq. (2.3) is introduced
for future convenience.) To determine its form, we can use the following
argument.3
Imagine that we divide the source J(x) into two parts J1 (x) and J2 (x)
(with J = J1 +J2 ) such that J1 acts in the causal future of J2 in some frame
of reference. This is easily done by restricting the support for the functions
J1 (x), J2 (x) to suitable regions of spacetime. When J2 (x) acts on the noparticle state it will, most of the time, do nothing, while occasionally it will
2.1. Sources that Disturb the Vacuum
47
produce a particle with the amplitude 1p |0− J2 , so that we can assume
0+ |0− J2 1 + F [J2 ]
(2.5)
After J2 has ceased to operate, the resulting state (either no-particle or
one-particle state) will evolve undisturbed until J1 starts operating. If it
destroys the single particle state with the amplitude 0+ |1p J1 , we would
have reached back to the no-particle state. The entire process is described
by the amplitude
0+ |1p J1 1p |0− J2
0+ |0− J1 +J2 = 0+ |0− J1 0+ |0− J2 +
= 1 + F [J1 ] + F [J2 ] +
p
(iJ1∗ (p))(iJ2 (p))
(2.6)
p
We now demand that the expression for 0+ |0− J1 +J2 should not depend
individually on the manner in which J is separated into J1 and J2 but
should just be a functional of J alone, so that
0+ |0− J1 +J2 ≡ 0+ |0− J = 1 + F [J] = 1 + F [J1 + J2 ]
(2.7)
where we have used Eq. (2.3). Substituting into Eq. (2.6), we get
(iJ1∗ (p)) (iJ2 (p))
F [J1 + J2 ] − F [J1 ] − F [J2 ] =
(2.8)
p
Using the form of F in Eq. (2.3), the left hand side works out to be
1
d4 x2 d4 x1 [J1 (x1 )J2 (x2 ) + J2 (x1 )J1 (x2 )] S(x2 − x1 )
−
2
= − d4 x2 d4 x1 J2 (x2 )J1 (x1 )S(x2 − x1 ) (2.9)
where we have used the symmetry of S(x). The right hand side of Eq. (2.8),
on the other hand, depends only on the on-shell value of J(p) and is given
by, for real sources,
∗
4
4
(iJ1 (p))(iJ2 (p)) = − d x2 d x1 dΩp J2 (x2 )J1 (x1 )e−ip(x2 −x1 )
p
=
−
4
4
d x2 d x1 J2 (x2 )
dΩp e
−ip(x2 −x1 )
J1 (x1 )
(2.10)
where we have converted the sum to an integral4 with the invariant measure
dΩp . Comparison of Eq. (2.9) and Eq. (2.10) allows us to read off the form
of S(x) to be
(for t > 0)
(2.11)
S(x) = dΩp e−ipx = G+ (x2 ; x1 )
To find the form of S(x) = S(t, x) for negative values of t = −|t|, we will
use the symmetry property S(−|t|, x) = S(+|t|, −x) and use Eq. (2.11) for
S(|t|, −x). This gives:
S(−|t|, x) = S(|t|, −x) =
dΩp e−iωp |t|−ip·x = dΩp eiωp t−ip·x
=
dΩp eipx = G− (x2 ; x1 )
(2.12)
4 The summation over momenta on
the left hand side of the above equation
is an on-shell summation over the 3
vector p and e−ip(x−x ) is evaluated
on-shell; therefore to maintain Lorentz
invariance it has to be converted to an
integral over dΩp .
48
Chapter 2. Disturbing the Vacuum
In arriving at the third equality, we have used the fact that |t| = −t for
t < 0. Using the usual trick of flipping the sign of p, we can write the full
expression for S(x) as
(2.13)
S(x) = dΩp e+ip·x e−iωp |t| = G(x2 ; x1 )
5 We
used the causal ordering of x2
and x1 to determine the form of S(x).
Such a causal ordering has a Lorentz
invariant meaning only when x1 and x2
are related by a timelike interval. But
we know that when these two events
are separated by a spacelike interval,
the numerical values of the expressions
in Eq. (2.11) and Eq. (2.12) are the
same; see Eq. (1.89). Therefore our
analysis determines S(x) everywhere
unambiguously.
The identification of S(x) with G(x2 ; x1 ) allows us to write the amplitude
0+ |0− J in the form5
1
d4 x2 d4 x1 J(x2 )S(x2 − x1 )J(x1 )
0+ |0− J = 1 −
2
1
d4 x2 d4 x1 J(x2 )G(x2 ; x1 )J(x1 )
= 1−
(2.14)
2
This has an obvious and nice interpretation of a particle being created at
x1 due to the action of the source, followed by its propagation from x1 to x2
and its destruction by the source acting for the second time. The fact that
the propagation amplitude G(x2 ; x1 ) — originally found using the path
integral for a relativistic particle — emerges from a completely different
perspective (of sources perturbing the no-particle state, thereby creating
and destroying particles) shows that we are on the right track in our interpretation. If we now write the amplitude G(x2 ; x1 ) using Eq. (1.86),
separating out propagation with positive and negative energies, the integral in Eq. (2.14) will also separate into two terms allowing interpretation
in terms of particles and antiparticles.
So far, we have been discussing a weak source in which J(x) should
actually be thought of as a small perturbation. In the case of a strong
source, the situation becomes more complicated. But it can be handled,
essentially by exponentiating the result in Eq. (2.3) if we consider particles
to be non-interacting with each other. In this case, the source can create
and destroy more than one particle but each pair of creation and destruction events is treated as independent. This assumes that: (i) parts of the
source which participate in the creation and destruction of particles do not
influence each other and (ii) the propagation amplitude for any given particle is not affected by the presence of other particles. Then the amplitude
0+ |0− J is given by the product of amplitudes for individual events and
we get
1
J
d4 x2 d4 x1 J(x2 )G(x2 ; x1 )J(x1 )
(2.15)
0+ |0− = exp −
2
To understand this result, let us again consider a situation when J = J1 +J2
with J1 acting in the future of J2 . On substituting J = J1 +J2 in Eq. (2.15),
it separates into a product of three terms expressible as
0+ |0− J = 0+ |0− J1 exp − d4 xd4 x J1 (x)G(x; x )J2 (x ) 0+ |0− J2
(2.16)
The first and the last factors are easy to understand as the effect of individual components, while the middle factor describes the creation of particles
by J2 followed by their destruction etc. (This is just the exponential form
of the result in Eq.(2.9).) Expressing the argument of the exponential
in Fourier space as p (iJ1∗ (p))(iJ2 (p)) (see Eq. (2.10)) and expanding the
2.1. Sources that Disturb the Vacuum
49
exponential:
∗
exp
=
iJ1p (iJ2p )
p
∗
exp iJ1p
iJ2p
p
=
∗ np
(iJ1p
)
(iJ2p )np
(np !)1/2 (np !)1/2
p n =0
(2.17)
p
we find that the entire amplitude in Eq. (2.16) can be written in the form
0+ |0− J =
0+ |{n}J1 {n}|0− J2
(2.18)
{n}
where we have defined
{n}|0− J = 0+ |0− J
(iJp )np
;
(np !)1/2
p
0+ |{n}J = 0+ |0− J
(iJp∗ )np
(np !)1/2
p
(2.19)
and the summation in Eq. (2.18) is over sets of all integers labeled by momenta. The expression in Eq. (2.18) has a straightforward interpretation:
The source J2 creates a set of particles labeled by the occupation numbers
np in the momentum state p with amplitude {n}|0− which are subsequently destroyed by J1 with amplitude 0+ |{n}. The Eq. (2.19) shows
that each of these amplitudes is made of the source and sink acting inden
pendently to produce multiple particles indicated by the factor Jp p etc.
The (np !)1/2 factor in the amplitude gives rise to (np !) in the probability,
taking care of the indistinguishability of created particles. The standard
n
translational invariance argument now
shows that the factor Jp p gives the
state |{n} the four-momentum P a = p np pa as expected. Thus, our exponentiated result in Eq. (2.15) correctly describes a set of non-interacting
particles produced and destroyed by the source J.
The result Eq. (2.15) can also be obtained more formally along the
following lines: Consider a strong source J divided up into a large number
of small, uncorrelated and independent weak sources such that J = α Jα
where α = 1, 2, · · · N with very large N . The independence of each weak
component is characterized by the condition
(for α = β)
(2.20)
d4 x d4 x Jα (x)G(x; x )Jβ (x ) = 0
In that case, the net amplitude 0+ |0− J is given by the product over the
individual amplitudes produced by each part of the source:
1
d4 x2 d4 x1 Jα (x2 )G(x2 ; x1 )Jα (x1 )
0+ |0− J =
1−
2
α
1
= exp −
d4 x2 d4 x1 Jα (x2 )G(x2 ; x1 )Jα (x1 )
2 α
1
d4 x2 d4 x1 Jα (x2 )G(x2 ; x1 )Jβ (x1 )
= exp −
2
αβ
1
= exp −
(2.21)
d4 x2 d4 x1 J(x2 )G(x2 ; x1 )J(x1 )
2
The second equality follows from the fact that for small x we can write
(1 + x) ≈ ex in order to convert products into sums in the exponential.
Exercise 2.1: Make sure you understand the notation and the validity of
interchange of products and summations in these expressions.
50
6 Eventually this will translate into
what is called free-field theory coupled
to an external, c-number source, in
contrast to an interacting field theory.
Chapter 2. Disturbing the Vacuum
The third equality follows from Eq. (2.20). Thus, even in the case of a
strong source we can express the amplitude 0+ |0− entirely in terms of
G(x2 ; x1 ) as long as we ignore the interaction between the particles.6
2.1.2
Vacuum Instability and the Interaction Energy
of the Sources
The above result allows us to extract some interesting conclusions regarding the effect of external sources on the vacuum (that is, no particle) state.
In general, the complex number 0+ |0− will have a modulus and phase
and physical consistency requires that |0+ |0− |2 should contain information about the creation of real, on-shell, particles from the vacuum by the
action of the external source. This, in turn, imposes two constraints on
this quantity: First, we must have |0+ |0− |2 < 1 for probabilistic interpretation. Second, it can only depend on the on-shell behaviour of J(p).
We will now verify that these conditions are met and — in the process
— obtain an expression for the mean number of particles produced by the
action of the source. From Eq. (2.21) we find that |0+ |0− |2 is given by
1
d4 x2 d4 x1 J(x2 )J(x1 )[G(x2 ; x1 ) + G∗ (x2 ; x1 )]
|0+ |0− |2 = exp −
2
(2.22)
where
∗
(2.23)
G(x2 ; x1 ) + G (x2 ; x1 ) =
dΩp eip·x−iωp |t| + e−ip·x+iωp |t|
=
dΩp eip·x e−iωp |t| + e+iωp |t|
= 2 dΩp eip·x cos(ωp |t|)
= 2 dΩp eip·x cos(ωp t) = dΩp e−ipx + eipx
Exercise 2.2: Write Eq. (2.21) in momentum space and use (z + i)−1 =
P(z −1 ) + iπδ(z) (where P denotes the
principal value) to get the result in
Eq. (2.24) faster.
In arriving at the second equality, we have done the usual flipping of vector
p in the second term; in arriving at the fourth equality we have used the
fact that the cosine is an even function; and finally we have flipped the
vector p again in the second term to get the last expression. Therefore
1
d4 x2 d4 x1 J(x2 )J(x1 )[G(x2 ; x1 ) + G∗ (x2 ; x1 )]
2
1
4
4
d x2 d x1 J(x2 )J(x1 ) dΩp e−ipx + eipx
=
2
= d4 x2 d4 x1 J(x2 )J(x1 ) dΩp e−ip(x2 −x1 )
= dΩp |J(ωp , p)|2
(2.24)
In arriving at the second equality we have used the fact that the product
J(x1 )J(x2 ) is symmetric in x1 and x2 . This allows us to retain only the
e−ipx term and multiply the result by the factor 2. Substituting Eq. (2.24)
in Eq. (2.22), we find:
d3 p 1
2
|J(ωp , p)|2
(2.25)
|0+ |0− | = exp −
(2π)3 2ωp
2.1. Sources that Disturb the Vacuum
51
The result (which is gratifyingly less than unity) shows that one can interpret (1/2ωp)|J(ωp , p)|2 as the probability density for the creation of a
particle with momentum p. As expected, the result depends only on the
on-shell behaviour J(ωp , p) of J(p). Since ωp ≥ m, this probability can
be non-zero only if the Fourier component J(ωp , p) is non-zero, suggesting
that a static source cannot produce any particles.7
Going back to the notation in which the integration over dΩp is replaced
by a summation over momenta on-shell, we can write the same result in an
intuitively understandable manner as
(iJp )(iJp )∗
|0+ |0− |2 = exp − dΩp |J(ωp , p)|2 = exp −
=
exp −|1p |0− |2
7 This ‘obvious’ result is not always
true, as we will see in later chapters! It
depends on the nature of the coupling.
p
(2.26)
p
Thus, |0+ |0− |2 is determined by the probability for production of single
particles from the initial vacuum state which is the (only) input we started
out with. The production of particles is essentially governed by a Poisson
distribution of the form λn e−λ /n! with λ = |1p |0− |2 in each mode. The
total probability is given by the product over all the modes. When the
argument of the exponent is small, corresponding to a weak source, this
result reduces to
|1p |0− |2
(2.27)
|0+ |0− |2 ≈ 1 −
Exercise 2.3: Compute the probability for the source to produce n particles
with momentum p and show that it is
indeed given by a Poisson distribution.
p
which is essentially a statement of probability conservation. Thus everything makes sense. This is the first non-trivial application of the formalism
we have developed so far; it gets better.
The above analysis concentrated on the probability |0+ |0− |2 which, of
course, loses the information about the phase of 0+ |0− . As it turns out,
the phase also contains valuable information about the physical system.
To unravel this, let us consider another extreme example of a source J(x)
which is static. Obviously, we do not expect any real particle production to
take place in this context and we must have |0+ |0− |2 = 1. What we are
interested in is the phase of the complex amplitude 0+ |0− . For a static
source,8 the argument of the exponent occurring in 0+ |0− is given by:
−
1
2
∞
∞
d3 x1 d3 x2 J(x1 )J(x2 )
dt1
dt2 dΩp eip·x e−iωp |t2 −t1 |
−∞
−∞
1
= −
d3 x1 d3 x2 dΩp J(x1 )J(x2 )eip·(x2 −x1 )
2
∞
∞
×
dt1
dt2 e−iωp |t2 −t1 |
(2.28)
−∞
−∞
To do the time integrals, introduce the variables T = (1/2)(t1 + t2 ), t =
t2 − t1 so that dt1 dt2 = dT dt. The integral over T is of course divergent but
we will deal with it later on; the integral over t leads to a factor (−2i/ωp )
making the argument of the exponent to be:
∞
dT
i
−∞
d3 x1 d3 x2 J(x1 )J(x2 )
d3 p 1 ip·(x2 −x1 )
e
≡ −i
(2π)3 2ωp2
∞
−∞
dT E
(2.29)
8 More precisely, we are talking about
a source which varies very little over a
timescale L/c where L is the spatial
extent of the source; but we will be a
little cavalier about this precise definition and take J(t, x) = J(x).
52
Chapter 2. Disturbing the Vacuum
where
E≡
1
2
with
d3 x d3 yJ(x)V (x − y)J(y)
V (x) = −
e−m|x|
d3 p eip·x
=
−
(2π)3 p2 + m2
4π|x|
(2.30)
(2.31)
So we can express Eq. (2.28) as:
0+ |0− = exp −i
∞
−∞
9 The integral in Eq. (2.31) is essentially the Fourier transform of the
propagation amplitude in the static
case — obtained by setting p0 = 0
in the Fourier transform of the propagator. So we find that the spatial
Fourier transform of the propagation
amplitude gives the effective interaction potential of the theory. This turns
out to be a fairly general feature.
10 The fact that “like” sources attract while interacting through a scalar
field — which is what our G(x2 ; x1 )
was reinterpreted as in Sect. 1.5 —
is related to its Lorentz transformation properties. We will see later (see
Sect. 3.6.2) that if the sources were vector fields Ja (x) interacting via a corresponding vector field, then like sources
will repel each other (as in the case of
electromagnetism). If the sources were
second rank symmetric tensor fields
Tab (x) then like sources will again attract; the corresponding field in that
case is called gravity, but we don’t understand it.
dT E
(2.32)
Thus all that the static source has done is to increase the phase of the
amplitude 0+ |0− at a steady rate of E for infinite duration. This suggests
interpreting E as the energy due to the presence of the static source J(x).
(This also agrees with the interpretation based on Eq. (1.40) generalized
to the field theory context.)9
The expression in Eq. (2.31) shows that E is negative, indicating that
the sources attract each other. Further, the explicit form of V (x) allows us
to interpret the resulting force of attraction as due to a Yukawa potential
of mass m. It is as though the exchange of particles of mass m between the
two sources governed by the amplitude G(x2 ; x1 ) appears in the Fourier
space as an effective potential of interaction between the sources which
are capable of exchanging these particles. This is remarkable when you
remember that the amplitude G(x2 ; x1 ) in real space shows no hint of such
varied dynamical properties.10 All that remains is to bring out the fields
which are hiding inside these expressions. This task, as you will see, turns
out to be somewhat easier now.
2.2
From the Source to the Field
We have found earlier in Sect. 1.5 that G(x2 ; x1 ) can be reinterpreted in
terms of a field φ(x), which in turn, is made of infinite number of harmonic oscillators each labeled by a momentum vector p. In that interpretation, G(x2 ; x1 ) was related to the expectation value of the operator
T [φ(x2 )φ† (x1 )] in the ground state of all oscillators. Somewhat intriguingly, we also found that the first excited state of the oscillator labeled by
p corresponds to a one-particle state with momentum p.
In Sect. 2.1 we approached the problem from a different perspective by
explicitly introducing sources J(x) which can create or destroy a particle.
Consistency of the formalism demands that we should be able to recover
the field φ(x), as well as the harmonic oscillators it is made of, from the
expression for GJ (x2 ; x1 ). (This is exactly in the same spirit as our ‘discovering’ the fields from the expression for G(x2 ; x1 ) in Sect. 1.5). As
you might have guessed, both these goals can indeed be achieved, thereby
connecting the discussion in Sect. 2.1 with that in Sect. 1.5.
We will first discuss, in the next section, how one can obtain the field
φ(x) and its action functional from the expression for GJ (x2 ; x1 ). In Sect.
2.2.3, we will show how to relate this expression to individual harmonic
oscillators (which turns out to be a lot easier).
2.2. From the Source to the Field
2.2.1
53
Source to Field: Via Functional Fourier Transform
To motivate the connection between 0+ |0− J and the fields in spacetime,
we start with the expression for the vacuum persistence amplitude:
1
(2.33)
d4 xd4 yJ(x)G(x; y)J(y)
0+ |0− J = exp −
2
which depends both on the nature of the source J(x) as well as the propagation amplitude G(x; y), and ask how we can extract the information
contained in G(x; y) without having to bother about the source J(x). To
do this we will rewrite Eq. (2.33) in a suggestive form as
1
1
Jx Mxy Jy = exp − (J T M J)
(2.34)
0+ |0− J → G(J) ≡ exp −
2 x,y
2
where M (x − y) ≡ G(x; y) is treated as a symmetric matrix with elements
denoted by Mxy and J(y) is treated as a column vector with elements
denoted by Jy . This is a discretised representation of the spacetime continuum in which the integrals over d4 x and d4 y (suitably discretized in
a dimensionless form) become summation over the discrete indices. It is
now obvious that we can express Mxy as the second derivative of 0+ |0− J
evaluated at J = 0:
∂ 2 0+ |0− J
(2.35)
Mxy = G(x; y) = −
∂Jx ∂Jy
J=0
On the other hand, we know from earlier analysis (see Eq. (1.128)) that
G(x; y) = 0|T [φ(x)φ(y)]|0 where we have taken the field φ to be Hermitian
for simplicity. This gives the identification
∂ 2 0+ |0− J
= G(x; y) = 0|T [φ(x)φ(y)]|0
(2.36)
−
∂Jx ∂Jy J=0
Such a result is reminiscent of generating functions related to probability
distributions. The right hand side of Eq. (2.36) is the expectation (‘mean’)
value of the time-ordered φ(x)φ(y) which could be compared to the correlation function of a stochastic variable. The 0+ |0− J in the left hand
side then becomes analogous to the generating function for the probability
distribution. Recall that the generating function G(λi ) (given in terms of a
set of variables λ1 , λ2 ...λN ) and the corresponding probability distribution
P(qi ) (given in terms of some variables q1 , q2 ...qN ) are related11 by:
⎞
⎞
⎛
⎛
N
N
dλj e−iλj qj ⎠ G(λj ); G(λi ) = ⎝
dqj eiλj qj ⎠ P(qj )
P(qi ) ≡ ⎝
j=1
j=1
(2.37)
so that
N
∂ 2 G
=−
dqk P(qk ) qi qj ≡ −qi qj
∂λi ∂λj λ=0
k=1
If the generating function is a Gaussian, G(λi ) = exp −(1/2)
then, we will have −Mij = [∂ 2 G/∂λi ∂λj ]|λ=0 leading to
qi qj = Mij
(2.38)
ij
λi Mij λj ,
(2.39)
11 Usually one uses the Laplace transform for greater convergence but with
a suitable contour in the complex
plane, one can also work with the
Fourier transforms; in any case, we are
only doing some formal manipulation
here.
54
Exercise 2.4: Prove Eq. (2.40) by diagonalizing the matrix M .
12 The
expression
P(0)
∝
(det M )−1/2 actually contains interesting physics which we will soon
come back to.
Chapter 2. Disturbing the Vacuum
This is completely analogous to what we have in Eq. (2.36) suggesting that
we can think of 0+ |0− J as the analogue of a generating function and
look for the corresponding probability distribution P(qi ). Since Eq. (2.36)
tells us that the generating function leads to the mean value of φ(x)φ(y),
it makes sense to think of φ(x) as the analogue of the qi . So we should
compute the Fourier transform of 0+ |0− J with respect to J by introducing
a conjugate variable φx which is the discretised version of some field φ(x)
in spacetime.
For evaluating this Fourier transform we need a preliminary result. In
the case of a finite N -dimensional matrix we know that
1
P(φ) ≡
dD J exp − J T M J − iφT J
(2.40)
2
−1
1
1 T
= (2π)D/2 (det M )−1/2 exp − φT M −1 φ ≡ P(0)e− 2 φ M φ
2
where M −1 is the inverse of the matrix M . We are interested in the case of
an infinite dimensional matrix where the indices are continuous variables.
In that case we can think of the integration dD J as a functional integral
DJ over the functions J(x). The prefactor (2π)D/2 (det M )−1/2 can become
ill-defined for an infinite continuous matrix, but it is independent of φ; so,
for the moment12 we can just call it P(0). What remains is to give meaning
to the expression
φx (M −1 )xy φy ⇒ d4 x d4 y φ(x)D(x − y)φ(y)
(2.41)
φT M −1 φ =
x,y
so that we can write the final result of the functional Fourier transform as:
1
P(φ) = P(0) exp − (φT M −1 φ)
2
1
⇒ P(0) exp −
(2.42)
d4 x d4 y φ(y)D(x − y)φ(x)
2
So, in the continuum limit, we need to find some operator D(x − y) which
will play the role of the inverse matrix. This requires giving a meaning
to the relation M −1 M = 1 for a matrix with continuous indices by the
relation:
(M −1 )xz Mzy = δxy ⇒ dz D(x − z)M (z − y) = δ(x − y)
(2.43)
z
To find the form of D(x − y) from this integral relation, we express the left
hand side in Fourier space obtaining
d4 p
D(p)M (p)e−ip(x−y) = δ(x − y)
(2.44)
(2π)4
which is satisfied if D(p) = 1/M (p). From Eq. (1.92), we know that M (p) =
i(p2 − m2 + i )−1 and hence D(p) = −i(p2 − m2 + i ). In position space,
the expression for D(x − y) is given by the Fourier transform:
d4 p
(p2 − m2 + i ) e−ip(x−y)
(2.45)
D(x − y) = −i
(2π)4
d4 p
= i
(x + m2 − i ) e−ip(x−y)
(2π)4
=
i(x + m2 − i ) δ(x − y)
2.2. From the Source to the Field
55
Obviously this is a somewhat singular operator but it leads to meaningful
results when substituted into Eq. (2.41). We get
4
d x d4 y φ(x)D(x − y)φ(y)
= i d4 x d4 y φ(x)(x + m2 − i )δ(x − y)φ(y)
= i d4 x φ(x)(x + m2 − i ) d4 y δ(x − y)φ(y)
= i d4 x φ(x)(x + m2 − i )φ(x)
(2.46)
Substituting back into Eq. (2.42), we get the final result to be
i
P(φ) = P(0) exp −
d4 x φ(x)( + m2 − i )φ(x)
2
(2.47)
This is the first example of functional integration and functional Fourier
transforms which we have come across and — as you can guess — many
more will follow. All of them require (i) some kind of discretisation followed
by (ii) using a result for a finite dimensional integral and finally (iii) taking a
limit to the continuum. Most of the time this will involve finding the inverse
of either an operator or a symmetric function in two variables acting as a
kernel to a quadratic form. The procedure we use in all these cases is the
same. The inverses of the operators are defined in momentum space as just
reciprocals (of algebraic expressions) and can be Fourier transformed to
give suitable results in real space. In this particular example, the relation
−i(p2 − m2 + i )
i
=1
(p2 − m2 + i )
(2.48)
in momentum space translates into the result
i(x + m2 − i )G(x; y) = δ(x − y)
(2.49)
in real space. One can, of course, directly verify in real space that Eq. (2.49)
does hold.
The term φ(x)φ(x) = φ(x)[∂a ∂ a φ(x)] occurring in Eq. (2.47) can be
expressed in a different form using the result
d4 x φ φ =
d4 x ∂a (φ∂ a φ) −
d4 x ∂a φ∂ a φ
V
V
V
3
a
d x na φ∂ φ −
d4 x ∂a φ∂ a φ
(2.50)
=
∂V
V
where the integration is over a 4-volume V with a boundary ∂V which has
a normal na . If we now assume that, for all the field configurations we
are interested in, we can ignore the boundary term,13 then one can replace
(∂a φ∂ a φ) by (−φ φ). So, an equivalent form for P(φ) is given by
i
P(φ) = P(0) exp
d4 x ∂a φ ∂ a φ − (m2 − i )φ2 ≡ P(0)eiA[φ] (2.51)
2
where A is given by
A=
1
2
Exercise 2.5: Prove this result using the expression in Eq. (1.128) for
G(x; y).
d4 x ∂a φ∂ a φ − m2 φ2
(2.52)
13 The
cavalier attitude towards
boundary terms is a disease you might
contract if not immunized early on
in your career.
Most of the time
you can get away with it; but not
always, so it is good to be cautious.
In this case, the equivalence (in the
momentum space) arises from two
ways of treating p2 |φ(p)|2 either as
φ∗ (p)p2 φ(p) and getting −φ(x) φ(x)
or writing it as [pφ(p)][p∗ φ∗ (p)]
leading to [−i∂a φ][i∂ a φ] = ∂a φ∂ a φ.
56
Chapter 2. Disturbing the Vacuum
which is essentially the action functional for the field φ(x) we constructed
from the oscillators — see Eq. (1.146) and note that we are now dealing
with a real φ(x). (It is understood that m2 should be replaced by (m2 − i )
when required; we will not display it explicitly all the time.) Our entire
analysis can therefore be summarized by the equation:
J
4
J(x)φ(x)d x = P(0)eiA[φ]
(2.53)
DJ 0+ |0− exp −i
14 This analogy gets better in the Euclidean space. The factor e−AE does
make sense as a probability for a field
configuration in many contexts.
15 The
P[(0)] stands for P[φ = 0]
while Z[(0)] is Z[J = 0]. But since
both are just constants, this notation
is unlikely to create any confusion.
This relation tells you that the (functional) Fourier transform of the vacuum
persistence amplitude leads to the action functional for the field. If we
think of 0+ |0− J as a generating function, then eiA[φ] is the corresponding
probability amplitude distribution.14
Clearly all the information about our propagation amplitude G(x2 ; x1 )
is contained in A and, in fact, we can extract it fairly easily. This is because
A can be thought of as an action functional for a real scalar field which —
in turn — can be decomposed, in Fourier space, into a bunch of harmonic
oscillators. Then the action decomposes into a (infinite) sum of actions
for individual oscillators as in Eq. (1.145). The dynamics of the system
(either classical or quantum) is equivalent to that of an infinite number
of harmonic oscillators, each labeled by a vector p and frequency ωp . To
study the quantum theory, one can introduce the standard creation and
annihilation operators for each oscillator. It is now obvious that we will
be led to exactly the same field operator which we introduced earlier in
Eq. (1.129). The ground state will now be the ground state of all the
oscillators and the amplitude G(x2 ; x1 ) can again be constructed as the
ground state expectation value of the time ordered product as done in
Eq. (1.128).
In this (more conventional) approach, one will be interested in obtaining
0+ |0− J starting from a theory specified by a given action functional. This
is given by the inverse Fourier transform of Eq. (2.53):
−1
J
4
[P(0)] 0+ |0− = Dφ exp iA[φ] + i J(x)φ(x)d x ≡ Z[J] (2.54)
where the last equality defines Z[J]. Given the fact that 0+ |0− J=0 = 1,
we also have the relation15
−1
[P(0)] = Z(0) = Dφ eiA[φ]
(2.55)
which essentially gives (detM )−1/2 as a functional integral. (We will come
back to this term soon.). From the explicit expression of 0+ |0− J given
by Eq. (2.33), it is also obvious that
1
d4 x2 d4 x1 J(x2 )G(x2 ; x1 )J(x1 )
(2.56)
Z(J) = Z(0) exp −
2
This result can be expressed in a slightly different form which makes
clear some of the previous operations related to functional integrals. In
the Eq. (2.54) — which expresses the vacuum persistence amplitude in the
presence of a source J in terms of a functional Fourier transform — we will
ignore a surface term in A and write exp(iA) in the form
i
1
iA
4
2
e = exp −
d x φ(x)( + m − i )φ(x) = exp −
d4 x φ(x)Dφ(x)
2
2
(2.57)
2.2. From the Source to the Field
57
where D is the operator D ≡ i( + m2 − i ). We also know from Eq. (2.49)
that G(x2 ; x1 ) can be thought of as a coordinate representation of the inverse D−1 of this operator. Comparing these results, we see that Eq. (2.56)
can be expressed16 in the form
1
d4 x φ(x)Dφ(x) + i J(x)φ(x)d4 x
Z(J) =
Dφ exp −
2
1
(2.58)
d4 x2 d4 x1 J(x2 )D−1 J(x1 )
= Z(0) exp −
2
with D−1 → G(x2 ; x1 ) in coordinate representation. In our approach, we
first obtained the right hand side of this equation from physical considerations and then related it to the left hand side by a functional Fourier transform thereby obtaining Eq. (2.54). Alternatively, we could have started
with the expression for A, calculated the functional Fourier transform of
exp iA in the left hand side and thus obtained the right hand side — which
is what usual field theory text books do.
In such an approach, we can obtain an explicit expression for G(x2 ; x1 )
in terms of the functional derivatives17 of Z(J) which generalizes the
derivatives with respect to Jx etc which we started out in the discretised
version in Eq. (2.35) and Eq. (2.36). On calculating the functional derivatives of the right hand side of Eq. (2.58) we have the result‘:
δ
δ
1
= G(x2 ; x1 )
(2.59)
i
i
Z[J]
Z(0) δJ(x1 )
δJ(x2 )
J=0
which is in direct analogy with Eq. (2.36). On the other hand, we can
also explicitly evaluate the functional derivatives on the left hand side of
Eq. (2.58), bringing down two factors of φ. Further, we know that the
G(x2 ; x1 ) arising from the right hand side is expressible as the vacuum
expectation value of the time ordered product (see Eq. (1.128)) of the
scalar fields. This leads to the relation:
δ
δ
1
i
i
Z[J]
Z(0) δJ(x2 )
δJ(x1 )
J=0
1
4
= Dφ φ(x2 )φ(x1 ) exp −
d x φ(x)Dφ(x)
2
= 0|T [φ(x2 )φ(x1 )]|0
(2.60)
This relation will play a key role in our future discussions.18 The fact
that you reproduce a ground state expectation value in Eq. (2.60) shows
that these path integrals are similar to the ones discussed in Sect. 1.2.3.
We usually assume that the time integral in d4 x is from −∞ to ∞ with
t interpreted as t(1 − i ). Therefore the functional integrals automatically
reproduce ground state expectation values.
2.2.2
Functional Integral Determinant: A First Look
at Infinity
It is now time to look closely at the expression for Z(0) which we promised
we will come back to. From the discussion around Eq. (2.40), we see that
P(0) ∝ (detM )−1/2 ∝ (detD)1/2 = Z[0]−1 ⇒ Z(0) ∝ (detD)−1/2 (2.61)
16 Mnemonic: We have −(1/2)Dφ2 +
iJφ = −(1/2)D[(φ − iJ/D)2 +
J 2 /D 2 ] on completing the square.
The Gaussian integral then leads to
exp[−(1/2)D −1 J 2 ].
17 Mathematical
supplement
Sect. 2.3.1 describes the basic
mathematics behind functional differentiation and other related operations.
This is a good time to read it up
if you are in unfamiliar territory.
Fortunately, if you do the most
obvious thing you will be usually right
as far as functional operations are
concerned!
18 Note that we need a time ordering operator in the expectation value
0|T [φ(x2 )φ(x1 )]|0 expressed in the
Heisenberg picture while we don’t need
to include it explicitly in the path integral expression. The reason should be
clear from the way we define the path
integral — by time slicing, as discussed
in Sect. 1.6.1. Depending on whether
x01 > x02 or the other way around, the
time slicing will put the two operators
in the time ordered manner when we
evaluate the path integral. Make sure
you understand how this happens because the time ordering — which is vital in our discussion — is hidden in the
path integral.
58
Chapter 2. Disturbing the Vacuum
where we have used Eq. (2.55) and the fact that D and M are inverses of
each other. This result expresses a relation which will also play a crucial
role in our later discussions:
1
Z(0) = Dφ exp −
d4 x φ(x)Dφ(x) ∝ (Det D)−1/2
(2.62)
2
19 If you think of exp(−A [φ]) as analE
ogous to a probability, the path integral obtained by summing over all φ
is analogous to the partition function
in statistical mechanics. (In fact, the
notation Z is a reminder of this.) We
know that the physically useful quantity is the free energy F , related to the
partition function by Z = exp(−βF ).
The effective action Aeff bears the
same relation to our Z as free energy
does to partition function in statistical
mechanics.
It should be obvious from our discussions leading up to Eq. (2.40) that this
result is independent of the nature of the operator D. In later chapters we
will have to evaluate this for different kinds of operators and we will now
develop a few tricks for evaluating the determinant of an infinite matrix or
operator.
Let us work in the Euclidean sector to define the path integral in
Eq. (2.62). It is then physically meaningful to write Z(0) as exp(−Aeff )
where Aeff is called the effective action.19 In many situations — like the
present one — it is possible to write Aeff as a spacetime integral over an
effective Lagrangian Leff . In this particular case one can provide a direct
interpretation of Leff as vacuum energy density along the following lines:
When the Euclidean time integrals go from −∞ to +∞ we know that (see
Sect. 1.2.3) the path integral will essentially give exp(−T E0 ) where T is
the formally infinite time interval and E0 is the ground state energy. If the
energy density in space is H, then we would expect the exponential factor
to be exp(−T V H) where V is the volume of the space. In other words we
expect the result
(2.63)
Z(0) ∝ exp − d4 xE H = exp − d4 xE LE
eff
where LE
eff is the effective Lagrangian in the Euclidean sector and H should
be some kind of energy density; the last equality arises from the fact that,
in the Euclidean sector, H = LE
eff in the present context. If we analytically
continue to the Lorentzian spacetime by tE = it, d4 xE = id4 x, this relation
becomes:
4
4
E
4
M
Z(0) ∝ exp −i d x H = exp −i d x Leff ≡ exp i d x Leff
(2.64)
is
the
effective
Lagrangian
in
the
Lorentzian
(‘Minkowskian’)
where LM
eff
spacetime. The first proportionality clearly shows that H leads to the
standard energy dependent phase while the last equality defined the efM
fective Lagrangian in the Lorentzian spacetime. From H = LE
eff = −Leff
we see that the effective Lagrangians in Lorentzian and Euclidean sectors
differ by a sign.
Let us work this out explicitly and see what we get. We will work in
the Euclidean sector and drop the superscript E from LE
eff for simplicity of
notation. We begin by taking the logarithm of Eq. (2.62) and ignoring any
unimportant additive constant:
1
1
(2.65)
ln Z[0] = − d4 xE Leff = ln(det D)− 2 = − Tr ln D
2
Exercise 2.6: Prove this.
where we have used the standard relation ln det D = Tr ln D. There are
several tricks available to evaluate the logarithm and the trace, which we
will now describe.
2.2. From the Source to the Field
59
The simplest procedure is to introduce a (fictitious) Hilbert space with
state vectors |x on which D can operate, and write:
1
1
d4 xE < x| ln D|x >= − d4 xE Leff
(2.66)
− Tr ln D = −
2
2
so that Leff = (1/2) < x| ln D|x >. Going over to the momentum representation in which D is diagonal, and working in n spatial dimensions for
future convenience, this result becomes:
1
1
dn+1 p
< x| ln D|x >=
x|p ln D(p)p|x
Leff =
2
2
(2π)n+1
1
dn+1 p
ln D(p)
(2.67)
=
2
(2π)n+1
In our case, the form of the operator D in the Euclidean space is
2
d
a
2
2
+ ∇ + m2 = −E + m2
D = (−∂a ∂ + m )E = −
dt2E
so that D(p) = p2 + m2 , giving Leff to be:
1
dn+1 p
Leff =
ln(p2 + m2 )
2
(2π)n+1
(2.68)
(2.69)
The expression is obviously divergent20 and we will spend considerable effort in later chapters to interpret such divergent expressions. In the present
case, we can transform this (divergent) expression to another (divergent)
expression which is more transparent and reveal the fact that Leff is indeed
an energy density.
For this purpose, we first differentiate Leff with respect to m2 and
then separate the dn+1 p as a n-dimensional integral dn p (which will correspond to real space) and an integration over dp0 (which is the analytic
continuation of the zeroth coordinate to give the Euclidean sector from
the Lorentzian sector) and perform the dp0 integration. These lead to the
result (with our standard notation ωp2 = p2 + m2 ):
∂Leff
∂m2
=
=
20 .... not to mention the fact that we
are taking the logarithm of a dimensionful quantity. It is assumed that finally we need to only interpret the difference between two Leff ’s, say, with
two different masses, which will take
care of these issues.
∞
1
1
1
dn+1 p
1
dn p
dp0
=
n+1
2
2
n
0
2
2
(2π)
(p + m )
2
(2π) −∞ (2π) ((p ) + ωp2 )
1
dn p 1
(2.70)
2
(2π)n 2ωp
which is gratifyingly Lorentz invariant, though divergent21 . If we now
integrate both sides with respect to m2 between the limits m21 and m22 we
will get the difference between the Leff for two theories with masses m1
and m2 . We will instead perform one more illegal operation and just write
the formal indefinite integral of both sides with respect to m2 . Then we
find that Leff has the nice, expected form
1
dn p 2
1
dn p
2
ωp
(2.71)
Leff =
p +m =
2
(2π)n
(2π)n 2
The Z(0) factor is now related to an energy density which is the sum of the
zero point energies of an infinite number of harmonic oscillators each having
21 This is the same expression as
Eq. (1.105);
the right hand side is just
(1/2) dΩp — the Lorentz invariant,
infinite, on-shell volume in momentum
space.
60
Chapter 2. Disturbing the Vacuum
a frequency ωk . Once again we see the fields and their associated harmonic
oscillators making their presence felt in our mathematical expressions —
this time through a functional determinant.
In more complicated situations, the operator D can depend on other
fields or external sources. In that case the ground state energy will be
modified compared to the expression obtained above and the difference in
the energy can have observable consequences. In the case discussed above,
Leff is real, and hence — when we analytically continue to the Lorentzian
sector — it will contribute a pure phase to 0+ |0− of the form:
1
d3 k
4
ωk
(2.72)
Z(0) ∝ exp −i d x H ; H = Leff =
(2π)3 2
22 There is a singularity at s = 0 in
this integral; in all practical calculations we will use this expression to
evaluate ln(F/F0 ) for some F0 when
the expression becomes well defined.
As we said before, it is also civil to
keep arguments of logarithms dimensionless.
23 This is the same analysis as the one
we did to arrive at Eq. (1.104) which
provides the physical meaning of the
effective Lagrangian. In the conventional descriptions, the (1/s) factor in
the effective Lagrangian, Eq. (2.73) appears to be a bit of a mystery. But we
saw that if we first define an amplitude
P(x; E) for the closed loop to have
energy E and then integrate over all
E, the (1/s) factor arises quite naturally. This is the proper interpretation
of Leff in terms of closed loops of virtual particles. More importantly, now
we know the oscillators behind G(x; x).
24 It is amusing that the functional integral over the field configurations is
related to the coincidence limit of a
path integral involving particle trajectories. The coincidence limit of the latter path integral gives the propagation
amplitude for the (fictitious?) particle
to start from an event xi and end up
at the same event xi after the lapse of
proper time s corresponding to a closed
curve in spacetime.
which is similar to the situation we encountered in Sect.2.1.2 (see Eq. (2.32)).
In some other contexts Leff can pick up an imaginary part which will lead
to the vacuum persistence probability becoming less than unity, which will
be interpreted as the production of particles by other fields or external
sources.
Most part of the above discussion can be repeated for such a more general D which we will encounter in later chapters. With these applications
in mind, we will now derive some other useful formulas for the effective
Lagrangian and discuss their interpretation.
One standard procedure is to use an integral representation22 for ln D
and write:
1 ∞ ds
1
Leff =
< x| ln D|x >= −
< x| exp −sD|x >
2
2 0 s
∞
ds
1
K(x, x; s)
(2.73)
≡ −
2 0 s
where K(x, x; s) is the coincidence limit of the propagation amplitude for a
(fictitious) particle from y i to xi under the action of a (fictitious) “Hamiltonian” D: That is,
K(x, y; s) =< x|e−sD |y >
(2.74)
Thus our prescription for computing Z(0) is as follows. We first consider
a quantum mechanical problem for a quantum particle under the influence
of a Hamiltonian D and evaluate the path integral propagator K(x, y; s)
for this particle. (The coordinates xiE describe a 4-dimensional Euclidean
space with s denoting time. So this is quantum mechanics in four dimensional space.) We then integrate23 the coincidence limit of this propagator
K(x, x; s) over (ds/s) to find Leff which, in principle, can have a surviving dependence on xi . The functional determinant is then given by the
expression in Eq. (2.66). We thus have the result
1
Z(0) =
Dφ exp −AE (φ) = Dφ exp −
(2.75)
d4 xE φ Dφ
2
1 ∞ ds
4
4
−sD
|x
= exp − d xE Leff = exp − d xE −
x|e
2 0 s
We will use this result extensively in the next chapter.24
If we are working in the Lorentzian (‘Minkowskian’) sector, Eq. (2.62)
still holds but we define the LM
eff with an extra i-factor as
Dφ exp i (−φDφ)d4 x ∝ (Det D)−1/2 = exp i d4 x LM
(2.76)
eff
2.2. From the Source to the Field
with
LM
eff =
i
i
x| ln D|x = −
2
2
61
0
∞
ds
x|e−isD |x
s
(2.77)
In this form,25 it is obvious that D acts as an effective Schrodinger Hamiltonian. (In the specific case of a massive scalar field, the operator in the
Euclidean sector is given by Eq. (2.68) which is a Schrodinger Hamiltonian
for a quantum mechanical particle of mass (1/2) in 4-space dimensions in a
constant potential (−(1/2)m2 ). With future applications in mind, we will
switch to (n + 1) Euclidean space.) The coincidence limit of the propagation amplitude for this quantum mechanical particle can be immediately
written down and substituted into Eq. (2.73) and Leff can be evaluated.
We also often need a general approach that allows handling the divergences in the integrals defining Leff in a more systematic manner. This
is obtained by again introducing a complete set of momentum eigenstates
in the d ≡ (1 + n) Euclidean space and interpreting the action of − in
momentum space as leading to the +p2 factor. Then, we get:
2
1 ∞ ds
dd p −s(p2 +m2 )
1 ∞ ds
Leff = −
x|e−s(−+m ) |x = −
e
2 0 s
2 0 s
(2π)d
(2.78)
The p-integration can be easily performed leading to the result
2
ds
dλ
1
1 ∞
1 ∞
−m2 s
Leff = −
e
=
−
e− 2 m λ
(2.79)
2 0 s(4πs)d/2
2 0 λ(2πλ)d/2
where λ = 2s. (On the other hand, if we had done the s integral first,
interpreting the result as the logarithm, we would have got the result we
used earlier in Eq. (2.69)). This integral is divergent for d > 0 at the lower
limit and needs to be ‘regularized’ by some prescription.26
The simplest procedure will be to introduce a cut-off in the integral at
the lower limit and study how the integral behaves as the cut-off tends to
zero, in order to isolate the nature of the divergences. For example, the
integral for Leff in n = 3, d = 4 dimensions is given by
∞
dλ − 1 m2 λ
1
Leff = − 2
e 2
(2.80)
8π 0 λ3
This integral is quadratically divergent near λ = 0. If we introduce a cut-off
at λ = (1/M 2 ) where M is a large mass (energy) scale27 , then the integral
is rendered finite and can be evaluated by a standard integration by parts.
We then get:
∞
dλ − 1 m2 λ
1
e 2
Leff = − 2
(2.81)
8π M −2 λ3
m2 M 2
m2
1
1
1
4
4
−
m
ln
+
γE m 4
= −
M
+
16π 2
2
64π 2
2M 2
64π 2
where γE is Euler’s constant, defined by
∞
γE ≡ −
e−x ln x dx
(2.82)
0
To separate out the terms which diverge when M → ∞ from the finite
terms, it is useful to introduce another arbitrary but finite energy scale
25 Note that x, s|y, 0 ≡ x|e−isD |y
in the Lorentzian spacetime becomes i times the corresponding Euclidean expression x, s|y, 0E . That
is, x, s|y, 0 = ixE , sE |yE , 0 if
we analytically continue by t =
−itE , s = −isE .
This is clear
from, for e.g., Eq. (1.71), which
displays an explicit extra i factor.
Hence LM
≡ (i/2)x| ln D|x =
eff
−(1/2)xE | ln DE |xE = −LE
eff as it
should.
26 In the specific context of a free field
with a constant m2 , there is not much
point in worrying about this expression. But we will see later on that
we will get the same integral when we
study, e.g., the effective potential for a
self-interacting scalar field. There, it is
vital to know the m2 dependence of the
integral. This is why we are discussing
it at some length here.
27 Note that the λ → 0 limit corresponds to short distances and high energies. Such a divergence is called a
UV divergence.
62
Chapter 2. Disturbing the Vacuum
μ and write m/M = (m/μ)(μ/M ) in the logarithmic term in Eq. (2.81).
Then we can write it as:
m2 M 2
1 4
1
M2
4
+
m
−
ln
M
Leff = −
16π 2
2
4
μ2
1
m2
1
+
m4 ln
+
γE m 4
(2.83)
2
64π
2μ2
64π 2
28 This procedure is called dimensional regularization and is usually
more useful in computation, though
its physical meaning is more abstract
compared to that of evaluating the expression with a cut-off. You never
know what you are subtracting out
when you use dimensional regularization!
29 You need to use the fact that
Γ() ≈ −−1 + γE + O() in this
limit. Expand both (m2 /4π)d/2 and
the (d/2)(d/2 − 1) factors correct to
linear order in , using (m2 /4π)2− ≈
m4 [1 − ln(m2 /4π)] etc.
When we take the M → ∞ limit, the terms in the square bracket diverge
while the remaining two terms — proportional to m4 and m4 ln m2 — stay
finite. Among the divergent terms, the first term, proportional to M 4 ,
can be ignored since it is just an infinite constant independent of m. So
there are two divergences, one which is quadratic (scaling as M 2 m2 ) and
the other which is logarithmic (scaling as m4 log M 2 ). In any given theory
we need to interpret these divergences or eliminate them in a consistent
manner before we can take the limit M → ∞. Even after we do it, the
surviving, finite, terms will depend on an arbitrary scale μ we introduced
and we need to ensure that no physical effect depends on μ. We shall see
in later chapters how this can indeed be done in different contexts.
Another possibility28 — which we will use extensively in the later chapters — is to evaluate the last integral in d ≡ n + 1 “dimensions”, treating
d as just a parameter and analytically continue the result to d = 4. In ddimensions, we have:
2
1 ∞
1 Γ(−d/2) 2 d/2
dλ
1
Leff = −
e− 2 m λ = −
(m )
(2.84)
d/2
2 0 λ(2πλ)
2 (4π)d/2
The last expression is an analytic continuation of the integral after a formal evaluation in terms of Γ functions. As it stands, it clearly shows the
divergent nature of the integral for d = 4 since Γ(−2) is divergent. In this
case, one can isolate the divergences by writing 2 ≡ 4 − d and taking the
→ 0, d → 4 limit. It is again convenient to introduce an arbitrary but
finite energy scale μ and write (m2 )d/2 = μd (m2 /μ2 )d/2 . Then the limit
gives: 29
Leff
=
1 Γ(−d/2) d
−
μ
2 (4π)d/2
=
−
m2
μ2
d/2
d/2
Γ 2 − d2 d m2
1
μ
d d
μ2
(4π)d/2
2 2 −1
3
2
1 m4
m2
− γE − ln
−
+
4 (4π)2
4πμ2
2
1
2
(2.85)
where γE is the Euler’s constant defined earlier in Eq. (2.82). We now
need to take the limit of → 0 in this expression. We see that the finite
terms in Eq. (2.85) again scales as m4 and m4 ln(m2 /4πμ2 ) which matches
with the result obtained in Eq. (2.83). On the other hand, the divergence
in Eq. (2.85) has a m4 / structure which is different from the quadratic
and logarithmic divergences seen in Eq. (2.83). That is, the divergent
terms in Eq. (2.83) had a m4 and m2 dependence while the divergent
term Eq. (2.85) has only a m4 dependence. This is a general feature of
dimensional regularization about which we will comment further in a later
section.
2.2. From the Source to the Field
63
We thus have several equivalent ways of expressing Leff which can be
summarized by the equation
dn k 2
dn+1 p
1
1
2 =
Leff =
k
+
m
ln(p2 + m2 )
n
2
(2π)
2
(2π)n+1
2
1 Γ(−d/2) 2 d/2
1 ∞
dλ
1
= −
e− 2 m λ = −
(m )
(2.86)
1
(n+1)
2 0 λ(2πλ) 2
2 (4π)d/2
where d ≡ n + 1. Each of these expressions are useful in different contexts
and allow different interpretations, some of which we will explore in the
later chapters.
2.2.3
Source to the Field: via Harmonic Oscillators
It is also obvious that we should also be able to interpret the amplitude
0+ |0− J in Eq. (2.21) completely in terms of the harmonic oscillators introduced earlier directly. In particular, if these ideas have to be consistent,
then we should be able to identify the no-particle states |0± with the
ground state of our (so far hypothetical!) oscillators and the one-particle
state |1p generated by the action of J(x) with the first excited state of the
p-th oscillator. We will briefly describe, for the sake of completeness, how
this can be done.
The key idea is to again write G(x2 ; x1 ) in terms of spatial Fourier
components using Eq. (1.117). Doing this in Eq. (2.21) allows us to write
it as
1
d4 x2 d4 x1 J(x2 )G(x2 ; x1 )J(x1 )
0+ |0− J = exp −
2
∞
1 ∞
e−iωp |t−t |
∗
dt
dt Jp (t)
Jp (t )
=
exp −
2 −∞
2ωp
−∞
p
e−Rp
(2.87)
≡
p
The amplitude splits up into a product of amplitudes — one for each Fourier
mode Jp . What is more, the individual amplitude e−Rp arises in a completely different context in usual quantum mechanics. To see this, let us
again consider an oscillator with frequency ωp and let a time-dependent
external force Jp (t) act on it by adding an interaction term30
1
(p)
AI =
dt Jp (t)qp∗ (t) + Jp∗ (t)qp (t)
(2.88)
2
to the individual term in the action given by Eq. (1.145). The effect of such
an external, time-dependent force is to primarily cause transitions between
the levels of the oscillator. In particular, if we start the oscillator in the
ground state |0, −∞ at very early times (t → −∞, when we assume that
Jp (t) is absent), then the oscillator might end up in an excited state at
late times (t → +∞, when again we assume the external force has ceased
to act). The amplitude that the oscillator is found in the ground state
|0, +∞ at late times can be worked out using usual quantum mechanics.
This is given by exactly the same expression we found above. That is,
0, +∞|0, −∞ = e−Rp
1
= exp −
2
∞
∞
dt
−∞
−∞
dt Jp∗ (t)
e
−iωp |t−t |
2ωp
(2.89)
Jp (t )
30 Usually
qp (t)
will
be
real/Hermitian for an oscillator
and hence we take Jp (t) to be
real/Hermitian as well. But we saw
in Sect. 1.5 that the oscillators which
are used to interpret G(x2 ; x1 ) are not
Hermitian; hence we explicitly include
two terms in the action AI .
Exercise 2.7: Compute this probability amplitude for an oscillator in
NRQM (which is most easily done using path integral techniques).
64
Chapter 2. Disturbing the Vacuum
Further, for weak perturbations, it is easy to see from the form of the
linear coupling in Eq. (2.88) that the oscillator makes the transition to the
first excited state. These facts allow us to identify the state |0− with the
ground state of all oscillators at early times and |0+ with the corresponding
ground state of the oscillators at late times. Similarly we can identify the
state |1p with the first excited state of the p-th oscillator.
It is now clear that the action of source J(x) on the no-particle state can
be represented through an interaction term in Eq. (2.88) for each oscillator.
If we add up the interaction terms for all the oscillators we get the total
action — which can be expressed entirely in terms of spacetime variables
J(x) and φ(x) by an inverse Fourier transform. An elementary calculation
gives
(p)
d3 p
∗
AI ⇒
(t)q
(t)
dt
Re
J
AI =
p
p
(2π)3
p
1
=
d4 x J(x)φ† (x) + J † (x)φ(x)
(2.90)
2
So, once again, the physics can be described in terms of a field φ(x) now
interacting with an externally specified c-number source.
31 This also means the dimension of
(δF/δφ) is not the dimension of (F/φ)
— unlike in the case of partial derivatives — because of the d4 x factor.
2.3
Mathematical Supplement
2.3.1
Aspects of Functional Calculus
A functional F (φ) is a mapping from the space of functions to real or complex numbers. That is, F (φ) is a rule which allows you to compute a real or
complex number for every function φ(x) which itself could be defined, say,
in 4-dimensional spacetime. The functional derivative (δF [φ]/δφ(x)) tells
you how much the value of the functional (i.e, the value of the computed
number) changes if you change φ(x) by a small amount at x. It can be
defined through the relation31
δF [φ]
δφ(x)
(2.91)
δF [φ] = d4 x
δφ(x)
Most of the time we will encounter functional derivatives when we vary an
integral involving the function φ and the above definition will be the most
useful one. One can also define the functional derivatives exactly in analogy
with ordinary derivatives. If we change the function by δφ(x) = δ(x − y),
then the above equation tells you that
δF
δF [φ]
δ(x−y) =
(2.92)
δF [φ] = F [φ + δ(x − y)]−F [φ] = d4 x
δφ(x)
δφ(y)
This allows us to define the functional derivative as
δF [φ]
F [φ + δ(x − y)] − F [φ]
= lim
→0
δφ(y)
(2.93)
(To avoid ambiguities, we assume that the limit → 0 is taken before
any other possible limiting operation.) Note that the right hand side is
independent of x in spite of the appearance.
2.3. Mathematical Supplement
65
From the above definition one can immediately prove that the product
rule of differentiation works for functional derivatives. The chain rule also
works, leading to
δF [G] δG[φ]
δ
F [G[φ]] = d4 x
(2.94)
δφ(y)
δG(x) δφ(y)
From the definition in Eq. (2.93), it is easy to obtain the functional derivatives of several simple functionals. For example, we have:
δF [φ]
n
n−1
= n (φ(y))
F [φ] = d4 x (φ(x)) ;
(2.95)
δφ(y)
This result can be generalized to any function g[φ(x)] which has a power
series expansion, thereby leading to the result
δ
(2.96)
d4 x g(φ(x)) = g (φ(y))
δφ(y)
where prime denotes differentiation with respect to the argument. Next, if
x is one dimensional, we can prove:
F [φ] =
dx
dφ(x)
dx
n
;
d
δF [φ]
= −n
δφ(y)
dx
dφ
dx
n−1
which again generalizes to any function h[∂a φ] in the form
δ
∂h
dx h[∂a φ] = −∂a
δφ(y)
∂ (∂a φ)
(2.97)
y
(2.98)
y
Very often we will have expressions in which F is a functional of φ as well
as an ordinary function of another variable. Here is one example:
δF [φ; y]
= K(y, x)
(2.99)
F [φ; y] = d4 x K(y, x ) φ(x );
δφ(x)
Finally, note that sometimes F is intrinsically a function of x through its
functional dependence on φ like e.g. when F (φ, x) = ∇φ(x), or, even more
simply, F [φ; x] = φ(x); then the functional derivatives are:
F [φ; x] = φ(x);
and
F (φ, x) = ∇φ(x);
δF [φ; x]
= δ(x − y)
δφ(y)
(2.100)
δFx [φ]
= ∇x δ(x − y)
δφ(y)
(2.101)
This is different from the partial derivative ∂F/∂φ which is taken usually
at constant ∇φ, leading to (∂∇φ/∂φ) = 0. The results in the text can be
easily obtained from the above basic results of functional differentiation.
Chapter 3
From Fields to Particles
In the previous chapters we obtained the propagation amplitude G(x2 ; x1 )
by two different methods. First we computed it by summing over paths
in the path integral for a free relativistic particle. Second, we obtained
G(x2 ; x1 ) by studying how an external c-number source creates and destroys particles from the vacuum state. Both approaches, gratifyingly, led
to the same expression for G(x2 ; x1 ) but it was also clear — in both approaches — that one cannot really interpret this quantity in terms of a
single, relativistic particle propagating forward in time. These approaches
strongly suggested the interpretation of G(x2 ; x1 ) in terms of a system with
an infinite number of degrees of freedom, loosely called a field.
In the first approach, we were led (see Eq. (1.131)) to the result:
t,x2
0,x1
exp −im
t2
2
dt 1 − v = G(x2 ; x1 ) = 0|T [φ(x2 )φ(x1 )]|0
(3.1)
t1
expressing G(x2 ; x1 ) in terms of the vacuum expectation value of the time
ordered product of two φs. (For simplicity, we have now assumed that φ is
Hermitian.) In the second approach, we were again led (see Eq. (2.54)) to
the relation
4 1
a
2 2
∂a φ∂ φ − m φ
DJ Z(J) exp −i J(x)φ(x)dx = exp i d x
2
(3.2)
which expresses the functional Fourier transform of the vacuum persistence
amplitude [Z(J)/Z(0)] = 0+ |0− J in the presence of a source J(x), in
terms of a functional of a scalar field given by:
1
d4 x ∂a φ∂ a φ − m2 φ2
A=
(3.3)
2
We also saw that this functional can be thought of as the action functional
for the scalar field (see Eq. (1.146)) and could have been obtained as the
sum of the action functionals for an infinite number of harmonic oscillators
which were used to arrive at the interpretation of Eq. (3.1). Given the fact
that Z(J) ∝ 0+ |0− J in the left hand side of Eq. (3.2) contains G(x2 ; x1 ),
this relation again links it to the dynamics of a field φ. More directly, we
found that
G(x2 ; x1 ) = 0|T [φ(x2 )φ(x1 )]|0 = Dφ φ(x2 )φ(x1 ) exp (iA[φ]) (3.4)
© Springer International Publishing Switzerland 2016
T. Padmanabhan, Quantum Field Theory, Graduate Texts in Physics, DOI 10.1007/978-3-319-28173-5_3
67
68
Chapter 3. From Fields to Particles
thus expressing the propagation amplitude entirely in terms of a path integral average of the fields.
In this chapter we shall reverse the process and obtain the relativistic
particle itself as an excitation of a quantum field. In the process, we will
re-derive Eq. (3.1) and other relevant results, starting from the dynamics
of a field described by the action functionals like the one in Eq. (3.3). Part
of this development will be quite straightforward since we have already
obtained all these results through one particular route and we only have
to reverse the process and make the necessary connections. So we will
omit the obvious algebra and instead concentrate on new conceptual issues
which crop up.
3.1
Classical Field Theory
Let us assume that someone has given you the functional in Eq. (3.3) and
has told you that this should be thought of as the action functional for a
real scalar field φ(t, x). You are asked to develop its quantum dynamics
and you may very well ask ‘why’. The reason is that one can eventually get
relativistic particles and their propagators out of such a study and more
importantly, field theory provides a simple procedure to model interactions
between the particles in a Lorentz invariant, quantum mechanical, manner.
This is far from obvious at this stage but you will be able to see it before
the end of this chapter.
1 Some textbooks attempt to “introduce” field theory after some mumbojumbo about mattresses or springs
connected to balls etc. which, if anything, confuses the issue. The classical dynamics of a field based on an action functional is a trivial extension of
classical mechanics based on the action
principle, unless you make it a point to
complicate it.
So we want to study the (classical and) quantum dynamics of fields and
the easiest procedure is to start with action principles. We will recall the
standard action principle in the case of classical mechanics and then make
obvious generalizations1 to proceed from point mechanics to relativistically
invariant field theory.
3.1.1
Action Principle in Classical Mechanics
The starting point in classical mechanics is an action functional defined as
an integral (over time) of a Lagrangian:
t2 ,q2
dt L(q,
˙ q)
A=
(3.5)
t1 ,q1
The Lagrangian depends on the function q(t) and its time derivative q(t)
˙
and the action is defined for all functions q(t) which satisfy the boundary
conditions q(t1 ) = q1 , q(t2 ) = q2 . For each of these functions, the action
A will be a pure number; thus the action can be thought of as a functional of q(t). Very often, the limits of integration on the integral will not
be explicitly indicated or will be reduced to just t1 and t2 for notational
convenience.
Let us now consider the change in the action when the form of the
3.1. Classical Field Theory
69
function q(t) is changed from q(t) to q(t) + δq(t). The variation gives
t2
∂L
∂L
δq +
δ q˙
δA =
dt
∂q
∂ q˙
t1
t2
t2
∂L
d ∂L
d ∂L
=
−
δq
dt
dt
δq +
∂q
dt ∂ q˙
dt ∂ q˙
t1
t1
t2
t2
∂L dp
=
−
dt
(3.6)
δq + pδq
∂q
dt
t1
t1
In arriving at the second equality we have used δ q˙ = (d/dt)δq and have
carried out an integration by parts. In the third equality we have defined
the canonical momentum by p ≡ (∂L/∂ q).
˙
Let us first consider those
variations δq which preserve the boundary conditions, so that δq = 0 at
t = t1 and t = t2 . In that case, the pδq term vanishes at the end points.
If we now demand that δA = 0 for arbitrary choices of δq in the range
t1 < t < t2 , we arrive at the equation of motion
∂L dp
−
=0
(3.7)
∂q
dt
It is obvious that two Lagrangians L1 and L1 + (df (q, t)/dt), where f (q, t)
is an arbitrary function, will lead to the same equations of motion.2
The Hamiltonian for the system is defined by H ≡ pq˙ − L with the
understanding that H is treated as function of p and q (rather than a
function of q˙ and q). By differentiating H with respect to time and using
Eq. (3.7) we see that (dH/dt) = 0.
It is also useful to introduce another type of variation which allows us
to determine the canonical momentum in terms of the action itself. To do
this, we shall treat the action as a function of the upper limits of integration
(which we denote simply as q, t rather than as q2 , t2 ) but evaluated for a
particular solution qc (t) which satisfies the equation of motion in Eq. (3.7).
This makes the action a function of the upper limits of integration; i.e.,
A(q, t) = A[q, t; qc (t)]. We can then consider the variation in the action
when the value of q at the upper limit of integration is changed by δq. In
this case, the first term in the third line of Eq. (3.6) vanishes and we get
δA = pδq, so that
∂A
(3.8)
p=
∂q
From the relations
dA
∂A ∂A
∂A
=L=
+
q˙ =
+ pq˙
dt
∂t
∂q
∂t
2 Note that f can depend on q and t
but not on q.
˙
(3.9)
we find that
∂A
(3.10)
∂t
This description forms the basis for the Hamilton-Jacobi equation in classical mechanics. In this equation we can express H(p, q) in terms of the
action by substituting for p by ∂A/∂q thereby obtaining a partial differential equation for A(q, t) called the Hamilton-Jacobi equation:
∂A
∂A
+H
,q = 0
(3.11)
∂t
∂q
H = pq˙ − L = −
all of which must be familiar to you from your classical mechanics course.
Exercise 3.1: Make sure you understand why ∂A/∂t is not just L but
−H. That is, why there is an extra pq˙ term and a sign flip. Also
note that these relations can be written in the “Lorentz invariant” form
pi = −∂i A though classical mechanics
knows nothing about special relativity.
Figure out why.
70
3.1.2
Chapter 3. From Fields to Particles
From Classical Mechanics to Classical Field Theory
Let us now proceed from classical mechanics to classical field theory. As
you will see, everything works out as a direct generalization of the ideas
from point mechanics as highlighted in Table. 3.1 for ready reference. Let
us review some significant new features.
In classical mechanics, the action is expressed as an integral of the Lagrangian over a time coordinate with the measure dt. In relativity, we want
to deal with space and time at an equal footing and hence will generalize
this to an integral over the spacetime coordinates with a Lorentz invariant measure d4 x in any inertial Cartesian system. The Lagrangian L now
becomes a scalar so that its integral over d4 x leads to another scalar.
Further, in classical mechanics, the Lagrangian for a closed system depends on the dynamical variable q(t) and its first time derivative q(t)
˙
≡
∂0 q. In relativity, one cannot treat the time coordinate preferentially in a
Lorentz invariant manner; the dynamical variable describing a field, φ(xi ),
will depend on both time and space and the Lagrangian will depend on the
derivatives of the dynamical variable with respect to both time and space,
∂i φ. Hence, the action for the field has the generic form
d4 x L(∂a φ, φ)
(3.12)
A=
V
3 This
generalizes the notion in classical mechanics in which integration over
time is in some interval t1 ≤ t ≤ t2 .
We stress that, in this action, the dynamical variable is the field φ(t, x) and
xi = (t, x) are just parameters.
4 This is identical in spirit to working with a Lagrangian L(qi , q˙i ) where
i = 1, 2, ...K denotes a system with K
generalized coordinates.
The integration is over a 4-dimensional region V in spacetime, the boundary
of which will be a 3-dimensional surface,3 denoted by ∂V.
At this stage, it is convenient to introduce several fields (or field components) into the Lagrangian and deal with all of them at one go.4 We
will denote by φN (t, x) a set of K fields where N = 1, 2, ...K. In fact, we
can even use the same notation to denote different components of the same
field when we are working with a vector field or tensorial fields. In the expressions below we will assume that we sum over all values of N whenever
the index is repeated in any given term. The action now becomes
A=
d4 x L(∂a φN , φN )
(3.13)
V
To obtain the dynamics of the field, we need to vary the dynamical
variable φN . Performing the variation, we get (in a manner very similar to
the corresponding calculation in classical mechanics):
∂L
∂L
δA =
δ(∂a φN )
δφN +
d4 x
∂φN
∂(∂a φN )
V
∂L
∂L
δφN
d4 x
− ∂a
=
∂φN
∂(∂a φN )
V
∂L
δφN
(3.14)
+
d4 x ∂a
∂(∂a φN )
V
In obtaining the second equality, we have used the fact that δ(∂a φN ) =
∂a (δφN ) and have performed an integration by parts. The last term in the
a
a
second line is an integral over a four-divergence, ∂a [πN
δφN ] where πN
≡
a
[∂L/∂(∂a φN )]. This quantity πN generalizes the expression [∂L/∂(∂0 q)] =
[∂L/∂ q]
˙ from classical mechanics and can be thought of as the analogue of
the canonical momentum. In fact, the 0-th component of this quantity is
3.1. Classical Field Theory
71
Property
Mechanics
Field theory
Independent variable
t
(t, x)
Dependent variable
qj (t)
#
φN (t, x)
#
A = d4 xL
Definition of Action
A=
dtL
Form of Lagrangian
L = L(∂0 qj , qj )
L = L(∂i φN , φN )
Domain of integration
t ∈ (t1 , t2 )
1-dimensional interval
xi ∈ V
4-dimensional region
Boundary of integration
two points; t = t1 , t2
3D surface ∂V
Canonical Momentum
pj =
General form of
the variation
∂L
∂(∂0 qj )
#t
δA = t12 dtE[qj ]δqj
# t2
+ t1 dt∂0 (pj δqj )
Form of E
E[qj ] =
Boundary condition to
get equations of motion
δqj = 0 at the
boundary
δφN = 0 at the
boundary
∂L
− ∂0 p j = 0
∂qj
t2
δA = (pj δqj )t
E[φN ] =
Equations of motion
Form of δA when
E = 0 gives momentum
Energy
∂L
− ∂0 p j
∂qj
E[qj ] =
1
E = pj ∂0 qj − L
∂L
∂(∂j φN )
#
δA = V d4 xE[φN ]δφN
# 4
j
+ V d x∂j (πN
δφN )
j
πN
=
E[φN ] =
δA =
∂L
j
− ∂j πN
∂φN
∂L
j
− ∂j πN
=0
∂φN
#
∂V
j
d3 σj (πN
δφN )
a
Tba = [πN
∂b φN − δba L]
Table 3.1: Comparison of action principles in classical mechanics and field theory
72
Chapter 3. From Fields to Particles
0
indeed πN
= [∂L/∂ φ˙ N ], as in classical mechanics. We now need to use the
4-dimensional divergence theorem,
d4 x ∂i v i =
d3 σi v i
(3.15)
V
r=R
time-like surface
[As R → ∞ this
surface goes to
spatial infinity]
t
t = t2 spacelike surface
R
y
where V is a region of 4-dimensional space bounded by a 3-surface ∂V and
d3 σi is an element of the 3-surface. You might recall the Fig. 3.1 from
your special relativity course which explains the usual context in which
we use this result: We take the boundaries of a 4-dimensional region V to
be made of the following components: (i) Two 3-dimensional surfaces at
t = t1 and t = t2 both of which are spacelike; the coordinates on these
surfaces being the regular spatial coordinates (x, y, z) or (r, θ, φ). (ii) One
timelike surface at a large spatial distance (r = R → ∞) at all time t
in the interval t1 < t < t2 ; the coordinates on this 3-dimensional surface
could be (t, θ, φ). In the right hand side of Eq. (3.15) the integral has to
be taken over the surfaces in (i) and (ii). If the vector field v j vanishes at
large spatial distances, then the integral over the surface in (ii) vanishes
for R → ∞. For the integral over the surfaces in (i), the volume element
can be parametrized as dσ0 = d3 x. It follows that
d4 x ∂i v i =
d3 x v 0 −
d3 x v 0
V
t = t1 spacelike surface
x
Figure 3.1: Divergence theorem in the
spacetime.
t2
t1
with the minus sign arising from the fact that the normal has to be always
treated as outwardly directed. If ∂i v i = 0 then the integral of v 0 over all
space is conserved in time.
In our case, we can convert the total divergence term in the action into
a surface term
4
a
a
0
δAsur ≡
d x ∂a (πN δφN ) =
dσa πN δφN →
d3 x πN
δφN
V
5 In classical mechanics, the corresponding analysis leads to pδq at the
end points t = t1 and t = t2 . Since
the integration is over one dimension,
the “boundary” in classical mechanics
is just two points. In the relativistic
case, the integration is over four dimensions leading to a boundary term
which is a 3-dimensional integral.
∂V
∂V
t=cons
(3.16)
where the last expression is valid if we take the boundary to be the spacelike
surfaces defined by t = constant and assume that the surface at spatial
infinity does not contribute.5 We see that, we can obtain sensible dynamical
equations for the field φN by demanding δA = 0 if we consider variations
δφN which vanish everywhere on the boundary ∂V. (This is similar to
demanding δq = 0 at t = t1 and t = t2 in classical mechanics.) For such
variations, the demand δA = 0 leads to the field equations
∂L
∂L
a
= ∂a πN
=
(3.17)
∂a
∂(∂a φN )
∂φN
Given the form of the Lagrangian, this equation determines the classical
dynamics of the field.
We can also consider the change in the action when the field configuration is changed on the boundary ∂V assuming that the equations of motion
are satisfied. In classical mechanics this leads to the relation p = (∂A/∂q)
where the action is treated as a function of its end points. In our case,
a
Eq. (3.16) can be used to determine different components of πN
by choosing different surfaces. In particular, if we take the boundary to be t =
0
constant, we get πN
= (δA/δφN ) on the boundary, where (δA/δφN ) is
the functional derivative and is defined through the second equality in
3.1. Classical Field Theory
73
a
as
Eq. (3.16). This provides an alternative justification for interpreting πN
the canonical momentum.
Another useful quantity which we will need is the energy momentum
tensor of the field. In classical mechanics, if the Lagrangian has no explicit
dependence on time t, then one can prove that the energy defined by E =
(pq)
˙ − L is conserved. By analogy, when the relativistic Lagrangian has no
explicit dependence on the spacetime coordinate xi , we expect to obtain a
suitable conservation law. In this case, we expect q˙ to be replaced by ∂i φN
a
and p to be replaced by πN
. This suggests considering a generalization of
E = (pq)
˙ − L to the second rank tensor
a
(∂i φN ) − δia L]
T ai ≡ [πN
(3.18)
0 ˙
φN − L is identical in structure
Again, we see that the component T 00 = πN
to E in classical mechanics, making T00 (= T00 ) the energy density. To check
its conservation, we calculate ∂a T ai treating L as an implicit function of xa
through φN and ∂i φN . We get:
∂a T ai
∂L
a
a
a
= (∂i φN )(∂a πN
) + πN
∂a ∂i φN −
∂i φN − πN
∂i ∂a φN
∂φN
∂L
a
=0
(3.19)
= (∂i φN ) ∂a πN
−
∂φN
In arriving at the second equality, we have used ∂i ∂a φN = ∂a ∂i φN to cancel
out a couple of terms and the last equality follows from the equations of
motion, Eq. (3.17). It is obvious that the quantity T ai is conserved when the
equations of motion are satisfied. (It is also obvious that if we accept the
expression for T ai given in Eq. (3.18), then demanding ∂a T ai = 0 will lead
to the equations of motion for the scalar field.) Integrating the conservation
law ∂a T ai = 0 over a four-volume and using the Gauss theorem in Cartesian
coordinates, we find that the quantity
(3.20)
P i = dσk T ki = d3 x T 0i
is a constant which does not vary with time. We will identify P i with the
total four-momentum of the field.6 [One difficulty with our definition of
the energy momentum tensor through Eq. (3.18) is that — in general —
it will not be symmetric; there are alternative definitions which will tackle
this problem, which we will describe later.]
In addition to symmetries related to spacetime, the action can have
some symmetries related to the transformation of fields, which can also
lead to conservation laws. A simple example is provided by a set of K fields
φN with N = 1, 2, ....K with the Lagrangian being invariant under some
infinitesimal transformation φN → φN + δφN . (That is, the Lagrangian
does not change under such a transformation without our using the field
equations.) This allows us to write
∂L
∂L
j
j
j
δφN + πN ∂j (δφN ) =
− ∂j πN δφN + ∂j (πN
δφN )
0 = δL =
∂φN
∂φN
(3.21)
If we now further assume that the field equations hold, then the first term
on the right hand side vanishes and we get a conserved current
∂L δφN
i
(3.22)
δφN ∝
J i ∝ πN
∂(∂i φN ) δα
N
6 The absence of xi in the Lagrangian
is equivalent to the 4-dimensional
translational invariance of the Lagrangian. We see that the symmetry of 4-dimensional translational invariance leads to the conservation of
both energy and momentum at one go.
In classical mechanics, time translation invariance leads to energy conservation and spatial translation invariance leads to momentum conservation,
separately. But since Lorentz transformation mixes space and time coordinates, the conservation law in relativity is for the four-momentum.
74
Chapter 3. From Fields to Particles
where the second relation is applicable if the changes δφN were induced by
the change of some parameter α so that δφN = (∂φN /∂α)δα. This result
will be needed later.
3.1.3
Real Scalar Field
All these work for any action functional. If we specialize to the action in
Eq. (3.3), then the Lagrangian, in explicit (1 + 3) form, becomes
L=
7 This
equation is called the KleinGordon equation though it was first
written down by Schrodinger, possibly
because he anyway has another equation named after him! We will come
across several such naming conventions
as we go along.
8 We write B(−k) rather than the
natural B(k) in Eq. (3.25) for future
convenience.
Exercise 3.2: Find the solution
φ(t, x) to the Klein-Gordon equation
explicitly in terms of the initial condi˙ 0 , x) at an earlier
tions φ(t0 , x) and φ(t
time t0 .
1
1
1
1
2
[∂a φ∂ a φ − m2 φ2 ] = φ˙ 2 − (∇φ) − m2 φ2
2
2
2
2
(3.23)
The field equation7 reduces to
a
∂ ∂a + m2 φ = 0;
∂2
2
2
−∇ +m φ=0
∂t2
(3.24)
The general solution to the Klein-Gordon equation can be easily determined
by Fourier transforming the equation in the spatial variables. We find that
d3 k 1 ik·x
A(k)e−iωk t + B(−k)eiωk t
(3.25)
e
φ(t, x) =
(2π)3 2ωk
√
where ωk ≡ k2 + m2 and A(k) and B(k) are arbitrary scalar functions
satisfying A∗ (k) = B(k) to ensure that φ is real.8 It is clear that the
solution represents a superposition of waves with wave vector k and that
the frequency of the wave is given by the dispersion relation ωk2 = k2 + m2
corresponding to a four-vector k i with k i ki = m2 . Flipping the sign of
k in the second term of Eq. (3.25), we can write it in manifestly Lorentz
invariant form:
φ(x) = dΩk A(k)e−ikx + A∗ (k)eikx
(3.26)
The existence of two arbitrary real functions in the solution (in the form
of one complex function A(k)) is related to the fact that Klein-Gordon
equation is second order in time; to find φ(t, x) you need to know both
˙ 0 , x).
φ(t0 , x) and φ(t
The canonical momentum corresponding to the field φ is given by π a =
a
∂ φ, and in particular
π(t, x) ≡
∂L
˙ x)
= φ(t,
˙
∂ φ(t, x)
(3.27)
The energy momentum tensor Tab and the energy density H = T00 are
given by
1
1 ˙2
Tba = ∂ a φ∂b φ − δba L;
H=
(3.28)
φ + (∇φ)2 + m2 φ2
2
2
The Hamiltonian density H can also be found from the expression π φ˙ − L.
This leads to the Hamiltonian when we integrate H over all space:
1 2 1
1
H = d3 x H = d3 x
π + (∇φ)2 + m2 φ2
(3.29)
2
2
2
The energy momentum tensor in Eq. (3.28) happens to be symmetric
but this is a special feature of the Lagrangian for a scalar field. The energy
3.1. Classical Field Theory
75
momentum tensor as defined in Eq. (3.18), Tai , will involve πaA ∂i φN and
hence — in general — need not be symmetric; as we said before this is
a serious difficulty with the definition in Eq. (3.18). (The problem arises
from the fact that angular momentum will not be conserved if Tab is not
symmetric; but we will not pause to prove this.) There are tricks to get
around this difficulty in an ad-hoc manner but the only proper way to
address this question is to ask for the physical meaning of Tab . Since we
expect the energy momentum tensor to be the source of gravity, the correct
definition of the energy momentum tensor must involve aspects of gravity.
We will now briefly describe how such a definition emerges; as a bonus
we will also understand how to do field theory in arbitrary curvilinear
coordinates and in curved spacetime.
Let us consider the action in Eq. (3.3) and ask how one would rewrite
it in an arbitrary curvilinear coordinate system. We know that when one
transforms from the Cartesian coordinates xα ≡ (x, y, z) in 3-dimensional
space to, say, spherical polar coordinates x
¯α ≡ (r, θ, φ), we need to use a
general metric gαβ rather than δαβ because the line interval changes from
d2 = δαβ dxα dxβ to d2 = gαβ d¯
xα d¯
xβ . The dot products between vectors
etc. will now involve gαβ v α uβ instead of δαβ v α uβ in Cartesian coordinates.
Moreover, transforming from xα to x
¯α will also change the volume element
√
of integration d3 x to d3 x
¯ g where g is the determinant of gαβ .
Exactly the same sort of changes occur when we go from 3-dimensions
to 4-dimensions. If we decide to use some set of curvilinear coordinates x
¯i
instead of the standard Cartesian inertial coordinates xi , the expression for
the spacetime interval will change from ds2 = ηik dxi dxk to ds2 = gik d¯
xi d¯
xk
√
and the volume element9 will become d4 x −g. The raising and lowering
of indices will be now done with gab rather than ηab with g ab defined as the
inverse of the matrix gab .
With these modifications the action in Eq. (3.3) can be expressed in
any curvilinear coordinate system in the form
√
1
A = d4 x −g g ab ∂a φ∂b φ − V (φ)
(3.30)
2
where we have promoted the term (1/2)m2 φ2 to an arbitrary potential
V (φ) for future convenience. Using the principle of equivalence — which
may be broadly interpreted as saying that physics in curvilinear coordinates
should be the same as physics in a gravitational field in a sufficiently small
region of spacetime — one can make a convincing case10 that the above
expression should also hold in arbitrary curved spacetime described by a
line interval ds2 = gik dxi dxk .
With these modifications, we have obtained a generally covariant version of the action principle for the scalar field, given by Eq. (3.30), which retains its form under arbitrary coordinate transformations. This action now
has two fields in it, φ(x) and gab (x). Varying φ(x) will give the generalization of the field equations in Eq. (3.24), valid in a background gravitational
field described by the metric tensor gab :
√
1
√ ∂a
−g g ab ∂b φ = −V (φ)
−g
(3.31)
where the operator (|g|)−1/2 ∂a [|g|1/2 g ab ∂b ] is the expression for Laplacian
in curvilinear coordinates or curved spacetime.11 On the other hand, the
√
√
write −g rather than g because we know that the determinant
will be negative definite in spacetime
due to the flip of sign between space
and time coordinates.
9 We
10 This is, of course, not a course on
GR so we will not spend too much time
on these aspects; fortunately, this is all
you need to know to get by for the moment. At the first available opportunity, do learn GR if you are not already
familiar with it!
11 It is the same maths as the one involved in writing ∇2 in spherical polar coordinates, except for some sign
changes and promotion from D = 3 to
D = 4.
76
Chapter 3. From Fields to Particles
variation of the action with respect to the metric tensor will have the generic
structure
√
1
d4 x −g Tab δg ab
δA ≡
(3.32)
2
12 Sometimes textbooks — even good
ones — claim that ‘since A is a
scalar δA = 0’ which is misleading.
Since δL = −ξ a ∂a L and
√
√
δ g = −∂a ( gξ a ) we actually have
√
δ[L (g)] = −∂a [ gLξ a ] which is a
total divergence that vanishes if (and
only if) ξ a vanishes on the boundary.
where the right hand side defines some second rank symmetric tensor
Tab . When we study gravity we will start with a total action Atot =
Ag (g) + Am (φ, g) which is the sum of the action for the gravitational field
Ag depending on the metric and the action for the matter source depending on the matter variables (like the scalar field φ) and the metric, as in
Eq. (3.30). The variation of the total action will give the field equations of
gravity with the variation of the matter action with respect to gab providing
the source. This justifies using the symbol Tab and identifying the tensor
which appears in Eq. (3.32) as the energy momentum tensor of matter.
It is also easy to see that the conservation law for this tensor Tab arises
from the fact that our action functional is now a scalar with respect to arbitrary coordinate transformations xa → x
¯a . The infinitesimal coordinate
transformation would correspond to x
¯a = xa + ξ a (x) with the functions
ξ a (x) treated as a first order infinitesimal quantity that vanishes on the
boundary ∂V of the spacetime region V of interest. The metric tensor in
the new coordinates will be
¯a ∂j x
¯b = η ij (δia + ∂i ξ a ) δjb + ∂j ξ b ≈ η ab + ∂ a ξ b + ∂ b ξ a
g ab = η ij ∂i x
(3.33)
so that δg ab = ∂ a ξ b + ∂ b ξ a ≡ ∂ (a ξ b) . For such a variation of the metric
arising from an infinitesimal coordinate transformation, δA = 0 in the
left hand side of Eq. (3.32) when ξ a vanishes on the boundary.12 Using
δg ab = ∂ (a ξ b) on the right hand side, we get
√
√
1
d4 x −g Tab ∂ (a ξ b) =
d4 x −g T ab ∂a ξb
0 =
2
V
V
√
(3.34)
=
d4 x −g ∂a (T ab ξb ) − ξb ∂a T ab
V
In arriving at the second equality we have used the symmetry of Tab and to
get the third equality we have done an integration by parts. This leads to
a surface term containing T ab ξb on the surface ∂V which vanishes because
ξb = 0 on ∂V. We then get the conservation law ∂a T ab = 0, reinforcing the
physical interpretation of Tab as energy momentum tensor of matter.
Usually, we only have to deal with matter Lagrangians in which the
√
dependence on g ab is either linear or quadratic (apart from the −g factor)
and in this case one can write down a general formula for Tab . If the matter
action has the form:
√
√
(3.35)
A = d4 x −g L = d4 x −g − V + Uij g ij + Wijkl g ij g kl
Exercise 3.3: Prove Eq. (3.36) by
varying det M = exp(Tr ln M ).
where V, Uij , Wijkl are functionals of the matter field variables and are
independent of the metric, then we can carry out the variation of the action
A with respect to g ij in a straight forward manner using the result
13 In general the prescription gives the
Tab in a general curved spacetime. For
us, who pretend to live in flat spacetime, the introduction of g ab is just a
trick to get Tab and we can set g ab =
ηab at the end.
√
1√
1√
−g g ab δgab = −
−g gab δg ab
δ −g =
2
2
(3.36)
Tab = −ηab L + 2Uab + 4Wabkl η kl
(3.37)
and obtain:13
3.1. Classical Field Theory
77
where we have set g ab = η ab at the end of the calculation. In the case of
the scalar field, Wijkl = 0, Uab = (1/2)∂a φ ∂b φ, thus giving:
Tab = ∂a φ ∂b φ − ηab L
(3.38)
which tells you that we get back the same expression14 in Eq. (3.28).
There is another natural conserved quantity for a system described by
the equation Eq. (3.24) which will acquire some significance later on. If
φ1 and φ2 are any two solutions to Eq. (3.24) (which could, in general, be
complex even though we are dealing with a real scalar field), then it is easy
to show that the current
←
→
(3.39)
Ja ∝ i (φ∗1 ∂a φ2 − φ2 ∂a φ∗1 ) ≡ iφ∗1 ∂a φ2
14 But the effort is not wasted because
we will need this procedure to find the
correct Tab of the electromagnetic field
later on. Besides, this is a physically
meaningful way to obtain Tab .
is conserved in the sense that ∂a J a = 0. (The constant i in front of the
definition is included to ensure Ja is real.) If we write this conservation
law in the form (∂ρ/∂t) + ∇ · J = 0 with J a = (ρ, J), then we can obtain
a conserved charge by integrating the density ρ over all space. In this
particular context, one can use this fact to define a conserved scalar product
between any two solutions. This is denoted by the symbol (φ1 , φ2 ) and
defined by
→
i
i
3
∗ ←
(3.40)
d x φ1 ∂t φ2 =
d3 x φ∗1 φ˙ 2 − φ2 φ˙ ∗1
(φ1 , φ2 ) ≡
2
2
One can verify explicitly that (i) ∂(φ1 , φ2 )/∂t = 0 and that (ii) ρ need not
be positive definite for a general solution.
3.1.4
Exercise 3.4:
ments.
Prove these state-
Complex Scalar Field
The above analysis generalizes in a straightforward manner for a complex
scalar field φ — which, in fact, can be
√ thought of as being made of two
real scalar fields with φ = (φ1 + iφ2 )/ 2 — with the number of degrees of
freedom doubling up. The action is now given by
(3.41)
A = d4 x ∂a φ∗ ∂ a φ − m2 φ∗ φ
where we treat φ and φ∗ as independent variables while varying the action.15 This will lead to the Klein-Gordon equations with the same mass
a
m for both φ and φ∗ . The canonical momentum π(φ)
conjugate to φ is
a ∗
∗
a
∂ φ , and the one conjugate to φ is ∂ φ.
There is an important new feature which emerges from the action in
Eq. (3.41). The Lagrangian is now invariant under the transformation
φ(x) → e−iθ φ(x);
φ∗ (x) → eiθ φ∗ (x)
(3.42)
where θ is a constant. This symmetry will lead to a conserved current which
actually can be interpreted as the electromagnetic current associated with
the complex scalar field. To find the conserved current we note that the
above symmetry transformation reduces to δφ = −iθφ, δφ∗ = iθφ∗ in the
infinitesimal form, to first order in θ. Using these in Eq. (3.22), taking
φN = (φ, φ∗ ), πN = (∂ a φ∗ , ∂ a φ), we get the conserved current to be
←
→
Jm ∝ i (φ∂m φ∗ − φ∗ ∂m φ) = −iφ∗ ∂m φ
(3.43)
15 If you do not want to think of φ∗ as
being independent of φ, go back to φ1
and φ2 and vary them; it amounts to
the same thing.
78
Chapter 3. From Fields to Particles
with the corresponding charge being
←
→
Q ∝ d3 x j 0 = i d3 x φ∗ ∂0 φ
16 We have subtracted out the rest energy mc2 = m by separating the phase
e−imt√
. The normalization is a√relic
of (1/ 2ωk ) which becomes (1/ 2m)
when c → ∞.
Exercise 3.5: This is a bit nontrivial;
convince yourself that we are retaining
terms of O(c3 ) and O(c) and ignoring
terms ofO(1/c).
This is essentially the same conserved scalar product introduced earlier in
Eq. (3.40). We will see in the next section that this conserved charge plays
an important role.
The non-relativistic limit of this system and, in particular, of the Lagrangian, is also of some interest. It can be obtained by writing16
1
φ= √
e−imt ψ
2m
Exercise 3.7: Prove this.
(3.45)
and expressing the Lagrangian in terms of ψ. We note that |∇φ|2 =
(|∇ψ|2 /2m) and, to the leading order,
1
1 ˙
i
ψ − imψ ψ˙ ∗ + imψ ∗ ≈ m|ψ|2 + (ψ ∗ ∂0 ψ − ψ∂0 ψ ∗ )
2m
2
2
(3.46)
Hence, in the action in Eq. (3.41), the m2 |φ|2 = (1/2)m|ψ|2 term is cancelled out by the first term in the above expression! Further, doing an
integration by parts over the time variable, we can combine the last two
terms in Eq. (3.46). This reduces the Lagrangian in Eq. (3.41) to the form
˙2=
|φ|
L = iψ ∗ ∂0 ψ −
Exercise 3.6: (a) Convince yourself
that this is not an artificiality due to
our normalization in Eq. (3.45) and
will arise for arbitrary normalization.
√
(b) Write ψ =
ρ exp(iθ) and reexpress the Lagrangian in terms of ρ
and θ. Show that when we quantise
this system, we will be led to [N, θ] = i
where N is the integral of ρ over all
space. In many condensed matter applications this leads to a conjugate relationship between the number of particles in a condensate and the phase of
the wave function.
(3.44)
1
|∇ψ|2
2m
(3.47)
The variation of this Lagrangian will lead to the Schrodinger equation
iψ˙ = −(1/2m)∇2 ψ for a free particle of mass m. It is amusing to see that
the mass term m2 |φ|2 does not survive in the non-relativistic limit and that
the (1/2m) factor in the Schrodinger equation has a completely different
origin in the action.
3.1.5
Vector Potential as a Gauge Field
Once you have a complex scalar field, there is a very natural way of “discovering” the existence of an electromagnetic field. This approach forms the
corner-stone of gauge field theories, one of the most successful paradigms
in particle physics.
To see how this comes about, let us recall that the Lagrangian for the
complex scalar field in Eq. (3.41) is invariant under the transformation
φ → e−iθ φ ≡ e−iqα φ where we have put θ = qα with q and α being
real constants, for future convenience. On the other hand, if α = α(x)
is a function of spacetime coordinates (with q constant), the action is not
invariant because the derivatives ∂i φ are not invariant and will pick up ∂i α
terms.
We now want to modify the action for the complex scalar field such
that it is invariant even under the local transformation with α = α(x).
One way to do this is to replace the ordinary partial derivative ∂i φ by
another quantity Di φ [called the gauge covariant derivative] involving another vector field Ai [called the gauge field ] and arrange matters such that
this invariance is maintained.
Let us postulate an ansatz for the gauge covariant derivative to be
Di = ∂i + iqAi (x). We now demand that, when φ → φ = e−iqα(x) φ,
3.1. Classical Field Theory
79
the Ai transforms to Ai such that Di φ = e−iqα(x) Di φ. [That is, Di φ
transforms just like φ.] A simple calculation shows that this is achieved if
Ai transforms as
Ai = Ai + ∂i α
(3.48)
That is, if we modify the action for complex scalar field to the form17
S =
d4 x Di φ [Di φ]∗ − m2 |φ|2
=
d4 x (∂i + iqAi (x))φ (∂ i − iqAi (x))φ∗ − m2 |φ|2 (3.49)
17 We denote the action by S (rather
than A) for notational clarity vis-a-vis
Ai .
then the action remains invariant under the simultaneous transformations:
φ → φ = e−iqα(x) φ;
Ai = Ai + ∂i α
(3.50)
Thus, we can construct an action for φ which has local gauge invariance if
we couple it to a vector field Ai and postulate that both fields change as
in Eq. (3.50).
The gauge covariant derivative has the structure Dj = ∂j + iqAj (x)
(where Ai = (A0 , A)), so that −iDα = −i∂α + qAα (x) = −i∂α − qAα .
So this is equivalent to replacing p by p − qA. This is precisely what
happens when we couple a particle of electric charge q to an electromagnetic
vector potential A. It makes sense to identify q with the charge of the
quanta of the scalar field (which is also consistent with the fact that for the
transformation φ → φ = e−iqα φ with constant α, the conserved charge will
scale with q) and Aj with the standard electromagnetic vector potential.
We have discovered electromagnetism.
A couple of points are worth remembering about this Lagrangian. First,
note that while the Lagrangian has both (Dm φ)∗ and (Dm φ) in which the
∗
complex conjugation changes Dm = ∂m + iqAm to Dm
= ∂m − iqAm . The
∗
field equations resulting from, say, varying φ in the Lagrangian, have the
form
2
(3.51)
Dm Dm φ + m2 φ = − (i∂ − qA) − m2 φ = 0
involving only Dm without the complex conjugate. The variation of φ, of
course, leads to the complex conjugate version of the above equation with
∗
Dm
Dm∗ . Second, one can rewrite the Lagrangian in the form φ∗ M φ by
ignoring a total divergence term just as in the case of the real scalar field.
In this case, we use the identity
(3.52)
Dj∗ φ∗ Dj φ = ∂j φ∗ Dj φ − φ∗ Dj Dj φ
and ignore the first term (which is a total divergence) to obtain
2
L = (Dm φ)∗ Dm φ − m2 φ∗ φ ⇒ φ∗ (i∂ − qA) − m2 φ
(3.53)
which will turn out to be quite useful in future.
The action in Eq. (3.49) now depends on two fields, Ai as well as φ. The
variation of the action with respect to φ and φ∗ leads to the field equation
in Eq. (3.51) and its complex conjugate. But now we can also vary the
action with respect to Ai . This is identical in spirit to our varying the
metric tensor in the action for the scalar field in Eq. (3.32) to define the
energy momentum tensor Tab . Just as we discovered the source of gravity
by this procedure, varying Ai in the action for the complex scalar field
Exercise 3.8: Show that [Dm , Dl ]φ =
iqFml φ where Dm = ∂m + iqAm and
Flm = ∂l Am − ∂m Al . So gauge covariant derivatives do not commute.
80
Chapter 3. From Fields to Particles
will lead us to the source for electromagnetism, viz., the current vector J i
through
δS ≡ − d4 x Ji δAi
(3.54)
18 Mnemonic:
The sign is chosen
such that, with our signature, the Lagrangian has a −ρφ term in an electrostatic field.
This is a very general definition for the current J i to which the vector field
Ai couples.18 Further, just as we obtained the conservation law for Tab by
considering the δgab arising from coordinate transformations in Eq. (3.34),
we can obtain the conservation law ∂i J i = 0 for the current from the gauge
invariance of the action A. Under a gauge transformation, δAi = ∂i α and
δS = 0. Using these in Eq. (3.54), doing an integration by parts we get
4
i
(3.55)
0 = d x J ∂i α = d4 x ∂i (J i α) − α∂i J i
As usual, the surface contribution resulting from the first term can be made
to vanish by a suitable choice of α, thereby leading to the conservation law
∂i J i = 0.
Using the specific form of the action in Eq. (3.49), and varying Ai , we
find the current to be
Ji = −iq [φ(Di φ)∗ − φ∗ (Di φ)] = −iq [φ∂i φ∗ − φ∗ ∂i φ] − 2q 2 |φ|2 Ai (3.56)
which reduces to current obtained earlier in Eq. (3.43) (except for an overall
factor) when Ai = 0. In the presence of Ai , we get an extra term 2q 2 |φ|2 Ai
which might appear a bit strange. However, it is not only needed for gauge
invariance but also plays a crucial role in superconductivity and Higgs
phenomena.
There is another, equivalent, way of identifying the current in Eq. (3.56).
The action in Eq. (3.49) does not change under the global transformation
φ → e−iqα φ, φ∗ → eiqα φ∗ even if we do not change Ai . For this transformation, δφ/δα = −iqφ and δφ∗ /δα = iqφ∗ . Therefore, Eq. (3.22) tells us
that there is a conserved current given by
Jm =
N
19 That is, the invariance of the Lagrangian under a global transformation leads to a conserved current.
When we gauge the global transformation making it local, the resulting
gauge field couples to this conserved
current.
δφN
∂L
= −iq(φ∂m φ∗ − φ∗ ∂m φ) − 2q 2 Am φ∗ φ
∂(∂m φN ) δα
(3.57)
which matches with Eq. (3.56).
More generally, consider any Lagrangian L(φ, ∂φ) that remains invariant when you change the dynamical variables by δ φ ≡ f (φ, ∂φ) where is
an infinitesimal constant parameter. Given this symmetry, let us ask what
happens if we make a function of coordinates, → (x), and look at the
form of δL for δ φ → (x)f (φ, ∂φ). Since δL = 0 for constant , we must
have δL scaling linearly with ∂a now. Therefore δL must have the form
δL = J a ∂a for some J a . Let us now consider these variations around an
extremum of the action; i..e when the equations of motion hold. Then δS
has to be zero for any variation including the one, δ φ → (x)f (φ, ∂φ), we
are studying. Writing J a ∂a = ∂a (J a ) − ∂a J a , and noting that the first
term — being a total divergence — does not contribute to δS, we find that
δS = 0 implies ∂a J a = 0. So, whenever you upgrade a global symmetry
(constant ) to a local symmetry ( dependent on x) you get a current J a
which will be conserved on-shell; i.e. when the equations of motion hold.19
Finally, you need to be aware of the following point which sometimes
causes a bit of confusion. The expanded form of the Lagrangian is now
3.1. Classical Field Theory
81
given by
L
(3.58)
= (Dm φ)∗ Dm φ − m2 φ∗ φ
m ∗
m ∗
∗ m
2
2
m
2 ∗
= ∂m φ∂ φ + iqAm (φ∂ φ − φ ∂ φ) + q |φ| Am A − m φ φ
in which there is a coupling term of the form −Am Km where Km = Jm +
q 2 |φ|2 Am . So the current Km we identify from the coupling term −Am Km
in the Lagrangian is not quite the genuine conserved current J m which acts
as the source for the vector field. The Am independent parts of Km and
Jm are identical but the terms involving Am differ by a factor 2. This is
because the Lagrangian in Eq. (3.58) has a quadratic term in Am which,
under variation [used to define Jm through Eq. (3.54)], will give an extra
factor 2. The current that is gauge invariant and conserved is Jm and not
Km .
3.1.6
Electromagnetic Field
The action resulting from Eq. (3.58) contains the scalar field, its derivatives
and the vector field but not the derivatives of the vector field. This cannot
be the whole story because an external field like Ai (x) can exchange energy
and momentum with the scalar field φ (thereby violating conservation of
energy etc.) without any dynamics for Ai itself. What we lack is a term
in the action containing the derivatives of Ai in order to complete the
dynamics. Only then will we have a closed system, with conserved energy,
momentum, etc. with two interacting fields φ and Ai . We will address this
issue next.
The term in the action leading to the dynamics of the field Ai will be expressible as an integral over the four-volume d4 x of some scalar Lagrangian
(density) L = L(Ai , ∂j Ai ), which could be a function of Ai and its first
derivative.20 We have seen earlier that the equations of motion for the
scalar field and the coupling of Ai and φ respect the gauge transformation:
Ai = Ai + ∂i f . Therefore, it makes sense to demand that the action for
the field should also be invariant under the gauge transformation. Clearly,
the “kinetic energy” term for Ai must have the structure M ijkl ∂i Aj ∂k Al
where M ijkl is a suitable fourth rank tensor which has to be built out of the
only two covariant tensors: η ij and the completely antisymmetric21 tensor
in 4-dimensions ijkl . One obvious choice, of course, is M ijkl = ijkl . To
determine the terms that can be constructed from η ij , we note that there
are essentially three different ways of pairing the indices in ∂i Aj ∂k Al with
a product of two η’s. One can contract: (1) i with j and k with l; (2) i
with k and j with l; (3) i with l and j with k. Adding up all these possible
terms with arbitrary coefficients leads to a Lagrangian of the form
L = c1 ijkl ∂i Aj ∂k Al + c2 (∂i Ai )2 + c3 (∂i Aj ∂ i Aj ) + c4 (∂i Aj ∂ j Ai ) (3.59)
It is convenient at this stage to express ∂i Aj as (1/2)[Fij + Sij ] where Sij is
the symmetric tensor Sij ≡ ∂i Aj +∂j Ai which complements the information
contained in the antisymmetric part Fij = ∂i Aj − ∂j Ai . We will call Fij
the electromagnetic field tensor, which is obviously gauge invariant.22 The
last three terms in Eq. (3.59) can be expressed in terms of these two tensors
and, using the fact that Fij S ij = 0, one can easily work out the Lagrangian
20 In the case of electromagnetism, experiments show that electromagnetic
fields obey the principle of superposition, viz. that the field due to two
independently specified currents is the
sum of the fields produced by each of
them in the absence of the other. For
this to be true, the differential equations governing the dynamics have to
be linear in the field; alternatively, the
Lagrangian can be at most quadratic
in the field variable. We will assume
this is the case in determining the form
of the action.
21 In case you haven’t seen this before, revise the relevant part of tensor analysis. We define ijkl as being
completely antisymmetric in all indices
with 0123 = 1 (so that 0123 = −1) in
Cartesian coordinates. You should be
able to verify that such a structure is
Lorentz invariant.
22 Notation: We take Ai = (A0 , A) =
(φ, A) with the Cartesian components
of A denoted by a superscript as Aα .
Then Ai = (φ, −A) and F0α = ∂0 Aα −
∂α A0 . This corresponds to the relation
˙ − ∇φ.
E = −A
82
Chapter 3. From Fields to Particles
to be of the form
4L =
4c1 ijkl ∂i Aj ∂k Al + c2 (Skk )2
=
(3.60)
+c3 (Fik F + Sik S ) + c4 (Sik S − Fik F )
4c1 ijkl ∂i Aj ∂k Al + c2 (Skk )2 + (c3 − c4 )Fik F ik + (c3 + c4 )Sik S ik
ik
ik
ik
ik
In the first term with c1 , we can replace ∂i Aj ∂k Al by Fij Fkl since the ijkl
assures that only the antisymmetric part contributes. Therefore, this term
is clearly gauge invariant. But we can also write this term as
ijkl ∂i Aj ∂k Al = ∂i [ ijkl Aj ∂k Al ] + ijkl Aj ∂i ∂k Al = ∂i [ ijkl Aj ∂k Al ] (3.61)
23 As an aside, we mention that the
integrand in Eq. (3.62) is essentially
A · B; it is usually called the ChernSimons term.
where the second equality arises from the fact that ∂i ∂k Al is symmetric
in i and k but ijkl is completely antisymmetric, making the contraction
vanish. The surviving term in Eq. (3.61) is a four-divergence which —
when integrated over all space, with the usual assumption that all fields
vanish at spatial infinity — will contribute only at the two boundaries at
t = (t1 , t2 ). So we need to deal23 with the surface terms of the kind:
d3 x[ 0jkl Aj ∂k Al ]
d4 x∂i [ ijkl Aj ∂k Al ] =
t=cons
=
d3 x[ 0αβμ Aα ∂β Aμ ]
(3.62)
t=cons
24 It is important to realize that only
the spatial derivatives of the field variables remain fixed at the t = constant
boundary surface and not the time
derivatives. In this particular case, we
do not have surviving time derivative
terms on the boundary and hence ignoring this term is acceptable. When
this does not happen, as in the case
of action for gravity, you need to work
much harder, or be a little dishonest to
throw away boundary terms.
25 Another system of units uses a factor 1/16π instead 1/4, differing by a 4π
factor. We will try to be consistent.
Exercise 3.9: The electromagnetic
Lagrangian F ik Fik is clearly invariant
under the gauge transformation Ai →
Ai + ∂i α. Show that this symmetry
does not lead to a new conserved current. How come we don’t get a conserved current from this symmetry?
In the variational principle we use to get the equations of motion, we will
consider variations with Aμ fixed at the t = t1 , t2 surfaces. If Aμ is fixed
everywhere on this surface, its spatial derivatives are also fixed and hence
the term in Eq. (3.62) will not vary. Hence this term does not make a
contribution.24
Among the remaining three terms in Eq. (3.60), the term with (c3 − c4 )
also remains invariant under a gauge transformation but the other two
terms do not. Hence, if we want the Lagrangian to be gauge invariant we
must have c3 = −c4 = a1 , say, and c2 = 0. Thus, only the scalar Fik F ik
survives as a possible choice for the Lagrangian of the electromagnetic field
and the action will be proportional to the integral of this term over d4 x. It
is conventional to write this part of the action as
1 2
1
ik 4
(3.63)
Sf = −
Fik F d x =
E − B 2 d4 x.
4
2
The magnitude of the constant in front is arbitrary and merely decides
the units used for measuring the electromagnetic field. We have taken this
prefactor to be a dimensionless numerical factor (1/4), thereby making the
field Ai have the dimensions of inverse length in natural units25 . (From
Di = ∂i + iqAi , we see that q is dimensionless in natural units. In normal
units, q stands for q/(c)1/2 .) The sign is chosen so that the term (∂A/∂t)2
has a positive coefficient in Sf . This is needed to ensure that the energy
of the plane wave solutions should be positive. The second equality in
Eq. (3.63) allows us to identify the Lagrangian for the field as the integral
over d3 x of the quantity L ≡ (1/2)(E 2 − B 2 ). This is the action for the
electromagnetic field which should be familiar to you from a course on
electrodynamics.
The above analysis simplifies significantly, if we work with the action
expressed in the Fourier space in terms of the Fourier transform of Aj (x)
3.1. Classical Field Theory
83
which we will denote by Aj (k). The partial differentiation with respect
to coordinates becomes multiplication by ki in Fourier space. The most
general quadratic action in Fourier space must have the form M ij (k, η)Ai Aj
where M ij is built from k m and η ab . Since the Lagrangian is quadratic in
first derivatives of Ai (x), the expression M ij (k, η)Ai Aj must be quadratic
in ki . Hence, Mij Ai Aj must have the form [αki kj + (βk 2 + γ)ηij ]Ai Aj ,
where α, β, and γ are constants. The gauge transformation has the form
Aj (k) → Aj (k) + kj f (k) in the Fourier space. Demanding that Mij Ai Aj
should be invariant under such a transformation — except for the addition
of a term that is independent of Ai (k) — we find that α = −β, γ = 0.
Therefore, the action in the Fourier space will be
d4 k
d4 k
i j
2 ij
A
[k
k
−
k
η
]A
∝
[kj Ai − ki Aj ]2
(3.64)
Sf ∝
i
j
4
(2π)
(2π)4
which reduces to Eq. (3.63) in real space.26
The field equations for the gauge field can now be obtained by varying
L = −(1/4)F 2 + Lm (φ, Am ) with respect to Am . Using Eq. (3.54) and
ignoring the total divergence term in
−(1/4)δ(Flm F lm ) = −(1/2)Flm (2∂ l δAm ) = (∂ l Flm )δAm − ∂ l (Flm δAm )
(3.65)
we get the field equation to be
∂ l Flm = Jm
(3.66)
The antisymmetry of F lm again ensures the conservation of J m . In the
case of the source being the complex scalar field, we get
∂l F lm = J m = −iq [φ∂ m φ∗ − φ∗ ∂ m φ] − 2q 2 |φ|2 Am
(3.67)
Incidentally, you are just one step away from discovering a more general class of (non-Abelian) gauge fields by proceeding along similar lines.27
Consider an N −component field φA (x) with A = 1, 2, ...N which is a vector
in some N −dimensional linear vector space and a “rotation” in the internal
symmetry space is generated through a matrix transformation of the type
φ = U φ;
U = exp(−iτA αA )
26 Originally, the complex scalar field
had a single parameter m which we interpreted as its mass through the relation ω 2 = p2 + m2 . Writing the phase
of the transformation as eiqα and introducing a q we have attributed a second parameter to the complex scalar
field. So, can we have different complex scalar fields with arbitrary values
for (i) mass m, (ii) charge q? The answer to (i) is “yes” while the answer to
(ii) is “no”. The electric charges of all
the particles which exist in unbound
state (which excludes quarks) seem to
be in multiples of a quantum of charge.
Nobody really knows why; so we don’t
usually talk about it.
27 We will not study non-Abelian
gauge theories in this book except for
this brief discussion.
(3.68)
where φ is treated as a column vector, U is a unitary matrix (i.e., U † U = 1),
αA is a set of N parameters and τA are N matrices which satisfy the
J
J
τJ where CAB
are constants. In the
commutation rules [τA , τB ] = iCAB
standard context, one considers a theory based on a gauge group, say,
J
SU (N ), in which case the CAB
will be the structure constants of the group.
Consider now a field theory for φA based on a Lagrangian of the form
L=−
1 † i
∂i φ ∂ φ + μ2 φ† φ
2
(3.69)
where φ† is the Hermitian conjugate. This Lagrangian is clearly invariant
under the transformations φ = U φ with constant parameters αA .
Consider now the “local rotations” with αA = αA (x) depending on
spacetime coordinates. You can easily see that above Lagrangian is no
longer invariant under such transformations. As before, introduce an N
component gauge field AK
i (x) with i = 0, 1, 2, 3 being a spacetime index
Exercise 3.10: Prove this.
84
Chapter 3. From Fields to Particles
and K = 1, 2, 3, ....N being an internal space index. Let the Ai ≡ τJ AJi
denote a set of matrices corresponding to the gauge field. You can then
show that our Lagrangian can be made invariant under the local rotations
if: (i) we replace partial derivatives by gauge covariant derivatives
Di = ∂i + iAi
(3.70)
where the first term is multiplied by the unit matrix and (ii) we assume
that the gauge field transforms according to the rule:
Ai = U Ai U −1 − i U ∂i U −1
(3.71)
This is a natural generalization of the Abelian gauge field (which is a fancy
name for the electromagnetic field). Continuing as before, you can now
introduce a field tensor by
Fik = ∂i Ak − ∂k Ai + i[Ai , Ak ]
(3.72)
in which each term is interpreted as an N × N matrix, which is gauge
covariant. That is, when Ai changes as in Eq. (3.71), the Fik changes to
U Fik U −1 . This allows us to construct a Lagrangian for the gauge field
which is gauge covariant
1
L = − Tr Fik F ik
4
Exercise 3.11: Prove these claims
about Eq. (3.70) – Eq. (3.73) and determine the field equations of the theory.
(3.73)
which is the generalization of the electromagnetic Lagrangian.
Let us get back to the electromagnetic field. We can compute the
canonical momenta π i(j) corresponding to Aj (where the bracket around
j denotes which component of the field we are considering) and try to
determine the energy momentum tensor by the procedure laid down in
Eq. (3.18). Since
π i(j) =
1
∂L
= − F ij × 4 = −F ij
∂(∂i Aj )
4
(3.74)
we get
Tab = −ηab L + πa(j) ∂b Aj = −ηab L − Faj ∂b Aj
28 There
is a subtlety here. Because
Ai = Aj g ij etc. we cannot assume
both Aj and Aj are independent of the
metric, while varying the action with
respect to gab . Here we assume the
Ai and consequently Fij are independent of the metric. This is related, in
turn, to the fact that A is actually a
one-form and F is a two-form; but if
you haven’t heard of such things, just
accept the result.
Exercise 3.12: Compute T00 , T0α
and Tαβ in terms of the electric and
magnetic fields and identify the resulting expressions with more familiar constructs.
(3.75)
which is not symmetric. This is the difficulty we mentioned earlier about
using Eq. (3.18). On the other hand, we can easily determine28 the energy
momentum tensor by introducing a g ab in to the electromagnetic action
and varying with respect to it. Since F 2 = g ab g cd Fac Fbd , we now find, on
using Eq. (3.35) with V = Uab = 0, Wabcd = −(1/4)Fac Fbd that:
Tab = −ηab L − Fak Fb k
(3.76)
which is clearly symmetric and gauge invariant.
The fact that electromagnetic interactions are gauge invariant has some
important consequences when we attempt to quantize it. As a preamble to
this topic which we will take up in Sect. 3.6 we will describe some of these
features at the classical level where they are easier to understand.
To begin with, gauge invariance demands that Ai should be coupled
to a current J i which is conserved. This is because we want to keep the
coupling term to be gauge invariant under the transformation Ai → Ai +
∂i α. We have already seen how the current J i defined through Eq. (3.54)
is conserved when the action is gauge invariant.
3.1. Classical Field Theory
85
Second, note that when the Lagrangian in Eq. (3.63) is expressed in
terms of the vector potential, we get
2
1
2
˙
L=
∇φ + A − (∇ × A)
(3.77)
2
˙ = −E
which shows that the momentum associated with A is (∂L/∂ A)
while the canonical momentum associated with φ vanishes.29 This, in
turn, implies that the equation of motion obtained by varying φ in the
electromagnetic Lagrangian will have the form (δL/δφ) = 0 which cannot
contain any second derivatives with respect to time. Such an equation
(called a constraint equation) puts a constraint on the initial data which
we give for evolving the system forward in time. The existence of such
constraint equations usually creates mathematical difficulties in quantizing
the system which we will come across when we study the quantum theory
of electromagnetic field.
Third, gauge invariance tells you that a description in terms of Ai is
actually redundant in the sense that Ai s can be modified by gauge transformations without affecting the physical consequences. This raises the
question as to what are the true degrees of freedom contained in the set of
four functions Ai . This question can be answered differently depending on
what exactly we want to do with the electromagnetic field. For our purpose
we are interested in determining the propagating degrees of freedom contained in the set Ai . These are the degrees of freedom which propagate in
spacetime at the speed of light in the case of electromagnetic interactions.
It turns out that these degrees of freedom are contained in the transverse
part of A; that is, if we write30 A = A⊥ + A where ∇ · A⊥ = 0 = ∇ × A ,
then the propagating degrees of freedom are contained in A⊥ . Hence a natural gauge choice which isolates the propagating degrees of freedom will be
the two conditions A0 = 0 and A = 0. The second condition is, of course,
equivalent to imposing ∇ · A = 0. (This is called the radiation gauge and
we will use it to quantize the electromagnetic field in Sect. 3.6.)
The simplest way to see that such a gauge condition can indeed be
imposed is to proceed through two successive gauge transformations. Let
us first consider a gauge transformation from Ai to A¯i ≡ Ai + ∂i α such
that A¯0 = 0. This is easily achieved by choosing α to be
(3.78)
α(t, x) = − dt A0 (t, x) + F (x)
¯ x).
Such a transformation will also change the original A to some other A(t,
This function, however, satisfies the condition
¯
∂A
∂
¯
(∇ · A) = ∇ ·
= −∇ · E = 0
(3.79)
∂t
∂t
since we are dealing with source-free electromagnetic fields. This result
¯ is independent of time. We now make another gauge
shows that ∇ · A
¯ + ∇f such
¯ to A = A
transformation with a gauge function f changing A
2
¯ with the
that ∇ · A = 0. This leads to the condition ∇ f = −∇ · A
solution:
1
1
d3 y
(∇ · A)(y)
(3.80)
f (x) =
4π
|x − y|
¯ is independent of t, the gauge function f is also independent
Since ∇ · A
of t and hence this transformation does not change the value A¯0 = 0 which
29 One
can also see this from
Eq. (3.74); the momentum conjugate
to Aj is π 0(j) = −F 0j which is clearly
zero for j = 0.
30 This separation is possible for any
vector field V α (t, xμ ).
This “theorem” is rather trivial because the
spatial Fourier transform V α (t, k μ )
can always be written as a sum
of a term parallel to k [given
by (k α k β /k 2 )Vβ (t, k μ )] and a term
perpendicular to k [given by the
rest V α (t, k μ ) − (k α k β /k 2 )Vβ (t, k μ )].
Obviously, these two terms will
Fourier transform back to curl-free and
divergence-free vectors.
86
Chapter 3. From Fields to Particles
we had already achieved. Thus with these two transformations, we have
successfully gone to a gauge with A0 = 0, ∇ · A = 0.
Of the two conditions which define the Coulomb gauge, viz., ∇ · A = 0
and φ = A0 = 0, the second condition cannot be imposed in the presence of
external sources because it will lead to trouble with ∇ · E = 4πρ condition.
This Coulomb’s law, however, has no dynamic content because it has no
time derivative. Therefore one would expect that the Hamiltonian, for the
electromagnetic field interacting with the source, should decouple into one
involving transverse components of A and the one involving the Coulomb
part. To see how this comes about, let us decompose the electric field into a
transverse (divergence-free) E⊥ and longitudinal (curl-free) E components
given by E⊥ = −∂0 A, E = −∇A0 (with ∇ · A = 0). In the free part of
the electromagnetic Lagrangian, the E 2 term will pick up the squares of
these two parts; the cross term
E · E⊥ = −(∇A0 ) · E⊥ = −∇ · (A0 E⊥ ) + A0 ∇ · E⊥ = −∇ · (A0 E⊥ ) (3.81)
is a total divergence and can be discarded. The canonical momentum
corresponding to A (which, of course, is transverse since ∇·A = 0) is given
by π ⊥ = E⊥ . Using this, you can compute the corresponding Hamiltonian
to be
1
1 ⊥2
E + B2 − E2 = H0 + H
(3.82)
Hem =
2
2
The part H combines in a natural manner with the interaction Hamiltonian Ji Ai which couples the vector potential to the external current.
Writing
H = H0 + H + Hint = H0 + Hint
(3.83)
where
= H + Hint = H − Lint
Hint
(3.84)
we can express the Hamiltonian arising from integrating the interaction
term as
1
3
0
d x − ∇A0 · ∇A0 + J0 A − J · A
Hint =
2
1
2
0
3
A0 ∇ A0 + J0 A − J · A
(3.85)
=
d x
2
where we have done one integration by parts and discarded the surface
term. Using the Poisson equation ∇2 A0 = −J0 , the first two terms combine
to give the Coulomb interaction energy:
1
1
3 1
0
J0 (x , t) (3.86)
Hcoul = d x J0 A =
d3 x d3 x J0 (x, t)
2
2
4π|x − x |
31 In
general, you cannot use the equations of motion in the Hamiltonian to
change its form. Here we could do
this only because the equation ∇2 A0 =
−J0 is not a dynamical equation —
i.e.., it has no time derivatives — but
a constraint equation which relates A0
at some time t to J0 at the same time
t. Since there are no propagating degrees of freedom, we can eliminate A0
from the Hamiltonian by this process.
leaving the remaining interaction term as31
Hint
= − d3 x J(x, t) · A(x, t)
(3.87)
which only couples A to the transverse part of J. Therefore, after removing
the non-dynamical interaction energy term Hcoul , we are essentially left
with the transverse part of (E 2 + B 2 ) and J · A which are the physical
degrees of freedom one would deal with in the quantum theory.
3.2. Aside: Spontaneous Symmetry Breaking
3.2
87
Aside: Spontaneous Symmetry Breaking
A real scalar field described by the action in Eq. (3.30) with V (φ) =
(1/2)m2 φ2 describes excitations with the dispersion relation ω 2 = k2 + m2
which can be thought of as relativistic free particles with mass m. We have
already seen that such a simple system has an equivalent description in
terms of G(x2 ; x1 ) obtained either from a path integral or from studying
the action of sources and sinks on the vacuum. The scalar field theory
comes alive only when we go beyond the free particles and introduce interactions, requiring a V (φ) which is more general than a simple quadratic.
In classical field theory one could have introduced any V (φ) (except
possibly for the condition that it should be bounded from below) but we
will see later that meaningful quantum field theories exist only for very
special choices of V (φ). One such example, of considerable importance,
is a V (φ) which has both quadratic and quartic terms with the structure
V (φ) = αφ2 +βφ4 . We definitely need β > 0 for the potential to be bounded
from below at large φ. But we now have the possibility of choosing α with
either sign. When α > 0 and β √
> 0, one can think of the theory as
describing particles with mass m = 2α which are interacting through the
potential βφ4 . We will study this extensively in Chapter 4 as a prototype
of an interacting field theory.
But when α < 0 and β > 0, we are led to a completely different kind of
theory which exhibits one of the most beautiful and unifying phenomena in
physics called spontaneous symmetry breaking, albeit in the simplest form.
Obviously we cannot now think of the αφ2 term as describing the mass
of the scalar field since α has the wrong sign. In this case, by adding a
suitable constant to the potential, we can re-express it in the form
V (φ) =
2
1 2
λ φ − v2
4
V Φ
2.0
1.5
1.0
0.5
(3.88)
which is shown in Fig. 3.2. The potential has two degenerate32 minima at
φ = ±v and a maximum at φ = 0. This implies that the ground state with,
say, φ = ±v will break the symmetry under φ → −φ present in the action.
Thus, physical phenomena arising in this system will not exhibit the full
symmetry of the underlying Hamiltonian because the ground state breaks
the symmetry; this situation is called spontaneous symmetry breaking.
Condensed matter physics is full of beautiful examples in which this
occurs. Consider, for example, a ferromagnet which exhibits non-zero magnetization M below a critical temperature Tc thereby breaking the rotational invariance. Observations show that the magnetization varies with
temperature in the form M ∝ (T − Tc )μ where 0 < μ < 1. Such a nonanalytic dependence on (T − Tc ) has puzzled physicists until Ginzburg
and Landau came up with an elegant way of describing such phenomena.
They suggested that the effective potential energy V (M, T ) (more precisely
the free energy of the system) describing the magnetization has the form
V (M, T ) = a(T )M 2 + b(T )M 4 + .... with the conditions: (i) a(Tc ) = 0
while (ii) b(T ) is smooth and non-zero around Tc . Expanding a(T ) in a
Taylor series around Tc and writing a(T ) ≈ α(T − Tc ) with α > 0, one sees
that the nature of V (M, T ) changes drastically when we go from T > Tc
to T < Tc . In the high temperature phase, the minimum of the potential
is at M = 0 and the system exhibits no spontaneous magnetization. But
when T < Tc , the potential takes the shape in Fig.3.2 and the minimum
2
1
0
1
2
Φ
Figure 3.2: The shape of the poten-
tial V (φ) which produces spontaneous
symmetry breaking.
32 The behaviour of such a system is
quite different in quantum field theory compared to quantum mechanics.
In quantum mechanics the tunneling
probability between the two minima is
nonzero which breaks the degeneracy
and leads to a sensible ground state.
In quantum field theory, the tunneling
amplitude between the minima vanishes because an infinite number of
modes have to tunnel; mathematically,
the tunneling amplitude will be of the
form exp(−Γ) where Γ will involve an
integration over all space of a constant
factor, making the amplitude vanish.
Thus, in field theory, we have genuine
degeneracy of the two vacua.
88
33 As you can guess,
the interrelationship between field theory and
condensed matter physics is very
strong and the concepts in each clarify and enrich the understanding of the
other. The third term in Eq. (3.89)
is analogous to adding the external
source term J(x)φ(x) in the case of the
scalar field theory; you should be able
to see the parallels in what follows.
Chapter 3. From Fields to Particles
of the potential is now at M 2 = (α/2b)(Tc − T ) thereby exhibiting the
non-analytic dependence M ∝ (T − Tc )μ with μ = 0.5.
In fact, this phenomenological model can also be used to describe the
response of the system to externally imposed magnetic field H, thereby
explaining the behaviour of the correlation length and magnetic susceptibility of the system. In this context, one could use a phenomenological
Hamiltonian of the form33
H=
1 α
(∂ M) · (∂α M) + V (M ) − H · M
2
(3.89)
Taking V ≈ a(T )M 2 ≈ α(T − Tc )M 2 as the leading behaviour for T > Tc
and minimizing this Hamiltonian, we get the equation
−∇2 M + aM = H
(3.90)
which leads to the Yukawa potential solution for the induced magnetic field
M(x) in terms of the applied magnetic field H(y) in the form
√
1
1
e− a|x−y|
M(x) = d3 y H(y)
(3.91)
4π |x − y|
Exercise 3.13: (a) Show that, in the
(1+1) spacetime, there exist static, finite energy field configurations φ(x)
which are solutions to the field equation with the double-well potential.
These solutions, with energy density
concentrated in a finite region of space,
are examples of solitons. (b) Show that
such solutions cannot exist in (1 + D)
spacetime with D ≥ 2. This is called
Derrick’s theorem. [Hint: Study the
scaling behaviour of the total kinetic
energy and the potential energy for a
static solution under x → λx and use
the fact that the total energy should be
a minimum with respect to λ at λ = 1
for any valid solution.]
The effect of the applied magnetic field is felt essentially
over a distance
√
(called the correlation length) of the order of ξ = (1/ a) ∝ (T − Tc )−1/2 .
The correlation length is finite at T > Tc but diverges as T → Tc indicating
the development of long range order in the system. Further, for a constant
applied field H we can integrate over y in Eq. (3.91) and get the resulting
magnetization to be M = χH, where the magnetic susceptibility χ ∝ ξ 2 ∝
(T −Tc )−1 . Thus, the simple model of spontaneous symmetry breaking also
predicts that the magnetic susceptibility diverges as T → Tc , all thanks
to the behaviour a(T ) ∝ (T − Tc ) which, in turn, is the root cause of
spontaneous symmetry breaking.
Back to field theory. Since the ground state is at non-zero φ, it makes
sense to shift the field φ in order to study excitations around the true ground
state. If we take the ground state to be at φ = v and define ψ ≡ φ − v,
then the potential takes the form
λ 4
λ 2
2 2
2 2
3
φ −v
ψ + λvψ
= λv ψ +
(3.92)
V (φ) =
4
4
We now√see that the theory describes excitations with a well defined mass
mψ = v 2λ for the ψ field with a self-interaction described by the terms
(λ/4)ψ 4 + λvψ 3 . This will provide a description of a sensible quantum field
theory in terms of the shifted field ψ. In this particular case, the situation
is fairly simple since there is only one degree of freedom to deal with in the
φ or ψ field.
Let us next consider a complex scalar field, having a potential with
quadratic and quartic terms, which exhibits spontaneous symmetry breaking. This happens if we replace the m2 |φ|2 term in Eq. (3.41) by
V (|φ|) =
Figure 3.3: The shape of the potential
V (φ) that leads to a zero mass particle.
2
λ 2
|φ| − v 2
4
(3.93)
The situation now is, however, qualitatively different from the case of the
real scalar field because of the existence of two degrees of freedom φ1 and φ2
in the complex scalar field φ ≡ φ1 + iφ2 . The potential now has a minimum
3.2. Aside: Spontaneous Symmetry Breaking
89
on the circle φ21 + φ22 = v 2 in the φ1 − φ2 plane (see Fig. 3.3). The φ field
can roll along the circle in the angular direction without changing the
potential energy while any radial oscillations about the minimum will cost
potential energy. Since the mass of the field gives the natural frequency
of oscillations, it follows that we will have one massless degree of freedom
(corresponding to the mode that rolls along the valley of the potential) and
one massive degree of freedom (corresponding to the radial oscillations).
This is easily verified by writing φ(x) = ρ(x) exp[iθ(x)/v] and working
with the two fields ρ(x) and θ(x). The kinetic term in the action now
becomes
ρ2
∂a φ ∂ a φ∗ = ∂a ρ ∂ a ρ + 2 ∂a θ ∂ a θ
(3.94)
v
while the potential energy term becomes, in terms of the field ψ ≡ (ρ − v)
defined with respect to the minimum,
V =
2
λ
λ
ψ(2v + ψ) = λv 2 ψ 2 + ψ 4 + vλψ 3 ≡ λv 2 ψ 2 + U (ψ)
4
4
(3.95)
where the first term is the mass term for ψ and the rest represent the
self-interaction of the field. Putting them together, we can express the
Lagrangian in the form
2
ψ
ψ
a
2 2
a
∂a θ ∂ a θ (3.96)
L = ∂a ψ ∂ ψ − mψ ψ − U (ψ) + ∂a θ ∂ θ + 2 + 2
v
v
√
with mψ = v λ. This form of the Lagrangian shows that our system
decomposes to one ψ field with mass mψ (and self-interaction described
by U (ψ)) one field θ with zero mass, with the remaining terms describing
the interaction between these two fields. The emergence of a zero mass
field in such systems with spontaneous symmetry breaking is a universal
feature and we will provide a general proof of this result when we discuss
the quantum structure of this theory, at the end of Sect. 3.5.
Finally, let us consider what happens when we couple the charged scalar
field to the gauge field. In the absence of the gauge field, when we wrote
φ(x) as ρ(x) exp[iθ(x)/v] we found that that θ(x) became a massless scalar
field. But when we have a gauge field present, we can eliminate the θ(x) by
a gauge transformation! This allows us to go from the 3 real fields (ρ, θ, Aj )
to two new (real) fields (ρ, 0, Bj ) with Bj ≡ Aj + (1/qv)∂j θ. Since Dj φ is
gauge invariant, it now becomes ∂j ρ + iqBj ρ; similarly, since Fij is gauge
invariant it can be expressed in terms of Bj . With these modifications, the
relevant Lagrangian will now become:
L
=
=
=
1
λ
− F 2 + |Dφ|2 − [|φ|2 − v 2 ]2
4
4
λ
1 2
− F + |∂m ρ + iqBm ρ|2 − (ρ2 − v 2 )2
4
4
1 2
λ 2
2
2 2 2
− F + (∂ρ) + q ρ B − (ρ − v 2 )2
4
4
(3.97)
As usual we now introduce a shifted field ψ by ψ = ρ − v to get:
1
L = − F 2 + q 2 v 2 Bj B j + (∂ψ)2 − λv 2 ψ 2 − U (ψ) + 2vq 2 ψB 2 + q 2 ψ 2 B 2
4
(3.98)
90
34 The phenomenon, which is of considerable importance in condensed
matter physics and particle physics,
was discovered, in one form or another, by (in alphabetical order) Anderson, Brout, Englert, Guralnik, Hagen, Higgs and Kibble; hence it is usually called the Higgs mechanism.
Chapter 3. From Fields to Particles
The terms in √
the first square bracket describe a massive vector field Bj with
mass mB = 2qv; the second
√ square bracket represents a massive scalar
field ψ with mass mψ = v λ and a self-interaction described by U (ψ) as
before. The remaining terms represent interactions between the two fields.
The massless mode has disappeared! Rather, the one degree of freedom in
θ and two degrees of freedom in the massless gauge field Aj have combined
into a massive vector field Bj with three degrees of freedom.34
An important example in condensed matter physics in which the above
result plays a role is superconductivity. Under certain circumstances, one
can describe the electron pairs (called Cooper pairs) in terms of a bosonic
complex scalar field φ (carrying a charge q = 2e = −2|e|) with a nonzero
vacuum expectation value. In that case, assuming φ does not vary significantly over space, the current in the right hand side of Eq. (3.67) will have
spatial components J = −2q 2 v 2 A. The Maxwell equation ∇ × B = J =
−2q 2 v 2 A reduces, on taking a curl and using ∇ · B = 0, to the form
∇2 B = 2q 2 v 2 B ≡ −2 B
(3.99)
with solutions
which decay exponentially with a characteristic length scale
√
= 1/( 2 qv) which is essentially the mass scale of the vector boson. This
fact underlies the so called Meissiner effect in superconductivity, which
leads to a superconductor expelling a magnetic field from its interior.
3.3
35 Recall
that, in the Heisenberg picture in which we are working, these
commutation relations hold only when
the operators are evaluated at the same
time; [qα (t), qα (t )] need not vanish
when t = t .
Exercise 3.14: How do you know
that the commutation rules are consistent with the Hamiltonian evolution?
That is, if you impose the commutation rules at t = t0 and evolve the system using a Hamiltonian, you better
make sure the commutation rules hold
at t > t0 . Is this a worry?
Quantizing the Real Scalar Field
Given the action functional in classical mechanics, one can obtain the quantum theory by elevating the coordinates and the canonical momenta to the
operator status and imposing the equal time commutation rules (ETCR)
given by [qα (t), pβ (t)] = iδαβ with all other commutators vanishing.35 The
time evolution of the dynamical variables is determined through the evolution equations iq˙α = −[H, qα ], ip˙ α = −[H, pα ] where H(p, q) is the Hamiltonian operator. Alternatively, one can choose to work in the Schrodinger
picture in the coordinate representation and take pα = −i∂α , and express
H(p, q) → H(−i∂, q) as a differential operator acting on the wave functions ψ(t, q), with the time evolution determined by iψ˙ = Hψ. Finally one
can also obtain a quantum theory from a classical theory (in principle) by
computing the path integral amplitude by summing over paths connecting
fixed end points, with the amplitude for a given path given by exp(iA).
Move on from mechanics to field theory. In the case of a real scalar
field, the dynamical variable which is varied in the action is φ(t, x) and the
canonically conjugate momenta is π(t, x) = φ˙ with the Hamiltonian given
by Eq. (3.29). If we try to quantize this system using the Schrodinger
picture — which is what you are likely to have spent most time on in a
QM course — we immediately run into a mathematical complexity. The
coordinate q, which was just a real number in Schrodinger picture quantum
mechanics, will now have to be replaced by a function giving the field
configuration φ(x) at a given time t. The wavefunction ψ(q, t) giving the
probability amplitude for the dynamical variable to have value q at time t
will now have to be replaced by a functional Ψ[φ(x), t] giving the probability
amplitude for the field to be specified in space by the function φ(x) at time
t. The Hamiltonian, which was a partial differential operator in Schrodinger
quantum mechanics, will now become a functional differential operator
3.3. Quantizing the Real Scalar Field
91
and the equivalent of the Schrodinger equation will become a functional
differential equation. The technology for dealing with these is not yet at a
sufficiently high state of development — which prevents us from handling
realistic quantum field theory problems using the Schrodinger picture.36
Similar difficulties arise in the case of using path integrals to quantize
the fields. What we now need to compute is an amplitude for transition
from a given field configuration φ1 (x) at time t = t1 to a field configuration
φ2 (x) at time t = t2 which is given by the path integral
t2 , φ2 (x)|t1 , φ1 (x) =
t=t2 ,φ=φ2 (x)
36 We will give some simple examples
of Schrodinger picture quantum field
theory later on because of its intuitive
appeal but one cannot do much with
it.
Dφ exp i
d4 x L(∂a φ, φ)
(3.100)
t=t1 ,φ=φ1 (x)
The sum over paths (or rather the functional integrals) have to be now
computed with the specific boundary conditions, as indicated. Once we
know t2 , φ2 (x)|t1 , φ1 (x), we can evaluate the wave functional at a given
time from its form at an earlier time by the standard rule
(3.101)
Ψ[φ2 (x), t2 ] = Dφ1 t2 , φ2 (x)|t1 , φ1 (x)Ψ[φ1 (x), t1 ]
It is obvious that this path integral quantization procedure is at least as
complicated technically as using the Schrodinger picture. To begin with,
we cannot compute it except for a quadratic action functional (which essentially corresponds to non-interacting fields). Second, even after such a
computation, we have to work continuously with functional integrals and
wave functionals to make any sense of the theory. Finally this approach
(as well as the one based on the Schrodinger picture) makes it difficult to
maintain manifest Lorentz invariance because it treats the time coordinate
in a preferential manner.37
Fortunately, the Heisenberg picture works fine in quantum field theory
without any of these difficulties and you having to learn new mathematical
physics. In the case of a real scalar field, the dynamical variable which is
varied in the action is φ(t, x) and the canonically conjugate momentum is
π(t, x) = φ˙ which suggests that we should try to quantize this system by
postulating the commutation rules
[φ(t, x), π(t, y)] = iδ(x − y);
[φ(t, x), φ(t, y)] = [π(t, x), π(t, y)] = 0
(3.102)
This will indeed lead to the quantum theory of a real scalar field but there
is a much simpler procedure based on what we have already learnt in the
last chapter. We will follow that route.
We had seen that the action functional for the scalar field in Eq. (3.3)
is the same as that for an infinite number of harmonic oscillators and can
be expressed in the form
d3 k 1
1
4
a
2 2
d x ∂a φ∂ φ − m φ =
[|q˙k |2 − ωk2 |qk |2 ] (3.103)
A=
2
(2π)3 2
where
φ(x) =
d3 p
qp (t) eip·x
(2π)3
(3.104)
So, if we quantize each of the oscillators by standard quantum mechanical
rules, we would have quantized the field itself. This reduces the quantum
37 We will, of course, use functional integrals extensively in developing quantum field theory in later chapters. But
these are functional integrals evaluated
from t1 = −∞ to t2 = +∞ with either
the i or the Euclidean prescription.
As we saw in Sect. 1.2.3, such path integrals do not require specific boundary configurations φ1 (x) and φ2 (x) —
or rather, they can be set to zero —
and lead to the ground state expectation values of various operators. So we
will use path integrals as a convenient
tool for calculating vacuum expectation values etc. and not as a procedure
for quantization — which is what we
are discussing here. These two roles of
path integrals are somewhat different.
92
38 More formally, this can be done by
separating qk into real and imaginary
parts with qk = Xk + iYk with the
conditions X−k = Xk , Y−k = −Yk
and noting that the dynamical variables Xk , Yk are independent only for
half the k values. In going from qk
to Xk , Yk you are doubling the number of degrees of freedom, but the fact
that only half of them are independent
restores the balance. Therefore we can
introduce one single real variable Qk
in place of Xk , Yk but now retaining
all possible values of k. This means
the action in Eq. (3.103) can be expressed entirely in terms of one real
harmonic oscillator coordinate Qk . We
won’t bother to spell out the details of
this approach since the final result is
the same.
Chapter 3. From Fields to Particles
field theory problem to the quantum mechanics of harmonic oscillators
which you are familiar with.
In this approach, the dynamical variables for each oscillator should
become qk but there is a minor issue that the qk ’s are not real if φ is real
and satisfy the condition q−k = qk∗ . This constraint is easily taken care38 of
by writing qk (t) = ak (t) + a∗−k (t) without any constraint on the variables
ak (t). The standard harmonic oscillator time evolution, arising from the
Heisenberg equations of motion now gives
1
ap e−iωp t + a†−p eiωp t
qp =
2ωp
(3.105)
[This has the same form as the classical evolution in Eq. (3.25).] Quantizing
each harmonic oscillator is now trivial and one can think of a†k and ak as
the standard creation and annihilation operators. In the continuum case,
the creation and annihilation operators satisfy the commutation rule
[ap , a†k ] = (2π)3 δ(p − k)
(3.106)
with all other commutators vanishing. The evolution of the field φ itself
follows from Fourier transforming Eq. (3.105) with an exp[ip · x] factor.
Using the usual trick of flipping the sign of p in the second term, we get back
essentially the result we found in the previous chapter (see Eq. (1.129)):
So our final answer is:
1
d3 p
ap e−ipx + a†p eipx
(3.107)
φ(x) =
3
(2π)
2ωp
We will now explicitly verify that this does lead to the correct commutation rule for the fields. Since we need to verify the commutation rules
only at a given time, we can always choose it to be t = 0. To work out the
commutators, we start with the expressions for φ and π at t = 0, given by
1
d3 p
ap eip·x + a†p e−ip·x
(3.108)
φ(0, x) =
(2π)3 2ωp
π(0, x) =
39 We could have done this starting
from Eq. (3.102) and then found that
the a and a† satisfy the usual commutation rules for a creation and an
annihilation operator. We took the
opposite route since we have actually
done all the hard work in the last chapter. We will see later that the identification and quantization of oscillators
work even when the approach based
on commutators requires special treatment.
d3 p
(−i)
(2π)3
ωp
ap eip·x − a†p e−ip·x
2
(3.109)
It is convenient to rearrange these two expressions by the usual trick of
flipping the sign of momentum in the second term and writing
1
d3 p
†
a
eip·x
+
a
(3.110)
φ(0, x) =
p
−p
(2π)3 2ωp
π(0, x) =
d3 p
(−i)
(2π)3
ωp
ap − a†−p eip·x
2
(3.111)
Evaluating the commutator and using Eq. (3.106), we find that39
[φ(x), π(x )] =
d3 p d3 p −i ωp
(3.112)
(2π)6 2
ωp
ei(p·x+p ·x ) = iδ (3) (x − x )
×
a†−p , ap − ap , a†−p
3.3. Quantizing the Real Scalar Field
93
Similarly, you can work out the other commutators and you will find that
they all vanish. Thus we have successfully quantized the real scalar field
and obtained the time evolution of the dynamical variable in Eq. (3.107).
For the sake of completeness, we will briefly mention how the same
result comes about if we had taken the more formal approach based on
Eq. (3.102). Here one would associate with the system the Hamiltonian
1
(3.113)
H=
d3 x π 2 + (∇φ)2 + m2 φ2
2
ˆ = [H, O]
ˆ which
and use the standard result in quantum mechanics −i∂t O
ˆ to determine φ˙ and π.
describes the time evolution of any operator O
˙ This
calculation gives (on using the commutation rules in Eq. (3.102)) the results
˙ y) = 1 d3 x π 2 (t, x), φ(t, y) = d3 x [π(t, x), φ(t, y)] π(t, x)
−iφ(t,
2
(3.114)
= −i d3 x δ(x − y) π(t, x) = −iπ(t, y)
and
−i∂t π(t, y)
$
%
d3 x m2 φ2 (x), π(y) + [∂α φx ∂α φx , πy ]
= im2 φ(y) + d3 x [∂α φx , πy ] ∂α φ
2
= im φ(y) + i∂α δ(x − y)(∂α φx )d3 x
= im2 φ(y) − i∇2 φy = i −∇2 + m2 φ(y, t) (3.115)
=
1
2
Combining these two we get the operator equation for φ which is just the
Klein-Gordon equation:
φ¨ − ∇2 φ + m2 φ = 0
(3.116)
Obviously, normalized solutions to this equation can be expressed in the
standard form as
φ(x) = dΩp Ap e−ipx + A†p eipx
(3.117)
which is exactly our result in Eq. (3.107) with ap = Ap / 2ωp etc. In fact,
if we solve equations Eq. (3.110) and Eq. (3.111) for ap and a†−p in terms
of φ(0, x) and π(0, x) and use Eq. (3.102), we can obtain Eq. (3.106). [See
Problem 3.] This is a more formal procedure for quantizing the system
treating φ as a dynamical variable and π as the conjugate momentum
leading to the results.
This is a good time to take stock of what we have achieved. We are
interested in defining a quantum theory for the field φ(t, x) such that it
obeys the commutation rules in Eq. (3.102). We find that this is most easily
done in terms of a set of creation and annihilation operators ap and a†p , for
each mode labeled by a 3-vector p, which satisfy the standard commutation
rule. This allows us to introduce states in the Hilbert space |{np } labeled
by a set of integers {np } with the creation and annihilation operators for
each mode increasing or decreasing the corresponding integer by one unit40
40 By and large, if you understand harmonic oscillators, you can understand
noninteracting fields; they are essentially the same except that there are
an infinite number of oscillators.
94
Chapter 3. From Fields to Particles
etc. The state |0 with all integers set to zero will be the ground state of
all harmonic oscillators which we will expect to be the vacuum state. (We
will have quite a bit to say about this state soon.) Acting on |0 with the
creation operator, we can generate the excited states of the system.
To investigate further the nature of these states |{np } we can work
out the Hamiltonian for the system in terms of ak and a†k . A direct way
of obtaining the Hamiltonian operator is to substitute Eq. (3.110) and
Eq. (3.111) into the expression for the Hamiltonian in Eq. (3.29). Again,
since H is independent of time, we can evaluate it at t = 0. This gives
& √
3 3
ω p ω p
d p d p i(p+p )·x
3
(ap − a†−p )(ap − a†−p )
H =
d x
e
−
6
(2π)
4
'
−p·p + m2
†
†
+
(ap + a−p )(ap + a−p )
√
4 ω p ω p
d3 p
1
†
†
=
a
(3.118)
ω
a
+
,
a
a
p
p p
p p
(2π)3
2
In the first term, each mode contributes np ωp to the energy density which
is consistent with interpreting the state |{np } as having np particles with
energy ωp . The second term is more nontrivial: It can be thought of as the
limit of Eq. (3.106) as p → k which will lead to a δ(0) type of divergence
in 3-dimensions. As usual, we interpret this as arising out of a volume
integration with
lim (2π)3 δ(p → k) = lim
(3.119)
d3 x exp[ix·(p → k)] = lim V
p→k
p→k
V →∞
so that the second term can be thought of as giving an energy density which
exists even in the vacuum state:
1
Evac
d3 p
=
ωp
(3.120)
V
2
(2π)3
41 If you are uncomfortable with this
entire discussion, then you are doing
fine. Nobody really knows how to
handle this divergence and it is rather
surprising that this procedure actually
‘works’ even to the extent it does. We
will see later that an experimentally
observed result, called the Casimir effect, can be interpreted in terms of the
zero point energies. This alone would
have been enough to argue that normal
ordering is a dubious procedure but for
the fact that there exist more complicated ways of obtaining Casimir effect
without using the zero point energy.
which is quartically divergent in the upper limit. Physically, this term
arises due to the infinite number of harmonic oscillators each having a
ground state energy (1/2)ωk . We have encountered the same expression
earlier when we computed Z(0) in Sect. 2.2.1 (see Eq. (2.71)) and also when
we computed the energy due to the closed loop of particles in Sect. 1.4 [see
Eq. (1.107)].
The usual procedure to deal with this term is to simply throw this
away and hope for the best. A formal way of implementing this procedure
is to postulate that physical quantities (like the Hamiltonian) should be
defined by a procedure called normal ordering. The normal ordering of
an expression containing several a s and a† s is to simply move all the
a† s to the left of a so that the vacuum expectation value of any normal
ordered operator is automatically zero. One possible way of “justifying”
this procedure is to note that the ordering of physical variables in the
classical theory is arbitrary, and when one constructs the quantum theory,
one has to make a choice regarding the way these variables are ordered,
since they may not commute. The normal ordering is supposed to give
a physically meaningful ordering of operators so that divergent vacuum
expectation values are avoided.41 We use the symbol : O : to denote a
normal ordered operator O. The expectation value of this (normal ordered)
3.3. Quantizing the Real Scalar Field
Hamiltonian, denoted by : H :, in a state |{np } is given by
d3 p
{np }| : H : |{np } =
ωp [np ]
(2π)3
95
(3.121)
By a similar procedure we can also express the total momentum operator
in Eq. (3.20) in terms of the creation and annihilation operators. We now
find
d3 p
P = − d3 x π(t, x) ∇φ(t, x) =
p a†p ap
(3.122)
(2π)3
There is no zero point contribution in this case because the contributions
from the vectors p and −p cancel.42 This shows that the expectation value
P in the state labeled by {np } is given by
d3 p
p [np ]
(3.123)
{np }|P |{np } =
(2π)3
It is obvious that (except for the zero point energy which we have dropped)
we can think of this state |{np } as containing np particles with momentum
p and energy ωp . This agrees with the conclusion we reached in the last
chapter that the first excited state of any given oscillator labeled by p can
be thought of as a one-particle state describing a particle with momentum
p and energy ωp .
An alternative way of obtaining the same result is to simply note that
the action for our system is made of a sum of actions for harmonic oscillators which do not interact with each other. Then, it is obvious that the
Hamiltonian can be expressed in the form
d3 p
(3.124)
ωp a†p ap
H=
3
(2π)
where we have dropped the zero point energy term.
We mentioned earlier that it is not easy to do quantum field theory in
the Schrodinger picture and this is definitely true in general. But in the case
of a free scalar field we are only dealing with a bunch of harmonic oscillators
and these can, of course, be handled in the Schrodinger picture. For each
harmonic oscillator we can write down the time independent Schrodinger
equation with the eigenfunction φnk (qk ) and the eigenvalue determined by
nk . The full wave functional of the system is just the product of all the
φnk (qk ) over k.
Just to see what is involved, let us consider the ground state wave functional of the system which is given by the product of all the ground state
wave functions of the oscillators. We have
d3 k
1
2
ω
|q
|
(3.125)
φ0 (qk ) ∝ exp −
Ψ[{qk }, 0] ∝
k k
2
(2π)3
k
The argument of the exponential can be expressed in terms of the field φ
itself using the inverse Fourier transform in the form
1
1
d3 k
2
3
ωk |qk | =
d x d3 y φ(x)G(x − y)φ(y); (3.126)
2
(2π)3
2
d3 k 2
(k + m2 )1/2 eik·x
(3.127)
G(x) =
(2π)3
42 This is fortunate because any nonzero momentum vector for the vacuum would have even violated the rotational symmetry of the vacuum and
given a preferential direction in space!
96
Chapter 3. From Fields to Particles
so that the ground state wave functional of the system is given by
1
Ψ[φ(x), 0] ∝ exp −
d3 x d3 y φ(x)G(x − y)φ(y)
(3.128)
2
Exercise 3.15: Evaluate G(x) paying particular attention to its singular
structure. Just a sec.; isn’t it the same
one we saw in Eq. (1.110) ? Is this a
coincidence ?
43 We have ignored the overall normalization factor and the time dependence
exp(−iEt) in writing the wave functional (for very good reasons; both are
divergent).
44 The way we have described the
entire quantization procedure, this
should not come as a surprise to you.
The field is a collection of oscillators
and the vacuum state is the simultaneous ground state of all the oscillators.
The probability amplitude for a nonzero displacement of the dynamical
variable qk is finite in the ground state
of the harmonic oscillator in quantum
mechanics; it follows that the amplitude for a non-zero field configuration
in the vacuum state must necessarily
be non-zero in field theory. There is
no way you can avoid it.
The form of G(x) is rather complicated for m = 0 but you can write the
vacuum functional in a simpler fashion when m = 0 by making use of the
fact that, for a massless scalar field, ωk = |k|. The trick now is to multiply
the numerator and denominator of the integrand in Eq. (3.126) by |k| and
note that kqk is essentially the Fourier transform of ∇φ. This gives
1
1
d3 k k 2 |qk |2
∇x φ(x)·∇y φ(y)
d3 k
2
=
|k|
|q
|
=
d3 x d3 y
k
3
3
2
2
(2π)
(2π) 2|k|
4π
|x − y|2
(3.129)
so that the ground state wave functional (also called the vacuum functional )
for a massless scalar field has the simple closed form expression given by
1
∇x φ(x)·∇y φ(y)
Ψ[φ(x), 0] = N exp − 2 d3 x d3 y
(3.130)
4π
|x − y|2
This wave functional is proportional to the amplitude that you will find
a non-trivial field configuration φ(x) in the vacuum state; that is, for any
given functional form φ(x) you can evaluate the integral in Ψ[φ(x), 0] and
obtain a number which gives the probability amplitude.43 The ratio of
Ψ[φ1 (x), 0]/Ψ[φ2 (x), 0] for any two functions φ1 (x), φ2 (x) correctly gives
the relative probability for you to find one field configuration compared
to another in the vacuum state. This is a concrete demonstration of the
non-triviality of the vacuum state in quantum field theory. If you look for
non-zero field configurations in the vacuum state, you will find them.44
3.4
Davies-Unruh Effect: What is a Particle?
In the earlier sections we started with the notion of a relativistic free particle, used the path integral formalism to obtain the quantum propagator
and reinterpreted it in terms of a field, say, a scalar field. In the last section
we took the opposite point of view: We started with a scalar field, treated
it as a dynamical system governed by a Hamiltonian, quantized it using
the commutation rules in Eq. (3.102) and discovered that the states in the
Hilbert space can be described in terms of occupation numbers {nk } allowing a particle interpretation. In this approach, we start with a vacuum
state containing no particles and we can create a one-particle state, say, by
acting on the vacuum with a creation operator. The propagator for the relativistic particle, for example, can now be obtained in terms of the vacuum
expectation value of the time-ordered product of two field operators. At
the level of propagators, we seem to have established the connection with
the original approach starting from a relativistic particle.
Conceptually, however, the procedure of describing relativistic particles
as excitations of a field has certain subtleties which need to be recognized.
The first one is the existence of vacuum fluctuations in the no-particle state
which, obviously, cannot be interpreted in terms of real relativistic particles.
We had already mentioned it while describing the wave functional for the
vacuum state given by Eq. (3.128) and will discuss it further later on when
we study the Casimir effect in Sect. 3.6.3.
3.4. Davies-Unruh Effect: What is a Particle?
97
There is a second subtlety which goes right to the foundation of defining
the concept of the particle, which we will discuss in this section. It turns
out that while the notion of a vacuum state and that of a particle are
Lorentz invariant, these are not generally covariant notions. That is, while
two observers who are moving with respect to each other with a uniform
velocity will agree on the definition of a vacuum state and the notion of
a particle, two observers who are moving with respect to each other noninertially may not agree45 on their definitions of vacuum state or particles!
For example, the vacuum state — as we have defined in the previous
section — for an inertial observer will not appear to be a vacuum state for
another observer who is moving with a uniform acceleration. The accelerated observer, it turns out, will see the vacuum state of the inertial observer
to be populated by a Planckian spectrum of particles with a temperature
T = (/ckB )(g/2π) where g is the acceleration. This peculiar result has far
reaching implications when one tries to combine the principles of quantum
theory with gravity. We will now describe how this comes about.
Let us begin with the expansion of the scalar field in terms of creation
and annihilation operators as given in Eq. (3.117). Given the commutation
rules in Eq. (3.102), we found that Ak and A†k can be interpreted as creation
and annihilation operators. However, this description in terms of creation
and annihilation operators is not unique. Consider, for example, a new set
of creation and annihilation operators Bk and Bk† , related to Ak and A†k
by the linear transformations
Ak = αBk + βBk† ,
A†k = α∗ Bk† + β ∗ Bk
(3.131)
It is easy to verify that Bk and Bk† obey the standard commutation rules
if we choose α and β with the condition |α|2 − |β|2 = 1. Substituting
Eq. (3.131) into Eq. (3.117) we find that φ(x) can be expressed in the form
φ(x) = dΩk Bk fk (x) + Bk† fk∗ (x) ,
(3.132)
where
fk (x) = αe−ikx + β ∗ eikx
(3.133)
One can now define a new vacuum state (“B-vacuum”) state by the condition Bk |0B = 0 and the corresponding one-particle state, etc. by acting
on |0B by Bk† , etc. It is easy to see that this vacuum state is not equivalent to the original vacuum state (let us call it the “A-vacuum”) defined by
Ak |0A = 0. In fact, the number of A-particles in the B-vacuum is given
by
†
2
(3.134)
B 0|Ak Ak |0B = |β|
At first sight this result seems to throw a spanner in the works! How
can we decide on our vacuum state and concept of particle if the linear
transformations in Eq. (3.133) (called the Bogolioubov transformations)
change the notion of vacua? One could have worked either with the modes
e−ikx , e+ikx or with fk (x), fk∗ (x) and these two quantization procedures
will lead to different vacuum states and particle concepts. We need some
additional criteria to choose the vacuum state.
In standard quantum field theory, one tacitly assumes such an additional criterion which is the following: We demand that the mode functions
we work with must have “positive frequency” with respect to the Hamiltonian that we use. That is, the mode functions must be eigenfunctions
45 It is rather surprising that most of
the standard QFT textbooks do not
even alert you about this fact!
98
46 Actually one can be a little bit more
generous and allow for mode functions which are arbitrary superpositions of positive frequency mode functions. But for our illustrative purposes, this criterion is good enough.
Also note that we do not put any condition on the spatial dependence of
the modes. One could use the standard plane waves exp(ik · x) or one
could expand the field in terms of, say,
the spherical harmonics. The notion
of a vacuum state is only sensitive to
the time dependence. Of course, a
one-particle state describing a particle
with momentum k = (0, 0, kz ) will be
a superposition of different Ylm (θ, φ)s when expressed in spherical coordinates. This need not scare anyone
who has studied scattering theory in
NRQM.
Exercise 3.16:
ments.
Prove these state-
Chapter 3. From Fields to Particles
of the operator H = i∂/∂t with a positive eigenvalue ωk so that the time
dependence of the modes is of the form exp(−iωk t). This excludes using
modes of the form fk (x) in Eq. (3.133) which is a linear combination of
positive and negative frequency modes.46 It is easy to verify that the notion of a positive frequency mode is Lorentz invariant; a positive frequency
mode will appear to be a positive frequency mode to all Lorentz observers.
If we introduce this extra restriction and define our vacuum state, it
will appear to be a vacuum state to all other inertial observers. In fact
most textbooks will not even bother to tell you that this criterion has been
tacitly assumed because one usually deals with only Lorentz invariant field
theory.
Things get more interesting when we allow for transformations of coordinates from inertial to non-inertial frames of reference. For example,
consider a transformation from the standard inertial coordinates (t, x) to
a non-inertial coordinate system (τ, ξ) by the equations
x = ξ cosh gτ ;
t = ξ sinh gτ
(see Fig. 3.4). This transformation introduces a coordinate system which
is appropriate for an observer who is moving with a uniform acceleration
g along the x−axis and is called the Rindler transformation. World lines
of observers who are at rest in the new coordinate system ξ =constant,
corresponds to hyperbolas x2 − t2 = ξ 2 in the inertial frame. You can easily
convince yourself that: (a) These are trajectories of uniformly accelerated
observers. (b) The τ represents the proper time of the clock carried by
the uniformly accelerated observer. The line element in terms of the new
coordinates is easily seen to be
ds2 = dt2 − dx2 = g 2 ξ 2 dτ 2 − dξ 2 − dy 2 − dx2
t
Ξ 0
Τ const
x
H
N
ZO
RI
O
Ξ 0
Ξ const
Figure 3.4: The Rindler coordinates
appropriate for a uniformly accelerated
observer. The ξ=constant curves are
hyperbolas while τ = constant curves
are straight lines through the origin.
The x = t surface of the Minkowski
space (which maps to ξ=0) acts as a
horizon for the uniformly accelerated
observers in the right wedge.
(3.135)
(3.136)
(This form of the metric is called the Rindler metric.) Note that the
transformations in Eq. (3.135) only cover the region with |x| > |t| which is
the region in which uniformly accelerated trajectories exist. The trajectory
corresponding to ξ = 0 maps to x = ±t which are the null rays propagating
through the origin. These null surfaces act as a horizon for the uniformly
accelerated observer. That is, an observer with ξ > 0 will not have access
to the region beyond x = t. This horizon divides the x-axis into two parts:
x > 0 and x < 0 which are causally separated as far as the accelerated
observer is concerned.
The key feature of the Rindler metric in Eq. (3.136) which is of relevance
to us is its static nature; that is, the metric does not depend on the new
time coordinate τ . This implies that when we study quantum field theory
in these coordinates, one can again obtain mode functions of the form
e−iωτ fω (ξ, y, z) which are positive frequency modes with respect to the
proper time τ of the accelerated observer. But given the rather complicated
nature of the transformation in Eq. (3.135), it is obvious that these modes
will be a superposition of positive and negative frequency modes of the
inertial frame. This is indeed true.
So even after imposing our condition that one should work with positive
frequency mode functions, we run into an ambiguity — positive frequency
modes of the inertial frame are not positive frequency modes of the accelerated (Rindler) frame! This, in turn, means that the vacuum state defined
using the modes e−iωτ fω (ξ, y, z) will be different from the inertial vacuum
3.4. Davies-Unruh Effect: What is a Particle?
99
state. The accelerated observer will not consider the inertial vacuum to be
a no-particle state. In short, the notion of a vacuum state and the concept
of particles (as excitations of the field) are Lorentz invariant concepts but
they are not invariant notions if we allow transformations to non-inertial
coordinates.
We will now show that the inertial vacuum functional leads to a thermal
density matrix when viewed in an accelerated frame. To obtain this result,
let us begin with the result we have obtained earlier in Sect. 1.2.3 which
expresses the quantum mechanical path integral in terms of the stationary
states of the system:
Dq exp (iA[t2 , q2 ; t1 , q1 ]) = q2 |q1 =
φn (q2 )φ∗n (q1 )e−iEn (t2 −t1 )
n
(3.137)
We now analytically continue to Euclidean time and set tE
1 = 0, q2 = 0, q1 =
q and take the limit tE
2 → ∞. In the infinite time limit, the right hand side
will be dominated by the lowest energy eigenvalue which will correspond
to the ground state φ0 . We can always add a constant to the Hamiltonian
to make the ground state energy zero. In that case, the only term which
survives in the right hand side will be φ0 (q2 )φ∗0 (q1 ) = φ0 (0)φ0 (q) ∝ φ0 (q).
Thus the ground state wave function can be expressed as a Euclidean path
integral with very specific boundary conditions (see Eq. (1.44)):
φ0 (q) ∝ Dq exp(−AE [∞, 0; 0, q])
(3.138)
This result will directly generalize to field theory if we think of q1 and
q2 as field configurations q1 (x), q2 (x) and φ0 (q1 ) as the ground state wave
functional Ψgs [φ(x)]. Then we get
Ψgs [φ(x)] ∝ Dφ e−AE (∞,0;0,φ(x))
(3.139)
The path integral on the right hand side can be evaluated in the Euclidean
sector obtained (i) either by analytically continuing the inertial time coordinate t to tE or (ii) analytically continuing the Rindler time coordinate τ
to τE . The transformations in Eq. (3.135) go over to
xE = ξ cos gτE ;
tE = ξ sin gτE
(3.140)
when we analytically continue to the Euclidean sector. The corresponding
line interval
−ds2E = dt2E + dx2 = ξ 2 d(gτE )2 + dξ 2 + dy 2 + dz 2
(3.141)
clearly shows that we are just going over from planar coordinates to polar
coordinates in the Euclidean (gτE , ξ) coordinates with θ ≡ gτE having a
period of 2π.
This situation is shown in Fig. 3.5. The field configuration φ(x) on the
tE = 0 surface can be thought of as given by two functions φL (x), φR (x)
in the left and right halves x < 0 and x > 0 respectively. Therefore,
the ground state wave functional Ψgs [φ(x)] now becomes a functional of
these two functions: Ψgs [φ(x)] = Ψgs [φL (x), φR (x)]. The path integral
expression in Eq. (3.139) now reads
tE =∞;φ=(0,0)
Ψgs [φL (x), φR (x)] ∝
Dφe−A
(3.142)
tE =0;φ=(φL ,φR )
100
it = tE
ξ
igτ = gτE
φL(x)
x
φR(x)
Figure 3.5: Analytic extension to the
imaginary time in two different time
coordinates in the presence of a horizon. When one uses the path integral
to determine the ground state wave
functional on the tE = 0 surface, one
needs to integrate over the field configurations in the upper half (tE > 0)
with a boundary condition on the field
configuration on tE = 0. This can be
done either by using a series of hypersurfaces parallel to the horizontal axis
(shown by broken lines) or by using
a series of hypersurfaces corresponding
to the radial lines. Comparing the two
results, one can show that the ground
state in one coordinate system appears
as a thermal state in the other.
Chapter 3. From Fields to Particles
But from Fig. 3.5 it is obvious that this path integral could also be evaluated
in the polar coordinates by varying the angle θ = gτE from 0 to π. While
the evolution in tE will take the field configuration from tE = 0 to tE → ∞,
the same time evolution gets mapped in terms of τE into evolving the
“angular” coordinate τE from 0 to π/g. (This is clear from Fig. 3.5.) It is
obvious that the entire upper half-plane t > 0 is covered in two completely
different ways in terms of the evolution in tE compared to the evolution
in τE . In (tE , x) coordinates, we vary x in the range (−∞, ∞) for each tE
and vary tE in the range (0, ∞). In (τE , ξ) coordinates, we vary ξ in the
range (0, ∞) for each τE and vary τE in the range (0, π/g). When θ = 0,
the field configuration corresponds to φ = φR and when θ = π the field
configuration corresponds to φ = φL . Therefore Eq. (3.142) can also be
expressed as
gτE =π;φ=φL
Ψgs [φL (x), φR (x)] ∝
Dφe−A
(3.143)
gτE =0;φ=φR
. Let HR be some Rindler Hamiltonian which is the (Euclidean version of
the) Hamiltonian that describes the evolution in terms of the proper time
coordinate of the accelerated observer. Then, in the Heisenberg picture,
‘rotating’ from gτE = 0 to gτE = π is a time evolution governed by HR . So
the path integral Eq. (3.143) can also be represented as a matrix element
of the Rindler Hamiltonian, HR giving us the result:
gτE =π;φ=φL
Dφe−A = φL |e−(π/g)HR |φR (3.144)
Ψgs [φL (x), φR (x)] ∝
gτE =0;φ=φR
If we denote the proportionality constant by C, then the normalization
condition
2
1 =
DφL DφR Ψgs [φL (x), φR (x)]
2
DφL DφR φL |e−πHR /g |φR φR |e−πHR /g |φL
= C
(3.145)
= C 2 Tr e−2πHR /g
fixes the proportionality constant C, allowing us to write the normalized
vacuum functional in the form:
φL |e−πHR /g |φR
Ψgs [φL (x), φR (x)] =
1/2
Tr(e−2πHR /g )
(3.146)
From this result, we can show that for operators O made out of variables
having support in x > 0, the vacuum expectation values vac| O(φR )|vac
become thermal expectation values. This arises from straightforward algebra of inserting a complete set of states appropriately:
(1)
(1)
(2)
(2)
vac| O(φR )|vac =
Ψgs [φL , φR ]φR |O(φR )|φR Ψgs [φR , φL ]
φL φ(1) ,φ(2)
R
R
=
φL φ(1) ,φ(2)
R
R
=
φL |e−(π/g)HR |φR φR |O|φR φR |e−(π/g)HR |φL
T r(e−2πHR /g )
T r(e−2πHR /g O)
T r(e−2πHR /g )
(1)
(1)
(2)
(2)
(3.147)
3.5. Quantizing the Complex Scalar Field
101
Thus, tracing over the field configuration φL in the region x < 0 (behind
the horizon) leads to a thermal density matrix ρ ∝ exp[−(2π/g)H] for the
observables in the region x > 0. In particular, the expectation value of the
number operator will be a thermal spectrum at the temperature T = g/2π
in natural units. So, an accelerated observer will consider the vacuum state
of the inertial frame to be a thermal state with a temperature proportional
to his acceleration. This result (called the Davies-Unruh effect ) contains
the essence of more complicated results in quantum field theory in curved
spacetimes, like for example the association of a temperature with black
holes.47
This result shows the limitations of concepts developed within the context of Lorentz invariant quantum field theory; they can run into conceptual
non-trivialities in the presence of gravity described by a curved spacetime.
3.5
Quantizing the Complex Scalar Field
Let us get back to QFT in the inertial frame and consider a complex scalar
field. This is completely straightforward and follows exactly the same route
as that of the real scalar field except for the doubling of degrees of freedom
because φ = φ† . This has the consequence that the expansion of the field
in terms of the creation and annihilation operators now takes the form
d3 p
1
φ(x) =
ap e−ipx + b†p eipx
(3.148)
3
(2π)
2ωp
and
†
φ (x) =
† ipx
1
d3 p
a e + bp e−ipx
(2π)3 2ωp p
(3.149)
which is precisely what we found earlier in Chapter 1 in Eq. (1.129). The
creation and annihilation operators now satisfy the commutation rules
ap , a†q = bp , b†q = (2π)3 δ(p − q)
(3.150)
with all other commutators vanishing. Obviously we now have double the
number of creation and annihilation operators compared to a real scalar
field, because when φ = φ† the antiparticle is distinct from the particle
while for a real scalar field with φ = φ† , the antiparticle is the same as the
particle.
One can also obtain the same results by treating φ(x) and φ† (x) as two
independent dynamical variables and imposing the canonical commutation
rules between them and their conjugate momenta. It should be noted, however, that for the Lagrangian given by L = ∂a φ† ∂ a φ−m2 φ† φ, the canonical
a
momentum for φ is π(φ)
= (∂L/∂(∂a φ)) = ∂ a φ† and vice-versa. (This is in
contrast to the real scalar field for which the canonical momentum corre˙ The algebra proceeds exactly as in the case of the
sponding to φ is just φ.)
real scalar field and one finds that both φ and φ† satisfy the Klein-Gordon
equation, allowing the expansions in Eq. (3.148) and Eq. (3.149).
Having quantized the scalar field (real and complex), we can work out
all sorts of vacuum expectation values involving the fields by using its
decomposition in terms of the creation and annihilation operators. But
we have already done all these in the last chapter and hence will only
recall a couple of key results for future reference. First, we know that the
47 Mathematically, this result arises
because analytic continuation to Euclidean time is a Lorentz invariant
procedure but not a generally covariant procedure. This is obvious from
the fact that the Euclidean metric expressed in terms of τE exhibits periodicity in the imaginary time with a
period β = (2π/g). We have seen in
Sect. 1.2.3 that the period of the imaginary time can be interpreted as the
inverse temperature which is precisely
what happens in this case.
102
Chapter 3. From Fields to Particles
vacuum expectation value of the time ordered product is given by (see e.g.,
Eq. (1.128)):
i
d4 p
†
e−ip·x
0|T [φ(x2 )φ (x1 )]|0 = G(x2 ; x1 ) =
(2π)4 (p2 − m2 + i )
(3.151)
=
dΩp θ(t)e−ipx + θ(−t)eipx
This quantity (usually called the Feynman propagator ) will play a key role
in our future discussions. The two individual pieces in this expression are
given by (see Eq. (1.133)):
0|φ(x2 )φ† (x1 )|0 = d Ωp e−ipx = G+ (x2 ; x1 )
0|φ† (x1 )φ(x2 )|0 = d Ωp e+ipx = G− (x2 ; x1 )
(3.152)
Further, the commutator of the fields at arbitrary times is given by (see
Eq. (1.89)):
[φ(x2 ), φ† (x1 )] = G+ (x2 ; x1 ) − G+ (x1 ; x2 ) = d Ωp [e−ipx − e+ipx ]
(3.153)
=
d Ωp eip·x [e−iωp t − eiωp t ]
48 In
contrast, note that [φ(x), φ(y)] =
[φ† (x), φ† (y)] = 0 for any pair of events
for the complex scalar field.
Exercise 3.17: Verify all these.
This commutator vanishes outside the light cone i.e., when x1 and x2 are
related by a spacelike interval.48 As we explained in Sect. 1.5.1, this is
crucial for the validity of causality in field theory. The observables of the
theory are now built from field operators — usually quadratic functionals of
the field operators like the Hamiltonian etc. — and causality is interpreted
as the measurement of any such observable at a spacetime location not
affecting observables which are causally disconnected from that event. We
no longer think in terms of a single particle description or localized position
eigenstates of particles.
The main difference between the real and complex scalar field lies in the
fact that now we have an antiparticle which is distinct from the particle so
that we need two sets of creation and annihilation operators in Eq. (3.148)
and Eq. (3.149). The existence of a distinct antiparticle created by b†p is
reflected in all the physical quantities which we compute. Following exactly
the same steps as in the case of real scalar fields, we can now compute the
(normal ordered) Hamiltonian and the momentum, obtaining
: H :=
Pα =
d3 p
ωp a†p ap + b†p bp
3
(2π)
(3.154)
d3 p α †
p ap ap + b†p bp
3
(2π)
(3.155)
which shows that both the particle and the antiparticle contribute additively to the energy and momentum.
To see the physical distinction between the particle and the antiparticle,
we will compute the operator corresponding to the conserved charge of the
complex scalar field introduced in Eq. (3.44). The invariance of the action
for complex scalar field under the transformations in Eq. (3.42) leads to
3.5. Quantizing the Complex Scalar Field
103
a conserved current J a , the zeroth component of which is given by (see
Eq. (3.43)):
J0 =
π(A) δφ(A) = −iq φ ∂t φ† + iq φ† ∂t φ = iq(φ† ∂t φ − φ∂t φ† ) (3.156)
A
The integral of J 0 over all space gives the charge Q. Substituting the
expansion in Eq. (3.148) and Eq. (3.149) into Eq. (3.44), we get
→
1
d3 p
1
d3 q
3
†←
Q = i d x φ ∂0 φ = i d3 x
(2π)3 2ωq (2π)3 2ωp
× a†q eiqx + bq e−iqx ∂0 ap e−ipx + b†p eipx
− ap e−ipx + b†p eipx ∂0 a†q eiqx + bq e−iqx
=
1
1
d3 p
d3 q
3
3
(2π)
2ωq (2π)
2ωp
× a†q eiqx + bq e−iqx ωp ap e−ipx − b†p eipx
d3 x
+ωq a†q eiqx − bq e−iqx ap e−ipx + b†p eipx
(3.157)
Different terms respond differently to the integration over d3 x. We get
δ(p − q) from the terms having a†q ap and bq b†p while the terms with bq ap
and a†q b†p lead to δ(p + q). In either case, |p| = |q| leading to ωq = ωp .
This fact leads to cancellation of terms with bq ap and a†q b†p leaving behind
only the terms with a†q ap and bq b†p . In these terms, the time dependence
goes away49 when we use ωq = ωp . So, we finally get:
1
d3 q
1
d3 p
(2π)3 δ(p − q)2ωp a†q ap − bq b†p
Q =
3
3
(2π)
2ωp (2π)
2ωq
3
d p
=
a† ap − bp b†p
(3.158)
(2π)3 p
We are again in a bit of trouble because the vacuum expectation value
of the second term will not vanish and will lead to an infinite amount of
charge due to antiparticles. This can be avoided by the usual remedy of
normal ordering Q to obtain50
d3 p †
ap ap − b†p bp
(3.159)
: Q :=
(2π)3
This expression shows that the net charge is the difference of the charges
contributed by the particles and antiparticles. So, if the particles carry
+1 unit of charge, then the antiparticles should carry −1 unit of charge.
This provides a simple and important procedure for distinguishing between
processes involving the particle from those with the antiparticle.
We conclude this section with a discussion of a couple of more features
related to the existence of a conserved current and charge. From the definition of the charge as an integral over J 0 and Eq. (3.156), we can compute
49 Since we know that charge is conserved, we could have done all the calculation at t = 0, but once in a while
it is good not to be clever and see explicitly how cancellations occur.
50 The normal ordering is somewhat
easier to justify in this case because
the classical expression for the charge
˙ in
involves the product of φ∗ and φ;
quantum theory this could have been
˙ or as φφ
˙ † or as
written either as φ† φ,
some symmetric combination. Since
˙ each of
φ† does not commute with φ,
these choices will lead to a different
expression for the total charge and it
makes sense (may be?) to settle the
issue by demanding that the vacuum
should have zero charge.
104
Chapter 3. From Fields to Particles
the commutator [Q, φ(A) (0, x)] to be
[Q, φ(A) (x)] = d3 y δφ(B) (y)[π(B) (y), φ(A) (x)] = i[δφ(A) (x)]
(3.160)
This result shows that one can think of the charge Q as the generator of
the infinitesimal transformation φ(A) → φ(A) + δφ(A) . The well known
special case of this is the translation in time and space generated by the
4-momentum.
Since a symmetry transformation generates a conserved current and an
associated charge, some of the crucial features of spontaneous symmetry
breaking can be related to the existence of the charge. For example, we
saw in Sect. 3.1.4 that the presence of degenerate vacua led to a zero mass
particle in the theory. This result happens to be very general and independent of the detailed structure of the Lagrangian in the relativistic field
theory. We will describe briefly how one can prove such a result in fairly
general terms.
Let us consider a system with a conserved charge Q so that [Q, H] = 0.
The ground state |0 of the system is taken to have zero energy so that
H|0 = 0. Since [Q, H]|0 = 0 = HQ|0 it follows that Q|0 also has zero
energy. When this state Q|0 differs from |0, we have a situation with
degenerate vacua. We want to show that under this circumstance there
will exist zero mass particles in the theory; that is, we have to show that
there exist states with some momentum k and energy E such that E → 0
when |k| → 0, which is the relativistic characterization of a zero mass state.
To do this, consider the state defined by
(3.161)
|s = R|0 ≡ d3 x eik·x J0 (x)|0
where J0 (x) is the charge density. It is easy to see that this state is an
eigenstate of the momentum operator P α with eigenvalue kα . We have
α
3
ik·x
α
P |s = d x e
[P , J0 ]|0 = i d3 x eik·x (∂α J 0 )|0
(3.162)
where we have used the facts that P α |0 = 0 and [P α , J0 ] = −i∂ α J0 =
i∂α J 0 . Doing an integration by parts and ignoring the surface term, we get
P α |s = −i d3 x(∂α eik·x )J 0 |0 = kα d3 x eik·x J 0 |0 = kα |s (3.163)
Consider now what happens to |s when we take the limit of |k| → 0.
We see that, in this limit, |s → Q|0 which we know is a state with zero
energy. In other words, the state |s has the property that when |k| → 0,
its energy vanishes. This shows that there are indeed zero mass excitations
in the theory.
3.6
Quantizing the Electromagnetic Field
We will next take up the example of the electromagnetic field which —
historically — started it all. Our aim is to quantize the classical electromagnetic field and obtain the photons as its quanta. This actually turns
out to be much more difficult than one would have first imagined because of
3.6. Quantizing the Electromagnetic Field
105
the gauge invariance. So we will first describe the difficulties qualitatively
and then take up the quantization.
We saw that quantizing a field involves identifying the relevant dynamical variables (which are the ones that you vary in the action to get
the classical field equations), calculating the corresponding canonical momenta and then imposing the commutation rules between the two. The
Lagrangian for the electromagnetic field is proportional to F ij Fij where
Fij = ∂i Aj − ∂j Ai and we vary Ai to obtain the classical field equations.
This identifies Ai as the dynamical variables. To determine the canonical
momenta corresponding to Ai , you need to compute ∂L/∂(∂0 Ai ) which
turns out to be proportional to F 0i . The trouble begins here.
The canonical momentum conjugate to A0 is F 00 , which, of course,
vanishes identically.51 So we cannot upgrade all Ai s to operator status and
impose the commutation rules. Of course, this is related to the fact that
the physical observables of the theory are F ik and not Ai , and, because of
gauge invariance, there is redundancy in the latter variables.
To get around this difficulty, we need to impose some gauge condition
and work with true dynamical variables. In principle, one can impose
any one of the many possible gauge conditions, but most of them will
lead to fairly complicated description of the quantum theory. Two natural
conditions are the following: (i) φ = 0, ∇ · A = 0; (ii) ∂j Aj = 0. The first
condition (which is what we will work with) has the disadvantage that it is
not manifestly Lorentz invariant and so you need to check that final results
are Lorentz invariant explicitly by hand. Further, when we try to impose
the condition
(3.164)
Aα (t, x), A˙ β (t, y) = −iη αβ δ(x − y) = iδ αβ δ(x − y)
51 If you write the Lagrangian as (E 2 −
B 2 ) and note that E = −∇φ −
(∂A/∂t), B = ∇ × A, it is obvious
˙
˙ but no φ.
that the Lagrangian has A
So clearly the momentum conjugate to
φ vanishes.
we run into trouble if we take the divergence of both sides with respect to
xα . We find that
∂α Aα (t, x), A˙ β (t, y) = −i∂ β δ(x − y) = 0
(3.165)
violating the ∇ · A = 0 condition. The second gauge52 is Lorentz invariant
but if we impose the commutation relation
Ai (t, x), A˙ j (t, y) = −iη ij δ(x − y)
(3.166)
The time component again creates a problem because the right hand side
now has a flipped sign for the time component vis-a-vis the space component. This leads to the existence of negative norm states and related
difficulties. All these can be handled but not as trivially as in the case
of scalar fields. We will use the first gauge and get around the difficulty
mentioned above by a different route.
3.6.1
Quantization in the Radiation Gauge
Let us consider a free electromagnetic field, with the vector potential Ai =
(φ, A) satisfying the gauge conditions: φ = 0, ∇ · A = 0. As usual we
expand A (t, x) in Fourier space as:
d3 k
d3 k
A (t, x) =
q
(t)
exp
ik.x
=
ak (t) + a∗−k (t) eik.x
k
3
3
(2π)
(2π)
(3.167)
52 Almost
universally called the
Lorentz gauge because it was discovered by Lorenz ; yet another example
of dubious naming conventions used
in this subject. This gauge was first
used by L. V. Lorenz in 1867 when the
more famous H.A. Lorentz was just
14 years old and wasn’t particularly
concerned about gauges. We shall
give the credit to Lorenz and use the
correct spelling. The original reference
is L.Lorenz, On the Identity of the
Vibrations of Light with Electrical
Currents, Philos. Mag. 34, 287-301
(1867).
106
Chapter 3. From Fields to Particles
which has exactly the same structure as in the case of the scalar field, given
by Eq. (3.105). The condition ∇ · A = 0, translates into
k · qk = k · ak = k · a∗k = 0;
(3.168)
i.e., for every value of k, the vector ak is perpendicular to k. This allows
the vectors qk , ak to have two components akλ (λ = 1, 2) and qkλ (λ = 1, 2)
in the plane perpendicular to k with {ak1 , ak2 , k} forming an orthogonal
system. The electric and magnetic fields corresponding to this vector potential Ai = (0, A) are
d3 k
q˙ k eik·x ;
(2π)3
d3 k
(k × qk ) eik·x .
(2π)3
(3.169)
Substituting this into the action in Eq. (3.63) and using
˙ =−
E = −A
B=∇×A=i
(k × qk ) · (k × q∗k ) = q∗k · [(k × qk ) × k] = (q∗k · qk ) k 2
(3.170)
we can again express the action as that of a sum of oscillators:
A=
1
2
1
d3 k
|q˙kλ |2 − k 2 |qkλ |2 . (3.171)
d3 x E 2 − B 2 =
3
2
(2π)
λ=1,2
So, once again, we find that the action for the electromagnetic field can
be expressed as a sum of the actions for harmonic oscillators with each
oscillator labeled by the wave vector k and a polarization. The polarization
part is the only extra complication compared to the scalar field and the
rest of the mathematics proceeds exactly as before. The dispersion relation
now corresponds to ωk = |k| showing that the quanta are massless, as we
expect for photons.
The wave equation satisfied by A (t, x) implies that qk (t) satisfies the
harmonic oscillator equation q
¨k + k 2 qk = 0, allowing us to take the time
evolution to be ak ∝ exp (−ikt). With the usual normalization, this gives
the mode expansion for the vector potential in the standard form
A (t, x) =
λ=1,2
that the operator a†kλ creates a photon with a polarization
state labelled by λ and momentum
labelled by k which will define a
direction (of propagation) in the 3dimensional space. The polarization
vectors α (k, λ) necessarily have to depend on the momentum vector k of
the photon. For example, if we try
to pick 3 four-vectors with components 1a = (0, 1, 0, 0), 2a = (0, 0, 1, 0)
and 3a = (0, 0, 0, 1) as the bases, the
Lorentz transformations will mix them
up with the “time-like polarization”
vector 0a = (1, 0, 0, 0).
53 Notice
1
d3 k
†
−ikx
ikx
√
a
e
+
a
e
kλ
kλ
(2π)3 2ωk
(3.172)
It is convenient in manipulations to separate out the vector nature from
the creation and annihilation operators and write akλ = (k, λ)akλ where
akλ is the annihilation operator for the photon with momentum k and
polarization λ while (k, λ) is just a c-number vector carrying the vector
nature of A. Since (k, 1), (k, 2) and k/|k| form an orthonormal basis,
they satisfy the constraint:53
2
λ=1
α (k, λ) β (k, λ) +
kα kβ
= δ αβ
k2
(3.173)
The creation and annihilation operators satisfy the standard commutation
rules, now with an extra polarization index:
[ak λ , akλ ] = a†k λ , a†kλ = 0 (3.174)
ak λ , a†kλ = δ(k − k)δλλ ,
3.6. Quantizing the Electromagnetic Field
107
Everything else proceeds as before. The Hamiltonian governing the free
electromagnetic field can be expressed as a sum of Hamiltonians for harmonic oscillators, with each oscillator labeled by a wave vector k and polarization index α. The quantum states of the system are labeled by by a set
of integers |{nkα }, one for each oscillator labeled by kα. The expectation
value of H in such a state will be
E=
α=1,2
1
d3 k
ωk nkα +
(2π)3
2
(3.175)
where the second term is the the energy of the system when all the oscillators are in the ground state — which can again be removed by normal
ordering.
You might have thought all this is pretty obvious and standard. But,
if you think about it, the second term in Eq. (3.172) involving a†kλ is precisely the term about which we made a song and dance in Eq. (1.130) while
discussing the scalar field. It is related to propagating negative energy antiphotons backward in time! The quantization of the electromagnetic field
based on the separation in Eq. (3.172) describes photons and antiphotons,
carrying positive frequency and negative frequency modes and propagating
forward and backward in the language we have used earlier.54 This term
creates an antiphoton from the vacuum, but since the antiphoton is the
same as the photon, you think of it as a creation of photons — which every
light bulb does. This is the simplest example of a particle popping out of
nowhere thereby making the standard description in terms of a Schrodinger
equation completely inadequate. The more esoteric processes — like, for
example, e+ − e− annihilation producing some exotic new particles is fundamentally no different from a light bulb emitting a photon through the
action of the second term which has a†kλ in it.
Let us next take a look at the commutation relation between the fields.
Once again, we have quantized a field by mapping the relevant dynamical
variables to a bunch of independent harmonic oscillators and then quantizing the oscillators. While discussing the scalar field, we said that this
procedure is equivalent to imposing the canonical commutation relations
between the dynamical variables and the canonical momenta; indeed —
in the case of the scalar field — one can derive the commutation relations between the creation and annihilation operators from the canonical
commutation relations between the dynamical variables and the canonical
momenta. But in the case of the electromagnetic field in the radiation
gauge, the situation is somewhat different. We know from Eq. (3.164) and
Eq. (3.165) that we cannot impose the standard commutation relations and
maintain consistency with the radiation gauge condition. So, one cannot
start from Eq. (3.164) and get the commutation rules for the creation and
annihilation operators in a straightforward manner.
Fortunately, our procedure — of first identifying the relevant oscillators
and then just quantizing them — bypasses this problem. Having quantized
the system by this procedure, we can now go back and compute the relevant
commutators between the field and its momentum and see what happens.
The canonical momenta55 corresponding to Aα :
πα =
∂L
= −∂0 Aα = E α
∂(∂0 Aα )
(3.176)
54 While every textbook makes some
noise about these “new features” when
it quantizes the scalar field, these are
not emphasized in the context of the
familiar electromagnetic field. You
talk about π + and π − , electrons and
positrons and their strange relationships but the same thing happens with
photons and antiphotons contained in
Eq. (3.172). It is the same maths and
same physics (except for m = 0).
55 Note the placement of α and the
sign flips due to our signature; the
components of A, E etc. have, by definition, a superscript index.
108
Chapter 3. From Fields to Particles
has the explicit expansion:
E(t, x) =
2
d3 k
†
−ikx
ikx
√
iω
(k,
λ)
a
e
−
a
e
k
kλ
kλ
(2π)3 2ωk λ=1
(3.177)
We need to compute [Aα (t, x), E β (t, x )] which is given by:
α
A (t, x), E β (t, x )
(3.178)
2
2
d3 k
d3 k
√
√
=
α (k, λ) β (k , λ )
3
3
(2π) 2ωk λ=1 (2π) 2ωk λ =1
× akλ , a†k λ e−i(kx−k x ) (−iωk ) + a†kλ , ak λ ei(kx−k x ) (+iωk )
2
d3 k
ik·(x−x )
−ik·(x−x )
(−iω
)
e
+
e
α (k, λ) β (k, λ)
=
k
(2π)3 2ωk
λ=1
We now do the usual flip of k to −k in the second term and use Eq. (3.173)
to obtain:
α
d3 k ik·(x−x ) αβ k α k β
e
−
δ
A (t, x), E β (t, x ) = −i
(2π)3
k2
αβ
(x − x )
≡ −iδ⊥
where the last equality defines the transverse delta function:
kα kβ
d3 k ik·(x−x )
⊥
δαβ
(x − x ) ≡
e
−
δ
αβ
(2π)3
k2
(3.179)
(3.180)
αβ
(x) = 0 (which is obvious in the momentum
Its key property is that ∂α δ⊥
space representation) thereby making the commutation relations consisαβ
tent with ∇ · A = 0. One can write down the explicit form of δ⊥
(x) in
coordinate space by noting that it can be written, formally, as
1
⊥
(x) = δαβ − ∂α 2 ∂β δ(x) ≡ (P⊥ )αβ δ(x)
(3.181)
δαβ
∇
Exercise 3.18: Prove this.
By defining the inverse of the Laplacian as an integral operator, one can
easily show that:
⊥
(x) = δαβ δ(x) +
δαβ
56 The
way we approached the problem, this should not worry you; finding the oscillators and quantizing them
is the safest procedure, when it can
be done. If, instead, you start from
the commutator for fields and canonical momenta, you need to first argue —
somewhat unconvincingly — that we
should introduce the transverse delta
function for the commutators and then
proceed with the quantization.
1
1
∂α ∂β
4π
|x|
(3.182)
The analysis shows that the commutator [Aα (t, x), E β (t, x )] does not have
the value [iδ αβ δ(x − x ) which we might have naively thought it should
have. In the radiation gauge, if we identify the correct harmonic oscillator
variables and quantise them, everything works fine and this commutator
has56 a value given by Eq. (3.179).
The price we have paid is the loss of manifest Lorentz invariance and,
of course, loss of gauge invariance. The first problem can be avoided by
using a Lorentz invariant gauge like the Lorenz gauge. The second problem
is going to stay with us because we have to always fix a gauge — naively
or in a more sophisticated manner — to isolate the physical variables but
we expect the meaningful results to be gauge invariant. We will say more
about this in the next section.
3.6. Quantizing the Electromagnetic Field
109
Let us next consider the commutator between the fields at arbitrary
events. A straightforward calculation gives:
2
α
d3 k 1
β
α
β
A (x), A (y) =
(k, λ) (k, λ)
(3.183)
(2π)3 2ωk
λ=1
× e−ik(x−y) − e+ik(x−y)
d3 k 1
k α k β −ik(x−y)
αβ
+ik(x−y)
e
−
−
e
=
δ
(2π)3 2ωk
k2
This can again be expressed in the coordinate space using the non-local
projection operator and in terms of the function Δ(x) ≡ G+ (x) − G+ (−x)
we introduced in Eq. (1.89) but now evaluated for m = 0; it is usual to
define
1
(3.184)
D(x) ≡ −iΔ(x)|m=0 = − θ(t)δ(x2 )
2π
in the massless case where the last equality is easy to obtain from any of
the integral representations for Δ. We find
α
∂α∂β
d3 k 1 −ik(x−y)
β
αβ
+ik(x−y)
e
A (x), A (y) =
δ −
−
e
∇2
(2π)3 2ωk
=
(P⊥ )αβ [iD(x − y)]
(3.185)
A straightforward evaluation of the integrals reveals, at first sight, another
problem: this commutator now does not vanish outside the light cone —
something we have been claiming is rather sacred! But this is sacred only
for observable quantities and Ai is not an observable. (In fact, if we use
some other gauge we will get a different result for this commutator; in
the Lorenz gauge it does happen to vanish outside the light cone.) If
the commutator between the electric fields, for example, does not vanish
outside the light cone, you will be in real trouble. This, as you might have
guessed from the fact that we are still in business, does not happen. The
commutator of the electric fields is given by:
α
αβ
E (x), E β (y) = ∂x0 ∂y0 Aα (x), Aβ (y) = i∂ 0 ∂ 0 (P⊥ ) D(x − y)
=
=
iδ αβ ∂ 0 ∂ 0 D(x − y)
d3 k 1 ωk2 −ik(x−y)
α β
+ik(x−y)
e
−
e
−∂ ∂
(2π)3 2ωk k2
αβ 0 0
(3.186)
i δ ∂ ∂ − ∂ α ∂ β D(x − y)
which vanishes for spacelike separation. Further, this commutator has the
same expression in all gauges, as one would have expected.
3.6.2
Exercise 3.19: Prove Eq. (3.184).
Gauge Fixing and Covariant Quantization
In the case of scalar fields we saw that the propagation amplitude G(x2 ; x1 )
(or propagator, for short) can be related to the vacuum expectation value of
the time ordered product of the fields. In a similar manner, one can obtain
the propagator for the photon from the vacuum expectation value of the
time ordered product of the vector potential. We would, however, like to
have a photon propagator which is Lorentz invariant and this cannot be
obtained using the radiation gauge discussed above because it is not Lorentz
Exercise
3.20:
Show
that
α
is
given
by
A (x), Aβ (y)
iδαβ D(x − y) + ∂ α ∂ β H(x − y)
where H(x) = −(i/8π)(1/r)[vsgn(v) −
usgn(u)] with v ≡ t + r and u ≡ t − r.
Hence argue that the commutator
does not vanish outside the light cone.
110
Chapter 3. From Fields to Particles
invariant. While we will not work out in detail the issue of quantizing
the electromagnetic field in Lorentz invariant gauges, it will be useful to
describe some important features which will be needed in later chapters.
We will now take up some of these topics.
We have seen in the last chapter (see Sect. 2.2.1; especially Eq. (2.49)
and Eq. (2.48)) that the propagator arises as the inverse of a differential
operator D which occurs in the Lagrangian in the form φDφ. In the case
of the electromagnetic field, the Lagrangian can be rewritten, by removing
a total divergence, in the form
1
− Fik F ik
4
1
(3.187)
= − (∂i Ak ) ∂ i Ak − ∂ k Ai
2
1
1
= − ∂i Ak ∂ i Ak − Ak ∂ k Ai + Ak δik − ∂ k ∂i Ai
2
2
Therefore the action can be expressed in the form
1
1
A = −
d4 x Fik F ik =⇒
d4 x Ak δik − ∂ k ∂i Ai
4
2
k 2
1
d4 p
= −
Ak (p) δi p − pk pi Ai (p)
(3.188)
4
2
(2π)
57 This is the party line and, of course,
this procedure has been enormously
successful in practice. But the fact
that the cleanest description of nature
seems to require gauge redundant variables — like Aj in electrodynamics and
it gets worse in gauge theories — is
rather strange when you think about
it. Mathematically it has something
to do with the representations of the
Lorentz group and elementary combinatorics; we will say something about
it in a later chapter. Physically this
issue is related to whether Aj is a local observable in quantum theory (and
all indications are that it is not) and if
not whether we can formulate the entire theory using only gauge invariant
variables like Fik . There have been attempts but nobody has succeeded in
coming up with a useful, simple, formalism which does not use gauge dependent variables. One can indeed
rewrite all the expressions involving
the vector potentials Aj in terms of the
field tensor Fab after choosing a specific gauge. But this relation will be
non-local and the resulting theory will
appear to be non-local though we know
that it is actually local — and you need
to first choose a gauge to do this inversion. This is one of the reasons we do
not try to express everything in terms
of Fab , and learn to love gauge fields.
In arriving at the first step we have ignored the total divergence and the
last equality is obtained by writing everything in the momentum space. To
find the propagator in the momentum space we need to find the inverse of
the matrix Mik = δik p2 − pk pi . But since Mik pi = 0 it is obvious that this
matrix has an eigenvector pi with zero eigenvalue. Therefore, its inverse
does not exist.
The reason is that the action is invariant under the gauge transformation Ai → Ai + ∂i Λ which translates to Ai (p) → Ai (p) − ipi Λ(p) in the
momentum space. The gauge invariance requires the extra term proportional to pi not to contribute to the action in momentum space. This is
why the matrix Mik necessarily has to satisfy the constraint Mik pi = 0; so
it must have a zero eigenvalue if the action is gauge invariant. In other
words, we cannot find the inverse if we work with a gauge invariant action.
The simplest way out is to fix the gauge in some suitable form and hope
that final results will be gauge invariant.57 There are several ways of fixing
the gauge and we will discuss a couple of them and a formal procedure to
do this at the level of the action.
The first procedure is a cheap trick. If we add a mass term to the photon
by adding a term m2 Aj Aj term to the Lagrangian, we immediately break
the gauge invariance. In this case the action for the vector field interacting
with, say, a current J m will be
1
A = d4 x L = d4 x Am ∂ 2 + m2 η mn − ∂ m ∂ n An + Am J m
2
(3.189)
Now the relevant propagator Dnl will satisfy the equation
2
∂ + m2 η mn − ∂ m ∂ n Dnl (x) = δlm δ(x)
(3.190)
This is exactly analogous to Eq. (2.49) for the scalar field with a couple of
extra indices. The solution is trivial in the Fourier space and we get
Dnl (k) =
−ηnl + (kn kl /m2 )
k 2 − m2 + i
(3.191)
3.6. Quantizing the Electromagnetic Field
111
which is a well defined propagator for a massive vector field. In the last
step, we have also added the i factor with the usual rule that m2 → m2 −i .
The propagator for the photon needs to be obtained by taking the m →
0 limit of our expressions, which makes Eq. (3.191) blow up in your face.
The idea is to leave m non-zero, do calculations with it and take the limit
right at the end after computing the relevant amplitude. This trick usually
works as long as the vector field is coupled to a conserved current. When the
current J m is conserved, the equation ∂m J m = 0 translates to km J m (k) =
0 in momentum space. While computing amplitudes, you usually have
to contract the propagator with the current, obtaining Dnm (k)J m (k). In
these expressions the km kn /m2 does not contribute because km J m (k) = 0
and we get the same result as with no mass term. We can take the m → 0
limit trivially.
As an application of this trick, let us consider the functional Fourier
transform of the massive vector field with respect to an external source
J m (x) exactly in analogy with what we did for scalar fields in Sect. 2.2.1.
We will now get the result
&
'
1
Ak δik + m2 − ∂ k ∂i Ai + Am J m
DAj exp i d4 x
2
= exp iW (J)
(3.192)
where
1
W (J) = −
2
d4 k m ∗
J (k)
(2π)4
−ηmn + (km kn /m2 )
J n (k)
k 2 − m2 + i
(3.193)
For a conserved current, the terms in the numerator with (1/m2 ) do not
contribute, and we get
1
1
d4 k m ∗
Jm (k)
J (k) 2
(3.194)
W (J) =
2
(2π)4
k − m2 + i
in which we can take the limit of m going to zero in the denominator
without any difficulty and obtain:
∗
1
d4 k Jm
(k) J m (k)
W (J) =
(3.195)
4
2
(2π)
k 2 + i
If you compare this with the scalar field case you see a crucial sign difference
in W (J); so, if you rework the energy for a static source, it will come out
to be positive.58 Like charges repel in electromagnetism.
While the above example shows that you can sometimes fix the gauge by
adding a mass term to the photon, this is no good if you want to write down
a covariant propagator which is finite. As we saw above, the propagator
itself was divergent in the m → 0 limit. One way out of this is to add
an explicit gauge fixing term into the Lagrangian in which we do not have
to take any singular limits. A simple way to impose the gauge condition
∂m Am = constant, say, is to modify the Maxwell Lagrangian by adding
an arbitrary function Q(∂A) of ∂m Am ≡ ∂A. Let P be the derivative
of Q with respect to its argument; ie., P (∂A) = Q (∂A). The modified
Lagrangian
1
(3.196)
L = − Fij F ij − Q(∂A)
4
58 The sign difference arises from the
−ηnl term in Eq. (3.191) which leads
to a term D00 = −(k 2 + i)−1 which
has an extra minus sign compared to
the scalar field case.
112
Chapter 3. From Fields to Particles
then leads to the field equation:
Aj − ∂ j (∂A) + ∂ j P = 0
59 Though any non-trivial Q will fix
the gauge, quadratic terms make the
path integrals simpler.
(3.197)
On taking the divergence we get the condition P = 0 to which one can
consistently take the solution to be P = 0. This requires Q and hence ∂A
to be a constant which is our gauge condition. The usual procedure is to
take Q to be a quadratic function such as Q = (1/2)ζ(∂A)2 , where ζ is a
constant.59 In this case, we modify the Lagrangian to the form:
1
1
2
L = − Fmn F mn − ζ (∂s As )
4
2
(3.198)
The field equations now get modified to
Am − (1 − ζ) ∂ m (∂s As ) = 0
(3.199)
If we take the divergence of this equation with respect to xm , we get the
constraint ζ (∂s As ) = 0 for which one can consistently choose the solution ∂m Am = 0 thereby imposing the Lorenz gauge condition on the vector
potential. With this condition, the field equation reduces to Am = 0
which, of course, has a well defined propagator etc. One can also obtain
the same result by choosing ζ = 1 in Eq. (3.198) so that the second term
in Eq. (3.199) vanishes.
−1
−1 n
in the Lagrangian (Am Dmn
A ) now
The differential operator Dmn
takes the form, in momentum space,
−1
(k) = −ηmn k 2 + (1 − ζ) km kn
Dmn
(3.200)
The propagator in momentum space is the inverse of this matrix. Assuming
that the matrix has the form
(D)ns (k) = A(k 2 )η ns + B(k 2 )k n k s
(3.201)
s
, we get the two conand imposing the condition Dmn (k)(D−1 )ns (k) = δm
straints
−k 2 A(k 2 ) = 1;
ζ k 2 B(k 2 ) = (1 − ζ) A(k 2 )
(3.202)
One can immediately see that if ζ = 0, these equations are not compatible
showing again that, without the gauge fixing term, the differential operator
has no inverse. When ζ = 0, we can solve for A and B and obtain the
propagator:
ζ − 1 km kn
−η mn
+
(3.203)
Dmn (k) = 2
k + i
ζ (k 2 + i )2
The choice ζ = 1 clearly simplifies the structure of the propagator and
reduces it to the form we encountered earlier in obtaining Eq. (3.195). The
gauge choice with ζ → ∞ is called the Landau gauge, and in this case the
propagator becomes
Dmn (k) =
k m k n − η mn k 2
(k 2 + i )2
(3.204)
One can show, with a fair amount of effort and after surmounting some
additional complications, that these propagators can actually be expressed
as the vacuum expectation value of the time ordered product of the Aj s
when we quantize the field in the relevant gauge.
3.6. Quantizing the Electromagnetic Field
113
The way we have introduced the gauge fixing term in Eq. (3.198) allows
taking the limit of ζ → 0 easily. From the structure of Eq. (3.203), it
is obvious that the parameter λ ≡ (1/ζ) expresses the propagator in an
equivalent but simpler form as
km kn
km kn
1
Dmn (k) = − 2
−λ 2
ηmn −
k + i
k2
k
i
km kn
(3.205)
= − 2 ηmn − (1 − λ) 2
k
k
The first line of this equation shows that the part of Dmn (k) — which is
independent of the gauge fixing term — is orthogonal to k m because of
which we do not get a sensible inverse in the absence of gauge fixing terms.
The structure of this propagator will play a crucial role in our discussion
of QED.
There is a very cute way60 of introducing the gauge fixing terms we
saw above by using path integrals. While this is not essential for our
discussion, we will describe it because of its cleverness as well as the fact
that this procedure plays a crucial role in quantizing non-Abelian gauge
theories.
The starting point for this approach is the formal path integral for the
vector field61 written in the form
Z = DAj exp[iS(Aj )]
(3.206)
The first thing you note is that this functional integral, as it stands, is
ill-defined if we try to integrate over all functions Aj (x). If we consider two
(α)
(α)
vector potentials Am and Am related by a gauge transformation Am ≡
Am − ∂m α, then the action as well as the measure will be the same for
(α)
both because both are gauge invariant. So if you sum over both Am and
Am , you will get a divergent result.62 This should be no more mysterious
than integrating over x and y of a function which is actually independent
of x. Nothing you do can make this divergence go away but the idea is to
separate it out in a sensible form as an infinite constant in Z such that the
remaining part of the functional integral is well defined.
So we want to reduce Z to a form Z = N F where N is an infinite
constant independent of Aj and F is a well defined functional of Aj . Obviously, this will require restricting the gauge in some manner but the trick
is to do it in such a way that the integration over the gauge transformation
function α(x) can be separated out.
Let us say that we want to impose a gauge condition G(A) ≡ ∂m Am −
ω(x) = 0 where ω is some function. This can be done by introducing a
Dirac delta functional δ(G) inside the functional integral. But that will be
cheating, since it changes the value of the integral. A legitimate way of
introducing the delta functional is to write the factor unity in the form
(α)
δG(Am )
(α)
I = Dα(x)δ(G(Am )) Det
(3.207)
δα
This is just an integral over DGδ(G) [which is unity by definition] rewritten
with α as the variable of integration with the Jacobian determinant taking
(α)
care of the transformation from G to α. Since Am = Am + ∂m α, we have
60 This
is called the Fadeev-Popov
procedure
61 Notation alert: We use S (rather
than A) for the action, for typographical clarity.
62 There is a subtlety here. The rigorous definition of path integrals from
time slicing leads to integration in
phase space over both coordinates and
momenta (see Sect. 1.6.1). This generalizes to, for example, the scalar field
in a natural fashion. The phase space
for the electromagnetic field, however,
is constrained because of gauge invariance. So it is not obvious that one
can define the theory using a path integral over Aj at all. We won’t discuss
this because there are ways of getting
around it formally leading to the same
results.
114
Chapter 3. From Fields to Particles
(α)
G(Am ) = ∂m Am + ∂ 2 α − ω and (δG/δα) = ∂ 2 ; the determinant is just Det
[∂ 2 ] which is independent of Aj . Introducing this factor of unity into the
path integral and pulling out the determinant because it is independent of
Aj , we get the result
(α)
Z =
Dα DAj eiS[A] δ(G(Aj )) Det[∂ 2 ]
(α)
2
=
Dα Det[∂ ] DAj eiS[A] δ(G(Aj ))
(3.208)
But since both the measure and the action are gauge invariant, we know
(α)
(α)
that DAj = DAj and S[A] = S[Aj ], which allows us to write
(α)
(α)
(α)
2
Z =
Dα Det[∂ ] DAj eiS[A ] δ(G(Aj ))
(3.209)
=
Dα Det[∂ 2 ] DAj eiS[A] δ(G)
In arriving at the second equality we have just replaced integration over
(α)
Aj by integration over Aj . We have now isolated the infinite integration
measure over the gauge function α in the first factor and all the dependence
on Aj is contained in the second functional integral over Aj . The Dirac
delta function in it restricts integration to configurations for which G =
∂m Am − ω(x) = 0.
There is one final trick which we can use to eliminate the ω dependence of the result. Consider a functional integral over ω of some arbitrary
function F (ω) leading to some finite constant C:
1
Dω F (ω) = C;
1=
Dω F (ω)
(3.210)
C
We now introduce this factor of unity into the path integral and write
1
Dω DAj eiS[A] δ [∂A − ω] F (ω) (3.211)
DAj eiS[A] δ [∂A − ω] =
C
Interchanging the order of functional integrations, we find that we have the
result
1
iS[A]
DA eiS[A] F (∂A)
(3.212)
δ [∂A − ω] =
DAj e
C
which is expressed
entirely in terms of the vector potential Aj . If you take
#
F ∝ exp −i Q(∂A)d4 x then we are essentially adding −Q(∂A) as a gauge
fixing term to the Lagrangian, obtaining the result in Eq. (3.196). But
since we like quadratic Lagrangians, we choose Q ∝ (∂A)2 by taking
ζ 2
4
F (ω) = exp −i d x ω
(3.213)
2
in which case the final result will be
'
&
2
1
det ∂
Dα
(3.214)
Z =
C
1 mn
ζ
2
× DAj exp −i d4 x
F Fmn + (∂m Am )
4
2
3.6. Quantizing the Electromagnetic Field
115
Except for the infinite constant in the curly bracket, we are now working
with an action with an extra gauge fixing term which we introduced earlier
in Eq. (3.198) somewhat arbitrarily.
The above procedure also works in the case of Coulomb gauge thereby
justifying the quantization scheme, which we have originally adopted, in
the language of the path integral. While integrating over DAj = DA0 DA
we can impose the ∇ · A = 0 condition by introducing a factor
1 = Δ [Dω]δ[∇ · Aω ], Aω
(3.215)
m ≡ Am + ∂m ω
where Δ is the determinant which is independent of Ai , into the path
integral, thereby making A purely transverse. Going through the same
procedure as above, this will lead to a path integral of the form
4
m
Z[J] = DA0 DA δ[∇ · A] eiS[A]+i d x J Am
(3.216)
when the field is coupled to a source. We have already seen in Sect. 3.1.5
that in the Coulomb gauge, the A0 decouples from A in the action and can
be integrated out separately. This will essentially add a phase containing
the Coulomb energy of interaction which can be ignored. The rest of the
integration is over the transverse degree of freedom which will lead to a
term like
1
4
4
κλ
Z[J] ∝ exp
(3.217)
d x d y Jκ (x)G (x, y) Jλ (y)
2
where Gκλ is the transverse propagator coupling the transverse degrees
of freedom of the source. This result ties up with the previous analysis
done in the canonical formalism in Sect. 3.6.1 and relates the transverse
propagator to the interaction between the currents.
3.6.3
Casimir Effect
It is obvious that the success of our program, for quantizing a field and interpreting the states in the Hilbert space in terms of particles, is closely tied
to decomposing the field into an infinite number of harmonic oscillators.
The facts: (i) the energy spectrum of the harmonic oscillator is equally
spaced and (ii) its potential is quadratic, are the features which allow us
to introduce quantum states labeled by a set of integers with the energy of
the state changing by ωk when nk → nk + 1. But if we take these oscillators seriously, we get into trouble with their zero point energy (1/2)ωk
which adds up to infinity. The conventional wisdom — which we have been
faithfully advocating — is to simply throw this away by introducing normal
ordering of operators like the Hamiltonian.
There is however one effect (called the Casimir effect, which is observed
in the lab) for which the most natural explanation is provided in terms of
the zero point energy of the oscillators. This is probably one of the most
intriguing results in field theory and is worth understanding.
Consider two large, plane, perfect conductors of area L2 kept at a distance a apart in otherwise empty space. This is like two capacitor plates
with L a (which allows us to ignore the edge effects) but we have not
put any charges on them. Experiments show that these two plates exert a
force of attraction on each other which varies as a−4 . You need to explain
116
63 The vanishing boundary condition
is expected to mimic what happens in
the case of electromagnetic field on the
surfaces of the perfect conductors.
64 This
can be quite shocking to anyone innocent of the practices in high
energy physics. One might have expected that the sum, if not infinite,
should at least be decent enough to be
a positive integer rather than a negative fraction! Such a conclusion, coming from physicists, would be quite suspect; but it was first obtained by a very
respectable mathematician, Euler. In
fact, most of our discussion will revolve
around attributing meaning to divergent expressions in a systematic manner.
Chapter 3. From Fields to Particles
why two perfect conductors kept in the vacuum should exert such a force
on each other!
The simplest explanation is the following: In the absence of the plates
we are in the vacuum state of the electromagnetic field (made of an infinite
number of oscillators) with corresponding mode functions varying in space
as exp(ik·x) with all possible values for k. When you introduce two plates,
you have to impose the perfect conductor boundary condition for the electromagnetic field at the location of the plate. If the plates are located at
z = 0 and z = a, this necessarily limits the z-component of the wave vector
k to be discrete with the allowed values being kz = nπ/a with integral
n. The form of the total zero point energy for the oscillators will now be
different (though, of course, still divergent) from the original form of the
total zero point energy. The difference between the zero point energies
calculated with and without the plates will involve subtracting one infinity
from other. There are several ways of giving meaning to such an operation, thereby extracting a finite result for the difference. This difference
turns out to be negative and scales as (−a−3 ); this leads to an attractive
force because reducing a decreases this energy. Thus, whether zero point
energies exist or not, their differences seem to exist and seem to be finite!
Given the conceptual importance of this result, we will provide a fairly
detailed discussion of its derivation. To understand the key issues which
are involved, we will first consider a toy model of a scalar field in the (1+1)
dimension and then study electromagnetic fields in (3+1).
Consider the quantum theory for a massless scalar field in the (1+1)
dimension when the field is constrained to vanish63 at x = 0 and x = L.
The allowed wave vectors for the modes are now given by kn = (nπ/L).
The total zero-point energy of the vacuum in the presence of the plates is
therefore given by
∞
∞
π
1 nπ
=
n
(3.218)
E=
2 L
2L n=0
n=0
To give meaning to this expression, we need to evaluate the sum over
all positive integers. It turns out that this sum64 is equal to −(1/12) so
that E = −(π/24L). If one imagines two point particles at x = 0 and
x = L (which are the analogs of two perfect conductors in the case of
electromagnetism), this energy will lead to an attractive force (∂E/∂L) =
(π/24L2 ) between the particles. Let us now ask how we can obtain this
result.
There are rigorous ways of defining sums of certain divergent series
which could be used for this purpose. But before discussing this approach,
let us consider a more physical way of approaching the question. We first
note that the zero-point energy in the absence of plates is given by the
expression
E0 =
∞
−∞
2L
L dk 1
|k| =
2π 2
4π
π2
L2
0
∞
π
dn n =
2L
∞
dn n
(3.219)
0
where we have used the substitution k = (πn/L) with n being a continuous
variable. What we are really interested in is the energy difference given by
∞
∞
π
n−
dn n
(3.220)
ΔE =
2L n=0
0
3.6. Quantizing the Electromagnetic Field
117
To evaluate this difference, let us introduce a high-n cut-off by a function
e−λn into these expressions and take the limit of λ → 0 at the end of the
calculation. That is, we think of the energy difference as given by65
∞
∞
π
−λn
−λn
ΔE =
ne
−
dn ne
(3.221)
lim
2L λ→0 n=0
0
For finite λ, the energy difference can be evaluated as
∞
π d
1
π d −λn 1
1
=−
e
−
−
ΔE = −
2L dλ n=0
λ
2L dλ 1 − e−λ
λ
(3.222)
We are interested in this expression near λ = 0 which can be computed by
a Taylor series expansion. We see that:
1
1
1
1
−
−
≈
1
1
−λ
2
3
1−e
λ
λ
λ 1 − 2 λ + 6 λ + O(λ )
2
1
1
1
1 2
1 2
3
=
λ− λ +
λ− λ
+ O(λ )
λ
2
6
2
6
=
1
1
+ λ + O(λ2 )
2 12
(3.223)
It follows, quite remarkably, that:
1
π
π
ΔE = −
+ O(λ) → −
2L 12
24L
(3.224)
which is the same result we would have obtained by treating the sum over
all positive integers as (−1/12) as we mentioned before!
The above procedure of introducing a cut-off function e−λn and taking
the limit of λ → 0 at the end of the calculation has one serious drawback. The (−1/12) we obtained seems to have come from the Taylor series
expansion of e−λ and hence it is not clear whether the same result will hold
if we use some other cut-off function to regularize the sum and the integral.
To settle this question, we need to evaluate expressions like
Δf ≡
∞
f (n) −
∞
dn f (n)
(3.225)
0
n=0
where f (x) is a function which dies down rapidly for large values of the
argument. It can be shown that (see Mathematical Supplement Sect. 3.8):
∞
∞
∞
Bk k−1
D
f (n) −
dn f (n) = −
f
(3.226)
Δf =
k!
0
0
n=0
1
where D = d/dx and Bk (called Bernoulli numbers) are defined through
the series expansion
∞ k
t
t
=
Bk
(3.227)
et − 1
k!
k=0
Consider now an arbitrary cut-off function F (λx) instead of exp(−λx) and
let the modified function be f (x) = xF (λx) with the condition that F (0) =
1. Using Eq. (3.226) we can compute Δf with this cut-off function and take
65 From a physical point of view, we
could think of a factor like e−λn as
arising due to the finite conductivity of
the metallic plates in the case of (1+3)
electrodynamics which makes conductors less than perfect at sufficiently
high frequencies.
118
Chapter 3. From Fields to Particles
the limit λ → 0. Equation (3.226) requires us to compute the derivatives
of f (x) at the origin and take the limit λ → 0. One can easily see that the
first derivative evaluated at the origin is given by
[xF (λx)λ + F (λx)]
=1
(3.228)
x=0
and the second derivative vanishes
F (λx)λ + xλ2 F (λx) + λF (x)
x=0
→0
(3.229)
when λ → 0. All further derivatives will introduce extra factors of λ and
hence will vanish. We therefore find that
Δf = −
B2
1
f (0) = −
2!
12
(3.230)
where we have used the fact that B2 = (1/6). This shows that, for a wide
class of cut-off functions, we recover the same result. Knowing this fact,
we could have even taken f (n) = n in Eq. (3.226) and just retained the
first derivative term (which is the only term that contributes) on the right
hand side of Eq. (3.226).
Having done this warm-up exercise, let us turn our attention to the
electromagnetic field in (1+3). We consider a region between two parallel
conducting plates, each of area L × L, separated by a distance a. We
will assume that L a and will be interested in computing the force
per unit area of the conducting plates by differentiating the corresponding
expression for zero-point energy with respect to a. As in the previous case,
we want to compute the zero-point energy in the presence of the plates
and in their absence in the 3-dimensional volume L2 a and compute their
difference.
The energy contained in this region in the absence of plates is given by
the integral
(
2 2
L d k
a dk3 1
2 + k2 + k2
k
E0 = 2
1
2
3
(2π)2
2π
2
2 2 ∞
2
L d k
2 + k 2 + nπ
dn
k
(3.231)
=
1
2
(2π)2 0
a
where the overall factor 2 in front in the first line takes into account two
polarizations and we have set k3 = (nπ/a) with a continuum variable n
to obtain the second line. Writing the transverse component of the wave
2
≡ k12 + k22 ≡ (π/a)2 μ, we can re-write this expression as an
vector as k⊥
integral over μ and n in the form
∞
1/2
π 2 L2 ∞
dμ
dn μ + n2
(3.232)
E0 =
3
4a
0
0
Both the integrals, of course, are divergent — as to be expected.
Let us next consider the situation in the presence of the conducting
plates. This will require replacing the integral over n by a summation over
n when n = 0. When n = 0, the corresponding result has to be multiplied
by a factor (1/2) because only one polarization state contributes. This
comes about from the nature of the mode functions for the vector potential
A in the presence of the plates. (See also Problem 4.) We will work in
3.6. Quantizing the Electromagnetic Field
119
the gauge with A0 = 0, ∇·A = 0 and decompose the vector potential into
parallel and normal components A = A + A⊥ where A⊥ is along the zaxis while A is parallel to the plates in the x − y plane. If you impose the
boundary condition, that the parallel component of the electric field and
the normal component of the magnetic field should vanish on the plates,
we get the conditions66
∂A
∂A
=
= 0,
Bz |z=0 = Bz |z=a = 0
(3.233)
∂t z=0
∂t z=a
66 Notation: The symbols and ⊥ do
not mean curl-free and div-free parts
in this context.
It is straightforward to determine the allowed modes in this case and you
will find that there are two polarizations, each contributing the energy:
nπ 2
(3.234)
ωk,n = k12 + k22 +
a
when n = 0, and one polarization contributing when n = 0. Therefore, the
corresponding expression in the presence of the plates is given by
∞
∞
π2 ∞
1
E1
= 3
dμ (μ + n2 )1/2 +
dμ μ1/2
(3.235)
L2
4a n=1 0
2 0
The difference between the energies per unit area is essentially determined
by the combination
∞
∞
4a3 1
1
f
(0)
−
[E
−
E
]
=
f
(n)
+
dnf (n)
1
0
π 2 L2
2
0
n=1
where
f (n) =
∞
dμ
μ + n2 =
∞
dv
√
v
(3.236)
(3.237)
n2
0
We will now use the result
∞
∞
∞
1
B2m 2m−1
f (n) −
dnf (n) + f (0) = −
f
D
2
(2m)!
0
0
n=1
m=1
(3.238)
which can be easily obtained from Eq. (3.226) (see Mathematical Supplement Sect. 3.8). We now have to regularize the expression in Eq. (3.236) by
multiplying f (n) by a cut-off function F (λn) and work out the derivatives
at the origin and take the limit λ → 0. We will, however, cheat at this
stage and will work with f (n) in Eq. (3.237) itself, because it leads to the
same result. In this case, it is easy to see that
f (n) = −2n2 = 0 ; f (n) = −4n = 0 ; f (n) = −4 (3.239)
0
0
0
0
0
with all further derivatives vanishing. Hence we get the final result to be
π 2 B4
ΔE
−1
π2
π2
=
(3.240)
=
=
−
L2
4a3 6
24a3 30
720a3
where we have used the fact that B4 = −(1/30). The corresponding force
of attraction (per unit area of the plates) is given by
f =−
π2
∂E
=−
∂a
240 a4
(3.241)
Exercise 3.21: Do this rigorously by
introducing at cut-off function F (λx)
— which will ensure convergence —
and prove that you get the same result, independent of the choice of F .
120
67 So
does it mean that zero point energies are non-zero and normal ordering is a nonsensical procedure? While
the above derivation provides the most
natural explanation for Casimir effect,
it is possible to obtain it by computing the direct Van der Waals like forces
between the conductors.
The fact
that conductivity has a frequency dependence provides a natural high frequency cut-off. If you accept this alternative derivation (originally due to
Liftshitz), then may be what you need
to really understand is why zero point
energies with dubious subtractions of
infinities also give the same result. Nobody knows for sure.
Chapter 3. From Fields to Particles
It is this force (which works out to 10−8 N for a = 1μm, L = 1 cm) that
has been observed in the lab.67
A more formal mathematical procedure for obtaining the Casimir effect
in the case of the (1+1) scalar field or (1+3) electromagnetic field involves
giving meaning to the zeta function
ζ(s) =
∞
1
ns
n=1
(3.242)
for negative values of s. We have already seen that, in the case of the scalar
field in (1+1) dimension, we needed to evaluate the sum of all integers
which is essentially ζ(−1). To see the corresponding result in the case of
the electromagnetic field, consider the integral
∞
1
n−2(α−1)
(3.243)
dμ (μ + n2 )−α =
(α
−
1)
0
which is well defined for sufficiently large α. We can therefore write
∞
∞
1
Λ−2(α−1) .
dμ μ−α = lim
dμ (μ + Λ2 )−α = lim
Λ→0 0
Λ→0 (α − 1)
0
(3.244)
Putting α = −1/2 in this relation, it can be shown that:
∞
1
Λ3 = 0 .
dμ μ1/2 = lim
(3.245)
Λ→0 (−3/2)
0
68 In
the language of dimensional regularization, a pet trick in high energy
physics, this means that several power
law divergences can be “regularized” to
vanish.
This means we need not worry68 about the second integral within the square
bracket in Eq. (3.235).
It follows that the quantity we needed to evaluate in Eq. (3.235) is given
by
∞
69 In the previous discussions, we interpreted the difference between two
divergent expressions as finite. Using
dimensional regularization, we are giving meaning to the divergent E0 /L2 in
Eq. (3.235) directly without any subtraction.
Exercise 3.22: For a massless scalar
field in the (1+1) dimension, find the
vacuum functionals with and without
the vanishing boundary conditions at
x = 0, L.
∞
1
1
1
ζ(2α − 2)
=
2(α−1)
(α − 1) n=1 n
(α − 1)
n=1 0
(3.246)
in the limit of α → −(1/2). That is, we need to give meaning to ζ(−3).
It is actually possible to define ζ(s) for negative integral values of s by
analytically continuing a suitable integral representation of ζ(s) in the
complex plane to these values.69 Such an analysis actually shows that
ζ(1 − 2k) = −(B2k /2k) using which we can obtain the same results as
above.
Finally we mention that the Casimir effect has a clearer intuitive explanation in the Schrodinger picture. The ground state wave functional which
we computed for a scalar field in Eq. (3.128) explicitly depended on using
the running modes exp(ik · x) to go back and forth between the field configuration φ(x) and the oscillator variables qk . If you introduce metal plates
into the vacuum, thereby modifying the boundary conditions, the vacuum
functional will change. In other words, the ground state wave functional in
the presence of the metal plates is actually quite different from that in the
absence of the plates. (This should be obvious from the fact that Ψ should
now vanish for any field configuration which does not obey the boundary
conditions.) Since the ground states are different, it should be no surprise
that the energies are different too.
∞
−α
dμ μ + n2
=
3.6. Quantizing the Electromagnetic Field
3.6.4
121
Interaction of Matter and Radiation
The electromagnetic field, as we have mentioned before, is the first example
in which one noticed the existence of quanta which, in turn, arose from
the study of the interaction between matter and radiation. While this is
probably not strictly a quantum field theoretic interaction — we will deal
with a non-relativistic quantum mechanical system coupled to a quantized
electromagnetic field — we will describe this briefly because of its historical
and practical importance.
To do this, we have to study the coupling between (i) quantized electromagnetic fields and (ii) a charged particle (say, an electron in an atom)
described by standard quantum mechanics. A charged particle in quantum
mechanics will have its own dynamical variable x and momentum p obeying standard commutation rules. Depending on the nature of the system,
such a charged particle can exist in different (basis) quantum states, each
of which will be labeled by the eigenvalues of a complete set of commuting variables. (For example, the quantum state |nlm of an electron in a
hydrogen atom is usually labeled by three quantum numbers n, l and m.)
For the sake of simplicity, let us assume that the quantum states of the
charged particle are labeled by the energy eigenvalues |E; the formalism
can be easily generalized when more labels are needed to specify the quantum state. The coupling between the electromagnetic field and the atomic
system is described by the Hamiltonian
Hint = −
d3 x J · A
(3.247)
where to the lowest order,70 in the non relativistic limit, J = qv ∼
= (q/m)p
= (q/m)(−i∇). We are interested in the transitions caused between the
quantum states of the electromagnetic field and matter due to this interaction.
Let the initial state of the system be |Ei , {nkα } where Ei represents
the initial energy of the matter state and the set of integers {nkα } denote
the quantum state of the electromagnetic field. We now turn on the interaction Hamiltonian Hint . Because of the coupling, the system can make a
transition to a final state |Ef , {nkα } where Ef represents the final energy
of the matter state and a new set of integers {nkα } denote the final quantum state of the electromagnetic field. To the lowest order in perturbation
theory, the probability amplitude for this process is governed by the matrix
element
Q ≡ Ef , {nkα }|Hint |Ei , {nkα }.
(3.248)
To see the nature of this matrix element, let us substitute the expansion of
the vector potential written in the form
A(t, x)
=
α=1,2
Akα
=
1
2ωk
d3 k
†
∗
a
A
+
a
A
kα
kα
kα kα ;
(2π)3
1/2
kα e−ikx .
(3.249)
70 The
canonical momentum in the
presence of A will have an extra piece
(qA/c) which will lead to a term proportional to q 2 A2 in J · A. This
quadratic term in qA is ignored in the
lowest order.
122
Chapter 3. From Fields to Particles
into the interaction Hamiltonian; then Hint becomes the sum of two terms
Hint = − d3 x J · A
(3.250)
d3 k
akα Akα + a†kα A∗kα ≡ Hab + Hem .
= − d3 x J.
3
(2π)
α=1,2
Since the creation and annihilation operators can only change the energy
eigenstate of the oscillator by one step, it is clear that the probability
amplitude Q in Eq. (3.248) will be non zero only if the set of integers
characterizing the initial and final states differ by unity for some oscillator labeled by kα. In other words, the lowest order transition amplitudes
describe either the emission or the absorption of a single photon with a definite momentum and polarization. Since the creation operator a†kα changes
the integer nkα to (nkα + 1), the term proportional to a†kα governs the
emission; similarly, the term proportional to akα governs the absorption of
the photon. Let us work out the amplitude for emission in some detail.
The emission process, in which the quantum system makes the transition from |Ei to |Ef and the electromagnetic field goes from a state |nkα
to |nkα + 1, during the time interval (0, T ), is governed by the amplitude
T
dt Ef |nkα + 1|Hem |nkα |Ei
A =
0
=
T
−
d3 x Ef |J · A∗kα |Ei (nkα + 1)
1/2
dt
(3.251)
0
where we have used the fact that nkα + 1|A|nkα = A∗kα (nkα + 1) .
Using the expansion for Akα in Eq. (3.249) and the fact that the energy
eigenstates have the time dependence exp (−iEt), the amplitude A can be
written as
1/2
A =
−
1/2
1
1/2
(nkα + 1)
(3.252)
2ωk
T
dt d3 x φ∗f (x) J.kα e−ik.x φi (x) e−i(Ei −Ef −ω)t
×
0
where φi (x) , φf (x) denote the wave functions of the two states. Denoting
the matrix element
q
(3.253)
d3 x φ∗f (x) p e−ik.x φi (x)
d3 x φ∗f (x) Je−ik·x φi (x) =
m
(which is determined by the system emitting the photon) by the symbol
Mf i , the probability of transition |A|2 becomes
|A|2 = P (T ) =
1
2ωk
|kα · Mf i |2 (nkα + 1) |F (T )|2
(3.254)
where
|F (T )| =
2
0
T
dt e
2
2
sin(RT /2)
=
R/2
−i(Ei −Ef −ω)t
(3.255)
3.6. Quantizing the Electromagnetic Field
123
with R ≡ (Ei − Ef − ω). In the limit of T → ∞, for any smooth function
S(ω), we have the result
∞
0
sin2 [(ω − ν)T /2]
dω S(ω)
2T S(ν)
[(ω − ν)/2]2
∞
−∞
sin2 η
dη = 2π T S(ν)
η2
(3.256)
which shows that,
lim
T →∞
sin2 [(ω − ν)T /2]
→ 2π T δD (ω − ν)
[(ω − ν)/2]2
in a formal sense. Hence Eq. (3.254) becomes, as T → ∞,
2
π
P (T ) ∼
kα · Mfi (nkα + 1) δD (Ei − Ef − ω).
=T
ωk
(3.257)
(3.258)
The corresponding rate of transition is P = P (T )/T , which gives a finite
rate for the emission of photons:71
1
dP
=
P≡
(nkα + 1) |kα · Mf i |2 2πδD (Ei − Ef − ω) . (3.259)
dt
2ωk
This expression gives the rate for emission of a photon with a specific wave
vector k and polarization α. The delta function, δD (Ei − Ef − ω) expresses
conservation of energy and shows that the probability is non zero only if
the energy difference between the states Ei − Ef is equal to the energy of
the emitted photon ω.
Usually, we will be interested in the probability for emission of a photon
in a frequency range ω, ω + dω and in a direction defined by the solid
angle element dΩ. To obtain this quantity, we have to multiply the rate of
transition by the density of states available for the photon in this range.
The density of states (for unit volume) is given by
d3 k
1 k 2 dkdΩ
k2
1
ωk2
dN
=
=
=
=
.
dωdΩ
(2π)3 dωdΩ
(2π)3 dωdΩ
(2π)3
(2π)3
Hence
dP
dtdωdΩ emi
(3.260)
dP ωk2
dP dN
=
dt dωdΩ
dt (2π)3
1
ωk2
=
(nkα + 1) |kα · Mf i |2 2πδD (ωk − ωf i )
3
2ω
(2π)
=
∝ ωk (nkα + 1) |kα · Mf i |2 δD (ωk − ωf i )
(3.261)
with ωfi ≡ Ei − Ef .
The analysis for the absorption rate of photons is identical except that
only the annihilation operator akα contributes. Since nkα − 1|akα |nkα =
1/2
nkα , we get nkα rather than (nkα + 1) in the final result:
dP
∝ ωk nkα |kα · Mf i |2 δD (ωk − ωf i ) .
(3.262)
dΩdtdω abs
The probabilities for absorption and emission differ only in their dependence on nkα . The probability for absorption scales in proportion to
71 The integration over the infinite
range of t implies, in practice, an integration over a range (0, T ) with ωT
1. If the energy levels have a characteristic width Δω ω, then the
above analysis is valid for ω −1 T
(Δω)−1 .
124
72 An intuitive way of understanding
stimulated emission is as follows. Consider an atom making a transition from
the ground state |G to an excited state
|E absorbing a single photon out of
n photons present in the initial state,
leaving behind a (n − 1) photon state.
The fact that this absorption probability P{|G; n → |E; n − 1} ∝ n ≡ Qn
is proportional to n seems intuitively
acceptable. Consider now the probability P for the time reversed process |E; n − 1 → |G; n. By principle
of microscopic reversibility, we expect
P = P giving P ∝ n ≡ Qn. Calling n − 1 = m, we get P {|E; m →
|G; m + 1} = Qn = Q(m + 1). Clearly
P is non zero even for m = 0 with
P {|E; 0 → |G; 1} = Q which gives
the probability for the spontaneous
emission while Qm gives the probability for the stimulated emission. Thus,
the fact that absorption probabilities
are proportional to n while emission
probabilities are proportional to (n+1)
originates from the principle of microscopic reversibility.
73 Make sure you understand this
point. The structure of the KleinGordon equation for a massless particle, φ = 0, is identical to Maxwell’s
equations Ai = 0 in a particular
gauge. The solutions for φ involve positive and negative energy modes and so
do the solutions for Ai . Sometimes a
lot of fuss is made over negative energy solutions, backward propagation
in time, etc. for the φ = 0 equation while you have always been dealing quite comfortably with the Ai =
0 equation, all your life. When you
quantize the systems, they have the
same conceptual structure.
Chapter 3. From Fields to Particles
nkα . Clearly, if nkα = 0, this probability vanishes; this is obvious since
no photons can be absorbed if there were none in the initial state to begin
with. But the probability for emission is proportional to (nkα +1) and does
not vanish even when nkα = 0. Hence there is a non-zero probability for a
system at an excited state to emit a photon and come down to a lower state
spontaneously. If the initial state of the electromagnetic field has a certain
number of photons already present, then the probability for emission is
further enhanced. The emission of a photon by an excited system when
no photons were originally present is called spontaneous emission and the
emission of a photon in the presence of initial photons is called stimulated
emission. Both these processes exist and contribute in electromagnetic
transitions.72
Your familiarity with this result should not prevent you from appreciating it. Spontaneous emission is a conceptually non-trivial quantum
field theoretic process and arises directly through the action of the “antiphoton” creating term in Eq. (3.172). The electron in the excited state in
the atom, say, can be described perfectly well by the Schrodinger equation
— until, of course, it pops down to the ground state emitting a photon.
We originally had just one particle, the electron (in an external Coulomb
field) to deal with, and the Schrodinger equation is adequate. But once it
creates a photon, we have to deal with at least two particles of which one
is massless and fully relativistic. So, this elementary process of emission
of the (anti)photon by an excited atom has the key conceptual ingredient
which we started out with, while explaining the need for quantum field
theory. Without a quantized description of the electromagnetic field, we
cannot account for an elementary process like the emission of a photon by
an excited system in a consistent manner. This can also be seen from the
fact that the initial state (of an electron in the excited atomic level) has no
photons, and yet a photon appears in the final state. People loosely talk of
this as “vacuum fluctuations of the electromagnetic field interacting with
the electron”; but the way we have developed the arguments, it should be
clear that this is a field theoretic effect73 arising from the fact that negative
energy solutions of the field equations for Ai are included in Eq. (3.172).
Finally, let us see how these results are related to the Planck spectrum
of photons in a radiation cavity. If we consider quantized electromagnetic
radiation in equilibrium with matter at temperature T , then in steady
state we will expect the condition Nup Rem = Ndown Rab to hold where
Nup and Ndown represent the two levels which we designate as up and
down and Rem and Rab are the rate of emission and absorption which we
have computed above. In thermal equilibrium, the population of atoms in
the two energy levels will satisfy the condition Nup = Ndown exp(−βE) =
Ndown exp(−βωk ) where E > 0 is the difference in the energies of the two
states and ωk is the frequency of the photon corresponding to this energy
difference. On the other hand, we see from our expressions for Rem and Rab
that their ratio is given by Rem /Rab = [(nkα + 1)/nkα ]. The equilibrium
condition now requires
eβE =
Ndown
Rem
nkα + 1
=
=
Nup
Rab
nkα
(3.263)
which tells you the number density of photons in a cavity, say, if the radiation is in equilibrium with matter atoms in the cavity held at temperature
3.7. Aside: Analytical Structure of the Propagator
125
T . We get the historically important Planck law for radiation in the cavity:
nkα =
1
eβωk − 1
(3.264)
The role played by the factors n + 1 and n in leading to this result is
obvious. That, in turn, is due to the fact that electromagnetic field can be
decomposed into a bunch of harmonic oscillators with the usual creation
and annihilation operators.
3.7
Aside: Analytical Structure of the Propagator
This discussion essentially brings us to end of this Chapter 3, the main
purpose of which was to interpret particles as excitations of an underlying
quantum field. We illustrated these ideas using real and complex scalar
fields and the electromagnetic field. But in all the cases we worked with
an action which is quadratic in the field variables thereby describing free,
noninteracting fields. We also found that in each of these cases one can
obtain an expression for the propagator in terms of the vacuum expectation
value of the time-ordered fields.
When we switch on interactions between the fields, the structure of
propagators will change. As we shall see later, most of the physical processes we are interested in (for e.g., scattering of particles) can be interpreted in terms of the propagators of the interacting theory. Unfortunately,
there is no simple way of computing these propagators when the interactions are switched on, and much of the effort in quantum theory is directed
towards developing techniques for their computation. We shall take up
many of these issues in the next two chapters. The purpose of this last,
brief section is to introduce some exact results related to propagators which
remain valid even in an interacting field theory.
It is possible to obtain some general results regarding the analytic structure of the propagators even in the exact theory, which will be useful later
on while discussing QED. For the sake of simplicity, we shall discuss this
aspect in the context of scalar field theory. The generalization to other
fields is straightforward.
Let us write down the propagator G(x, y) = 0|T [φ(x)φ(y)]|0 for x0 >
0
y by inserting a complete set of states in between:
0|φ(x)|nn|φ(y)|0 (3.265)
0|T [φ(x)φ(y)]|0 = 0|φ(x)φ(y)|0 =
n
where |n is the eigenstate of the full Hamiltonian. Since the total momentum commutes with the Hamiltonian, we can label any such state with a
momentum p and an energy E thereby defining74 a quantity mλ by the
relation E 2 − p2 ≡ m2λ . The λ here merely labels such states since there
could be several of them. We can therefore expand the complete set of
states in terms of the momentum eigenstates by:
d3 p 1
|nn| = |00| +
|pp|
(3.266)
(2π)3 2Ep
n
74 Note that m is not supposed to be
λ
the mass of any “fundamental” particle and could represent a bound state
or a composite unbound state of many
particles.
λ
= p +
Plugging this into Eq. (3.265) and using the
where
translation invariance and the scalar nature of the field, we can obtain
Ep2 (λ)
2
m2λ .
Exercise 3.23: Prove this result.
126
Chapter 3. From Fields to Particles
0|φ(x)|pλ = 0|φ(0)|p = 0e−ip·x
(3.267)
This, in turn, allows us to write the propagator for x0 > y 0 in the form
2
d3 p
1
e−ip(x−y) 0|φ(0)|p = 0λ (3.268)
0|φ(x)φ(y)|0 =
3
(2π) 2Ep (λ)
λ
Or, equivalently, as
0|φ(x)φ(y)|0 =
λ
2
d4 p ie−ip·(x−y)
=
0
0|φ(0)|p
λ
(2π)4 p2 − m2λ + i
(3.269)
We have introduced a p0 integral as usual which ensures that the correct
pole will be picked up through the i prescription. Here, this is merely a
mathematical trick and the propagator did not arise from any fundamental
field theory. The advantage of this form is that it trivially works for the
x0 < y 0 case as well, allowing us to replace φ(x)φ(y) by the time-ordered
product T [φ(x)φ(y)] in the left hand side. If we define the Fourier transform
of the propagator in the usual manner, we find that the propagator in the
Fourier space is given by
2 ∞ ds
ρ(s)i
i
(3.270)
G(p) =
0|φ(0)|p = 0λ =
2
2
2
p − mλ + i
2π p − s + i
0
λ
where we have defined a function (called the spectral density of the theory)
by
2
ρ(s) ≡
(2π)δ(s − m2λ )0|φ(0)|p = 0λ
(3.271)
λ
75 Typically, they arise either through
a square root or a logarithmic factor
and invariably characterize the contribution of multi-particle states to the
propagator in the theory.
In Eq. (3.270), the exact propagator is re-expressed in an integral representation in terms of the spectral density of the theory ρ(s). We see from
the definition of ρ(s) that it picks up contributions from the various states
of the theory created by the action of the field on the vacuum. These states,
depending on the mass threshold, will lead to non-analytic behaviour or
branch-cuts in the propagators.75 Conversely, if we can write the propagator G(p) of the theory — at any order of approximation — in the form
of Eq. (3.270), then we can read off the spectral density ρ(s). The branchcuts will now tell us about the properties of the multi-particle states in the
theory. This is a useful strategy in some perturbative computations and
we will have occasion to use this later on.
One curious application of this result is in determining the behaviour
of two-point functions for large values of the momentum. This behaviour
is important as regards divergences in perturbation theory which we will
discuss in the next two chapters. These divergences arise because, while
computing certain amplitudes using Feynman diagrams, we will integrate
over the two-point functions in the momentum space. These integrals use
the measure d4 p ∝ p3 dp, while the propagators decrease only as p−2 for
large p, thereby leading to divergences. You might wonder whether these
divergences arise, because our description is incorrect at sufficiently hight
energies. Can we modify the theory in some simple manner to make the
integrals convergent? This could very well be possible but not in any simple
minded way. Let us illustrate this fact by a specific example.
The simplest hope is that we can make the divergences go away by
modifying the theory at large scales to make the propagators die down
3.7. Aside: Analytical Structure of the Propagator
127
faster than p−2 . As an example, consider a Lagrangian for a scalar field in
which the kinetic energy term is modified to the form
2
1
L = − φ + c 2 + m2 φ + Lint (φ)
2
Λ
(3.272)
which would lead to a propagator in the momentum space which behaves
as:
1
(3.273)
Π(p2 ) = 2
p − m2 − c(p4 /Λ2 )
In this case, the propagator decays as p−4 for large p and one would think
that the loop integrals would be better behaved. More generally, one could
think of modifying Π(p2 ) such that it decreases sufficiently fast as p2 → ∞.
Under Euclidean rotation, p2 → −p2E and we would like Π(−p2E ) to go to
zero faster than (1/p2E ).
We can actually show that this is not possible in a sensible76 quantum
theory. To do this, we go back to the spectral decomposition of the twopoint function given by Eq. (3.270), which — as we emphasized — holds
for an interacting field theory as well:
d4 p ip·(x−y)
e
iΠ(p2 )
(2π)4
0|T [φ(x)φ(y)]|0 =
76 What constitute a ‘sensible’ theory in this context is, unfortunately, a
matter of opinion, given our ignorance
about physics at sufficiently high energies. Roughly speaking, one would like
to have unitary, local, Lorentz invariant field theory.
(3.274)
where
Π(p ) ≡
2
0
∞
ρ(q 2 )
→
dq 2
p − q 2 + i
∞
2
0
dq 2
ρ(q 2 )
= Π(−p2E )
p2E + q 2
(3.275)
with the second result being valid in the Euclidean sector. From this, it
follows that
2
∞ 2 ρ(q 2 ) q0 2 ρ(q 2 )
Π(−p2E ) =
≥
(3.276)
dq 2
dq 2
pE + q 2 0
pE + q02
0
for any q02 . If we now take the limit p2E → ∞, we will eventually have
p2E > q02 , leading to the result
2
q0
2
ρ(q
)
A
dq 2 2 =
lim p2E Π(−p2E ) ≥ 2lim p2E
2
2p
2
pE →∞
pE →∞
0
E
where
A=
q02
ρ(q 2 ) dq 2
(3.277)
(3.278)
0
defines some finite positive number. A propagator in Eq. (3.273) will violate
this bound for any c and A at large enough p2E . Note that the positivity of
ρ(q 2 ) — which follows from its definition — is crucial for this definition. We
therefore conclude that propagators cannot decrease77 faster than (1/p2 )
for large p2 . Thus, simple modifications of the interactions, maintaining
unitarity, Lorentz invariance and a local Lagrangian description will not
help you to avoid the loop divergences.
77 This is a rather powerful result obtained from very little investment!
128
Chapter 3. From Fields to Particles
3.8
Mathematical Supplement
3.8.1
Summation of Series
We will provide a proof of some of the relations used in Sect. 3.6.3 here.
Let D = d/dx be the derivative operator using which we can write the
Taylor series expansion for a function f (x) as f (x + n) = enD f (x) where
n is an integer. Summing both sides over n = 0 to n = N − 1 we get
N
−1
f (x + n) =
n=0
N
−1
enD f (x) =
n=0
=
(eD
eN D − 1
f (x)
eD − 1
1
[f (x + N ) − f (x)]
− 1)
(3.279)
On the right hand side, we introduce a factor 1 = D−1 D and use the Taylor
series expansion of the function
∞
tk
t
=
Bk
t
e −1
k!
(3.280)
k=0
to obtain the result
∞
D
Bk k
−1
−1
D
[f
(x
+
N
)
−
f
(x)]
=
D
D [f (x + N ) − f (x)]
D
(e − 1)
k!
k=0
(3.281)
In the sum on the right hand side, we will separate out the k = 0 term
from the others and simplify it as follows:
N
N
−1
−1
D (f (x + N ) − f (x)) = D
f (x + t)dt =
f (x + t) dt (3.282)
0
0
where, in obtaining the last result, we have used the fact that D−1 f = f .
Putting all these together, we get:
N
−1
n=0
f (x + n) −
N
f (x + t)dt =
0
∞
Bk
1
k!
Dk−1 [f (x + N ) − f (x)] (3.283)
We will now take the limit of N → ∞ assuming that the function and all
its derivatives vanish for large arguments, thereby obtaining
∞
∞
∞
Bk k−1
f (x + n) −
f (x + t)dt = −
f (x)
(3.284)
D
k!
0
n=0
1
If we now choose x = 0, we obtain Eq. (3.226) quoted in the text:
∞
∞
∞
Bk k−1
Δf =
D
f (n) −
dn f (n) = −
f
(3.285)
k!
0
0
n=0
1
This result can be rewritten in a slightly different form which is often
useful. To obtain this, consider the Taylor series expansion of the left hand
side of Eq. (3.280) separating out the first two terms:
∞
∞
k=0
k=2
tk
tk
t
1
=
B
t
+
Bk
=
1
−
k
et − 1
k!
2
k!
(3.286)
3.8. Mathematical Supplement
129
t
t et + 1
t
+
=
et − 1 2
2 et − 1
Since the function
(3.287)
is even in t, the summation in the right hand side of Eq. (3.286) from k = 2
to ∞ has only even powers of t. In other words, all odd Bernoulli numbers
other than B1 vanish.78 Therefore, one could equally well write the series
expansion for Bernoulli numbers as
∞
t
t
B2m 2m
=1− +
t
et − 1
2 m=1 (2m)!
78 This itself is an interesting result.
In fact, when you have some time to
spare, you should look up the properties of Bernoulli numbers. It is fun.
(3.288)
which allows us to write Eq. (3.285) in the form
∞
∞
f (n) −
f (n)dn
=
0
n=0
−
∞
Bk
1
=
k!
D
k−1
f
(3.289)
0
∞
B2m 2m−1
1
f (0) −
D
f
2
(2m)!
0
m=1
Separating out the n = 0 term in the summation in the left hand side, this
result can be expressed in another useful form:
∞
n=1
f (n) −
0
∞
∞
B2m 2m−1
1
D
dnf (n) + f (0) = −
f
2
(2m)!
0
m=1
(3.290)
This was used in the study of the Casimir effect in (1+3) dimensions for
electrodynamics.
3.8.2
Analytic Continuation of the Zeta Function
We will next derive an integral representation for the zeta function ζ(s)
which allows analytic continuation of this function for negative values of s.
To do this, we begin with the integral representation for Γ(s) and express
it in the form
∞
∞
s−1 −t¯
s
¯
¯
Γ(s) =
dt t
e =n
dt ts−1 e−nt
(3.291)
0
0
where we have set t¯ = nt to arrive at the last expression. Summing the
expression for Γ(s)/ns over all n, we get the result:
∞ s−1
∞
1
t
= Γ(s)ζ(s)
=
dt t
Γ(s)
s
n
e −1
0
n=1
(3.292)
Im z
Let us now consider the integral,
I(s) ≡
dz
C
z s−1
ez − 1
Re z
(3.293)
in the complex plane, over a contour (see Fig. 3.6) consisting of the following
paths: (i) Along the real line from x = ∞ to x = where is an infinitesimal
quantity; (ii) On a circle of radius around the origin going from θ = 0 to
θ = 2π; (iii) Along the real line from to ∞. Along the contour in (i) we get
the contribution −Γ(s)ζ(s); it can be easily shown that the contribution
Figure
3.6:
Eq. (3.293).
Contour to evaluate
130
Chapter 3. From Fields to Particles
along the circle will only pick up the residue at the origin. Finally, along
(iii) one obtains the contribution e2πis ζ(s)Γ(s). We therefore find that
z s−1
= Γ(s)ζ(s)[e2πis − 1] = Γ(s)ζ(s)eiπs (2i sin(πs))
dz z
e −1
C
1
(3.294)
= ζ(s)eiπs (2πi)
Γ(1 − s)
In arriving at the last expression, we have used the standard identity:
Γ(s)Γ(1 − s) =
π
sin πs
(3.295)
This allows us to express ζ(s) as a contour integral in the complex plane
given by
1
z s−1
−iπs
(3.296)
Γ(1 − s)
dz z
ζ(s) = e
2πi C e − 1
Exercise 3.24: Prove Eq. (3.296).
By studying the analytical properties of the right hand side, it is easy to
show that this expression remains well defined for negative integral values
of s. For example, when s = −1, the integrand in the contour integral can
be expressed as (1/z 3 )[z/(ez − 1)]. The residue at the origin is therefore
governed by the z 2 term in the power series expansion of [z/(ez −1)], giving
a contribution proportional to B2 . Similarly, for s = −3 the integrand in
the contour integral can be expressed as (1/z 5 )[z/(ez − 1)]. The residue at
the origin is therefore governed by the z 4 term in the power series expansion
of [z/(ez − 1)] giving a contribution proportional to B4 . In fact, it is easy
to show, putting all the factors together, that:
ζ(1 − 2k) =
B2k
2k
for k = 1, 2, ....
(3.297)
This gives ζ(−1) = −B2 /2 = −1/12, ζ(−3) = −B4 /4 = 1/120 etc. These
results provide an alternate way of giving meaning to the divergent series
which occur in the computation of the Casimir effect.
Chapter 4
Real Life I: Interactions
4.1
Interacting Fields
We have seen in the previous sections that a useful way of combining the
principles of relativity and quantum theory is to introduce the concept of
fields and describe particles as their quanta. So far, we have discussed the
quantization of each of the fields separately and have also assumed that,
when treated as individual fields, the Lagrangians are at most quadratic in
the field variables. The study of such a real scalar field based on the action
in Eq. (3.3) led to non-interacting, spinless, massive, relativistic particles
obeying the dispersion relation ωp2 = p2 + m2 . Similarly, the study of the
action in Eq. (3.41) led to spinless particles with mass m and charge q. This
field will also be non-interacting unless we couple it to an electromagnetic
vector potential Aj . Finally, we studied the free electromagnetic field based
on Eq. (3.63) which led to massless spin-1 photons1 which were also noninteracting unless coupled to charged matter.
The real world, however, consists of relativistic particles which are interacting with each other. If these particles arise as the quanta of free
fields, their interactions have to be modeled by coupling the fields to one
another. One prototype example could be a complex scalar field coupled
to the electromagnetic field interacting through the Lagrangian:
1
L = (Dm φ)∗ Dm φ − m2 φ∗ φ − F ik Fik
4
1 We haven’t actually discussed the
spin of any of these fields rigorously
and, for the moment, you can take it
on faith that scalar fields are spin zero
and the photon has spin 1. To do this
properly will require the study of the
Lorentz group which we will take up in
Chapter 5.
(4.1)
which is essentially Eq. (3.58) with the electromagnetic Lagrangian in
Eq. (3.63) added to it. When expanded this will read as:
1 ik
m ∗
2 ∗
L = ∂m φ∂ φ − m φ φ − F Fik
4
+iqAm (φ∂ m φ∗ − φ∗ ∂ m φ) + q 2 |φ|2 Am Am
(4.2)
The first line of the above equation gives the sum of the two free Lagrangians for the complex scalar field and the electromagnetic field. The
second line gives the coupling between the two fields with the strength
of the coupling determined by the dimensionless constant q which is dimensionless in four dimensions.. This term makes the charged particles —
which are the quanta of the complex scalar fields — interact through the
electromagnetic field.
© Springer International Publishing Switzerland 2016
T. Padmanabhan, Quantum Field Theory, Graduate Texts in Physics, DOI 10.1007/978-3-319-28173-5_4
131
132
Chapter 4. Real Life I: Interactions
It is also possible to have a field which interacts with itself. For example,
a Lagrangian of the form
L=
2 This Lagrangian has the form L =
(1/2)(∂φ)2 − V (φ) with a potential
V (φ) = (1/2)m2 φ2 + (λ/4!)φ4 . If you
are studying classical or quantum mechanics, you would have treated the
entire V as an external potential. But
in quantum field theory we think of the
(1/2)m2 φ2 part (that is, the quadratic
term) as a part of the “free” field (as
long as m2 > 0) and the φ4 as selfinteraction. This is because the m2 φ2
leads to a linear term in the equations
of motion and in the Fourier space,
each mode evolves independently. The
m2 φ2 term also contributes to the correct description of the quanta of the
free field, with the m2 part of the dis2 = p2 + m2 coming
persion relation ωp
from this term.
λ
1
∂m φ∂ m φ − m2 φ2 − φ4
2
4!
(4.3)
describes a scalar field with a self-interaction2 indicated by the last term
proportional to φ4 . This term is again characterized by a dimensionless
coupling constant λ (in four dimensions) and will allow, for example, the
quanta of the scalar field to scatter off each other through this interaction.
The real quantum field theory involves studying such interactions and
making useful predictions. Unfortunately, there are very few realistic interactions for which one can solve the Heisenberg operator equations exactly.
So the art of making predictions in quantum field theory reduces to the art
of finding sensible, controlled, approximation schemes to deal with such
interactions.
These schemes can be broadly divided into those which are perturbative
and those which are non-perturbative. The perturbative approach involves
treating the interactions (characterized by the coupling constants q or λ
in the above cases) as “small” and studying the physical processes orderby-order in the coupling constants. In some situations, one could use a
somewhat different perturbation scheme but all of them, by definition,
involve having a series expansion in a suitable small parameter. We will
have a lot to say about this approach in the second half of this chapter and
in the next chapter.
The non-perturbative approach, on the other hand, involves using some
other approximation to capture certain aspects of the physical situation
without relying on the smallness of the coupling constant. Obviously, this
will be a context dependent scheme and is somewhat less systematic than
the perturbative one. But — when it can be done — it provides important
insights that could complement the perturbative approach. One possible
way of obtaining non-perturbative results is by using a technique called the
effective action. The first half of this chapter will be devoted to describing
and illustrating this non-perturbative technique.
4.1.1
The Paradigm of the Effective Field Theory
To understand what is involved, let us introduce the Fourier transform φ(k)
of φ(x) with each mode labeled by a given value of k. Roughly speaking,
these wave vectors k j describe the energy and momentum associated with
the modes, with the larger values of k corresponding to higher energies. In
the case of a quadratic interaction (corresponding to V (φ) ∝ φ2 ) the field
equation will be linear in φ; then different modes do not talk to each other
and we say that it is a free field. If V (φ) has terms higher than quadratic,
then the modes for different k get coupled. For example if V (φ) ∝ φn , then
you can easily convince yourself that the interaction term in the Lagrangian
will involve a product of n terms φ(k1 ), φ(k2 )...φ(kn ) in the Fourier space.
So, in an interacting field theory, physics at high energies is coupled to
physics at low energies.
The fact that high energy interactions (e.g. collisions) between particles
can lead to new particles now raises a fundamental issue of principle: Since,
at any given time, we will have direct knowledge about interactions only
below some energy scale (say, Emax ), accessible in the lab, how can we say
4.1. Interacting Fields
133
anything about the low energy physics if the low energy modes are coupled
to high energy modes? Don’t we lose all predictability?
Actually we don’t. The resolution of the above problem is closely related
to the current paradigm in high energy physics called the renormalization
group and the concept of effective field theory, which is in fact the proper
language to describe quantum field theory.
To understand how this comes about, consider a more familiar situation.
The hydrodynamics of a fluid can be described by a set of variables like density, velocity etc. which satisfy a certain set of equations.3 These equations
are non-linear, which again tells you that high wavenumber (momentum)
modes of a fluid are coupled to low momentum modes of the fluid. Further,
these high wavenumber modes probe small spatial scales and we know for
a fact that, at sufficiently small scales, the hydrodynamical description is
incorrect — because matter is made of discrete particles. This fact, however, has never prevented us from using the equations of hydrodynamics
within its domain of validity, say, at scales sufficiently large compared to
the intermolecular separation. The effect of all the small scale phenomena
can be incorporated into a small set of parameters like the specific heat,
the coefficients of viscosity etc. which, of course, need to be determined by
experiments.
The situation is similar when we use fields to describe the interaction of
relativistic particles, but with some interesting new complications. By and
large, the low energy sector of the theory decouples from the high energy
sector — in a well-defined manner — and can be described in terms of
certain parameters in the low energy Lagrangian like the mass, charge etc.,
the values of which have to be determined by observations. More formally,
let us assume that the description in a given field theory is valid only at
energy scales below a particular value Λ. (This is analogous to saying that
the hydrodynamic description breaks down at length scales comparable to
the atomic size.) Let us divide the modes φ(k) into those below a particular
energy E < Λ, (which we will call low energy modes, φlow ) and those with
energies between E and Λ (which we will call high energy modes, φhigh )
and ask what is the effect of φhigh on φlow .
There is a formal method of “integrating out” φhigh from the path
integral defining the theory and incorporating its effects by a modified
effective interaction between φlow modes. Among other things, this will
have the effect of changing the values of the parameters in the original
Lagrangian which were describing φlow . The parameters like mass, charge
etc. will now get modified to new values which will be functions of both
E and Λ. In addition, this process will also introduce new terms into the
Lagrangian which were not originally present.
Suppose we now lower the value of E. This will integrate out more and
more high energy modes thereby causing the low energy parameters of the
theory to “flow” or “run” as a function of E. The effect of high energy
interactions in the low energy sector of the theory is essentially determined
by the flow of these parameters. This flow has a natural description in
terms of the physical meaning attributed to the parameters (like m, q, λ
etc.) which occur in the Lagrangian.
As an example, let us consider the Lagrangian in Eq. (4.2) which has
the parameter q occurring in it. In the earlier discussion, we identified this
parameter as the charge of the quanta of the complex scalar field. But operationally, to determine this charge, we need to perform some experiments
3 In
fact, the density and velocity in
fluid mechanics are also fields in the
sense that they are functions of space
and time governed by certain partial
differential equations.
134
4 In nature both examples are seen
in model theories: quantum electrodynamics is a theory in which the
coupling constant becomes stronger at
higher energies while quantum chromodynamics (describing strong interaction of the quarks) is asymptotically
free.
5 No condensed matter physicist will
set the spatial cut-off scale to zero
and ask what happens. Some particle
physicists can be enormously naive —
or extremely optimistic depending on
your point of view!
6 The standard textbook approach
will be to first introduce perturbation
theory to study interacting fields. The
amplitudes computed in the perturbation theory will be divergent and the
concept of renormalization will be introduced to “remedy” this. We will do
some non-perturbative examples first
and discuss the renormalization and
‘running’ of coupling constants in the
non-perturbative context to stress the
conceptual separation between these
effects and the existence of divergences
in the perturbation theory.
Chapter 4. Real Life I: Interactions
involving these quanta. One could, say, scatter these particles off each other
electromagnetically, and by comparing the theoretically predicted scattering cross-section with the observationally determined one, we will be able
to determine the strength of the coupling q. But what we measure in such
a process is the value of q relevant for the scattering experiment, performed
at some energy scale E, say, which includes all the effects of the electromagnetic interaction. There is no a priori reason to believe that this will
be governed by the same parameter as introduced in the Lagrangian. One
describes this process — viz., that the physical parameters can be different
from the parameters introduced in the Lagrangian — by the term “renormalization” of the parameters in the Lagrangian. This is one of the features
of an interacting theory and is not difficult to understand conceptually by
itself.
The nature of the flow of each parameter decides its role in the theory.
Consider, for example, a coupling constant in the theory which increases
with E; then the theory will become more and more strongly coupled at
high energies. This also means that if we fix the value of the coupling
constant at some very large energy scale, then it will keep decreasing with
energy and will become irrelevant at sufficiently low energy scales. Such
coupling constants, even if they are generated during the process of integrating out the high energy modes, are not going to affect the low energy
sector of the theory. We will have a description in terms of an effective field
theory where the corrections due to such terms are suppressed (or vanish)
at low energies compared to E.
On the other hand, if the coupling constant decreases with increasing
energy, then the theory will be strongly coupled at low energies but will
become “asymptotically free” at high energies.4 It is also possible for the
coupling constants to behave in a more complicated manner, like for example, first increase with energy and then decrease, or vice-versa.
The above description assumes that there is a cut-off scale Λ beyond
which we will not trust the field theory description at all — just as one
would not trust the hydrodynamic description at scales smaller than the
atomic size. It is a philosophical issue whether we should adhere to such a
point of view or not. If field theory is the ultimate truth5 then one is justified in considering the limit Λ → ∞. When you do that, something weird
happens in quantum field theory: Many of the correction terms to the low
energy Lagrangian will acquire divergent terms as their coefficients. When
we compute the amplitude for a physically relevant process, we obtain a
divergent result when there is no cut-off (Λ → ∞). In a class of theories
(called perturbatively renormalizable theories) these divergences can be accommodated by absorbing them in the parameters of the Lagrangian —
that is, by renormalizing λ, m, q etc; no new term needs to be dealt with.
It should be stressed that the conceptual basis for renormalization has
nothing to do with the divergences which arise either in perturbative or
nonperturbative description.6 For example, the effective mass of an electron in a crystal lattice is different from the mass of the electron outside the
lattice, with both being finite; this could be thought of as a renormalization of the mass due to the effect of interactions with the lattice, which has
nothing to do with any divergences. In this case, we can measure both the
masses separately, but in quantum field theory, since the fields always have
their interaction — which cannot be switched off, unlike the interaction of
the electron with the lattice which is absent when we take the electron out
4.1. Interacting Fields
135
— what we observe are the physical parameters that include all the effects
of interaction.
To summarize (and emphasize!), the real reason for renormalization
has to do with the following facts. We describe the theory by using a
Lagrangian containing a certain number (say, N ) of parameters λ0A with
A = 1, 2, ..., N (like, e.g., the mass, charge, etc.) and a set of fields, neither
of which, a priori, have any operational significance. The nature of the
fields as well as the numerical values of the constants need to be determined in the laboratory through the measurement of physical processes.
For example, in the study of the scattering of a φ-particle by another, one
could relate the scattering cross-section to the parameters of the theory
(like for e.g., the coupling constant λ). Measuring the scattering crosssection will then allow us to determine λ. But what we observe in the lab
is the exact result of the full theory and not the result of the perturbative
expansion truncated to some finite order! It is quite possible that part
of the effect of the interaction is equivalent to modifying the original parameters λ0A we introduced into the Lagrangian to the renormalized values
λren
A . (The interactions, of course, will have other effects as well, but one
feature of the interaction could be the renormalization of the parameters.)
When this is the case, no experiment will ever allow us to determine the
numerical values of λ0A we put into the Lagrangian and the physical theory
is determined, operationally, in terms of the renormalized parameters λren
A .
This happens in all the interacting theories and is the basic reason behind
renormalization.7
It is useful to consider a situation different from λφ4 theory or QED to
highlight another fact. Suppose you start a theory (with a set of N parameters λ0A in its Lagrangian) and, when you compute an amplitude perturbatively, there are divergences in the theory which cannot be reabsorbed
in any of these parameters. Sometimes the situation can be remedied by
introducing another k parameters, say μB (B = 1, ...k) and when you study
the perturbative structure of the new Lagrangian involving (N + k) parameters (λA , μB ), all the divergences can be reabsorbed into these parameters.
We then might have a healthy theory using which we can make predictions.
But for most Lagrangians you can write down, including some very sensible Lagrangians (like the one used in Einstein’s theory of gravity), this
process does not converge. You start with a certain number of parameters
in the theory and the divergences which arise force you to add some more
parameters; these lead to new interactions and when you rework the structure of the theory you find that fresh divergences are generated which will
require still more parameters to be introduced etc., ad-infinitum. Such a
theory will require an infinite number of parameters to be divergence-free,
which is probably inappropriate for a fundamental theory.8 Fortunately, a
very wide class of theories in physics are renormalizable in the above sense.
In particular, the λφ4 theory and QED are classic examples of theories in
which this program works.
After this preamble, we will study (i) the self-interacting scalar field
and (ii) the coupled complex scalar-electromagnetic field system, using a
nonperturbative approximation called the effective action method.
7 Even if both λ0 and λren are finite
A
A
quantities, we still need to introduce
the concept of renormalization into the
theory, acknowledge the fact that λ0A
are unobservable mathematical parameters and re-express physical phenomena in terms of λren
A . It so happens
that in λφ4 theory, QED (and in many
other theories) the parameters λ0A and
λren
differ by infinite values for reaA
sons we do not quite understand. But
the physical theory with finite λren
A is
quite well defined and agrees wonderfully with experiment. Thus, while
renormalization has indeed helped us
to take care of the infinities in these
perturbative theories, the raison d’ˆ
etre
for renormalization has nothing to do
with divergences in perturbative expansion.
8 It is important to appreciate that
such theories could still be very useful and provide a perfectly sensible description of nature as effective theories valid below, say, certain energy
scale. This is done by arranging the
terms in the Lagrangian, arising from
new parameters, in some sensible order so that their contributions are subdominant at low energies in a controlled manner. It is a moot point
whether real nature might actually require an infinite number of parameters
for its description if you want to study
processes at arbitrarily high energies.
Physicists hope it does not.
136
4.1.2
9 This notation is purely formal; the
symbol Q, for example, could eventually describe a set of variables, like the
components of a vector field. The detailed nature of these variables is not
of importance at this stage; we will describe the concepts as though Q and q
are quantum mechanical in nature and
then generalize it to the fields.
10 There is a formal way of introducing the idea of effective action, valid
in more general contexts. We will
describe it later on, at the end of
Sect.4.5. Here we take a more intuitive
approach.
Chapter 4. Real Life I: Interactions
A First Look at the Effective Action
We will first introduce this concept in somewhat general terms and then
illustrate it later on with specific examples. Consider a theory which describes the interaction between two systems having the dynamical variables
Q and q. The action functional for the system, A[Q, q] will then depend9
on both the variables, possibly in a fairly complicated way. The full theory
can be constructed from the exact propagator
i
(4.4)
K(Q2 , q2 ; Q1 , q1 ; t2 , t1 ) = DQ Dq exp A[Q, q]
which is quite often impossible to evaluate. The effective action method
is an approximation scheme available for handling Eq. (4.4). This method
is particularly appropriate if we are interested in a physical situation in
which one of the variables, say, Q, behaves nearly classically while the
other variable q is fully quantum mechanical. We then expect q to fluctuate
rapidly compared to Q which is considered to vary more slowly in space
and time. The idea is to investigate the effect of the quantum fluctuations
of q on Q. In that case, we can approach10 the problem in the following
manner:
Let us suppose that we can do the path integral over q in Eq. (4.4)
exactly for some given Q(t). That is, we could evaluate the quantity
i
i
(4.5)
F [Q(t); q2 , q1 ; t2 , t1 ] ≡ exp W = Dq exp A[Q(t), q]
treating Q(t) as some specified function of time. If we could now do
i
(4.6)
K = DQ exp W [Q]
exactly, we could have completely solved the problem. Since this is not
possible, we will evaluate Eq. (4.6) by invoking the fact that Q is almost
classical. This means that most of the contribution to Eq. (4.6) comes from
the extremal paths satisfying the condition
δW
=0
δQ
11 It
is often quite easy to evaluate
Eq. (4.6) in this approximation and
thereby obtain an approximate solution to our problem. Usually we will
be content with obtaining the solutions
to Eq. (4.7), and will not even bother
to calculate Eq. (4.6) in this approximation.
(4.7)
Equation (4.7), of course, will contain some of the effects of the quantum
fluctuations of q on Q, and is often called the ‘semi-classical equation’.11
The quantity W is called the ‘effective action’ in this context.
The original action for the field was expressible as a time integral over a
Lagrangian in quantum mechanics (and as an integral over the four volume
d4 x of a Lagrangian density in quantum theory) and, in this sense, is local.
There is no guarantee that the effective action W [q] can, in general, be
expressed as an integral over a local Lagrangian. In fact, W [q] will usually
be nonlocal. But in several contexts in which we can compute W in some
suitable approximation, it will indeed turn out to be local or can be approximated by a local quantity. In such cases it is also convenient to define
an effective Lagrangian through the relation
(4.8)
W = Leff dt
4.1. Interacting Fields
137
There is, however, one minor complication in the above formalism as
it stands. The way we have defined our expressions, the quantities K and
W depend on the boundary conditions (t2 , q2 , t1 , q1 ). We would prefer to
have an effective action which is completely independent of the q-degree of
freedom. The most natural way of achieving this is to integrate out the
effect of q for all times by considering the limit t2 → +∞, t1 → −∞ in
our definition of the effective action. We will also assume, as is usual, that
Q(t) vanishes asymptotically. From our discussion in Sect. 1.2.3, we know
that, in this limit, the path integral essentially represents the amplitude for
the system to go from the ground state (of the q degree of freedom) in the
infinite past to the ground state in the infinite future. Using the results of
Sect. 1.2.3, we can write
i
F (q2 , q1 ; +∞, −∞) ≡ exp W [Q(t)] = N (q2 , q1 )E0 , +∞| − ∞, E0 Q(t)
(4.9)
where E0 , +∞| − ∞, E0 Q(t) stands for the vacuum-to-vacuum amplitude
in the presence of the external source Q(t) and N (q, q2 ) is a normalization
factor, independent of Q(t). Taking logarithms, we get:
W [Q(t)] = −i lnE0 , +∞| − ∞, E0 Q(t) + (constant)
(4.10)
Since the constant term is independent of Q, it will not contribute in
Eq. (4.7). Therefore, for the purposes of our calculation, we may take
the effective action to be defined by the relation
W [Q(t)] ≡ −i lnE0 , +∞| − ∞, E0 Q(t)
(4.11)
in which all reference to the quantum mode is eliminated. Note that |E0
is the ground state of the original system; that is, the ground state with
Q = 0.
This discussion also highlights another important feature of the effective
action. We have seen in Sect. 2.1.2 that an external perturbation can cause
transitions in a system from the ground state to the excited state. In other
words, the probability for the system to be in the ground state in the
infinite future (even though it started in the ground state in the infinite
past) could be less than unity. This implies that our effective action W0
need not be a real quantity! If we use this W directly in Eq. (4.7), we have
no assurance that our solution Q will be real. The imaginary part of W
contains information about the rate of transitions induced in the q-system
by the presence of Q(t); or — in the context of field theory — the rate of
production of particles from the vacuum. The semi-classical equation is of
very doubtful validity if these excitations drain away too much energy from
the Q- mode. Thus we must confine ourselves to the situations in which
Im W Re W
(4.12)
In that case, we can approximate the semi-classical equations to read12
δRe W
=0
δQ
(4.13)
The above discussion provides an alternative interpretation of the effective action which is very useful. Let us suppose that Q(t) varies slowly
enough for the adiabatic approximation to be valid. We then know — from
12 In
most practical situations,
Eq. (4.12) will automatically arise
because of another reason. Notice
that the success of the entire scheme
depends on our ability to evaluate
the first path integral in Eq. (4.5).
This task is far from easy, especially
because we need this expression for
an arbitrary Q(t). Quite often, one
evaluates this expression by assuming
that the time variation of Q(t) is slow
compared to time scale over which the
quantum variable q fluctuates. In such
a case, the characteristic frequencies of
the q-mode will be much higher than
the frequency at which the Q-mode is
evolving, and hence there will be very
little transfer of energy from Q to q.
The real part of W will dominate.
138
Chapter 4. Real Life I: Interactions
our discussion in Sect. 2.1.2 — that the ‘vacuum to vacuum’ amplitude is
given by [see Eq. (2.32)]
i +∞
E0 (Q)dt.
(4.14)
lim lim F (q2 , q1 ; t2 , t1 ) ∝ exp −
t2 →∞ t1 →−∞
−∞
This expression allows us to identify the effective Lagrangian as the ground
state energy of the q-mode in the presence of Q:
Leff = −E0 (Q)
(4.15)
This result, which is valid when the time dependence of Q is ignored in
the calculation of E0 , provides an alternative means of computation of the
effective Lagrangian if the Q dependence of the ground state energy can be
ascertained.
To summarize, the strategy in using this approach is as follows. We first
integrate out the quantum degree of freedom q with asymptotic boundary
conditions and obtain the W [Q(t)] or Leff [Q(t)]. The imaginary part of Leff
will contain information about particle production which could take place
due to the interactions; the real part will give corrections to the original
classical field equations for the Q degree of freedom.
Let us now proceed from quantum mechanics to field theory. In the field
theoretic context, Q, q will become fields depending on space and time and
in many contexts we are interested in, the relevant action can be expressed
in the form, either exactly or as an approximation:
1
4
ˆ
(4.16)
d4 x q D[Q]q
A[Q(x), q(x)] = d x L0 [Q(x)] −
2
13 We
have now set = 1
If we integrate out q(x), we will end up getting an effective action13 given
by
iA
Z[Q(x)] =
Dq e = exp i d4 x (L0 + Leff )
⇒ exp − d4 x (L0 + Leff )E
(4.17)
where the last expression is valid in the Euclidean sector and
i ∞ ds
1 ∞ ds
x|e−isD |x ⇒ −
x|e−sDE |x
Leff = −
2 0 s
2 0 s
(4.18)
The first form of Leff is in the Lorentzian space while the second form is
valid in the Euclidean sector (see Eq. (2.66) and Eq. (2.77)). We have
already seen that Leff can be interpreted as (i) the vacuum energy of field
oscillators (see Eq. (2.71) or (ii) the energy associated with particles in
closed loops (see Eq. (1.107). This connects well with Eq. (4.14).
We can make some general comments about these expressions connecting up with our discussion in the last section. As we said before, Leff
can, in general, have a real and imaginary part. The imaginary part will
indicate the probability for creation of particles while the real part will
give corrections to L0 (Q). Suppose that the real part of Leff contained a
term which has exactly the same form as a term in the original Lagrangian
L0 (Q), except for a different proportionality constant. For example, in the
study of scalar field theory with Q = φ, the L0 (φ) will contain a mass term
4.2. Effective Action for Electrodynamics
139
−(1/2)m2 φ2 . Suppose when we consider the interaction of this field with
itself or other fields and integrate out the other degrees of freedom, we
end up getting an Leff which has a term −(μ/2)m2 φ2 . Then the quantum
corrected effective Lagrangian L0 (φ) + Leff (φ) will have a term which goes
as (1 + μ)(1/2)m2 φ2 .
But since the quantum fluctuations can never be switched off, it doesn’t
make sense to think of the (1/2)m2 φ2 term and (μ/2)m2 φ2 term as physically distinct entities because we will only be able to observe their total
effect. In other words, the interaction modifies (“renormalizes”) the parameter m2 to a the value m2 (1 + μ). It is only this quantity which will be
physically observable and the original m2 we introduced in the “free” scalar
field Lagrangian has no operational significance because “free” Lagrangians
do not exist in nature. It follows that whenever Leff contains terms which
have the same structure as the terms in the original Lagrangian L0 , then
these corrections will not be separately observable; instead they renormalize the original parameters in the Lagrangian L0 .
All these arguments make sense even if μ is a divergent, infinite quantity,
as it will turn out to be. This divergence is disturbing and is possibly telling
us that the theory needs a cut-off, say, Λ, but the fact that such a term will
automatically get absorbed in the original parameters of the Lagrangian is
a result which is independent of the divergence. In this case, we need to
define the theory with a cut-off Λ beyond which we do not trust it. Then
the Leff can be interpreted as describing a low energy effective Lagrangian
obtained after integrating out certain degrees of freedom, usually those with
high energy or belonging to another field which we are not interested in. If
we choose to integrate out modes with energies in the range M < E < Λ,
then the correction μ will depend on both M and Λ. When we decrease M ,
integrating out more modes, the parameter μ and hence the physical mass
will “run” with M . This is one way to approach the ideas of renormalization
and running coupling constants through the effective action. We will see
specific examples for such a behaviour in the coming sections.
4.2
Effective Action for Electrodynamics
As an illustration of the above ideas, let us consider a charged scalar field
interacting with the electromagnetic field. In this case, we can take advantage of the fact that the electromagnetic field has a sensible classical
limit in which it obeys the standard Maxwell equations. It makes sense
to consider a nearly constant (very slowly varying) electromagnetic field
background and ask how the coupling of this electromagnetic field with a
charged scalar field affects the former. To answer this question, we need
to compute the effective action for the electromagnetic field by integrating
out the charged scalar field in the interaction term of the Lagrangian given
by Eq. (3.53). That is, we need to compute
4
2
2
∗
(4.19)
eiAeff [Fik ] = DφDφ∗ ei d x φ[(i∂−qA) −m ]φ
From Eq. (4.18), we see that we need to compute the matrix element
x|e−isH |x for the quantum mechanical Hamiltonian H = −(i∂ − qA)2 +
m2 . Once we have this matrix element, we can compute the effective Lagrangian Leff using Eq. (4.18). This computation actually turns out to be
140
Chapter 4. Real Life I: Interactions
easier than one would have imagined because one can use a series of tricks
to evaluate it. We will now explain the procedure.
Since we are interested in the case in which the vector field Aj varies
slowly in spacetime, we could ignore its second and higher derivatives and
treat it as arising due to a constant Fik ; i.e., we study a background electromagnetic field with constant E, B. The gauge invariance implies that
the Leff which we compute can only depend on the quantities (E 2 − B 2 )
and E · B. We will define two constants a and b by the relation
a2 − b 2 = E 2 − B 2 ;
Exercise 4.1: Prove this claim.
ab = E · B
(4.20)
so that Leff = Leff (a, b). In the general case with arbitrary E and B for
which a and b are not simultaneously zero, it is well known that, by choosing
our Lorentz frame suitably, we can make E and B parallel, say along the
y-axis. Such a field will have components [E = (0, E, 0); B = (0, B, 0)] and
can arise from the vector potential of the form Ai = [−Ey, 0, 0, −Bx] in a
specific gauge. In this particular case, a2 − b2 = E 2 − B 2 and ab = E · B =
EB. Therefore E = a and B = b. Once we find the effective Lagrangian
Leff (E, B) as a function of E and B in this particular case, the result for
the most general case can be obtained by simply substituting E = a and
B = b.
For this choice of fields and vector potential, the H we need to deal
with is given by
(4.21)
H = − (i∂ − qA)2 − m2
2
2
2
2
2
= − (i∂t + qEy) + ∂x + ∂y − (i∂z + qBx) − m
While evaluating the matrix element x|e−isH |x, we can introduce a complete set of momentum eigenstates as far as the t and z coordinates are
concerned, since the Hamiltonian is independent of these two coordinates.
Taking
0
z
pz |z = e−ip z
(4.22)
p0 |t = e+ip t ;
we can replace i∂t by −p0 and i∂z by pz in the expression for H in Eq. (4.21)
and integrate over p0 and pz . The coordinates x and y — which we will
indicate together as x⊥ — cannot be handled by going to the momentum
space since the Hamiltonian explicitly depends on these coordinates. So
we will leave them as they are. The problem now reduces to evaluating
dp0 dpz
i +is[(i∂−qA)2 −m2 ] i
|x =
x⊥ |e−isH |x⊥
(4.23)
x |e
(2π)2
where
H = − (−p0 + qEy)2 + ∂y2 + ∂x2 − (pz + qBx)2 − m2
(4.24)
Introducing two new coordinates 1 , 2 in place of x and y, by the definitions:
pz
p0
1 ≡ x +
;
2 ≡ y −
,
(4.25)
qB
qE
H is reduced to the form
H = −∂12 + q 2 B 2 21 + −∂22 − q 2 E 2 22 + m2
(4.26)
4.2. Effective Action for Electrodynamics
141
This form is remarkably simple and should be familiar to you. It consists of
two separate operators in square brackets (the m2 term is just a constant
added to the Hamiltonian) of which the first one is the Hamiltonian for a
harmonic oscillator with mass (1/2) and frequency ω = 2|qB|. The second
square bracket corresponds to an “inverted” harmonic oscillator (i,e., the
one with the sign of ω 2 reversed so that ω → iω) with the same mass
and frequency. The matrix element in Eq. (4.23) is now trivial and can be
computed using the standard path integral kernel for a harmonic oscillator
in the coincidence limit. For the 1 degree of freedom, this is given by the
expression
mω 1/2
ωt
K=
(4.27)
exp i mωx2 tan
2πi sin ωt
2
Exercise 4.2: Revise your knowledge
of the path integral for the harmonic
oscillator and obtain this result.
evaluated with m = (1/2), ω = 2|qB|, t = s, x = 1 . The result for 2 is
obtained by changing ω to iω in this expression and is given by
1/2
mω
ωt
K=
(4.28)
exp −i mωx2 tanh
2πi sinh ωt
2
evaluated with m = (1/2), ω = 2|qE|, t = s, x = 2 . The integration over
p0 and pz in Eq. (4.23) is equivalent to an integration over 1 and 2 with
dp0 = −qEd2 , dpz = qBd1 . Doing these integrals and simplifying the
expressions, we find that:
2
qB
qE
1
i −isH i
x |e
e−im s
|x =
(4.29)
2
16π i sin qBs
sinh qEs
The effective Lagrangian is obtained from this kernel using the result
∞
ds
i
Leff = 2 × −
x|e−iHs |x
(4.30)
2
s
0
where the extra factor of 2 takes care of the fact that the complex scalar
field has twice the degrees of freedom as compared to a single field. This
gives the final result to be
Leff
=
−
∞
0
=
−
0
∞
2
ds e−i(m −i)s
(4π)2
s3
2
ds e−i(m −i)s
(4π)2
s3
|qB|s
|qE|s
sinh |qE|s
sin |qB|s
|qb|s
|qa|s
(4.31)
sinh |qa|s
sin |qb|s
In arriving at the second expression, we have substituted E = a and B = b
as explained earlier. It is understood that the combinations qa and qb
(which arose from taking the positive square root of ω 2 ) should be treated
as positive quantities irrespective of the individual signs of q, a, b. The
integration over s is slightly off the real axis along s(1 − i ) as described
in Fig. 1.3. This requires the contour to be closed as shown in Fig. 4.1 on
the quarter circle at infinity in the lower right quadrant. Equivalently, we
can rotate the contour to go along the negative imaginary axis. This shows
that the poles on the integrand along the real axis do not contribute to the
integral because we are going below all these poles. On the other hand, any
pole on the negative imaginary axis needs to be handled separately in this
integral. We need to go around each of these poles in a small semi-circle in
Im s
Re s
Figure 4.1:
Eq. (4.31).
Integration contour for
142
14 The L
eff can also be thought of as
the vacuum energy of the field oscillators (as in Eq. (2.71)). It is possible to use this interpretation directly
to obtain the effective Lagrangian for
the electromagnetic field. We will describe this in Mathematical Supplement, Sect.4.9.1.
Im s
x
Re s
Chapter 4. Real Life I: Interactions
the complex plane which – as we will see — will lead to an imaginary part
in Leff .
From the integrand in Eq. (4.31), it is clear that the magnetic field
leads to poles along the real axis due to the sin(qbs) factor, which are
irrelevant; on the other hand, the electric field leads to poles along the
negative imaginary axis due to the sinh(qas) factor, which, as we will see,
will lead14 to an imaginary contribution to Leff . Our next task is to unravel
the physics contained within Leff which we will do in a step-by-step manner.
4.2.1
Schwinger Effect for the Charged Scalar Field
We will first extract the imaginary part of the effective action which represents the creation of particles from the vacuum. The simplest way to
compute this is to rotate the contour from along the positive real axis to
along the negative imaginary axis. Formally, this will lead to the expression:
∞
2
ds e−m s
qbs
qas
Leff (a, b) =
(4.32)
(4π)2 s3
sin qas
sinh qbs
0
In this expression, the poles from the electric field lie along the path of
integration, but, as described earlier, these poles are to be avoided by going
in a semi-circle above them as illustrated in Fig. 4.2 for one particular pole.
Each of the poles at s = sn = (nπ/qa) with n = 1, 2, ... are avoided by going
around small semicircles of radius in the upper half plane. The n-th pole
contributes to this semicircle the quantity
( eiθ idθ) −m2 sn
qbsn
qa
e
.
(4π)2 s2n
cos(nπ) eiθ sinh qbsn
θ=π
m2 π
(qa)2 1
qbsn
n
= i(−1)n+1
(4.33)
exp
−
16π 3 n2
qa
sinh qbsn
θ=0
x
In
x
x
=
which is pure imaginary. So the total contribution to Im Leff is:
Im s
Im Leff =
∞
(−1)n+1 .
n=1
Re s
x
Figure 4.2: Integration contours for
(4.34)
It is now clear that (Im Leff ) arises because of non-zero a, i.e. (i) whenever
there is an electric field in the direction of the magnetic field or (ii) if E is
perpendicular to B, but E 2 > B 2 . (In this case, we can go to a frame in
which the field is purely electric). If we set B = 0, we reduce the problem
to that of a constant electric field, and in this case, the imaginary part is
Im Leff =
Eq. (4.33).
qbsn
1 (qa)2 1
m2 π
n
.
exp
−
2 (2π)3 n2
qa
sinh qbsn
∞
1 (qE)2 (−1)n+1
πm2
n
exp
−
2 (2π)3
n2
|qE|
n=1
(4.35)
The interpretation of this result is similar to the one we provided in
Sect. 2.1.2 in the case of an external source J(x) creating particles from
the vacuum (see Eq. (2.25) but with some crucial differences. In that case,
we considered a real scalar field φ(x) coupled to an externally specified cnumber source J(x) and found that |0+ |0− J |2 can be related to number
of quanta of the φ field produced per unit volume per unit time. As you can
4.2. Effective Action for Electrodynamics
143
see, we are computing exactly the same physical quantity now. We have
an externally specified electromagnetic field with a particular Ai which can
act as a source for the charged quanta of a complex scalar field. If we think
of the electric field, say, as being switched on adiabatically and switched off
adiabatically, then the initial vacuum state will not remain a vacuum state
at late times. The probability for the persistence of vacuum is again given
by |0+ |0− A |2 where the matrix element is now evaluated in an externally
specified Ai . Writing Leff as (R + iI) and noting that
Z[Ai ] = 0+ |0− A = exp i d4 x (R + iI) ,
(4.36)
we get
|Z| = |0+ |0− | = exp −2
2
A 2
d4 x I
(4.37)
Thus, Im Leff gives the number of pairs of charged scalar particles created
by the electromagnetic field per unit time per unit volume.15
There are, however, two surprising features about this result which
are worth noting.16 First we find that a static electric field can produce
particles from the vacuum, unlike the case of a time dependent J(x) which
produces particles in the example we saw in Sect. 2.1.2. Second — and
more important — is the fact that the expression obtained in Eq. (4.35)
is non-analytic in q. Conventional quantum electrodynamics (examples
of which we will discuss in the next chapter) is based on a perturbative
expansion in the coupling constant q as a series. The non-analyticity of
the Schwinger effect shows that perturbation theory in q will never get you
this result irrespective of your summing the series to an arbitrarily large
number of terms. Thus the Schwinger effect is a genuinely non-perturbative
result in quantum field theory.17
4.2.2
The Running of the Electromagnetic Coupling
Having disposed off the imaginary part of Leff , let us consider its real part.
The full action for the electromagnetic field — when we take into account
its coupling to the charged scalar field — is given by the sum L0 + Re Leff
where L0 = (1/2)(E 2 − B 2 ) which will, for example, lead to corrections
to the classical Maxwell equations. We certainly expect Re Leff to be
analytic18 in the coupling constant q so that one could think of it as a
small perturbative correction to L0 . The real part of Eq. (4.31) is given by
∞
ds
cos m2 s
qas
qbs
Re Leff = −
(4.38)
(4π)2
s3
sinh qas
sin qbs
0
While one can work with this expression, the oscillation of the cosine function in the upper limit of the integral makes it a bit tricky; therefore, it
is preferable to go back to the expression in Eq. (4.32) which we used
previously:
Leff (a, b) =
0
∞
ds e−m
(4π)2 s3
2
s
qas
sin qas
qbs
sinh qbs
(4.39)
We know how to handle the poles along the real axis arising from the sine
function and leading to an imaginary part when the integration contour
is deformed to a semi-circle. What remains is just a principal value of
15 This result, viz. that a constant
electric field can produce charged pairs
from the vacuum, is known as the
Schwinger effect and for once, it is
named correctly; Schwinger did work
out the clearest version of it.
16 There are several ways of “understanding” this qualitatively, none of
which are totally satisfactory; so it is
best if you take it as a fact of life.
You may legitimately wonder where
the energy for the creation of particles
is coming from. This is provided by the
agency which is maintaining the constant electric field. Suppose you keep
two charged capacitor plates maintaining a constant electric field in between.
The pairs produced from the vacuum
will move towards oppositely charged
plates, shorting the field. The agency
has to do work to prevent this from
happening, which goes into the energy
of the particles.
17 That is why we began with this discussion before you lose your innocence
in the perturbation expansions!
18 If Re L
eff turns out to be nonanalytic in q, we are in deep trouble
and one cannot obtain the free electromagnetic field as the q = 0 limit. It is
reassuring that this does not happen.
144
Chapter 4. Real Life I: Interactions
the integral which is real. With this understanding — which effectively
ignores the poles along the real axis — we can continue to work with this
expression. This is what we will do.
The first point to note about Re Leff is that it is divergent near s = 0.
There are two sources to this divergence, one trivial and the other conceptually important. The trivial source of divergence can be spotted by
noticing that Re Leff is divergent even when E = B = 0. This divergence is that of Leff for a free complex scalar field, which essentially arises
from the sum of the zero point energies of the field, which we studied in
Sect. 2.2.1 (see Eq. (2.71)). We mentioned while discussing the expression
in Eq. (2.73) that the integral representation for ln D should be interpreted
as ln D/D0 ; this corresponds to subtracting out the Leff in the absence
of electromagnetic field. So, this divergence can be legitimately removed
by simply subtracting out the value for E = B = 0. Thus, we modify
Eq. (4.38) to
∞
2
q 2 abs2
ds e−m s
−
1
(4.40)
Re Leff ≡ R =
(4π)2 s3
sinh qbs sin qas
0
Since the subtracted term is a constant independent of E and B, the equations of motion are unaffected.
The expression R is still logarithmically divergent near s = 0, since the
quantity in the square brackets behaves as [+(1/6)q 2 s2 (a2 −b2 )] near s = 0.
But notice that this divergent term is proportional to (a2 − b2 ) = E2 − B2 ,
which is the original — uncorrected — Lagrangian. As we described right
at the beginning, any term in Leff which has the same structure as a term
in the original Lagrangian cannot be observed separately in any physical
process and merely renormalizes the parameters of the theory. To handle
this, let us follow the same procedure we described earlier. We will write
Ltotal = L0 + Re Leff = (L0 + Lc ) + (Re Leff − Lc ) ≡ Ldiv + Lfin
(4.41)
where Ldiv ≡ L0 + Lc is divergent and Lfin ≡ Re Leff − Lc = R − Lc is
finite with Lc being the divergent part of R given by
∞
ds −m2 s 1
1
2 2
2
(qs) (a − b )
e
(4.42)
Lc =
(4π)2 0 s3
6
∞
q2
ds −m2 s
Z
Z
2
2
=
e
(a
−
b
)
≡ (a2 − b2 ) = (E2 − B2 )
2
6(4π)
s
2
2
0
19 The
notation Z is rather conventional; in fact, when we study QED
you will encounter Z1 , Z2 and Z3 all
being divergent!
where Z is given by19
q2
Z=
48π 2
0
∞
ds −m2 s
e
s
(4.43)
This integral, and hence Z, is divergent but that is not news to us. We
can, as usual, combine Lc with our original Lagrangian L0 and write the
final result as Ltot = Ldiv + Lfin , where
Ldiv ≡ (L0 + Lc ) =
1
(1 + Z)(E2 − B2 )
2
(4.44)
and Lfin ≡ (Re Leff − Lc ) is given by
∞
q 2 s2 ab
1 2 2 2
1
ds −m2 s
2
Lfin =
−
1
−
q
e
s
(a
−
b
)
(4π)2 0 s3
sin(qsa) sinh(qsb)
6
(4.45)
4.2. Effective Action for Electrodynamics
145
The quantity Lfin is perfectly well-defined and finite. [The leading term
coming from the square bracket near s = 0 is proportional to s4 and hence
Lfin is finite near s = 0.] Further, it can be expanded in a Taylor series in q
perturbatively, giving finite corrections to the classical electromagnetic Lagrangian. The leading order term has the form (with and c re-introduced)
q 4 (/mc)3 7 2
2
2 2
E −B
+ (E · B)
(4.46)
L1 =
90 mc2
4
Exercise 4.3: Just for fun, do the
Taylor series expansion and verify this
expression.
For a more challenging task, compute the corrections to
Maxwell’s equations due to this term.
As regards Ldiv , we know that only the combination with a (1 + Z)
factor will be physically observable and not the individual term. To make
this formal, we shall now redefine all our field strengths and charges by the
rule
Ephy = (1 + Z)1/2 E;
Bphy = (1 + Z)1/2 B;
qphy = (1 + Z)−1/2 q (4.47)
This is, of course, same as scaling a and b by (1+Z)1/2 leaving (qphy Ephy ) =
qE invariant. Since only the products qa, qb appear in Lfin , it can also
be expressed in terms of (qphy Ephy ). Once we have introduced physical
variables, Ldiv reduces to the standard electromagnetic Lagrangian in terms
of physical variables, with corrections arising from Lfin also being expressed
entirely in terms of physical variables.
All that remains is to study how the coupling constant runs in this case
by evaluating the integral in Eq. (4.43) after regularizing it.20 Here we
shall use the simpler technique of calculating it with a cut-off. Since it is
well defined for s → ∞ and the divergence arises from the lower limit, we
will first regularize it by introducing a cut-off Λ ≡ (1/M 2 ) at the lower
end.21 An integration by parts then gives the result:
∞
∞
2
ds −m2 s
−m2 /M 2
2
e
=e
ln M +
m2 ds e−m s ln s
(4.48)
s
−2
−2
M
M
2
2
We can now set M −2 = 0 in the second term and in e−m /M . A simple
calculation now gives:
∞
∞
ds −m2 s
M2
2
2
e
= ln M − ln m +
du e−u ln u = ln 2 − γE (4.49)
m
M −2 s
0
where γE is Euler’s constant defined in Eq. (2.82).
Since we are going to let M → ∞, the finite term given by γE is
not relevant and can be dropped. Also note that the original integral is
invariant under s → m2 s when there is no cut-off, with any constant scaling
of s vanishing in the ds/s factor. This is also indicated by the fact that
the integral reduces to logarithms, where any change of scale only adds a
finite constant, which, according to our philosophy, can be dropped. That
is, if we write
(4.50)
ln(m2 /M 2 ) = ln(μ2 /M 2 ) + ln(m2 /μ2 )
with some finite energy scale μ introduced by hand then — since we are
dropping all finite terms compared to ln M — the factor ln(m2 /μ2 ) can be
ignored. Thus the value of our integral is essentially ln(M 2 /μ2 ) where μ is
a finite energy scale which we have introduced into the problem and M 2
is a high energy cut-off, and we are supposed to let M → ∞. Putting all
these in, we find that
∞
M2
ds −m2 s
q2
q2
e
=
ln
(4.51)
Z=
48π 2 0 s
48π 2
μ2
20 There are several ways to do this
and later on, in the next chapter, we
will do it using dimensional regularization. Right now, we will take a simple
minded approach, of introducing a cutoff, which is adequate.
21 Note that s and hence Λ have dimensions of (mass)−2 . So M used for
the cut-off represents a (formally infinite) energy scale.
146
Chapter 4. Real Life I: Interactions
Therefore, from Eq. (4.47) we find that the physical coupling constant is
given by
−1
q2
M2
2
qphys
(μ) = q 2 1 +
ln
(4.52)
48π 2
μ2
22 Clearly, q 2
phys can be finite when
M → ∞ only if this unobservable parameter in the Lagrangian is divergent.
So it is nice that it is unobservable.
where, again, μ is a finite energy scale which we have introduced into the
problem and M 2 is a high energy cut-off which is formally infinite. Further,
the quantity q 2 in this expression is a parameter in the Lagrangian which
has no operational significance.22 To interpret this expression, we first take
the reciprocal:
1
1
1
M2
= 2+
ln
.
(4.53)
2
qphys (μ)
q
48π 2
μ2
2
as the one observed using an
and think of the coupling constant qphys
experiment at energy scale μ. If we change this energy scale from μ to μ ,
the coupling constant will change by the relation
1
2
qphys
(μ )
=
1
2
qphys
(μ)
+
1
μ2
ln
48π 2 μ2
(4.54)
or, equivalently,
2
(μ )
qphys
23 You
could not have done any of
these if the divergent term in the effective Lagrangian was not proportional to the original electromagnetic
Lagrangian. This is what allowed us to
make the transformations in Eq. (4.47)
in the first place. In a way, the rest of
it is just a matter of detail.
=
2
qphys
(μ)
−1
.
(4.55)
This expression, which is completely independent of the cut-off scale, relates
the electromagnetic coupling constant at two different energy scales.23 We
can write this as:
1
1
+
ln μ2 = constant
2
qphys
(μ) 48π 2
(4.56)
The variation of the coupling constant on the energy scale is usually characterized by something called the beta function of the theory, defined as
β(q) ≡ μ
24 The β function in QED is a celebrated result and is usually given for
spin-1/2 particles for which you get
q 3 /12π 2 . (The spin-0 case differs by
a factor 4.) As we shall see in Sect.
5.6.5, this result, for the spin-1/2 particle, can also be obtained by a similar
analysis non-perturbatively.
2
(μ)
qphys
μ2
1+
ln 2
2
48π
μ
q3
∂q
=
∂μ
48π 2
(4.57)
where we have omitted the subscript ‘phys’. Any of these expressions
will tell you that the electromagnetic coupling constant increases with the
energy scale. That is, when you probe the charge distribution of a charged
particle at higher and higher energies, its effective strength will increase.24
This is your first glimpse of handing the divergences in a theory by
trading off unobservable parameters in a Lagrangian for operationally defined parameters. We have necessarily kept the discussion a bit cursory,
because we will take up the renormalization of QED in much more detail,
later on in the next chapter. Let us now move on to another example of
the effective action.
4.3
Effective Action for the λφ4 Theory
So far, we described the concept of effective action in terms of two sets of
degrees of freedom as though they were distinct. There is, however, no real
need for this and one can use a similar technique to study the effect of high
4.3. Effective Action for the λφ4 Theory
147
energy modes in the low energy sector of the same field. As an example,
consider a field Φ which has a non-trivial self interaction; for example, the
Lagrangian could have the form L = (1/2)(∂Φ)2 − (1/2)m2 Φ2 − V (Φ). In
the absence of V this represents a free field with quanta of mass m. Let us
now consider separating out the field Φ into two parts Φ = φc + φq where
we think of φc as containing modes with low wavenumber and energy (in
Fourier space) and φq as containing modes with high wave number and
energy. We can now define an effective action Aeff [φc ] for φc by integrating
out φq . That is, the effective action for the low energy sector of the theory
with the modes φc is now defined by the path integral over φq :
e
iAeff [φc ]
=
Dφq exp iA[φc + φq ]
(4.58)
Of course, an unconstrained path integral over φq in the right hand side of
Eq. (4.58) is the same as an unconstrained path integral over φ = φc + φq
(and hence will lead to a left hand side which is independent of φc ), which
is rather silly thing to do!. This is not what we want to do and an equation
like Eq. (4.58) makes sense only if certain implicit restrictions are imposed
on the path integral over φq . (This is indicated by a prime on the integral
in Eq. (4.58).) The nature of the restrictions will depend on the context but
here we assume that there is some sensible way of distinguishing between
high energy and low energy modes. (The most common procedure is to
assume φc as constant and expand A[φc + φq ] up to quadratic order in φq .)
In general, the path integral in Eq. (4.58) will lead to an Aeff (φc ) with
terms (a) which have the same structure as the ones we originally put
in A(Φ) as well as (b) those which were not originally present in A(Φ).
As an example of (a), suppose we get a a term proportional to φ2 [say,
(μ/2)m2 φ2 ]. Then, the net effect will be to change the value of what we
originally thought was the mass of the quanta, by m2 → m2 (1 + μ). You
will invariably find that the terms which are generated in Aeff (φc ) come
with divergent coefficients; for example, in the above case μ will depend
on the cut-off scale and could diverge if the cut-off goes to infinity. As
regards such divergent terms generated by the path integral in Eq. (4.58),
which are of the same form as those that are already present in A(Φ), we
are in good shape. All the divergences can now be absorbed by redefining
the parameters in the original theory making them ‘run’. Since only such
modified parameters are physically relevant, these divergences need not
bother us.
In addition to the divergent terms, the integration will also generate
finite corrections to the original potential V , which, of course, are what we
would like to study. These potentials will also have parameters which, as
we described above, will run with the scale. This is broadly the situation
in theories which are called renormalizable.
The path integral in Eq. (4.58) can also generate divergent terms which
were not originally present in A(Φ). One can try adding these terms and
redoing the computation to see whether self-consistency can be achieved at
some stage. If that happens, we are back to the previous situation after a
few iterations; it is just that we originally started with a potential which
was missing some terms generated by quantum effects. Usually, however,
this process does not converge. The path integral in Eq. (4.58) will produce
divergent terms which are not in the action you had started with, irrespec-
148
25 If it has a (1/2)m2 φ2 part with real
m, then that part should be conventionally clubbed as free field and only
the rest of V (φ) will be usually called
the “interaction”. You will see soon
that in this example it is simpler to
work with the full V (Φ) rather than
separate out the quadratic term.
26 A
more formal procedure to define
the effective action in this context is
to think of φc as the expectation value
ψ|Φ|ψ and minimize the energy subject to this constraint. If you implement it with a Lagrange multiplier J
added to the Hamiltonian H, then the
minimum energy E(J) will be related
to φc by (dE/dJ) = φc . Inverting this
relation, we get J(φc ) and doing a Legendre transform we get E(φc ) = E −
J(dE/dJ). Then E(φc ) is essentially
your effective potential. In this case we
are integrating over quantum fluctuations keeping ψ|Φ|ψ fixed. We will
explore a related approach in Sect. 4.5.
27 Since we are assuming φ is a conc
stant, Leff = Veff (φc ), the effective potential in the Euclidean sector. The
expression for Veff (φc ) will be the same
in the Lorentzian sector as well, since
analytic continuation in time does not
change it.
28 Note that Λ has the dimensions of
(mass)−2 , so that Λ = (1/M 2 ) where
M corresponds to the high energy cutoff scale, as in the case of electrodynamics.
29 Note that the V
eff is manifestly independent of μ because it cancels out
in the two logarithms. This is precisely what we did to get Eq. (2.83).
Of course, with proper interpretation,
no observable effect can depend on μ2 .
Chapter 4. Real Life I: Interactions
tive of how hard you try. Such a theory is called non-renormalizable.
We will now illustrate the above concepts in a simple context of a selfinteracting scalar field which is described classically by a Lagrangian of the
form
1
L0 = ∂a Φ∂ a Φ − V (Φ)
(4.59)
2
where V (Φ) is a potential describing the interactions.25 Writing Φ = φc +
φq , where φc is a constant, and expanding A[φc + φq ] up to quadratic order
in φq , we get an expression like in Eq. (4.16), viz.26
1
4
ˆ c )φq ;
ˆ = + V (φc )
d4 xφq D(φ
A = d xL(φc ) −
D
(4.60)
2
The computation of the effective action now requires evaluating the expression:
iAeff [φc ]
Z[φc ] = e
= exp i d4 x Leff (φc )
i
(4.61)
d4 x φq + ω 2 φq
=
Dφq exp −
2
where ω 2 ≡ V (φc ). But we have computed this in several forms before in
Sect. 2.2.1. In Eq. (2.86) we have provided three different expressions for
computing this Leff . We will use the expression given by27
∞
dλ − 1 ω2 λ
1
Leff = − 2
e 2
≡ Veff (φc )
(4.62)
8π 0 λ3
with ω 2 = V (φc ). As we discussed in Sect. 2.2.1, this integral can be
evaluated either with a cut-off or by dimensional regularization. Here we
shall regularize the integral using a cut-off. That is, we will evaluate this
integral with the lower limit set to some small value Λ = 1/M 2 and then
consider the limit28 of M → ∞, Λ → 0. The result of the integration is
given by Eq. (2.83) which is reproduced here:
∞
dλ − 1 ω2 λ
1
Veff = − 2
e 2
(4.63)
8π M −2 λ3
ω2M 2
1 4
M2
1
4
+
ω
−
ln
= −
M
16π 2
2
4
μ2
2
1
ω
1
+
ω 4 ln
+
γE ω 4
2
2
64π
2μ
64π 2
where γE is Euler’s constant, defined in Eq. (2.82), and μ is an arbitrary
energy scale we have introduced to separate out the divergent and finite
terms.29
We are supposed to take the limit of M → ∞ in this expression. We
see that the last two terms involving ω 4 ln ω 2 and ω 4 are finite but depend
on the arbitrary scale μ we have introduced. The terms in the square
bracket diverge as M → ∞; there are quartic, quadratic and logarithmically
divergent terms. Of these, the quartic term can be dropped because it is
just an infinite constant. We can also drop the finite term (1/2)γE ω 4 term
since it only changes the value of M which is anyway arbitrary at this stage.
But the next two terms, involving M 2 ω 2 and −ω 4 ln(M 2 /μ2 ), cannot be
dropped because they depend on ω 2 as well. The remaining expression is
1
1
1 2
μ2
V
2
2
Veff =
V M + (V ) ln 2 +
(4.64)
(V ) ln
2
2
32π
2
M
64π
2μ2
4.3. Effective Action for the λφ4 Theory
149
We see that when we integrate out the high energy modes from the theory,
the resulting potential governing the dynamics of φc gets modified by the
addition of the Veff given by the above expression. We must think up some
way of interpreting this. The last term (1/2)(V )2 ln(V /μ) in Veff is finite
and presumably can be interpreted if we can interpret the μ dependence.
From the fact that μ2 occurs in the combination (− ln(μ/M )2 ), we can
think of this as arising from integrating out modes with energies in the
range μ < E < M . One striking feature of this expression is that even the
finite term now describes a potential quite different from what we started
out with. That is, the high energy modes which were integrated out have
led to new interaction terms in the low energy sector. The real trouble, of
course, arises from the divergent terms.
Let us examine the nature of the divergent part when the original potential was a n-th degree polynomial in φ with n coefficients λ1 , λ2 · · · λn :
V (φc ) = λ1 φc + λ2 φ2c + · · · + λn φnc =
n
λk φk
(4.65)
k=1
In the last equation, we have dropped the subscript c from φc for notational
simplicity with the understanding that we will hereafter be dealing only
with the low energy modes of the theory. Then V will be a polynomial of
(n−2)-th degree and (V )2 will be a polynomial of 2(n−2)-th degree. Two
completely different situations arise depending on whether this is greater
than n or not. When
2(n − 2) ≤ n
i.e.,
n ≤ 4,
(4.66)
all the terms in Veff have the same degree or less compared to the terms in
V . In that case, the divergent part of Veff will again be a polynomial with
degree n or less, but with coefficients which are divergent. This is precisely
the idea of renormalization which we described earlier. When n ≤ 4, the
effect of integrating out high energy modes is to merely change the values
of the coupling constants λj which exist in the original theory. But, as we
explained before, the corrections due to high energy quantum fluctuations
are always present in such a self-interacting field. Thus, what are actually
observed experimentally are the renormalized coupling constants. So, when
n ≤ 4, we have a simple way of re-interpreting the theory and working out
its physical consequences.
We shall do this in detail in a moment, but before that, let us consider
the situation for n > 4. Then the effect of integrating out high energy
modes is to introduce new terms in the low energy sector of the theory.
For example, if the original potential had a φ6 term, then (V )2 will have a
φ8 term. If we add a φ8 term in the potential to tackle this, then that will
generate a φ12 term, etc. By and large, the effect of high energy modes is
to introduce in the low energy sector of the theory all possible terms which
are not forbidden by any symmetry considerations.
The old fashioned view in quantum field theory was that such theories
are just bad and should not be considered at all. This is, however, a bit
drastic. Even though integrating out high energy modes leads to such
terms, their effects are usually suppressed by a high energy scale below
which these terms will not be important. In other words, one can still
interpret the theory as a low energy effective theory with a domain of
validity determined by a high energy scale. As an explicit example, consider
Exercise 4.4: Repeat the entire analysis for an arbitrary D and determine
what is the value of n for which the
above idea works as a function of D.
150
30 As you can see, this is plain dimensional analysis but very useful and
powerful! Whenever the Lagrangian
contains a coupling constant with inverse dimensions of mass, the same
conclusions apply.
Chapter 4. Real Life I: Interactions
a term proportional to φ6 ; since the Lagrangian should have the dimensions
of (mass)4 , this term will occur in the Lagrangian through a combination
V ∝ λ(φ6 /μ4 ) where μ2 is a energy scale and λ dimensionless. The V 2 ∝
λ2 (φ8 /μ8 ) will occur in the two terms of Veff . It is now natural to use
the same μ which occurs in the Lagrangian to make the arguments of the
logarithms in dimensionless. Then we see that these terms are suppressed
by μ4 factors. So it makes sense to treat this model as an effective theory,
i.e as an approximation to a better theory, well below30 the scale μ.
Let us now see how the details work out when we take n = 4, which
is the maximally allowed interaction which will not generate new terms.
We will take (with an explicit subscript 0 added to the parameters of the
original Lagrangian)
1 2 2 λ0 4
λ0 2
2
V (φ) = m0 φ + φ ; V = m0 + φ
(4.67)
2
4!
2
Then, Eq. (4.64) becomes:
Leff
=
M2
32π 2
2
λ0 2
λ0 2
μ2
1
2
φ +
φ
+
ln 2
m
0
2
2
64π
2
M
2
λ0 2
1
λ0 2
1
2
2
+
ln 2 m0 + φ (4.68)
m0 + φ
64π 2
2
2μ
2
m20 +
We will rearrange this expression, dropping infinite constants and absorbing
divergent coefficients into ln M 2 , M 2 etc. This will allow us to write
2
λ0 4 1
μ2
M 2 λ0 2
2
2 2
8π Leff =
φ +
φ + λ0 m0 φ ln 2
(4.69)
8
32
8
M
λ0
λ0
1
1
+ (m20 + φ2 )2 ln(m20 + φ2 ) 2
8
2
2
2μ
2
2
M λ0
λ0
1
μ2
μ2
2
2
4
+ λ0 m0 ln 2 + φ
ln 2 + 8π 2 Vfinite (φ; μ2 )
=φ
8
8
M
32
M
where the first two terms are divergent and the last one is finite when we
send the cut-off to infinity (M → ∞), with
Vfinite =
31 The magic of renormalizability,
again. The divergent coefficients multiply the φ2 and φ4 terms which were
originally present in the Lagrangian,
allowing us to absorb them into the
parameters of the theory. This is precisely what we saw in the case of electrodynamics as well.
1
64π 2
2
λ0
1
λ0
ln 2 m20 + φ2 .
m20 + φ2
2
2μ
2
(4.70)
Adding this correction term to the original V (φ), we get31 the effective
potential to be:
V
=
≡
1 2 2
m φ +
2 0
1 2 2
m φ +
2
λ0 4
φ + Veff
4!
λ 4
φ + Vfinite (φ, μ2 )
4
(4.71)
where
m2
λ
λ0
μ2
2
2
+
m
ln
M
0
32π 2
M2
3λ20
μ2
= λ0 +
ln 2
2
32π
M
= m20 +
(4.72)
4.3. Effective Action for the λφ4 Theory
and
Vfinite
1
=
64π 2
λ
m + φ2
2
2
2
1
ln 2
2μ
151
λ 2
2
m + φ .
2
(4.73)
in which we have replaced λ0 , m0 by λ, m to the same order of accuracy as
we are working with.32
Let us take stock of the situation. The net effect of integrating out high
energy modes is to change the original potential to the form in Eq. (4.71)
which differs in two crucial aspects from the original potential. First, there
is a correction term in the form of Vfinite which is non-polynomial (since it
contains a log) that has been generated. This term is finite but depends
on a parameter μ which is completely arbitrary and hence needs to be
interpreted properly. We will come back to this issue. Second, all the
divergent terms have been absorbed into the two parameters of the original
Lagrangian, thereby changing m0 and λ0 to m and λ. In the expression for
Vfinite , we should actually be using m0 and λ0 but we will assume that, to
the lowest order, we can replace them by m and λ.
The constants m2 and λ are thought of as renormalized parameters
and, as we said before, in this particular case (n = 4) we could reabsorb all
the divergences by renormalizing the original parameters. The idea is that
the observed mass and coupling constant determined by some scattering
experiment, say, should be used for m and λ and the original parameters
m0 and λ0 are unobservable.33
Let us now get back to the apparent μ dependence of Vfinite . We know
that this is an arbitrary scale which we introduced into the theory and
Vfinite really should not depend on μ; that is, if we change μ, then Vfinite
should not change in spite of appearances. The clue to this paradox lies
in the fact that m2 and λ in Eq. (4.72) also depend on μ. So, when we
change μ, these parameters will also change, keeping Vfinite invariant. In
fact, this is apparent from the fact that μ was introduced into Veff in two
logarithmic terms such that it does not change the value of Veff . All we
have done is to rewrite this expression separating out the divergent and
finite pieces, thereby making m2 , and λ functions of μ for a fixed M .
To see this more clearly, let us consider the expressions for λ for two
different values of μ2 given by μ2 = μ21 and μ2 = μ22 . From Eq. (4.72), we
find that
μ21
3λ2 (μ1 )
ln
(4.74)
λ(μ1 ) − λ(μ2 ) =
32π 2
μ22
This expression is completely independent of the cut-off M and tells you
how the strength of the coupling constant changes with the energy scale.
If we use a particular value of μ in Vfinite and the corresponding value of
λ(μ) as the coupling constant, then the numerical value of Vfinite will be
independent of μ. Of course, we need to fix the value of λ at some energy
scale through the experiment; once this is done, we know the value of the
coupling constant at all other energy scales. The above result can also be
stated in the form
λ(μ) −
3λ2 (μ)
ln μ2 = constant
32π 2
(4.75)
or, more simply, as a differential equation
μ
∂λ
3λ2
=
≡ β(λ)
∂μ
16π 2
(4.76)
32 Notice that the (divergent) correction to the bare mass m20 is proportional to λ0 M 2 where M is the high
energy cut-off mass. The fact that
one picks up a correction which is
quadratic in M when one uses cutoff regularization is related to what is
known as the hierarchy problem. If one
considers the theory having a scalar
field of mass m and a large energy scale
M (which can act as a physical cut-off)
then Eq. (4.72) tells us that the physical mass of the theory will be driven to
M and we need to fine-tune the parameters of the theory to avoid this from
happening. This is known as the hierarchy problem. We will see later that
such quadratic divergences do not occur if we use dimensional regularization.
33 The renormalization scheme we described works even if you change the
sign of the m2 φ2 term. Then, φ = 0 is
not the minimum and we need to worry
about spontaneous symmetry breaking
— a key ingredient in the standard
model of particle physics. What is nice
is that the renormalizability continues
to hold even in this context.
152
Chapter 4. Real Life I: Interactions
The logarithmic derivative in the left hand side of the coupling constant is
the β function and the above result shows that β > 0, indicating that the
coupling becomes stronger at higher energies.
Incidentally, we can play the same game with the mass parameter as
well. From the first relation in Eq. (4.72), we find that
μ
∂m2
λm2
=
≡ γ(m)
∂μ
16π 2
(4.77)
which is sometimes called the gamma function of the flow. This tells you
how m2 runs with the scale μ to keep observable results independent of
E. We will say more about these issues after developing the perturbation
approach.
4.4
Perturbation Theory
In the last few sections, we discussed the behaviour of interacting fields nonperturbatively by using the concept of an effective action. This required us
making some approximations in order to make the problem tractable but
we did not have to assume that the coupling constant in the theory (λ in
the λφ4 interaction or q in electromagnetic coupling) is a small parameter.
In fact, one of the results we obtained in the Schwinger effect is explicitly
non-perturbative in the electromagnetic coupling constant.
A completely different way of tackling interacting fields is to deal with
them perturbatively and calculate all physical observables in a series expansion in powers of the coupling constant. In this approach we imagine that
there exists some kind of a free-field theory (containing, say, non-interacting
photons described by a vector field Al (x) and charged, spin-zero, massive
particles described by a complex scalar field φ(x)) when the relevant coupling constant (in this case, the charge q of the scalar field) tends to zero.
We then think of “switching on” the interaction by introducing a coupling
between Al and φ, the strength of which is controlled by q. Further assuming that the interaction can be treated as a perturbation of the free-field
theory, one can compute the effect of the interaction order-by-order as a
power series in the coupling constant. This is the essence of the perturbative approach to studying interacting field theory.
The key advantage of this approach is that the entire procedure can
be made completely systematic in the form of a set of rules involving the
Feynman diagrams of the theory. Once the rules have been written down
from the structure of the theory, you can — in principle — compute any observable quantity in the perturbation series. When such computations are
done, certain amplitudes become divergent and one again has to renormalize the parameters of the theory to obtain finite results. This, by itself, is
not a serious issue because — as we have already emphasized — the parameters in the Lagrangian need to be interpreted in a scale dependent manner
in any realistic theory. We have also seen that divergences arise even in
the context of non-perturbative computations and hence perturbation theory cannot be blamed for these divergences. Moreover, the perturbative
approach has proved to be enormously successful both in the case of quantum electrodynamics and in the study of electroweak interactions — which
alone makes it a worthwhile object to study from the practical point of
view.
4.4. Perturbation Theory
153
One downside of the perturbative approach is that it is very difficult
to rigorously demonstrate some of the constructs which form its basis. For
example, it is not easy to define rigorously (though it can be done with
a fair amount of mathematical formalism) the notion of a free asymptotic
field and its properties or the notion of “switching on” the interaction. It
is also not clear whether the perturbative series is convergent, asymptotic
or divergent for a general interacting field theory. In addition, we obviously cannot handle any non-perturbative features of the theory (like the
Schwinger effect or quark confinement) by such an approach.
On the whole, high energy physicists will claim — quite correctly —
that the advantages of the perturbative approach outweigh the disadvantages. This is the point of view we will adopt and will now introduce the
perturbative approach in the very simple context of a self-interacting λφ4
theory.
4.4.1
Setting up the Perturbation Series
To understand how one goes about studying the λφ4 theory in the perturbative approach and obtain the relevant Feynman diagrams etc., let us
begin by briefly reviewing some of the results we obtained in Sect. 2.2.1
for the free-field theory and generalizing them suitably. Given an action
functional A(φ) for the scalar field, we can obtain the functional Fourier
transform Z(J) of eiA by the relation (see Eq. (2.54))
Z[J] =
Dφ exp iA[φ] + i
J(x)φ(x)d x
4
(4.78)
If we compute the functional derivative of Z(J) with respect to J(x), we
bring down one factor of φ(x) for each differentiation. For example, if we
compute the functional derivative twice, we get:
1
Z(0)
δ
iδJ(x2 )
δ
iδJ(x1 )
Z[J]
=
J=0
=
≡
1
Dφ φ(x2 )φ(x1 ) eiA[φ]
Z(0)
0|T [φ(x2 )φ(x1 )]|0
(4.79)
G(x2 , x1 ) ≡ x2 |x1
where we have introduced the notation G(x1 , x2 ) ≡ x1 |x2 for typographical simplicity.34 The above equation was obtained in Sect. 2.2.1 for the
case of free-fields, but it is obvious that the first equality is trivially valid
for arbitrary A[φ]. Further, since the time integral in the action goes from
t = −∞ to t = +∞, we can easily demonstrate (by, for example, analytically continuing to the Euclidean sector) that the path integral expression reduces to the vacuum expectation value (VEV) of the time-ordered
products even for an arbitrary action A[φ], so that the second equality in
Eq. (4.79) also holds for any A[φ].
In a similar manner, one can obtain the functional average or the VEV of
a product of n scalar fields φ(x1 )φ(x2 ).....φ(xn ) ≡ φ1 φ2 ...φn [where we have
simplified the notation by writing φ(xj ) = φj etc.] by taking n functional
34 For reasons described at the end of
Sect. 1.4, do not attribute meanings to
|x1 and |x2 . This is just a notation.
154
Chapter 4. Real Life I: Interactions
derivatives with respect to J(x1 ), J(x2 ), ...J(xn ). That is,
1
δ
δ
δ
...
Z[J]
Z(0) iδJ(x1 )
iδJ(x2 )
iδJ(xn )
J=0
1
Dφ φ1 φ2 ...φn exp (iA[φ]) = 0|T [φ1 φ2 ...φn ]|0
=
Z(0)
(4.80)
≡ G(x1 , x2 , ...xn )
35 In the process we will also provide
an intuitive picture of how the n-point
functions are related to simple physical processes like the scattering of the
quanta of the field, because of the interaction. Later on, in Sect. 4.6, we
will provide a more rigorous proof of
how the amplitude for scattering, etc.
can be related to the n-point functions.
Recalling our interpretation in Sect. 2.2.1 of exp(iA) as being analogous to
a probability distribution and Z(J) as being analogous to the generating
function for the probability distribution, we see that the above result gives
the n-th moment of the random variable φ(x) in terms of the probability
distribution as well as in terms of the derivatives of the generating function.
It is also clear that if all the moments are given, one can reconstruct Z(J)
as a functional Taylor series by:
∞
1
dx1 dx2 ...dxn G(x1 , x2 , ...xn ) J(x1 )J(x2 )...J(xn )
n!
n=0
(4.81)
Obviously, the set of all n-point functions G(x1 , x2 , ...xn ) contains the complete information about the theory and if we can calculate them, we can
compute all other physical processes. Our first task is to set up a formalism by which these n-point functions can be computed in a systematic
manner.35
Since we are interested in setting up a systematic procedure for computing various quantities, it will be useful to start from the free-field theory
as a warm up. In this case, we know that Z(J) can be explicitly computed
and is given by:
1
Z(J) = Z(0) exp −
d4 x2 d4 x1 J(x2 )x2 |x1 J(x1 )
(4.82)
2
Z(J) = Z(0)
x1
x2
Figure 4.3: Diagrammatic representation of G(x1 , x2 )
x3
x4
x1
x2
+
x3
x4
The n−point functions can be determined by expanding the exponential
in a power series and identifying the coefficients of J(x1 )J(x2 )...J(xn ),
remembering the fact that this product is completely symmetric. For example, the 4-point function, computed using Eq. (4.80), is given by:
x1
G(x1 , x2 , x3 , x4 ) = x1 |x2 x3 |x4 + x1 |x3 x2 |x4 + x1 |x4 x2 |x3 (4.83)
x2
+
x3
x4
x1
x2
Figure 4.4: Diagrams contributing to
G(x1 , x2 , x3 , x4 )
and similarly for higher orders. We are essentially computing the moments
of a Gaussian distribution for which all the information is contained in the
two-point function. The odd moments will vanish and the even moments
can be expressed as products of the two-point function, which is what
Eq. (4.83) tells you.
With future applications in mind, we will associate a diagrammatic representation with these algebraic expressions. We will represent G(x1 , x2 )
by a line connecting the two events x1 and x2 as in Fig. 4.3. This term
occurs in the series expansion for Z(J) in the form G(x1 , x2 )J(x1 )J(x2 )
which can be interpreted as the source J creating a particle at one event
and destroying it at another, with the amplitude G(x1 , x2 ) describing the
propagation. This is, of course, completely consistent with our earlier discussion and interpretation of x1 |x2 . Let us next consider G(x1 , x2 , x3 , x4 )
which can be represented by the figure in Fig. 4.4. This term occurs in the
4.4. Perturbation Theory
155
form G(x1 , x2 , x3 , x4 ) J(x1 )J(x2 )J(x3 )J(x4 ) and one would like to interpret this as a pair of particles produced and destroyed by two sources and
two sinks. The form of the expression in Eq. (4.83) clearly shows that the
particles do not affect each other — as to be expected in the absence of
any interactions — and this merely propagates from event to event in all
possible combinations.
The Fig. 4.4 has three pieces and each of them have two disconnected
components. It is obvious that when we look at 2n-point functions we
will get the same structure involving n particles, each propagating independently amongst the 2n events without any interactions. Any particular
configuration will involve n lines of the kind in Fig. 4.4, each of which is
disconnected from the other.
It is clear that if we try to draw all these diagrams with disconnected
pieces and compute each term, we are duplicating the effort unnecessarily.
This problem is easily solved by writing Z(J) as exp[iW (J)] and working
with W (J). In the case of free field theory, W (J) is a quadratic functional
of J which leads to no higher than the two-point function. When we exponentiate W (J) to get Z(J), we automatically generate the correct products
of disconnected diagrams, while W (J) has only connected diagrams. That
is, while
∂nZ
n 1
0|T [φ(x1 ) · · · φ(xn )]|0 = (−i)
(4.84)
Z[J] ∂J(x1 ) · · · ∂J(xn )
J=0
the W (J), on the other hand, produces a similar quantity involving only
those Feynman diagrams which are connected; i.e.,
n
∂
W
[J]
(−i)n
= −i0|T [φ(x1 ) · · · φ(xn )]|0connected
∂J(x1 ) · · · ∂J(xn )
J=0
(4.85)
For example, when n = 2, we have the result
∂2W
∂
1 ∂Z
(−i)2
(4.86)
= (−i)3
∂J1 ∂J2
∂J1 Z ∂J2
1 ∂2Z
1 ∂Z
1 ∂Z
− (−i)3
= (−i)3
Z ∂J1 ∂J2
Z ∂J1
Z ∂J2
= −i [J|φ(x1 )φ(x2 )|J − J|φ(x1 )|JJ|φ(x2 )|J]
where we have kept J = 0 to illustrate the point. The second term, which is
subtracted out, involves all the disconnected pieces which we are not usually
concerned with. In presence of the source, even the one-point function
(which, of course, is connected) can be non-zero and is given by
∂W [J]
1 ∂Z[J]
= −i
= J|φ(x)|J ≡ φJ (x)
∂J(x)
Z ∂J(x)
(4.87)
where the last relation defines φJ (x) as the expectation value of the field
φ in a state describing a source J(x).
This feature is quite generic in field theory36 and hence one prefers to
work with W (J) rather than Z(J). Alternatively, we can simply restrict
ourselves to connected diagrams in our computation, which is what we will
do eventually.
36 It is not too difficult to prove; but
we won’t bother to do it since we will
directly work with connected Feynman
diagrams.
156
4.4.2
Chapter 4. Real Life I: Interactions
Feynman Rules for the λφ4 Theory
Let us now consider the application of the above ideas to a self-interacting
λφ4 theory with the action
1
λ
d x (∂m φ∂ m φ − m2 φ2 ) − φ4
2
4!
4
A[φ] =
(4.88)
and the corresponding generating function, in a condensed notation,
Zλ (J) =
&
'
λ
1
(∂φ)2 − m2 φ2 − φ4 + Jφ
(4.89)
Dφ exp i d4 x
2
4!
Since we cannot evaluate this exactly when λ = 0, we will resort to a
perturbation series in λ. If we expand the relevant part of the exponential
as
λ
exp −i
d4 xφ4
4!
=
λ n
(4.90)
−i
4!
n=0
×
d4 x1 d4 x2 ...d4 xn φ4 (x1 )φ4 (x2 )...φ4 (xn )
it is obvious that the n−th order term will insert the n factors of φ4 (with
integrals over each of them) inside the path integral. But this is exactly
what we would have obtained if we had differentiated the remaining path
integral (the one with λ = 0) n times with respect to δ 4 /δJ 4 . Therefore
we get:
λ n
δ4
δ4
δ4
Zλ [J] =
....
Z0 [J]
d4 x1 d4 x2 ...d4 xn
−i
4!
δJ(x1 )4 δJ(x2 )4 δJ(xn )4
n=0
(4.91)
The generating function Z0 [J] for λ = 0 is of course known and is given by
Eq. (4.82). Therefore, we can write, in a compact notation:
λ
δ4
4
d x
Zλ [J] = Z0 [0] exp −i
4!
δJ(x)4
1
d4 x2 d4 x1 J(x2 )x2 |x1 J(x1 )
× exp −
2
(4.92)
The exponential operator is defined through the power series in λ given
explicitly in Eq. (4.91). The expression in Eq. (4.92) formally solves our
problem. Once we expand the exponential operator in a power series in
λ, we can compute the nth order contribution (involving λn ) by carrying
out the functional differentiation in Z0 (J). This is a purely algorithmic
procedure which will provide the perturbative expansion for Z(J). This,
in turn, leads to a functional series expansion for Z(J) in J as shown in
Eq. (4.81), the coefficients of which will give the n-point functions.
Before we discuss the nature of this expansion, we will also describe an
alternative way of obtaining the same result. If one expands Eq. (4.89) as
4.4. Perturbation Theory
157
a power series in J(x) directly, we will get
∞
1
Zλ [J] = Z0 [0]
dx1 dx2 · · · dxn J(x1 ) · · · J(xn )G(n) (x1 , · · · , xn )
n!
n=0
∞
1
dx1 dx2 · · · dxn J(x1 ) · · · J(xn ) Dφ φ(x1 ) · · · φ(xn )
=
n!
n=0
'
&
λ 4
1
4
2
2 2
(∂φ) − m φ − φ
(4.93)
× exp i d x
2
4!
which allows us to read off the different n-point functions as standard path
integral averages like
1
G(x1 , x2 ) ≡
Dφ φ(x1 )φ(x2 )
Z0 (0)
&
'
λ
1
× exp i d4 x
(∂φ)2 − m2 φ2 − φ4
2
4!
= 0|T [φ(x1 )φ(x2 )]|0
(4.94)
and
G(x1 , x2 , x3 , x4 )
≡
1
Z0 (0)
Dφ φ(x1 )φ(x2 )φ(x3 )φ(x4 )
&
'
λ 4
1
4
2
2 2
(∂φ) − m φ − φ
× exp i d x
2
4!
= 0|T [φ(x1 )φ(x2 )φ(x3 )φ(x4 )]|0
(4.95)
etc. These have the same structure as the 2-point and 4-point functions in
Eq. (4.79) and Eq. (4.80) but are now constructed for the interacting theory
with λ = 0. For the same reason, we cannot evaluate these expressions
exactly and have to resort to a perturbative expansion in λ in Eq. (4.94)
and Eq. (4.95).
The two approaches (repeated functional differentiation with respect to
J 4 (x) of the non-interacting generating function, or evaluation of the path
integral average in Eq. (4.80) in a perturbative series in λ) are mathematically identical; they also involve substantially the same amount of effort in
calculation. The functional series expansion of Z(J) in J, however, gives
a clearer physical picture of particles being created and destroyed by the
source J(x). This allows us to interpret the term G(x1 , x2 )J(x1 )J(x2 ), for
example, as the creation and destruction of a particle at x1 and x2 with
G(x1 , x2 ) giving the propagation amplitude in the interaction theory. Similarly, the term G(x1 , x2 , x3 , x4 )J(x1 )J(x2 )J(x3 )J(x4 ) can be interpreted
as the creation of a pair of particles at x1 and x2 , say, with their subsequent destruction at x3 and x4 . In the free-field theory, we know that the
pair of particles will propagate in between, ignoring each other. But in
the interacting field theory, this process will also involve scattering of the
particles due to the interaction term.
To gain some insight into the explicit computation of these amplitudes,
let us try our hand in computing G(x1 , x2 , x3 , x4 ) to the lowest order at λ.
This is given by the path integral average:
iλ
1
4
−
d x Dφ φ(x1 )φ(x2 )φ(x3 )φ(x4 )φ(x)4
G(x1 , x2 , x3 , x4 ) =
Z0 (0)
4!
&
'
1
4
2
2 2
× exp i d x
(4.96)
(∂φ) − m φ
2
158
x2
x4
x
x1
x
x3
(a)
x2
x4
x
x1
x3
(b)
x4
x2
x1
x3
(c)
Figure
4.5:
Three types of diagrams involved in the computation of
Eq. (4.96)
Chapter 4. Real Life I: Interactions
This is a path integral average of eight factors of φ with four factors
evaluated at the same event x and the other four factors evaluated at
(x1 , x2 , x3 , x4 ). Because we are doing perturbation theory, the path integral is over the free particle action which is just a quadratic expression; in
other words, we just have to evaluate the 8-point function of the free-field
theory. Since the only non-trivial n-point function in the free-field theory is the 2-point function x1 |x2 , we know that each term in the 8-point
function will be the product of four 2-point functions taken in different
combinations. Let us sort out the kind of terms which will emerge in such
an expression.
If we combine x1 with any of x2 , x3 or x4 , we will get a factor like x1 |x2 ,
say. (This process is usually called ‘contracting’ φ(x1 ) with φ(x2 ).) The
remaining x3 and x4 can either be combined with each other or combined
with the xs in φ4 (x). In either case, we see that there is a factor x1 |x2
represented by a single line disconnected from the rest (see Fig. 4.5(a), (b)).
The Fig. 4.5(a) represents the particles propagating without any interaction
from x1 to x2 and x3 to x4 with the four factors of φ4 (x), contracted pairwise amongst themselves, leading to the closed figure-of-eight loop shown
in the figure. Similarly, Fig. 4.5(b) shows one particle propagating from
x1 to x2 while we combine x3 with an x in one of the φ(x) of the φ4 (x)
term, x4 with an x in another φ(x) of the φ4 (x) term and combine the
remaining two φ(x) together to get the closed loop corresponding to x|x.
The processes in Fig. 4.5(a), (b) clearly belong to the disconnected type
with (at least) one particle propagating freely without interaction. What
we are interested in, of course, is the situation in which the two particles
do interact, which can happen only if we combine each of the x1 , x2 , x3
and x4 with one each of the φ(x) in the φ4 (x) term. This will lead to the
amplitude we are after, given by
G(x1 , x2 , x3 , x4 ) ∝ (−iλ) d4 x x1 |x x2 |x x3 |x x4 |x
(4.97)
This is represented by the diagram in Fig. 4.5(c). This gives the lowest
order amplitude for scattering of the scalar particles by one another.
Just to understand the structure of these terms and their diagrammatic
representation better, let us see how the same result will be obtained if we
calculate the process using the technique of functional differentiation with
respect to J 4 (x). The linear term in λ arises from the expansion of the
operator
λ
λ
δ4
δ4
exp −i
d4 x 4
=1−i
d4 x 4
(4.98)
4!
δJ (x)
4!
δJ (x)
which should act on the free-field generating function Z0 (J) again expanded
in a functional Taylor series in J. We want to isolate the term which will
be linear in λ and will have four factors of J surviving after the functional
differentiation. Since the linear term in λ in Eq. (4.98) involves four differentiations with respect to J, the only term which can contribute is the
term involving eight factors of J in the expansion of Z0 (J). Given the fact
that Z0 (J) is an exponential of a quadratic, we are looking for the fourth
term in the expansion of the exponential which will contain eight factors
of J. This will lead to the computation of the term
4
λ
1
δ4 1
4
A = −i
d x
−
dx1 dx2 x1 |x2 J(x1 )J(x2 )
(4.99)
4!
δJ 4 (x) 4!
2
4.4. Perturbation Theory
159
Expanding it out, we see that this term has the structure
δ4
dx1 ....dx8 J1 J2 J3 J4 J5 J6 J7 J8
A ∝ −iλ d4 x
δJ 4 (x)
× x1 |x5 x2 |x6 x3 |x7 x4 |x8
(4.100)
in obvious notation. The structure of different terms arising from the functional differentiation depends on how the different Jk s are paired with the
differentiating Jx . When the four Jx factors hit J5 , J6 , J7 and J8 , we get
the connected diagram Fig. 4.5(c) which we want. This has the algebraic
structure:
Ac ∝ −iλ d4 x dx1 ....dx4 J1 J2 J3 J4 x1 |xx2 |xx3 |xx4 |x (4.101)
This result is easy to understand. We think of two particles being created
at x1 and x2 and propagating to the same event x with the amplitude
x1 |x x2 |x (recall that a|b is symmetric in a and b), scattering with
an amplitude (−iλ) and then propagating from x to x3 , x4 where they get
destroyed by the external source again. For the purpose of calculating the
4-point function G(x1 , x2 , x3 , x4 ), we can imagine replacing the source Jk
by a Dirac-delta function at the event.
On the other hand, the functional differentiation will also contain a
term, say, in which the four factors of Jx in (δ 4 /δJx4 ) hit J3 , J4 , J7 , J8 .
This will lead to a term of the form
Aa ∝ −iλ dx1 dx2 dx5 dx6 J1 J2 J5 J6 x1 |x5 x2 |x6
dxx|xx|x
(4.102)
which corresponds to Fig. 4.5(a). We interpret this as two particles created
by the sources at x1 and x2 , propagating freely to x5 and x6 respectively.
Simultaneously, at some other event x, there arises a vacuum fluctuation
in terms of two closed loops with amplitude −iλx|x x|x. Clearly, this is
a disconnected process. One can similarly see that if two of the Jx s hit x1
and x5 respectively and others are paired off distinctively, we will pick up
just one factor of x|x and one freely propagating particle, corresponding
to Fig. 4.5(b) which is yet another disconnected process.
You should now be able to understand how the correspondence between
the Feynman diagrams and the functional differentiation works in order
to produce the same algebraic expressions. The expansion of Z0 (J) in J
involves a bunch of Js at different events in spacetime connected by xi |xj
factors which we denote by a line connecting xi and xj . For example, the 6th order term in the expansion will involve 6 factors of xi |xj connecting
12 events of which Fig. 4.6(a) shows a representative set. We have to
operate on these terms with the operator on the left hand side of Eq. (4.98)
which, when expanded out, will have a whole series of terms each involving
products of (δ 4 /δJx4 ). This operation essentially combines four ends of the
lines to a single event on which the differentiation acts. For example, the
term of order λ2 in our operator, involving
4
4
δ
δ
O = λ2 d4 x d4 x
(4.103)
δJ(x)
δJ(x )
can (i) pick up four ends x2 , x5 , x4 , x7 of Fig. 4.6(a) and will connect them
up at an event x and (ii) pick up x6 , x9 , x11 and x8 of Fig. 4.6(a) and
x10
x12
x6
x8
x9
x11
x2
x4
x7
x5
x3
x1
(a)
x10
x12
x
x
x3
x1
(b)
Figure 4.6: Diagrams relevant to the
computation of Eq. (4.102).
160
Chapter 4. Real Life I: Interactions
combine them at an event x . This leads to the diagram in Fig. 4.6(b).
The locations x and x are then integrated out to give the final amplitude.
In all the above cases we have been essentially concentrating on representative terms and diagrams in each category. But given the fact that
many of the products involved are symmetric in their arguments, one needs
to keep count of a symmetry factor for a term of the diagram. As an illustration, consider the correction to the 2-point function to order λ which
requires us to compute
iλ
B=−
d4 z φx φy φ4z
(4.104)
4!
y
z
x
(a)
y
where we have further simplified the notation by using .... to denote the
path integral average. We can contract the six factors of φ pair-wise in
6
C2 = 15 different ways and each one of them will give a contribution in
terms of products of three amplitudes xi |xj . However, a little thought
shows that only two of these 15 are really different. If we contract φx with
φy , there are three ways of contracting the remaining four φs with each
other and all of these will give identical contribution. This way of combining
the factors leads to the disconnected diagram shown in Fig. 4.7(a). In
writing the corresponding algebraic expression, we have to multiply it by a
factor 3. The other possibility is to combine φx with one of the φ4z (which
can be done in 4 ways), φy with another one of φ4z which can be done in 3
ways, with the remaining two φz being combined with each other (which is
unique). This gives rise to the diagram in Fig. 4.7(b). The corresponding
algebraic expression will, therefore, be
iλ
iλ
B = 3× −
x|y d4 z z|zz|z + 12 × −
d4 z x|zy|zz|z
4!
4!
(4.105)
From these examples, you should be able to figure out the rules governing the Feynman diagrams by which every term in the perturbation series
can be computed. After one draws the relevant diagrams, for a given process, we see that we have to do the following: (a) Associate the amplitude
x|y with a line joining the events x and y. (b) Associate a factor −iλ with
every vertex at which four lines meet. Further, integrate over the spacetime event associated with the vertex. (c) Introduce the correct symmetry
factor for the diagram.
z
4.4.3
x
(b)
Figure 4.7: Diagrams relevant to the
computation in Eq. (4.104).
Feynman Rules in the Momentum Space
While the above rules certainly work, it is often easier to work in the momentum space by introducing the Fourier transforms of all the expressions,
especially the n-point functions. We will then label each line by a fourmomentum vector k (instead of the events at the end points) and associate
the propagation amplitude in momentum space i(k 2 − m2 + i )−1 with that
line. It is obvious that, because of translational invariance, the Fourier
transform G(k1 , k2 , k3 , k4 ), say, of G(x1 , x2 , x3 , x4 ) will contain a Diracdelta function on the sum of the momenta ki . It is convenient to omit this
in the overall factor and also assume the momentum conservation at each
vertex. With these modifications, the Feynman rules in momentum space
will be the following:
(a) Draw the diagram in the momentum space associating a momentum
label with each of the lines. That is, we label each line with a momentum
4.4. Perturbation Theory
161
k and associate with it the factor i(k 2 − m2 + i )−1 .
(b) Assume that momentum is conserved at each vertex. This is most
conveniently done by having in-going and out-going momentum vectors at
each vertex (denoted by suitable arrows) and ensuring that the net sum of
in-going momenta is equal to the sum of the out-going momenta.
(c) Momenta associated with internal lines are integrated over with the
measure d4 k/(2π)4 .
(d) Associate the correct symmetry factor with the diagram.
With these rules, the momentum space version of Fig. 4.5(c) will become
Fig. 4.8. The corresponding algebraic expression will correspond to the
amplitude
A = −iλ(2π)4 δ (4) (k1 + k2 − k3 − k4 )
4
a=1
i
ka2 − m2 + i
(4.106)
It is obvious that nothing is lost in omitting the propagator factors corresponding to the external lines and dropping the overall momentum conservation factor. This is usually done while writing down the algebraic expressions corresponding to the Feynman diagram. In the case of Fig. 4.8,
we will now get the amplitude to be remarkably simple:
A = −iλ
k3
(4.107)
To understand how this transition from the real space to momentum
space Feynman diagram works, let us do one more example — which we
will anyway need later, to study the running coupling constant — in some
detail. Consider the evaluation of the 4-point function G(x1 , x2 , x3 , x4 )
to order λ2 in the coupling constant. A typical diagram, among the set
of diagrams which will contribute, is shown in Fig. 4.10. This arises
from all possible pairings of the product of twelve φs in the expression
φ(x1 )φ(x2 )φ(x3 )φ(x4 )φ(x)4 φ(y)4 . Let us first determine the overall symmetry factor for this diagram. One can combine φ(x1 ) with one of the φ(x)
and φ(x2 ) with another φ(x) in 4 × 3 possible ways; φ(x3 ) and φ(x4 ) can
similarly be combined with two of the φ(y)-s in another 4 × 3 possible ways;
the remaining two of the φ(x) can combine with the remaining two φ(y)
in two possible ways; finally, we can do the whole thing in two different
ways by exchanging x ↔ y, which gives another factor 2. Thus we get two
factors of 4! which nicely cancel with the (4!)2 which arises in the coefficient
(1/2)(−iλ/4!)2 (when we expand the exponential to quadratic order in λ)
leaving just (1/2)(−iλ)2 . To this order, our real space Feynman rule will
translate the figure Fig. 4.9 to the algebraic expression
1
2
G(x1 , x2 , x3 , x4 ) =
(−iλ)
d4 y
(4.108)
2
×
d4 x x1 |x x2 |x x|y x|y y|x3 y|x4
When we go to Fourier space, we will work with the Fourier transform of
G(x1 , x2 , x3 , x4 ) with respect to its arguments. It is convenient to associate
p1 , p2 with x1 , x2 and −k1 , −k2 with x3 and x4 , so that we can think of
in-going and out-going momenta with proper signs. We will denote the
Fourier transform of x|y by G(k) = i(k 2 − m2 + i )−1 . The Fourier
k4
k1
k2
Figure 4.8: Momentum space repre-
sentation of the vertex.
x3
x1
x
y
x2
x4
Figure 4.9: Diagram in real space cor-
responding to Eq. (4.108).
162
Chapter 4. Real Life I: Interactions
transform will, therefore, be given by
1
2
4
4
4
4
4
(−iλ)
d y d x d x1 d x2 d x3 d4 x4
2
× x1 |x x2 |x x|y x|y y|x3 y|x4
× exp i[p1 x1 + p2 x2 − k1 x3 − k2 x4 ]
1
= (−iλ)2 G(p1 ) G(p2 ) G(k1 ) G(k2 ) d4 xd4 yx|y2
2
× exp i[(p1 + p2 )x − (k1 + k2 )y]
=
k
p1
k1
p2
k2
k1 + k2 − k
You will get similar expressions if we join x1 and x3 to x and x2 and
x4 to y or if you join x1 and x4 to x and x2 and x3 to y. These three
situations appear in momentum space as the three diagrams in Fig. 4.10.
It is now clear that we again obtain four factors of G for the external
legs of the diagram, one integration over the internal line momentum, an
overall Dirac delta function expressing momentum conservation, individual
momentum conservation at each vertex and an overall symmetry factor.
This is precisely what we would have got if we had worked directly with the
diagrams in Fig. 4.10. You should be able to convince yourself that similar
features arise for arbitrarily complicated diagrams when we translate them
from real space to momentum space.
p1
k1
4.5
p 2 + k − k1
k
k2
p2
k1
p2
p 2 + k − k1
k2
1
(−iλ)2 G(p1 ) G(p2 ) G(k1 ) G(k2 )(2π)4 δ(p1 + p2 − k1 − k2 )
2
d4 k
G(k)G(k1 + k2 − k)
(4.109)
×
(2π)4
k
p1
Figure 4.10: Three momentum space
diagrams related to the same process.
Effective Action and the Perturbation Expansion
Incidentally, the definition of Z(J) and W (J) — which we introduced in
order to facilitate the perturbation series — is also useful in another context. We saw earlier that one way of obtaining non-perturbative results is
by using the concept of effective action which we defined in Sect. 4.1.1 from
a fairly physical point of view. There is a more formal way of defining the
effective action which you will find in several textbooks based on Z(J) and
W (J). This formal definition will be useful, for e.g., to relate the approach
based on the effective action with that based on perturbation theory etc.
To do this, we begin from the partition function Z(J) defined by the
standard relation
&
'
iW |J|
4
Z[J] = e
= Dφ exp i A[φ] + d x Jφ
(4.110)
The Green’s functions of the theory can be obtained either from the functional derivatives of Z or from the functional derivatives of W with respect
to J. Usually we set J = 0 after the differentiations so as to obtain the time
ordered expectation values in the vacuum state. But if we keep J = 0, we
obtain the relevant expectation values in a state |J describing an external
source J(x).
We are now in a position to define the effective action Aeff [φ] as a
Legendre transform. To do this, we invert the equation Eq. (4.87) viz.,
4.5. Effective Action and the Perturbation Expansion
163
(∂W [J]/∂J(x)) ≡ φJ (x), to express J as a function of φJ and omit the
subscript J on φJ to obtain the function J = J(φ). We then define the
effective action as the Legendre transform of W by
Aeff [φ] ≡ W [J(φ)] − d4 x J[φ(x)]φ(x)
(4.111)
In this relation, J(φ) is treated as a functional of φ so that the left hand
side is a functional of φ, as it should. The standard relations which occur
in any such Legendre transform, viz.:
δW [J]
δAeff [φ]
= −J(x);
(4.112)
= φJ (x)
δφ(x) φ=φJ
δJ(x)
hold in this case as well and
W [J] = Aeff [φ(J)] +
d4 x J(x)φ[J(x)]
(4.113)
gives the inverse Legendre transform. You can easily verify that the Aeff (φ)
defined by the above rule is the same as the effective action we have defined
earlier for quadratic actions.
One can now understand the properties of Aeff better through the above
approach. We recall that the functional Taylor series expansion of W (J)
(n)
gives all the connected end point functions Gc via the relation:
∞ n
i
d4 x1 · · · d4 xn J(x1 ) · · · J(xn )G(n)
W [J] = −i
c (x1 , · · · , xn )
n!
n=0
(4.114)
Taking a cue from this, we can also expand Aeff (φ) in a functional Taylor
series with the coefficient Γ(n) defined through the relation37
∞ n
i
d4 x1 · · · d4 xn Γ(n) (x1 , · · · xn )φ(x1 ) · · · φ(xn ) (4.115)
Aeff [φ] =
n!
n=1
Since we require δAeff /δφ = 0 when J = 0 with φc = 0, the above expansion implies that Γ(1) = 0. To understand what this expansion means,
let us concentrate on Γ(2) (x1 , x2 ). We differentiate the second relation in
Eq. (4.112) with respect to φ(y). This gives
δ
δJ(z) δ 2 W [J]
δW [J]
(4.116)
δ 4 (x − y) =
= d4 z
δφ(y) δJ(x)
δφ(y) δJ(z)δJ(x)
But, from the first relation in Eq. (4.112), we have
−
δJ(z)
δ 2 Aeff [φ]
=
δφ(y)
δφ(z)δφ(y)
(4.117)
Using this in Eq. (4.116) and evaluating it with φ = J = 0, we obtain a
(2)
simple relation between Γ(2) and Gc , given by:
−1
4
(2)
= − G(2)
(4.118)
−δ (x − y) = d4 z Γ(2) (y, z) G(2)
c (z, x) or Γ
c
In other words, the quadratic part of the effective action gives the inverse of
the connected two-point function. For example, we will show in Sec 4.7.1
37 For simplicity, we have assumed
that 0|φ|0 is zero; otherwise we need
to make a shift φ → φ − φc .
164
Chapter 4. Real Life I: Interactions
(2)
38 In this sense, one often says that effective action is non-perturbative.
that the sum over a set of loop diagrams in Fig. 4.12 will give Gc =
(p2 −m2 −Σ(p))−1 (see Eq. (4.145)). Then the relation in Eq. (4.118) tells us
that Γ(2) is then given by Γ(2) (p) ∝ (p2 −m2 −Σ(p)) in the momentum space.
In other words, the effective action succinctly summarizes the information
contained in the sum of all the diagrams in Fig. 4.12. This should be
contrasted with a perturbation expansion which will lead38 to the individual
terms in the expansion in Eq. (4.145).
An alternate way of obtaining a similar result is as follows. If we substitute the expression in Eq. (4.113) for exp[iW (J)] in Eq. (4.110) and use
the first relation in Eq. (4.112) in the right hand side, we get the result
δAeff [φ]
exp i Aeff + Jφ d4 x = Dφ exp i A[φ] − dx φ(x)
δφ(x)
(4.119)
which can be written as
δAeff [ψ]
exp iAeff = Dφ exp i A[φ] − dx(φ − ψ)
(4.120)
δψ(x)
If we now shift φ to ψ + φ, we can rewrite this result as
δAeff [ψ]
exp iAeff = Dφ exp i A[ψ + φ] − dx φ(x)
δψ(x)
(4.121)
At first sight, you might think that this relation is not of much value since
both sides contain Aeff . However, the second term in the right hand side is
a higher order correction and hence we can iteratively calculate the result
as an expansion in .
For example, at one loop order we obtain an extra factor from the
second term, which is given by
'
&
δ 2 A0 [φ]
i
(1)
4
4
exp iAeff [φ] ∝ Dφ exp
d x φ(x) d y φ(y)
2
δφ(x)δφ(y)
(4.122)
where A0 is the tree-level action. In a theory with
1
1 2 2
m
∂m φ∂ φ − m φ − V (φ)
(4.123)
A0 [φ] = dx
2
2
we get the correction to be
&
'
i
(1)
d4 x φ + m2 + V (φ) φ
exp iAeff ∝ Dφ exp
2
This leads to an effective action of the form
i
(1)
Aeff = A0 [φ] + ln Det + m2 + V (φ)
2
(4.124)
(4.125)
This is indeed the expression we have used before to compute the effective
action.
4.6
39 You
can happily skip this discussion
unless you are curious to know what
LSZ is all about.
Aside: LSZ Reduction Formulas
The results obtained so far can be made much more rigorous in many different ways, one of which is called the LSZ (short for Lehmann-SymanzikZimmermann) formalism.39 While we do not want to be unnecessarily
4.6. Aside: LSZ Reduction Formulas
165
formal, we will describe very briefly the ideas involved in this formalism in
order to clarify certain aspects of our previous discussion.
For the sake of definiteness, let us consider a physical process in which
two particles with momenta p1 and p2 scatter with each other and produce
a bunch of particles with momenta p3 , p4 , ...pn . We introduce an S-matrix
element — which is the matrix element of an operator S — denoted by
p3 , p4 , ....pn |S|p1 , p2 which gives the amplitude for this process. We have
seen that such amplitudes can be computed in terms of Feynman diagrams
which, in turn, are short hand notations for algebraic expressions involving
vacuum expectation values of the time-ordered product of field operators
like the N point function:
G(x1 , x2 , . . . xN ) = 0|T [φ(x1 )φ(x2 ) . . . φ(xN )]|0
(4.126)
The procedure based on the Feynman diagrams suggests that the S-matrix
element p3 , p4 , ....pn |S|p1 , p2 is directly related to the vacuum expectation values of the time ordered products like G(x1 , x2 , . . . xN ) defined by
Eq. (4.126). The LSZ reduction formula provides the formal relation between these two. For the process which we were discussing, one can show
that this relation is given by
4
−ip1 x1
2
(1 + m ) · · ·
(4.127)
p3 · · · pn |S|p1 p2 = i d x1 e
× i d4 xn e−ipn xn (n + m2 ) 0|T {φ(x1 )φ(x2 )φ(x3 ) · · · φ(xn )} |0
We will now describe briefly how such a relation comes about.
In general, the field will be described by an expansion of the form
1
d3 p
φ(x) = φ(t, x) =
ap (t)e−ip·x + a†p (t)eip·x
(4.128)
3
(2π)
2ωp
which is an operator in Heisenberg picture. In a free field theory, ap (t)
will evolve as e−iωp t ; but in an interacting theory, the evolution of these
operators will be non-trivial. Let us assume that all the interactions happen
during the time interval −T < t < T . At very early times (t → −∞) and
at very late times (t → +∞), the system is described by a free field theory
which satisfies the condition
0|φ(t = ±∞, x)|p ∝ eip·x
(4.129)
Further, since all interactions take place only during the interval −T <
t < T , the creation and annihilation operators evolve like in the free-field
theory at t → ±∞. So, asymptotically, we can construct the initial and
final states by acting on the vacuum with suitable creation operators. More
specifically, we have,
√
√
(4.130)
|p1 , p2 ≡ |i = 2ω1 2ω2 a†p1 (−∞)a†p2 (−∞)|0
|p3 , p4 · · · pn ≡ |f =
√
√
2ω3 · · · 2ωn a†p3 (∞) · · · a†pn (∞)|0
(4.131)
Therefore, the S-matrix element is just the amplitude given by
√
f |S|i = 2n ω1 · · · ωn 0|ap3 (∞) · · · apn (∞)a†p1 (−∞)a†p2 (−∞)|0
(4.132)
166
Chapter 4. Real Life I: Interactions
We first re-write this expression in a slightly different way which will prove
to be convenient. To begin with, we insert a time-ordering operator and
express it in the form
$
%
√
f |S|i = 2n ω1 · · · ωn 0|T ap3 (∞) · · · apn (∞)a†p1 (−∞)a†p2 (−∞) |0
(4.133)
We next notice that this expression can be written in an equivalent form
as
√
f |S|i =
2 n ω1 · · · ωn
×0|T {[ap3 (∞) − ap3 (−∞)] · · · [apn (∞) − apn (−∞)]
%
× [a†p1 (−∞) − a†p1 (∞)][a†p2 (−∞) − a†p2 (∞)] |0 (4.134)
This trick works because the time-ordering moves all the unwanted a† (∞)
operators associated with the initial state to the left where they annihilate
on f | while all the unwanted a(−∞) operators are pushed to the right
where they annihilate on |i. So, in order to determine f |S|i we only
need to find a suitable expression for [ap (∞) − ap (−∞)] (and its Hermitian
conjugate) for each of the creation and annihilation operators. Our aim is
to manipulate this expression and obtain Eq. (4.127).
The key to this result is the purely algebraic relation
(4.135)
i d4 x eipx ( + m2 )φ(x) = 2ωp [ap (∞) − ap (−∞)]
in which the four-momentum is on-shell. To obtain this relation, we will assume that all fields die off at spatial infinity, allowing a suitable integration
by parts. Then, we have the results
(4.136)
i d4 x eipx ( + m2 )φ(x) = i d4 x eipx (∂t2 + ωp2 )φ(x)
and
∂t [eipx (i∂t + ωp )φ(x)]
Exercise 4.5: Prove these.
= [iωp eipx (i∂t + ωp ) + eipx (i∂t2 + ωp ∂t )]φ(x)
(4.137)
= ieipx (∂t2 + ωp2 )φ(x)
which can be proved by simple algebra. Combining these, we get
d4 x ∂t [eipx (i∂t + ωp )φ(x)]
(4.138)
i d4 x eipx ( + m2 )φ(x) =
=
dt ∂t eiωp t d3 x e−ip·x (i∂t + ωp )φ(x)
Since the integrand is a total time derivative and the integration is over
the entire range −∞ < t < ∞, the result is only going to depend on the
behaviour of the integrand in the asymptotic limits. By construction, ap (t)
and a†p (t) are time-independent asymptotically. If we now use this fact and
the expansion in Eq. (4.128), we can easily show that
d3 x e−ip·x (i∂t + ωp )φ(x) = 2ωp ap (t) e−iωp t
(4.139)
which allows us to write
i d4 x eipx ( + m2 )φ(x)
=
=
dt ∂t
iωp t
e
2ωp ap (t) e−iωp t
2ωp [ap (∞) − ap (−∞)]
(4.140)
4.7. Handling the Divergences in the Perturbation Theory
Taking the Hermitian conjugate, we also have the result
−i d4 x e−ipx ( + m2 )φ(x) = 2ωp a†p (∞) − a†p (−∞)
167
(4.141)
That is all we need. Substituting Eq. (4.141) into Eq. (4.134), we immediately obtain the relation in Eq. (4.127).
What does this relation mean? The time-ordered correlation function
contains a large amount of information about the dynamics of the field,
much of which is irrelevant for the amplitude we are interested in. The operator + m2 , which becomes −p2 + m2 in Fourier space, actually vanishes
in the asymptotic states where the field becomes free. Therefore, these
factors in Eq. (4.127) will remove all the terms in the time-ordered product
except those containing the factor (p2 − m2 )−1 . In other words, we are essentially picking out the residue from each pole in the correlation function.
Thus, the S-matrix projects out the one-particle asymptotic states from
the time-ordered product of the fields. The LSZ reduction formula encodes
a careful cancellation between the zeros resulting from +m2 acting on the
fields and the ( + m2 )−1 arising from the one-particle states. This formal
relation is what is pictorially represented through the Feynman diagrams.
4.7
Handling the Divergences in the Perturbation Theory
When we studied the self-interacting scalar field from a non-perturbative
perspective — using the effective action — in Sect. 4.3, we found that the
effective theory can be expressed in terms of certain physical parameters
mphy , λphy which are different from the parameters m0 , λ0 which appear
in the original Lagrangian.40 We also found that for mphy , λphy to remain
finite, the bare parameters m0 , λ0 have to be divergent. In particular,
the strength of the interaction in the theory, measured by the coupling
constant λphys , needs to be defined operationally at some suitable energy
scale. Once this is done, it gets determined at all other scales through the
equation Eq. (4.75). The question arises as to how we are led to these
results when we approach the problem perturbatively.
The answer to this question relies on the concept of perturbative renormalization which is an important technical advance in the study of quantum field theory. The main application of this technique — leading to
definite predictions which have been observationally verified — occurs in
QED which we will study in the next chapter. At that time, we will also describe the conceptual aspects of the perturbative renormalization in somewhat greater detail. The purpose of this section is to briefly introduce these
ideas in the context of λφ4 theory as a preparation for the latter discussion
in QED.41
The really nontrivial aspects of any quantum field theory, in particular
λφ4 theory or QED, arise only when we go beyond the lowest order perturbation theory (called the tree-level). This is because, as long as there
are no internal loops, the translation of a Feynman diagram into an algebraic expression does not involve any integration. But when we consider
a diagram involving one or more internal loops, we have to integrate over
the momenta corresponding to these loops. The propagators corresponding to the internal loops will typically contribute a (1/p2 ) or (1/p4 ) factor
40 We
have added the subscripts ‘phy’
and 0 to these parameters for later notational convenience.
41 The technical details of renormalization depend crucially on the kind of
theory you are interested in. From this
point of view, λφ4 theory is somewhat
too simple to capture all the nuances
of this technique. This is one reason
for keeping this discussion rather brief
in this section.
168
42 The effective action is closely related to the expansion in loops. One
may therefore try to argue that, ultimately, the source of all divergences is
in the loops. But the fact that the effective action can also lead to results
which are non-analytic in the coupling
constant — like the Schwinger effect —
suggests that it is better not to try and
interpret everything in quantum field
theory in terms of the perturbative expansion.
43 When we study the λφ4 theory at
one loop order — which we will do in
the next section — we will find that
the mass m and the coupling constant
λ gets renormalized and this is adequate to handle all the one loop divergences. But when we proceed to
the next order, viz. O(λ2 ), we will encounter a new type of divergence which
can be handled through the renormalization of the field, usually called the
wave function renormalization. This is
described in Problem 9 but we will discuss a similar renormalization in detail
when we study QED.
p
p
Figure 4.11: One loop correction to
the propagator.
Chapter 4. Real Life I: Interactions
to the integral at large momenta and the integration measure will go as
p3 dp. This could lead to the contributions from these diagrams, to diverge
at large p. In fact, this disaster does occur in λφ4 theory and QED, and
the diagrams involving even a single loop produce divergent results. It is
necessary to understand and reinterpret the perturbation theory when this
occurs.
The situation described above is the usual motivation given in textbooks
for the procedure called renormalization; it is thought of as a practical
necessity needed to “save” the perturbative calculation technique. From
a conceptual point of view, this reasoning is at best misleading and at
worst incorrect. As we have stressed several times before, the process of
renormalization, a priori, has very little to do with the existence or removal
of divergences in perturbative field theory. We have also seen examples of
these divergences in nonperturbative calculations based on the concept of
effective action.42
It is now clear that, to implement the program of perturbative renormalization, we need to undertake the following steps. (a) Identify processes
or Feynman diagrams which will lead to a divergent contribution to a probability amplitude. (b) Regularize this divergence in some sensible manner;
this is usually done by introducing some extra parameter , say, such that
the contribution is finite and well defined when = 0 and diverges in the
limit of → 0. (c) See whether all the divergent terms can be made to
disappear by changing the original parameters λ0A in the Lagrangian of
the theory to the physically observable set λren
A . This is best done keeping
finite, so that you are dealing with regularized finite expressions. (d)
Re-express the theory and, in particular, the non-divergent parts of the
amplitude in terms of the physical parameters λren
A . This will form the
basis for comparing the theory with observations.
In any given theory, say e.g., λφ4 theory or QED, we will need to carry
out this program to all orders in perturbation theory in order to convince
ourselves that the theory is well defined.43 This is indeed possible, but we
shall concentrate on illustrating this procedure at the lowest order, which
essentially involves integration over a single internal momentum variable.
This will introduce the necessary regularization procedure and the conceptual details which are involved.
4.7.1
One Loop Divergences in the λφ4 Theory
After this preamble, let us consider the Feynman diagrams in the λφ4
theory which contain a single loop. There are essentially two such diagrams
of which the second one comes in three avatars. The first one shown in
Fig. 4.11 while the second one — which has three related forms — is shown
in Fig. 4.10. The analytic expression corresponding to Fig. 4.11 is given by
the integral
λ
−iΣ(p ) = −iΣ(0) = −i
2
2
d4
i
(2π)4 2 − m20 + i
(4.142)
On the other hand, the contribution from the sum of the three diagrams
in Fig. 4.10 is given by
iM = −iλ + A(p1 + p2 ) + A(k1 − p1 ) + A(k1 − p2 )
(4.143)
4.7. Handling the Divergences in the Perturbation Theory
169
where
A(p) ≡
(−iλ)2
2
i
d4 k
i
2
4
2
2
(2π) k − m0 + i (p − k) − m20 + i
(4.144)
It is obvious that both the integrals are divergent; we also note that Σ(p2 )
is actually independent of p and is just a divergent constant.44 The contribution from Fig. 4.11 corrects the propagator from its expression at the
tree-level while the contribution from Fig. 4.10 changes the value of the
coupling constant.
Let us begin with the role played by the diagram in Fig. 4.11. To understand it in proper context, it is convenient to introduce the concept
of one-particle, irreducible (1PI) diagrams. The 1PI means that you start
with a single line and end with a single line and draw in between all possible
diagrammatic structures which cannot be cut apart by just cutting a single
line. Using this concept, one can draw a series of 1PI diagrams which will
correct the propagator in a geometric series as shown in Fig. 4.12. Translating this geometric series in diagrams into equivalent algebraic expressions,
we find that the Feynman propagator gets corrected as follows:
iG(p) = iG0 (p) + iG0 (p) [−iΣ(p)] iG0 (p) + · · ·
i
i
i
=
+ 2
[−iΣ(p)] 2
+ ···
2
2
2
p − m0 + i p − m0 + i
p − m20 + i
1
i
=
p2 − m20 + i 1 + iΣ(p2 ) p2 −mi 2 +i
44 This is a rather special feature of
λφ4 theory and does not happen in
general.
=
+
+
0
=
i
p2 − m20 − Σ(p2 ) + i
(4.145)
Since Σ(p2 ) = Σ(0) is a constant, it is obvious that the net effect of the
diagram in Fig. 4.11 — when used in the geometric series in Fig. 4.12 —
is to change the bare mass as m20 → m20 + Σ(0).
Our next job is to evaluate Σ(p) by some regularization procedure. As
usual, one could do this either by introducing a cut-off or by dimensional
regularization. This time we shall use the procedure of dimensional regularization in order to illustrate an important difference with respect to the
cut-off regularization which we used earlier when we studied the effective
potential.
To do this, it is convenient at this stage to switch to the n-dimensional
Euclidean momentum space. As we described earlier, dimensional regularization involves45 working in n-dimensions and analytically continuing the
results to all values of n and then taking the limit of n → 4. By and large,
this procedure has the advantage that it maintains the symmetries of the
theory.
Such a dimensional continuation of the integral in Eq. (4.142) will require us to essentially evaluate the integral given by
1
dn
4−n
I ≡ λμ
(4.146)
(2π)n 2 + m20
The pre-factor μ4−n , where μ has the dimensions of energy, is introduced
in order to keep λ dimensionless46 even in n-dimensions, just as it was in
n = 4. It is also convenient to define a parameter by 2 = 4 − n. The
+
Figure 4.12: Diagrammatic representation of the geometric progression
evaluated in Eq. (4.145).
45 We will say more about this procedure when we discuss QED in the next
chapter.
46 Since we want dn x(∂φ)2 to be dimensionless, φ must have the dimensions μ1− where μ is an energy scale.
Further, λφ4 must have the dimensions
of (∂φ)2 . This will require λ to have
the dimensions of μ2 = μ4−n in ndimensions. It is this factor which we
scale out in order to keep λ dimensionless.
170
Chapter 4. Real Life I: Interactions
integral in Eq. (4.146) is most easily evaluated by writing the integrand
(1/D) as
∞
1
=
ds exp(−sD) ,
(4.147)
D
0
Exercise 4.6: Fill in the algebraic details in Eq. (4.148) to Eq. (4.152).
carrying out the Gaussian integrations over and then performing the integration over the parameter s. Analytically continuing back to Lorentzian
spacetime, we find that our result is given by
−iλm2 4πμ2
−iΣ(0) =
Γ(−1 + )
(4.148)
32π 2
m2
This result is finite for finite , but of course we require the limit of → 0
which will allow us to isolate the divergent terms in a clean fashion. We
first note that
4πμ2
4πμ2
4πμ2
= exp ln
1 + ln
(4.149)
m2
m2
m2
while the Gamma function has the expansion near = 0 given by
1
Γ(−1 + ) = −
(4.150)
+ 1 − γE + O( )
Putting these in, we find the final result which we are after, viz.,
4πμ2
iλm2 1
+
1
−
γ
−iΣ(0) =
+
ln
+
O( )
E
32π 2
m2
(4.151)
This result displays the structure of the divergence (which arises from the
(1/ ) term) and a dependence on the arbitrary mass scale μ through the
logarithmic term. We can take care of the divergence by replacing m20 by
m2 = m20 + δm2 which can be written as
λ 1
(4.152)
m20 = m2 − δm2 = m2 1 +
32π 2
47 Do
not confuse this mass scale μ
with the cut-off scale Λ which was used
to regularize divergences earlier. In dimensional regularization, the ultraviolet regulator is actually provided by
the parameter 2 = 4 − n which measures the deviation from 4-dimensions.
The → 0 limit is equivalent to an
ultraviolet regulator scale Λ going to
infinity. The μ is not a regulator, but,
as we shall see, allows us to define a
renormalization point (usually a finite
mass scale of the order of the other
mass scales in the theory) used to define renormalized quantities.
We, of course, do not want the physical results to depend on the arbitrary
scale μ; this would require m and λ to vary with μ in a particular way.47
We will comment on this fact after discussing the renormalization of λ.
You might have noticed that the relation between m0 and m is now different from what we found in Eq. (4.72). In Eq. (4.72) we had a logarithmic
divergence proportional to λm2 and a quadratic divergence proportional to
λ. But in Eq. (4.152) we only have a single divergence proportional to λm2
and the quadratic divergence has disappeared. As we mentioned earlier,
this is a general feature of dimensional regularization. To understand this
better, consider the n-dimensional Euclidean momentum integral of the
form
n
1
d pE
(4.153)
Ik (n, m2 ) =
2
n
(2π) (pE + m2 )k
This integral is most easily evaluated by using the following integral representation for the integrand:
∞
1
1
=
dt tk−1 e−at
(a > 0)
(4.154)
ak
Γ(k) 0
4.7. Handling the Divergences in the Perturbation Theory
171
and performing the Gaussian integrals over the momentum. This will give
the result
Γ(k − 2 − n−4
1
2 )
Ik (n, m2 ) =
(4.155)
n−4
n−4
2+
k−2−
2
2
2
(4π)
Γ(k)(m )
We can now take the limit of n → 4 and obtain
Ik (n, m2 ) →
(m2 )2−k 2
+ finite part
16π 2 n − 4
(4.156)
We see that the nature of the divergence is independent of k and always
has a 1/(n − 4) behaviour.
On the other hand, if we compute the same integral using a momentum
space cut-off at some large value M , then we get the result which clearly
depends on k:
2
⎫
⎧ 2 2
m
M
M
⎪
k=1 ⎪
2
2 − log
2
⎪
⎪
8π
m
m
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎬
⎨
4
1
d pE
2
1
M
1
∝
log
−
k
=
2
2
2
2
4
2
k
8π
m
2
⎪
⎪
|pE |
2
2
8π (k−1)(k−2)
(4.157)
We see that the k = 1 and k = 2 cases (which prop up repeatedly in quantum field theory) have different structures of the divergence as compared
to what we found with dimensional regularization.48
In our case, we are interested in the k = 1 integral. If we do it using
a cut-off — as we did when we computed the effective potential — we
pick up two kinds of divergences, one quadratic and one logarithmic, as
seen in Eq. (4.157). On the other hand, if we had used the dimensional
regularization, we would not have picked up the quadratic divergence as can
be seen from Eq. (4.156). In some sense, this issue is irrelevant, because the
divergent terms are being absorbed into the parameters of the theory, and
no physical effects, including the running of the parameters, will depend on
whether we use dimensional regularization or cut-off regularization. It is
nevertheless important to appreciate that dimensional regularization makes
quadratic divergences disappear.49
4.7.2
Running Coupling Constant in the Perturbative
Approach
Let us next consider the contributions from the three diagrams in Fig. 4.9
which is given by the expression in Eq. (4.143) and Eq. (4.144), reproduced
here for convenience:
iM = −iλ + A(p1 + p2 ) + A(k1 − p1 ) + A(k1 − p2 )
where we have defined with
d4 k
i
i
(−iλ)2
A(p) ≡
4
2
2
2
2
(2π) k − m + i (p − k) − m2 + i
(4.158)
(4.159)
Physically this represents the amplitude for a φφ → φφ scattering50 correct
to O(λ2 ). To the lowest order, we just have (−iλ) and to the next order
we have the contribution from the three diagrams in Fig. 4.9.
Exercise 4.7: Obtain the expressions
in Eq. (4.157).
48 This is just another fact of life. You
cannot do anything about it except be
aware that the pattern of divergence
can depend on the regularization procedure you use and make sure no observable results are affected.
49 Notice
that, in the case of the integral I1 (n, m2 ), we have a Γ((2 − n)/2)
making its appearance. This has, besides the pole at n = 4, another one
at n = 2. The original quadratic divergence transforms itself into the existence of two such poles in the resulting
integral.
50 The arguments of A in Eq. (4.158)
occur frequently in problems involving the scattering of two particles of
masses m1 , m2 from the initial momenta (p1 , p2 ) to the final momenta
(p3 , p4 ). For such a process, one defines the Mandelstam variables s ≡
(p1 + p2 )2 , t ≡ (p1 − p3 )2 , u ≡ (p
√1 −
p4 )2 . You can easily verify that s is
the center of mass energy for the scattering and s + t + u is equal to the sum
of the squares of the masses of the particles. They play an important role in
the study of the kinematics of the scattering.
172
51 For example, if we scatter two particles at very low momentum, we can say
that the energy scale is set by the rest
mass m of the particles, or if we scatter
two particles at energies far larger than
the rest mass energy, we
√ could take the
center of mass energy s as setting the
scale for this scattering, etc.
52 We have, of course, come across the
same issue in Sect. 4.3 and have explained the philosophy behind tackling
this problem. It is not surprising that
the same issue arises in the perturbative approach as well. In fact, you will
see that we will get exactly the same
result for the “running” of λ, which
is gratifying given the cavalier attitude
we have taken towards mathematical
rigour.
53 This is usually called the Feynman parameterization in textbooks,
though it has appeared in an earlier
work of Julian Schwinger in the exponential form. The discussion about
priorities in a conversation between
Schwinger and Harold (Hypothetically
Alert Reader Of Limitless Dedication;
a character, created by Schwinger, who
sets records straight) in page 338 of
the book Sources, Fields and Particles, Vol. I is amusing to listen to:
Harold: Is it not true, however, that
the usual intent of that device, to replace space-time integrations by invariant parametric integrals, was earlier
exploited by you in a related exponential version, and that the elementary
identity combining two denominators,
in fact, appears quite explicitly in a paper of yours, published in the same issue that contains Feynman’s contribution?
Schwinger : Yes.
Chapter 4. Real Life I: Interactions
The most interesting (or disturbing, depending on your point of view)
feature of the expression in Eq. (4.158) is that the integral defining A is
logarithmically divergent. The way to handle this divergence, again, is as
we described before. The parameters in the Lagrangian — and in particular the coupling constant λ — have no intrinsic physical significance and
have to be defined through some operational procedure like the φφ → φφ
scattering, etc. When such an experiment is performed, one would work
at some energy51 scale E. With such a scattering experiment, one can
measure the amplitude M which will define for us the effective coupling
constant λeff (E) defined through the relation iM = −iλeff (E). This effective coupling constant could, of course, depend on the energy scale of
the scattering experiment which is indicated by an explicit functional dependence in the expression. When such an experiment is performed, one
always gets finite answers for M and the experimental result takes into
account the interaction to all orders in the perturbation. So, we need to do
is to re-express all the answers in terms of the experimentally determined
λeff (at some scale). If the divergences disappear when the scattering amplitudes are re-expressed in terms of physically relevant parameters, then
we have a good theory which can be used to make useful predictions. We
will now see how this comes about.52
Our first task is to regularize the divergent integral in Eq. (4.159) and
isolate the divergences. To do this, we will again use dimensional regularization which requires us to work in n dimensions after analytically
continuing to the Euclidean sector. Assuming p2 < m2 and rotating to
the Euclidean plane in n-dimensions, we need to essentially evaluate the
integral:
1
1
dn k
(4.160)
F (p2 ) = −
(2π)n k 2 + m2 (k − p)2 + m2
This naive procedure of analytic continuation does work in this case; but it
is worth noting the following subtlety (to which we will come back later).
Since we want to analytically continue k 0 to imaginary values, let us examine the analytic structure of the integrand in Eq. (4.159), made of the
product of two propagators. The first factor has the poles at ωk − i and
−ωk + i . These two poles are the usual ones in the second and fourth
quadrant and do not prevent us from performing the Wick rotation in
from θ = 0 to θ = π/2. The poles
of the second factor are at
k 0 eiθ
(p0 + (k − p)2 + m2 − i ) and (p0 − (k − p)2 + m2 + i ). It is easy
to see that, if 0 < p0 < m, then these poles are also in the second and
fourth quadrant and hence do not create any obstacles for the Wick rotation. Therefore, we can first determine the value of the integral for p2 < m2
and then analytically continue to the whole energy plane. This will require
a transition across a Riemann sheet (because of square roots) which can
be handled — as we shall see in the Mathematical Supplement 4.9.2 – by
evaluating the imaginary part of the integral explicitly if required. We will
now proceed with above integral in Eq. (4.160).
It is convenient at this stage to introduce a parametric integration to
combine the denominators of the two individual terms.53 Using again the
trick in Eq. (4.147), we obtain
∞ ∞
2
F (p2 ) = −
ds
dt e−(s+t)m
(4.161)
0
0 n
d k
exp − (s + t)k 2 − 2kpt + p2 t
×
n
(2π)
4.7. Handling the Divergences in the Perturbation Theory
173
in which one can immediately perform the Gaussian integrals. This requires
completing the squares by writing
2
pt
st 2
2
2
p
(s + t)k − 2kpt + p t = (s + t) k −
+
(4.162)
s+t
s+t
and shifting the integration variable to
pt
k¯ = k −
s+t
(4.163)
We then get the result
∞
1
2
F (p ) = −
ds
(4.164)
(4π)n/2 0
&
'
∞
1
st
×
dt
p2
exp − (s + t)m2 +
n/2
s
+
t
(s
+
t)
0
To proceed further, it is useful to do a change of variable which can be
done formally as follows. We first insert a factor of unity
∞
1=
dq δ(q − s − t)
(4.165)
0
and introduce two variables α and β in place of s and t by the definitions
s = αq, t = βq. This leads to
∞
∞ ∞
1
1
dq
q
dα
q dβ n/2 δ(q(1 − α − β))
F (p2 ) = −
n/2
(4π)
q
0
0
0
(4.166)
× exp −q(m2 + αβp2 )
Using δ(qx) = δ(x)/q, we can do the q integration, obtaining
∞
n −2
n 2
m + αβp2 2
dq q 1−n/2 exp −q(m2 + αβp2 ) = Γ 2 −
2
0
(4.167)
Therefore,
F (p2 ) = −
n
μn−4
Γ 2−
2
(4π)
2
1
dα
0
m2 + α(1 − α)p2
4πμ2
n2 −2
(4.168)
In the last expression we have introduced an arbitrary mass scale μ by
multiplying and dividing by54 the factor μn−4 . This makes the integral
dimensionless which will be convenient later on.
We now need to take the n → 4 limit of this expression when we get a
divergence from the Gamma function which has a simple pole at this point.
In this limit, we have the relation
1
1
n Γ 3 − n2
=
− γE
→
(4.169)
Γ 2−
n
n + Γ (1) ≡
2
2− 2
2− 2
with = 2 − (n/2) and γE being Euler’s constant. Using this in Eq. (4.168)
and pulling out a μn−4 factor to provide the dimensions, we can express
our result as
1 1
2
n−4
2
F (p ) = μ
−
(4.170)
+ Ffin (p )
(4π)2
54 That ensures the result is independent of μ; right? Now watch the fun.
174
Chapter 4. Real Life I: Interactions
where the finite part is:
'
&
1
4πμ2
1
1
−
dα
Γ
( )
Ffin (p2 ) = −
(4π)2 0
m2 + α(1 − α)p2
(4.171)
These expressions clearly isolate the divergent and finite parts of the function. In fact, using x = e ln x ≈ 1 + ln x for small and the expansion in
Eq. (4.169), we can easily show that the finite part is given by the integral
representation:
&
'
1
1
4πμ2
Ffin (p2 ) =
−
dα
ln
γ
(4.172)
E
(4π)2
m2 + α(1 − α)p2
0
in the Euclidean space. Analytically continuing back to the Lorentzian
space, p2 will become −p2 and we will get
&
'
1
1
4πμ2
Ffin (p2 ) =
−
dα
ln
γ
(4.173)
E
(4π)2
m2 − α(1 − α)p2
0
We can evaluate the integral explicitly, but before we do that, let us
once again demonstrate how the running of the coupling constant arises
from this expression. The total scattering amplitude in Eq. (4.158) now
reads as
1
3 1
+
F
(s)
+
F
(t)
+
F
(u)
+ O(λ3 ) (4.174)
−M = λ + λ2 −
fin
fin
fin
2
(16π 2 )
55 Algebra alert: For most purposes,
we can set μ2 in Eq. (4.175) to unity
when → 0, which is what we have
done here. But occasionally, the fact
that μ2 ≈ 1 + ln μ can be relevant to
regularize certain results.
with s = (p1 + p2 )2 , t = (p1 − p3 )2 , u = (p1 − p4 )2 . We now define a physical
coupling constant λphy by the relation
3 λphy 1
3 λphy 1
2
3
1+
λ = λphy μ
+ O(λphy ) → λphy 1 +
+ O(λ3phy )
2 16π 2
2 16π 2
(4.175)
with λphy being finite, so that the scattering amplitude becomes:55
1
−M = λphy + λ2phy [Ffin (s) + Ffin (t) + Ffin (u)] + O(λ3phy )
2
This expression is now perfectly finite and we have taken care of the divergence.
As usual, the λphy depends on the scale μ which is lurking inside Ffin .
Obviously, we cannot have the amplitude M to depend on this arbitrary
scale μ. This is taken care of by arranging the μ dependence of λphy to be
such that M becomes independent of μ. This leads to the condition:
0=−
dM
dλphys
=
dμ
dμ
+
+
56 If
you expand the denominator of
Eq. (4.179) in a Taylor series, you
can easily see that this reduces to
Eq. (4.74). Writing the result as in
Eq. (4.179) involves pretending that
Eq. (4.178) is exact. We shall comment on the domain of validity of expressions like Eq. (4.179) in a moment.
(4.176)
dλphys
[Ffin (s) + Ffin (t) + Ffin (u)] (4.177)
dμ
1 2 d
λ
[Ffin (s) + Ffin (t) + Ffin (u)] + O(λ3phy )
2 phy dμ
λphys
Since (dFfin /dμ) = −(1/16π 2)(2/μ), this condition reduces to
μ
dλphys
3 2
=
λ
+ O(λ3phy ) ≡ βλ2phys + O(λ3phy )
dμ
16π 2 phys
(4.178)
with β ≡ (3/16π 2 ) > 0. This is exactly the same “running” of the coupling
constant obtained earlier Eq. (4.76). If we formally integrate this relation,
we get56
4.7. Handling the Divergences in the Perturbation Theory
λphys (μ) =
λphys (μ0 )
λphys (μ0 )
≡
μ
μ
3
1 − βλphys (μ0 ) ln
λ
(μ
)
ln
1−
phys
0
μ0
16π 2
μ0
175
(4.179)
If you expand the denominator of Eq. (4.179), it will lead to an expansion
in powers of logarithms, given by:
λphy (μ) = λ0 +
3λ20
μ
9λ30
μ
ln
+
ln2
+ ···
2
4
(4π)
μ0
(4π)
μ0
(4.180)
where λ0 ≡ λphy (μ0 ). The first term in this expansion is the coupling
constant λ0 at the scale μ0 which we take as fixed. The second term with
a single logarithm represents the momentum dependence arising from the
one loop correction to the 4-point function. In fact, it is easy to see that the
higher order terms provide an expansion in terms of the number of loops.
An n-loop diagram will have n + 1 vertices, giving rise to a factor λn+1
and the n momentum integrals will give a factor (4π)−2n . The momentum
dependence of the type lnn (μ/μ0 ) arises from regions of integrals where all
the loop momenta have similar scales.
Since the β function was obtained in a perturbation series, you may wonder whether it is legitimate to integrate the resulting equation Eq. (4.178)
and obtain an expression like the one in Eq. (4.179) which contains arbitrary powers of the coupling constant. To see what is involved, let us write
the the series expansion of Eq. (4.179) formally as:
n
∞
μ
n
λphy (μ) = λ0 1 +
cn (μ)λ0 ; cn (μ) = β ln
(4.181)
μ0
n=1
In the case of the λφ4 theory, β = (3/16π 2 ) > 0, but let us consider a more
general situation for a moment. The series in Eq. (4.181) (as well as the
result in Eq. (4.179)) can indeed be interpreted as arising due to summing
up a subset of Feynman diagrams in a geometric progression. To check the
limits of its validity, we need to know what the higher loop corrections do
to this expansion. The effect of higher loop corrections to the β functions,
it turns out, is to modify cn (μ) to the form
n
μ
cn (μ) = β ln
+ O(ln ln(μ/μ0 ))
(4.182)
μ0
which, in turn, is equivalent to adding additional terms proportional to
ln ln μ to the denominator of Eq. (4.179). This means the term with (ln μ)n
is unaffected by higher order corrections to β, but at each order, cn will
also pick up terms of the order (ln μ)n−1 and smaller. These are the terms
which are missed when we use the one loop β function.
It follows that the re-summation of a special class of diagrams leading
to something like Eq. (4.179) is useful when ln(μ/μ0 ) 1; in this case, we
are actually picking the dominant term at high energies at each order of
n. But this, in turn, means that the procedure is really useful only when
β < 0. This is required because we simultaneously demand ln(μ/μ0 ) 1
(to validate the re-summation) and λphys 1 (to validate the perturbation
theory).57 This suggests that, by using the expression in Eq. (4.179), one
can improve significantly the accuracy of the perturbative expansion at
the scale μ, differing significantly from the scale μ0 at which the original
coupling constant λ0 was defined. In fact, even at the one loop level we
57 In both λφ4 theory — and in QED,
as we will see later — we have β > 0.
But we do have β < 0, in asymptotically free theories like QCD, which
makes the coupling constant smaller at
higher energies. In such theories, it
is computationally easier to isolate the
divergent parts — containing the poles
in (d − 4) — than to calculate the finite parts. The coefficient of the divergent part can be used to integrate the β
function equation and thus obtain the
running of the coupling constant in the
leading log approximation. This turns
out to be quite useful computationally.
176
Chapter 4. Real Life I: Interactions
notice that the effective coupling constant is not really λ0 but λ0 ln(μ/μ0 ).
So even if λ0 was small at the scale μ0 , the factor λ0 ln(μ/μ0 ) can become
large when μ is not comparable to μ0 . The denominator in Eq. (4.179) tells
us that λphy can be actually small even if λ0 ln(μ/μ0 ) is large (if β < 0),
thereby providing us with a better expansion parameter.
Let us now close the discussion on the running of m, which we came
across at the end of the last section. Once we determine the running of λ,
we can determine the running of m, from Eq. (4.152). Using the condition
that m0 cannot depend on the arbitrary scale λ, we get:
∂λ 1 1
∂m20
∂m2
λ
=0=μ
(4.183)
μ
1+
+ m2 μ
2
∂μ
∂μ
32π
∂μ 32π 2
which can be solved to give:
μ
∂m2
λ 1
∂λ 1 1
= −m2 μ
1
−
∂μ
∂μ 32π 2
32π 2
(4.184)
In this expression, we need to careful about evaluating μ(∂λ/∂μ) because
of the (1/ ) term. Using the first
expression in Eq. (4.175), we have
μ(∂λ/∂μ) = −2 λ 1 − (3λ/32π 2 ) , which leads to the result
∂m2
1 1
3λ 1
λ 1
μ
= −m2 (−2 λ) 1 −
1
−
∂μ
32π 2 32π 2
32π 2
λ
+ O(λ2 )
(4.185)
⇒ m2
16π 2
where the last relation holds when → 0. This is precisely what we obtained earlier in Eq. (4.77) from the effective action approach. With this
scaling, all physical results will be independent of the parameter μ.
All that remains is to evaluate the finite part, Ffin (p2 ) in Eq. (4.173).
If we do not worry about the analytic structure, Riemann sheets etc., this
can be done by using the result
√
1
√
1+b +1
4
√
dα ln 1 + α(1 − α) = −2 + 1 + b ln
(4.186)
b
1+b −1
0
to obtain the final answer as:
√
,
√
m2 e γE
s − 4m2 + s
1
4m2
ln √
ln
+ 1−
Ffin (s) =
√ −2
(4π)2
4πμ2
s
s − 4m2 − s
(4.187)
The original integral in Eq. (4.186) is valid only when b > 0, but we have
quietly continued the result to all values of p2 . While the final result
is correct, this misses some interesting physics contained in the branch
cuts of the scattering amplitude, which is discussed in the Mathematical
Supplement, Sect. 4.9.2.
4.8
Renormalized Perturbation Theory for the
λφ4 Model
In the previous sections, we studied the renormalization of the λφ4 model
in the perturbative approach. We started with the bare Lagrangian of the
4.8. Renormalized Perturbation Theory for the λφ4 Model
177
form
1
1
λ0 4
(∂m φ0 )2 − m20 φ20 −
φ
(4.188)
2
2
4! 0
where we have now introduced the subscript 0 to make it explicit that this
Lagrangian is written in terms of bare quantities. You have already seen
that the bare parameters m0 and λ0 get renormalized to mR and λR (now
denoted with subscript R for emphasis) order by order in the perturbation
theory. We haven’t seen any evidence for the renormalization of the field
φ itself, but it turns out that this will be required when we study two-loop
diagrams (see Problem 9). Just to be formally correct, we have also put a
subscript 0 to the field φ. We can now set up a perturbation expansion in
λ0 which will lead to the results discussed earlier.
The purpose of this brief section is to redo the analysis somewhat differently — using renormalized quantities and counter-terms — which has
significant formal advantage. As you will see, a similar procedure works
in the case of QED as well and learning it in the context of λφ4 theory is
somewhat simpler. In this spirit, we shall not work through the algebra
again, but will merely borrow the results from the previous discussions,
concentrating on the conceptual features.
We will begin by rescaling the field by φ0 = Z 1/2 φR and introducing
the three parameters (δZ , δm , δλ ) by the definitions
L=
Z = 1 + δZ ,
m20 Z = m2R + δm ,
Z 2 λ0 = λR + δλ
(4.189)
This reduces the Lagrangian to the form
1
λR 4
1
1
δλ
1
(∂m φR )2 − m2R φ2R −
φ + δZ (∂m φR )2 − δm φ2R − φ4R + ρ
2
2
4! R 2
2
4!
(4.190)
where we have added a constant ρ to cancel out any other infinite constants
we might encounter. This Lagrangian splits into two parts. We would
expect the first three terms to remain finite at all orders of perturbation
theory once we choose the constants (δZ , δm , δλ ) and ρ appropriately. What
is more, we will now do the perturbation theory with the renormalized
coupling constant λR , which makes a lot of physical sense.58 We already
know the Feynman diagrams corresponding to the first three terms of the
Lagrangian in Eq. (4.190). The three counter-terms involving (δZ , δm , δλ )
will generate new Feynman diagrams of very similar structure, as indicated
in Fig. 4.13.
With this preamble, let us again look at the four-point and two-point
functions of the theory. In the four-point function, we have the standard
diagrams which led to the results in Eq. (4.158) and Eq. (4.159). We will
now have to add to it the first diagram in Fig. 4.13, contributing a term
−iδλ. With a slight change in notation, the final amplitude in Eq. (4.158)
can be written as
L=
iM = −iλR + (−iλR )2 [iB(s) + iB(t) + iB(u)] − iδλ
where
B(p2 ) =
i
2
d4 k
i
i
2
4
2
(2π) k − mR + i (p1 + p2 + k)2 − m2R + i
58 Note, however, that at this stage
we have not identified the renormalized parameters with the physical mass
and the physical coupling constants,
though they will be closely related.
= −iδλ
(4.191)
= i(p2δZ − δm)
(4.192)
We know that the integral for B(p2 ) is infinite. As before, we shall handle
it using dimensional regularization which will lead to (see Eq. (4.173) and
Figure 4.13: The diagrams generated
by the counter-terms and their algebraic equivalents.
178
Chapter 4. Real Life I: Interactions
Eq. (4.170))
B(p2 ) = −
1
32π 2
1
dx
0
4πe−γE μ2
2
+ log
m2R − x(1 − x)p2
(4.193)
Substituting back, and doing a little bit of cleaning up, we get the final
result to be
1
iλ2R
6
(4.194)
iM = −iλR +
dx +
2
32π 0
(4πe−γE μ2 )3
log
− iδλ
[m2R − x(1 − x)s][m2R − x(1 − x)t][m2R − x(1 − x)u]
The integral in this expression is divergent, but we can cancel the divergence
by choosing δλ appropriately. Since one could have subtracted any finite
quantity along with the divergent result, the finite part — after eliminating
the divergence — has some ambiguity. This ambiguity is usually resolved
by choosing what is known as a subtraction scheme which we will now
describe.
There are essentially three subtraction schemes which are used extensively in the literature. The first one, called minimal subtraction (MS)
consists of cancelling out only the divergent part. This is done by choosing
λ2
δλ = R 2
32π
1
dx
0
3λ2R
6
=
16π 2
(4.195)
thereby leading to a well defined final result given by
iM
iλ2
= −iλR + R2
(4.196)
32π
1
(4πe−γE μ2 )3
×
dx log
[m2R − x(1 − x)s][m2R − x(1 − x)t][m2R − x(1 − x)u]
0
The result, of course, depends on the arbitrary scale μ, but we know how
to handle this by letting λR run with the scale.
The second subtraction scheme is a slight variant on MS usually called
MS in which one takes out the 4π and γE . This involves the choice
λ2
δλ = R 2
32π
0
1
6
λ2R 6
γE
γE
+ 3 log(4πe ) =
+ 3 log(4πe ) (4.197)
dx
32π 2
leading to the final expression
iM
59 This is not a priori equal to m ; we
R
will relate to mR in a moment.
iλ2
(4.198)
= −iλR + R2
32π
1
(μ2 )3
×
dx log
[m2R − x(1 − x)s][m2R − x(1 − x)t][m2R − x(1 − x)u]
0
The third possibility is to choose δλ such that λR is actually a physically
relevant observable quantity. To do this, we choose what is called a renormalization point at which the coupling constant λR is actually measured in
some experiment. One choice, for example, could be s0 = 4m2P , t0 = u0 = 0
where mP is called the pole mass.59 We then demand that, at the renor-
4.8. Renormalized Perturbation Theory for the λφ4 Model
179
malization point, M should be equal to the observed value λR . This leads
to the condition
1
(4πe−γE μ2 )3
iλ2R
6
+ log
− iδλ
−iλR = −iλR +
dx
32π 2 0
m4R [m2R − 4x(1 − x)m2P ]
(4.199)
which can be solved for δλ . Substituting back, the result we get now is
given by
−iλR +
iM =
×
1
iλ2R
32π 2
dx log
0
(4.200)
m4R m2R − 4x(1 − x)m2P
[m2R − x(1 − x)s][m2R − x(1 − x)t][m2R − x(1 − x)u]
Notice that, in this procedure — in contrast to the MS and MS schemes
— the final result is independent of the arbitrary scale μ. The result in
Eq. (4.200) has a very direct physical meaning. In this expression, λR
is defined to be the coupling constant measured in an experiment at the
renormalization point characterized by the scale mP . Once your experimentalist friend tells you what mP and λR are, you can predict the value
of M without any further arbitrariness.
Let us next consider what happens to the two-point function in this approach. Here, we will stick with the first two diagrams in Fig. 4.12 and add
to them the diagram arising from the counter-term (given by the second
one in Fig. 4.13). Adding up the relevant contributions and using the standard dimensional regularization, we can compute the resulting amplitude
to be
2
i
imR λR 2
4πe−γE μ2
iM =
+
+
1
+
log
p2 − m2R + i
32π 2
m2R
2
i
(4.201)
+ i(p2 δZ − δm )
2
p − m2R + i
We now have two parameters, δZ and δm , to play around with and remove
the divergences. As in the previous case, we need to choose a subtraction
scheme and the three schemes described earlier will again lead to three
different final expressions. First, in the MS scheme, we cancel out only the
divergent part which can be achieved by the choice
δZ = 0,
δm =
m2R λR
16π 2
(4.202)
leading to the result
iM =
2
im2R λR
4πe−γE μ2
i
i
+
1
+
log
p2 − m2R + i
32π 2
m2R
p2 − m2R + i
(4.203)
The μ dependence of the result remains and has to be compensated by the
running of mR as we have seen earlier. In the MS scheme, we take out the
4π and γE as well, by choosing:
m2R λR 2
−γE
δZ = 0, δm =
(4.204)
+ log 4πe
32π 2
Exercise 4.8: Prove Eqs. (4.196) to
(4.200).
180
Chapter 4. Real Life I: Interactions
which leads to the final result
2
2
μ
i
im2R λR
i
+
1 + log
iM = 2
p − m2R + i
32π 2
m2R
p2 − m2R + i
(4.205)
The third procedure will now involve interpreting mR as the physical
mass of the particle, which requires the propagator to have the structure
iM =
Exercise 4.9: Prove Eqs. (4.202) to
(4.208).
p2
i
+ (terms that are regular at p2 = m2R )
− m2R + i
(4.206)
That is, we demand the final propagator to have a simple pole at mR with
a residue i. These conditions require the choice
4πe−γE μ2
m2 λR 2
+ 1 + log
δZ = 0, δm = R 2
(4.207)
32π
m2R
thereby leading to the simple result where the pole mass is indeed mR :
iM =
p2
i
− m2R + i
(4.208)
Notice that we could do everything consistently keeping δZ = 0. In
other words, we do not have to perform any renormalization of the field
strength φ to this order. This is not true at higher orders (see Problem 9),
but with just the three counter-terms (δZ , δm , δλ ) one can handle the divergences at all orders of perturbation. This is why you call the λφ4 theory
as renormalizable.
4.9
4.9.1
Mathematical Supplement
Leff from the Vacuum Energy Density
We have seen earlier that Leff can be thought of as the vacuum energy of
the field oscillators (see Eq. (2.71)). It is possible to use this interpretation
directly to obtain the effective Lagrangian for the electromagnetic field
along the following lines.
First, note that, in the case of a pure magnetic field, the H in Eq. (4.21)
reduces to the harmonic oscillator Hamiltonian. Using the standard eigenvalues for harmonic oscillators, we can sum over the zero point energies
to obtain Leff . Differentiating this expression twice with respect to m2 ,
summing up the resulting series and integrating again, one can identify
the finite part which will lead to the result in Eq. (4.31) with E = 0.
The expression for the pure electric field can be obtained by noting that
Leff must be invariant under the transformation E → iB, B → −iE.
Getting the complete expression requires a little more more work: It involves noting that the eigenvalues of H in Eq. (4.21) depend on E only
through
the combination τ = (qE)−1 (m2 + qB(2n + 1)) and have the form
∞
E0 = n=0 (2qB)G(τ ). This expression can be summed up using Laplace
transform techniques, thereby leading to the result in Eq. (4.31).
It is easy to obtain Leff for a pure magnetic field. Let us suppose that
the background field satisfies the conditions E · B = 0 and B2 − E2 > 0.
In such a case, the field can be expressed as purely magnetic in some
4.9. Mathematical Supplement
181
Lorentz frame. Let B = (0, B, 0); we choose the gauge such that Ai =
(0, 0, 0, −Bx). The Klein-Gordon equation
[(i∂μ − qAμ )2 − m2 ]φ = 0
(4.209)
can now be separated by taking
φ(t, x) = f (x) exp i(ky y + kz z − ωt).
(4.210)
where f (x) satisfies the equation
d2 f
+ [ω 2 − (qBx − kz )2 ]f = (m2 + ky2 )f
dx2
(4.211)
This can be rewritten as
−
d2 f
+ q 2 B 2 ξ 2 f = f
dξ 2
(4.212)
kz
;
qB
(4.213)
where
ξ =x−
= ω 2 − m2 − ky2
Equation Eq. (4.212) is that of a harmonic oscillator with mass (1/2) and
frequency 2(qB). So, if f (x) has to be bounded for large x, the energy
must be quantized:
1
(4.214)
= ω 2 − (m2 + ky2 )
n = 2(qB) n +
2
Therefore, the allowed set of frequencies is
2
ωn = m +
ky2
1
+ 2qB n +
2
1/2
(4.215)
The ground state energy per mode is 2(ωn /2) = ωn because the complex
scalar field has twice as many degrees of freedom as a real scalar field. The
total ground state energy is given by the sum over all modes ky and n. The
weightage factor for the discrete sum over n in a magnetic field is obtained
by the correspondence:
qB dky
dkx dky
→
(4.216)
2π 2π
2π 2π
n
Hence, the ground state energy is
E0 =
+∞
∞
dky
qB
n=0
2π
−∞
(2π)
(ky2
1/2
1
+ m ) + 2qB(n + )
= −Leff (4.217)
2
2
This expression, as usual, is divergent. To separate out a finite part we will
proceed as follows: Consider the quantity
2
2
2π
2π
∂ E0
∂ Leff
I≡−
=
.
(4.218)
2
2
qB ∂(m )
qB ∂(m2 )2
182
Chapter 4. Real Life I: Interactions
which can be evaluated in the following manner:
∞
dky
1 +∞ 1
I =
(4.219)
4 n=0 −∞ 2π k 2 + m2 + 2qB(n + 1 )3/2
y
2
∞
∞
2
1
1
1
1 ∞
dηe−η(m +2qB(n+ 2 ))
=
=−
4π n=0 [m2 + 2qB(n + 12 )]
4π n=0 0
∞
2
1
1
dηe−ηm e−qBη
=
4π 0
1 − e−2qBη
2
∞
∞
2
2
2π
∂ Leff
1
e−ηm
1
e−ηm
=
=
dη qBη
=
dη
4π 0
e
− e−qBη
8π 0
sinh qBη
qB ∂(m2 )2
60 This
- and the subsequent expressions - have a divergence at the lower
limit of integration. This divergence
can be removed by subtracting the contribution with E = B = 0 as done in
the main text. The integration with
respect to m2 also produces a term
like (c1 m2 + c2 ) with two (divergent)
integration constants c1 and c2 . We
have not displayed this term here; this
divergence is also connected with the
renormalization of Leff .
The Leff can be determined by integrating the expression twice with respect
to m2 . We get60
∞
∞
2
2
dη e−ηm
dη e−ηm
qB
qBη
=
(4.220)
Leff =
.
.
2
3
(4π)2 0 η 2 sinh qBη
(4π)
η
sinh
qBη
0
If the Leff has to be Lorentz and gauge invariant then it can only depend
on the quantities (E 2 − B 2 ) and E · B. We will again define two constants
a and b by the relation
a2 − b 2 = E 2 − B 2 ;
ab = E · B
(4.221)
Then Leff = Leff (a, b). In the case of the pure magnetic field we are considering, a = 0 and b = B. Therefore, the Leff can be written in a manifestly
invariant way as:
∞
2
qbη
dη e−ηm
(4.222)
.
.
Leff =
2
3
(4π)
η
sinh
qbη
0
Because this form is Lorentz invariant, it must be valid in any frame in
which E 2 − B 2 < 0 and E · B = 0. In all such cases,
√
∞
2
dη e−ηm
qη B 2 − E 2
√
Leff =
.
(4.223)
(4π)2 η 3 sinh qη B 2 − E 2
0
The Leff for a pure electric field can be determined from this expression
if we analytically continue the expression even for B 2 < E 2 . We will find,
for B = 0,
∞
2
dη e−ηm qηE
Leff =
(4.224)
(4π)2 η 3 sin qηE
0
The same result can be obtained by noticing that a and b are invariant
under the transformation E → iB, B → −iE. Therefore, Leff (a, b) must
also be invariant under these transformations: Leff (E, B) = Leff (iB, −iE).
This allows us to get Eq. (4.224) from Eq. (4.220).
We will now consider the general case with arbitrary E and B for which
a and b are not simultaneously zero. It is well known that by choosing our
Lorentz frame suitably, we can make E and B parallel, say along the yaxis. We will describe this field [E = (0, E, 0); B = (0, B, 0)] in the gauge
Ai = [−Ey, 0, 0, −Bx]. The Klein-Gordon equation becomes
[(i∂μ − qAμ )2 − m2 ]φ
(4.225)
2
2
2
2
∂
∂
∂
∂
=
i + qEy + 2 + 2 − i
+ qBx − m2 φ = 0
∂t
∂x
∂y
∂z
4.9. Mathematical Supplement
183
Separating the variables by assuming
φ(t, x) = f (x, y) exp −i(ωt − kz z)
we get
∂2
∂2
+ 2
2
∂x
∂y
(4.226)
+ (ω + qEy)2 − (kz − qBx)2 f = m2 f
(4.227)
which separates out into x and y modes. Writing
f (x, y) = g(x)Q(y)
(4.228)
where g(x) satisfies the harmonic oscillator equation
d2 g
1
2
−
(k
−
qBx)
g
=
−2qB
n
+
g
z
dx2
2
(4.229)
we get
1
d2 Q
2
2
+
(ω
+
qEy)
Q
=
m
+
2qB
n
+
Q
dy 2
2
Changing to the dimensionless variable
ω
η = qEy + √
qE
we obtain
d2 Q
1
+ η2 Q =
dη 2
qE
(4.230)
(4.231)
1
m2 + 2qB(n + ) Q.
2
(4.232)
To proceed further, we use a trick. The above expression shows that the
only dimensionless combination which occurs in the presence of the electric
field is τ = (qE)−1 (m2 + qB(2n + 1)). Thus, purely from dimensional
considerations, we expect the ground state energy to have the form
E0 =
∞
(2qB)G(τ )
(4.233)
n=0
where G is a function to be determined. Introducing the Laplace transform
F of G, by the relation
∞
G(τ ) =
F (k)e−kτ dk
(4.234)
0
we can write
Leff = (2qB)
∞
n=0
0
∞
k
(m2 + qB(2n + 1))
dkF (k) exp −
qE
Summing the geometric series, we obtain
∞
2
Leff = 2(qB)(qE)
dsF (qEs)e−sm e−qBs .
=
=
2(qB)(qE)
(qB)(qE)
0
(4.235)
1
1 − e−2qBs
2
∞
F (qEs)e−sm
eqBs − e−qBs
0
∞
F (qEs) −m2 s
e
ds
sinh qBs
0
ds
(4.236)
184
Chapter 4. Real Life I: Interactions
We now determine F by using the fact that Leff must be invariant under
the transformation E → iB, B → −iE. This means that
∞
2
F (iqBs)
Leff = (qB)(qE)
(4.237)
dse−m s
− sinh(iqEs)
0
Comparing the two expressions and using the uniqueness of the Laplace
transform with respect to m2 , we get
F (iqBs)
F (qEs)
=−
sinh qBs
sinh(iqEs)
(4.238)
F (qEs) sin qEs = F (iqBs) sin(iqBs)
(4.239)
Or, equivalently,
Since each side depends only on either E or B alone, each side must be a
constant independent of E and B. Therefore
F (qEs) sin qEs = F (iqBs) sin(iqBs) = constant = A(s)
giving
2
∞
Leff = (qB)(qE)
ds
0
(4.240)
e−m s A(s)
sin qEs sinh qBs
(4.241)
The A(s) can be determined by comparing this expression with, say,
Eq. (4.220) in the limit of E → 0. We have
∞
ds −m2 s A(s)
e
Leff (E = 0, B) = qB
.
s
sinh qBs
0 ∞
2
ds
1
(4.242)
e−m s
= qB
2 s2
(4π)
sinh
qBs
0
implying
A(s) =
1
(4π)2 s
Thus we arrive at the final answer
∞
2
qBs
qEs
ds e−m s
Leff =
(4π)2 s3
sin qEs
sinh qBs
0
(4.243)
(4.244)
In the situation we are considering, E and B are parallel making a2 − b2 =
E 2 − B 2 and ab = E · B. = EB. Therefore E = a and B = b. Thus, our
result can be written in a manifestly invariant form as
∞
2
qas
qbs
ds e−m s
Leff (a, b) =
(4.245)
(4π)2 s3
sin qas
sinh qbs
0
This result will be now valid in any gauge or frames with a and b determined
in terms of (E2 − B2 ) and (E · B).
4.9.2
Analytical Structure of the Scattering Amplitude
In the text, we computed the φφ → φφ scattering amplitude to the one
loop order, ie., to O(λ2 ). There is a curious connection between the imaginary part of any scattering amplitude evaluated at one loop order and the
4.9. Mathematical Supplement
185
real part of the scattering amplitude at the tree-level. Since tree-level amplitudes do not have divergences, the amplitude we computed at one loop
level can have a divergence only in the real part. We will first prove this
connection which arises from the unitarity of the S-matrix and then briefly
describe how it relates to the analytical structure of the amplitude we have
computed.
Let us introduce the S-matrix somewhat formally and consider its structure. Given the fact that the states evolve according to the law |Ψ; t =
e−iHt |Ψ; 0 we can identify the S-matrix operator for a finite time t as
S = exp(−itH). Since the Hamiltonian is Hermitian with H † = H, we
have the unitarity condition for the S-matrix given by S † S = 1. It is conventional to write S = 1 + iT where the T matrix will be related to the
amplitude of transition between an initial and final states M(i → f ) by an
overall momentum conservation factor:
f |T |i = (2π)4 δ 4 (pi − pf )M(i → f )
(4.246)
(This M and hence the T is what you would compute using, say, the
Feynman diagrams). While S is unitary, T satisfies a more complicated
constraint, viz.,
(4.247)
i(T † − T ) = T † T
Using Eq. (4.247) and Eq. (4.246) we immediately obtain:
f |i(T † − T )|i =
=
∗
ii|T |f − if |T |i
(4.248)
∗
i(2π) δ (pi − pf ) (M (f → i) − M(i → f ))
4 4
Introducing a complete set of states, we can express the right hand side of
Eq. (4.247) as:
†
f |T T |i =
d Γψ f |T † |ψψ|T |i
ψ
=
(2π)4 δ 4 (pf − pψ )(2π)4 δ 4 (pi − pψ )
ψ
×
d Γψ M(i → ψ)M∗ (f → ψ)
(4.249)
Therefore, we conclude that the unitarity of the S-matrix implies the relation
M(i → f ) − M∗ (f → i)
(4.250)
=i
d Γψ (2π)4 δ 4 (pi − pψ )M(i → ψ)M∗ (f → ψ)
ψ
which goes under the name generalized optical theorem because of historical
reasons. Notice that the left hand side is linear in the amplitudes M while
the right hand side is quadratic in the amplitudes. This relation must hold
order-by-order in any perturbation theory in some coupling constant, say,
λ. This implies that, at O(λ2 ), for example, the left hand side must be a
loop amplitude if it has to match a tree-level amplitude on the right hand
side.61
We will now compute the imaginary part of the scattering amplitude
and determine its analytical structure. In the process, we will also be
61 Therefore, the imaginary parts of
loop amplitudes at a given order are
related to products of tree-level amplitudes and a theory without loops will
be inconsistent within this formalism.
Im
=
2
Figure 4.14: The result in diagram-
matic form.
186
Chapter 4. Real Life I: Interactions
able to verify the above relation in this simple case — which, in terms
of Feynman diagrams, is given in Fig. 4.14. Let us now get back to the
explicit evaluation of Ffin (p2 ) in Eq. (4.173). We see that the integrand in
Eq. (4.173) involves the logarithm of the factor m2 − α(1 − α)p2 . We write
this expression as
m2 − α(1 − α)p2 = m2 + α(1 − α)p2 − α(1 − α)p20
(4.251)
The sum of the first two terms is greater than or equal to m2 for 0 ≤ α ≤ 1
while the second term is less than or equal to p20 . Therefore, the whole
expression is greater than or equal to (m2 − p20 ), making the argument of
the logarithm positive in the original range 0 < p0 < m which we considered
and also, obviously, for p2 < 0. The logarithm is well defined in this range
and our aim now is to analytically continue the expression for all values of
p0 . To do this when the argument of the logarithm becomes negative, we
have to use the proper Riemann sheet to define the function in the entire
complex plane. We will now show how this can be done.
We continue working in the Euclidean space (and have again dropped
the subscript E in p2E ) and do an integration by parts to obtain
2
1
m
m2 + α(1 − α)p2
α(1 − 2α)
2
= ln
−p
dα ln
dα 2
2
2
4πμ
4πμ
m
+ α(1 − α)p2
0
0
(4.252)
Introducing a Dirac delta function into the integrand by
∞
m2
1=
(4.253)
ds δ s −
α(1 − α)
0
1
we can write the integral in the form of a dispersion relation:
1
∞
α(1 − 2α)
ρ(s )
−
dα 2
=
ds
m + α(1 − α)p2
s − s
0
0
(4.254)
where
ρ(s ) =
=
m2
1 − 2α
δ s −
1−α
α(1 − α)
0
1
1
m2
−
dα α(1 − 2α) δ α(1 − α) −
s 0
s
−
1
dα
(4.255)
and we have set p2 = −s. Factorizing the quadratic form in the delta
function, it is easy to see that ρ(s ) is given by
⎧
⎪
0
(s < 4m2 ),
⎪
⎪
⎨
ρ(s ) =
(4.256)
⎪
2
⎪
⎪ 1 1 − 4m
2
⎩
(s ≥ 4m )
s
s
Since s plays a role similar to m2 , the prescription m2 → m2 − i is the
same as an i prescription on s. Using this, we can express our result in
the form
2 γE
∞
s
ρ(s )
1
m e
Ffin (s) =
+
ln
ds
(4.257)
(4π)2
4πμ2
(4π)2 4m2
s − s + i
4.9. Mathematical Supplement
187
We now see, using the result
Im
1
= π δ(x)
x − i
(4.258)
for s > 4m2 , that the imaginary part of the amplitude is given by
1
s
4m2
ρ(s) = −
(4.259)
1−
Im Ffin (s) = −
16π
16π
s
It is easy to show, by explicit computation, that this expression can also
be written in the form
d3 k
1
1
d3 q
Im F (p) = −
(2π)4 δ(p − k − q) ≡ − Γ(2) (p)
3
3
2
(2π) 2Ek (2π) 2Eq
2
(4.260)
where Γ(2) (p) is the volume of the two-body phase space. From this relation, it is easy to verify that our general theorem relating the imaginary
part of the loop amplitude to the real part of tree-level amplitude does
hold.
Further, because of the analytic structure of the scattering amplitude,
Ffin is uniquely determined by: (i) its discontinuity across the cut, viz.,
2i ImFfin , (ii) its asymptotic behaviour and (iii) its value at some point.
The asymptotic behaviour is given by
∞
∞
1
ρ(s )
1
ds
=
lim
s
ds
−
lim s
= ln s + const.
|s|→∞
s − s
|s|→∞
s
s − s
4m
const.
(4.261)
while the value at s = 0 is given by
2 γE
m e
1
ln
(4.262)
Ffin (0) =
(4π)2
4πμ2
A function which has all these properties is indeed given by:
√
,
√
1
4m2
m2 e γE
s − 4m2 + s
ln √
ln
+ 1−
Ffin (s) =
√ −2
(4π)2
4πμ2
s
s − 4m2 − s
(4.263)
which is, therefore, the required expression for the finite part of the scattering amplitude.
Exercise 4.10: Do this.
Chapter 5
Real Life II: Fermions and
QED
5.1
Understanding the Electron
The formalism of quantum field theory we have developed in the previous
chapters allows us to think of (what we usually call) particles as quantum
excitations of (what we usually call) fields. The clearest example is that
of a photon which arises as an excitation of the electromagnetic field. We
have also discussed how we can describe the interaction of the particles,
say, e.g., their scattering, starting from a field theoretical description. We
did all these using a scalar field and the electromagnetic field as prototypes.
It turns out that we run into new issues when we try to describe
fermionic particles — the simplest and the most important one being the
electron — using this formalism. We need to cope with several new features and extend our formalism to cover these. The purpose of this —
rather long — last chapter is to orient you towards understanding these
issues and illustrate the (suitably extended) formalism in the case of quantum electrodynamics, which describes the quantum interaction of electrons
with photons.
The first issue in dealing with the electron, which hits you even in
NRQM, is that it is a spin-half particle. A spinless non-relativistic particle
can be studied using the Schrodinger equation iψ˙ = Hψ where |ψ(t, x)|2
will give you the probability to find the particle around the event (t, x).
But an electron could be found around this event with spin up (along, say,
the z−axis) or spin down or, more generally, in a linearly superposed spin
state. So, at the least, we need to describe it with a two component wave
function (ψ+ (t, x), ψ− (t, x)) corresponding to the spin up and spin down
states.
If you think of (ψ+ , ψ− ) as a column vector, then the Hamiltonian H
in the Schrodinger equation iψ˙ = Hψ probably should be some kind of a
2 × 2 matrix. What is more, the doublet (ψ+ , ψ− ) must have some specific
transformation properties when we rotate the coordinate systems, changing
the direction of the z−axis. (Such a non-trivial transformation of the wave
function was not required in the study of spinless particles in NRQM.)1
What we have said so far, in some sense, is just the kinematic aspect
of the spin. Spin also has a dynamical aspect in the sense that it carries
© Springer International Publishing Switzerland 2016
T. Padmanabhan, Quantum Field Theory, Graduate Texts in Physics, DOI 10.1007/978-3-319-28173-5_5
1 You might have learnt all these in a
QM course and might have even known
that the description of spin is quite
closely related to the 3-dimensional rotation group SO(3) or — more precisely — to SU (2). If not, don’t worry.
We will do all these and more as we go
along.
189
190
2 Mnemonic (with c = 1): The magnetic moment of a current loop is μ =
JA where J is the current and A is the
area of the loop. For an orbiting electron, J is the charge/orbital period;
i.e. J = eω/2π and A = πr 2 with
L = mωr 2 . So you get μ = JA =
(e/2π)(ωπr 2 ) = (e/2m)L.
3 This behaviour of the fermions can
be understood from fairly sophisticated arguments of quantum field theory. Occasionally someone comes up
with a fraudulent “proof” of this result
within NRQM, which never stands up
to close scrutiny. No amount of your
staring at the relevant non-relativistic
Hamiltonian will allow you to discover
the Pauli exclusion principle. It is also
not like the relativistic corrections to a
non-relativistic Hamiltonian which can
be understood in some Taylor series in
v/c. The “strength” of the Pauli exclusion principle cannot be measured in
terms of some small parameter in the
problem which takes you from QFT to
QM continuously! Here is an O(1) effect which manifests itself in the approximate theory and is actually a relic
of the more exact theory. This is a peculiar situation with no other parallel
in theoretical physics. When a result
can be stated quite simply, but requires
very elaborate machinery for its proof,
it is usually because we have not understood the physics at an appropriately deep level.
Chapter 5. Real Life II: Fermions and QED
with it a magnetic moment, which couples to any external magnetic field.
Observations tell us that the magnetic moment associated with the spin
angular momentum s = /2 is given by μ = e/2m = (e/2m)(2s). On
the other hand, the orbital angular momentum L of a charged particle
(like an electron) produces a magnetic moment2 given by (e/2m)L. So
the spin angular momentum couples with twice as much strength as the
orbital angular momentum. Why the electron spin should behave like an
angular momentum, and why it should produce a coupling with twice as
much strength are questions for which NRQM has no clear answer; they
just need to be taken as phenomenological inputs while constructing the
Hamiltonian for the electron in an electromagnetic field. As we shall see,
making things consistent with relativity helps in this matter to a great
extent.
The second dynamical aspect related to electron (and its spin) is a lot
more mysterious. To bring this sharply in focus, let us consider a system
with two electrons (like e.g., the Helium atom). Suppose you manage to
solve for the energy eigenstates of the (non-relativistic) Hamiltonian for this
system, obtaining a set of wave functions ψE (x1 , x2 ). (We have suppressed
the spin indices for the sake of simplicity.) Your experimental friend will
tell you that not all these solutions are realized in nature. Only those
solutions in which the wave function is antisymmetric with respect to the
exchange x1 ⇔ x2 occur in nature. This is a characteristic property of
all fermions (including the electron) which should also be introduced as an
additional principle3 (the Pauli exclusion principle) in NRQM.
You could easily see that Pauli exclusion principle is going to throw
a spanner into the works when we try to think of an electron as a oneparticle excitation of a field, if we proceed along the lines we did for, say,
the photon. In that approach, we start with a vacuum state |0, and act on
it by a creation operator a†α to produce a one-particle state a†α |0 ∝ |1α ,
act on |0 twice to produce a two-particle state a†α a†β |0 ∝ |1α , 1β etc.
Here, α, β etc. specify the properties of the excitations in that state like
the spin, momentum etc. Since a†α commutes with a†β , these states are
necessarily symmetric with respect to the interchange of particles; i.e.,
|1α , 1β = |1β , 1α . In particular, one can take the limit of α → β leading
to a state with two particles with the same properties (spin, momentum
...), by using (a†α )2 |0 ∝ |2α .
This procedure simply won’t do if we have to create electrons starting
from the vacuum state. To begin with we need |1α , 1β = −|1β , 1α to
ensure antisymmetry. If we denote the creation operator for electron by
b†α , then we must have b†α b†β = −b†β b†α . So the creation operator for the
fermions must anti-commute rather than commute:
{b†α , b†β } = 0
(5.1)
In particular, this ensures that (b†α )2 |0 = 0 so that you cannot create two
electrons in the same state α thereby violating the Pauli exclusion principle.
A couple of operators anticommuting — rather than commuting — may
not disturb you too much, given the sort of strange things you have already
seen in QFT. But the situation is a bit more serious than that. Recall that
we build the field by multiplying the creation and annihilation operators
by functions like exp(±ipx). This form of the mode function is dictated
by translational invariance and the fact that we are using the momentum
basis; you need this mode function to describe particles with momentum
5.2. Non-relativistic Square Roots
191
k and energy ωk ≡ (k2 + m2 )1/2 . So, for the field ψ(x) describing the
electron, you would expect an expansion which goes something like4
ψ(x) =
α
d3 k
†
−ikx
ikx
√
b
u
e
+
d
v
e
αk
αk
αk
αk
(2π)3 2ωk
(5.2)
Here we have separated out the spin variable α from the momentum variable k and introduced a creation operator b†αk capable of creating an electron (with spin orientation α and momentum k) and an operator d†αk for
creating its antiparticle, viz., the positron. The c-number coefficients uαk
and vαk take care of the normalization and the correct Lorentz transformation properties in the presence of spin. (This is just kinematics — but
important kinematics — which we will describe in detail later on.) The
key point to note is that ψ(x) will inherit the anti-commuting properties
of b†αk and dαk . In fact, we would expect something like ψ 2 (x) = 0 to hold
for the field describing an electron, which is rather strange.5
We will now take up these issues and develop a field-theoretic formalism
for electrons. For this purpose, we will start by asking fairly innocuous
questions.
5.2
5.2.1
Non-relativistic Square Roots
4 This is exactly in analogy with what
we did for the complex scalar field in
Sect. 3.5, Eq. (3.148) and for the photon in Sect. 3.6.1 , Eq. (3.172). For the
complex scalar field, we did not have to
worry about the spin while for the photon we did not have to worry about a
distinct antiparticle. For the electron,
we need to introduce both the complications.
5 Unlike the electromagnetic field
which definitely has a classical limit or
a scalar field for which we can cook
up a classical limit if push comes to
shove, there is no way you are going to
have a classical wave field the quanta
of which will be the electrons — or any
fermions for that matter. Fermions, in
that sense, are rather strange beasts of
which the world is made of.
Square Root of pα pα and the Pauli Matrices
We have seen that the two key new features related to electrons involve (i)
the relation between its spin and magnetic moment and (ii) some kind of
inherent anticommutativity of the operators. There is a rather curious way
in which these ideas can be introduced even in NRQM which will help us
to understand the relativistic aspects better. With this motivation, we will
start by asking a question which, at first sight, has nothing to do with the
electron (or even with physics!).
Consider a 3-vector, say, the momentum p of a particle with components
pα in a Cartesian coordinate system. The magnitude of this vector |p| ≡
(−ηαβ pα pβ )1/2 scales6 linearly with its components in the sense that, if
you rescale all the components by a factor λ, the magnitude |p| changes
by the same factor λ. This prompts us to ask the question: Can we write
the magnitude as a linear combination of the components in the form |p| ≡
(−ηαβ pα pβ )1/2 = Aα pα with some triplet Aα that is independent of the
vector in question? For this result to hold, we must have 7
−ηαβ pα pβ = Aα Aβ pα pβ =
1
{Aα , Aβ } pα pβ
2
(5.3)
which requires the triplet Aα to satisfy the anti-commutation rule
{Aα , Aβ } = −2ηαβ = 2δαβ
(5.4)
Obviously, the Aα s cannot be ordinary numbers if they have to behave
in this way. The simplest mathematical structures you would have come
across, which will not commute under multiplication, are matrices. So
we might ask whether there exist three matrices Aα which satisfy these
relations. Indeed there are, as you probably know from your NRQM course,
6 With
our signature, (−ηαβ ) is just
a fancy way of writing a unit 3 × 3
matrix.
7 In this discussion, we will not distinguish between the subscripts and superscripts of Greek indices corresponding to spatial coordinates when no confusion is likely to arise.
192
Chapter 5. Real Life II: Fermions and QED
and these are just the standard Pauli matrices; that is, Aα = σα where, for
definiteness, we take them to be:
0 1
0 −i
1 0
, σy =
, σz =
(5.5)
σx =
1 0
i 0
0 −1
Before proceeding further, let us pause and recall some properties of the
Pauli matrices which will be useful in our discussion.
(1) They satisfy the following identity: (Again, as far as Pauli matrices
are concerned, we will not worry about the placement of the index as
superscript or subscript; σα = σ α etc., by definition.)
σα σβ = δαβ + i αβγ σ γ
8 Mnemonic: You know that J =
σ/2 satisfies the standard angular momentum commutation rules [Jα , Jβ ] =
iαβγ Jγ which explains the factor 2.
(5.6)
From which it follows that8 :
[σα , σβ ] = 2i αβγ σ γ ;
{σα , σβ } = 2δαβ
(5.7)
(2) For any pair of vectors a, b we have:
(σ · a) (σ · b) = a · b + iσ · (a × b)
9 You
would have seen a special case
of this expressed as exp −[(iθ/2)(σ ·
n)] = cos(θ/2) − i(σ · n) sin(θ/2) in
NRQM; but it is more general.
(5.8)
(3) Any arbitrary function of Q ≡ a + b · σ can be reduced to a linear
function of σ! That is, for any arbitrary9 function f (Q), we have
f (Q) = A + B · σ
(5.9)
with
A=
1
[f (a + b) + f (a − b)],
2
B=
b
[f (a + b) − f (a − b)]
2b
(5.10)
Getting back to our discussion of square roots, we see that the square
root of |p|2 can be thought of as the 2 × 2 matrix, p ≡ σα pα , in the sense
that (σα pα )(σβ pβ ) = |p|2 . Since the left hand side is a 2 × 2 matrix, we
need to interpret such relations as matrix identities, introducing the unit
2 × 2 matrix wherever required.
5.2.2
10 Or rather, the Pauli equation, after
Pauli who came up with this.
Spin Magnetic Moment from the Pauli Equation
This might appear to be an amusing curiosity, but it will soon turn out
to be more than that. We know that the Hamiltonian for a free electron
is usually taken to be H = p2 /2m = (−i∂α )2 /2m. But with our enthusiasm for taking novel square roots, one could also write this free-particle
Hamiltonian and the resulting Schrodinger equation10 in the form:
H=
(−iσ · ∇) (−iσ · ∇)
;
2m
iψ˙ =
(−iσ α ∂α )(−iσ β ∂β )
ψ
2m
(5.11)
Because of the identity Eq. (5.8) and the trivial fact p × p = 0, it does not
matter whether you write p2 or (p · σ)(p · σ) in the Hamiltonian for a free
electron. But suppose we use this idea to describe — not a free electron —
but an electron in a magnetic field B = ∇ × A. We know that the gauge
invariance requires the corresponding Hamiltonian to be obtained by the
5.2. Non-relativistic Square Roots
193
replacement p → p − eA. So we now have to describe an electron using
the Hamiltonian
Hem
=
=
=
1
σ · (p − eA) σ · (p − eA)
2m
i
1
(p − eA)2 +
σ · [(p − eA) × (p − eA)]
2m
2m
1
e
2
(p − eA) −
σ·B
2m
2m
(5.12)
where we have used the identity
(p × A) + (A × p) = −i(∇ × A)
(5.13)
The first term in the Hamiltonian in Eq. (5.12) is exactly what you would
have obtained if you had used H = p2 /2m as the Hamiltonian and made
the replacement p → p − eA; that would have been the correct description
of a spinless charged particle in a magnetic field. For a constant magnetic
field B with A = (1/2)B × x, this term will expand to give
1
2
(p − eA)
2m
=
2
e
e
1
p2
p− B×x ≈
−
(B × x) · p
2m
2
2m 2m
p2
e
−
B · L + O(B 2 )
(5.14)
2m 2m
with L = x × p being the orbital angular momentum. One can see that,
to the linear order in the magnetic field, the orbital angular momentum
contributes a magnetic moment m = (e/2m)L.
But we have a surprise in store when we look at the second term in
Eq. (5.12). This tells you that the spin s ≡ (/2)σ of the electron contributes a magnetic moment m = (e/m)s, making the coupling of the
spin to the magnetic field is twice as strong as that of orbital angular momentum. When we take into account both the orbital and spin angular
momentum, the relevant coupling term in the Hamiltonian will be given
by11
e
(L + 2s) · B
(5.15)
Hcoupling = −
2m
This is precisely what is seen
in spectroscopic observations. So our trick of
taking the square root by p2 = p · σ and constructing the Hamiltonian
from this, produces the correct coupling of the electron spin to the magnetic
field observed in the lab!
5.2.3
Exercise 5.1: Prove the following
claim. One needs to be careful in general, while computing (p − eA)2 , because A(x) and p won’t commute. But
when we use the gauge ∇ · A = 0, as
we have done, A · p = p · A and we can
be careless.
11 The ratio between the magnetic moment and the angular momentum is
usually written as eg/2m and we see
that g = 1 for the orbital angular momentum while g = 2 for the electron
spin. (Actually it is g = 2 + (α/π) but
that part of the story comes later.) In
normal units, the ratio is (e/2mc)g.
How does ψ Transform?
So, clearly, the way you take the square root of p2 changes physics and
it is worth looking more closely12 at the modification in Eq. (5.11). Our
modification of the Schrodinger equation will require interpreting ψ as a
two-component column vector ψA (called a spinor ) with A = 1, 2. The
first worry you should have with such an equation is what happens when
we rotate the coordinate system by the standard linear transformation
xα → xα , where
xα = Mγ α xγ ;
Mγ α ≡
∂xα
;
∂xγ
Mγ α Mα β = δγβ
(5.16)
12 More precisely, we should be looking at Eq. (5.12) but this distinction is
unimportant for what we are going to
do.
194
Chapter 5. Real Life II: Fermions and QED
which leaves xα xα invariant. For a general rotation, the form of the matrix
Mγ α is a bit complicated, but — to study the invariance properties —
we can get away by just looking at infinitesimal rotations. Under such a
rotation, by an infinitesimal angle θ around an axis represented by a unit
vector n, the coordinates change according to x = x+(θ)n×x ≡ x+θ ×x
where θ ≡ θn. This can be written in component form as:
xα = xα + [θ × x]α = xα + αβγ θβ xγ ≡ xγ [δγα + ωγ α ] ≡ Mγ α xγ
(5.17)
where we have defined
ωγα ≡ γαβ θβ ;
13 Hopefully you know that it is best
to think of rotations as “rotations in a
plane”. When D = 3, thinking of a rotation in the xy plane is equivalent to
thinking of it as a rotation about the
z-axis; but if you are in a D = 4 Euclidean space (x, y, z, w), you can still
talk about rotations in the xy-plane
but now you don’t know whether it is
a rotation about the z-axis or w-axis!
All these happen for the same reason
as why you can define a cross product
A × B giving another vector in D = 3
but not in other dimensions; it is related to the curious fact that 3− 2 = 1.
14 Sanity check: If you start with a linear transformation for the components
as ψ1 = aψ1 + bψ2 ; ψ2 = cψ1 + dψ2
(with 4 complex or 8 real parameters)
you will find that the bilinear form
ψ1 φ2 − ψ2 φ1 transforms to (ad − bc)
times itself. This requires it to represent a spin-zero state and hence we
must have (ad − bc) = 1. Further, the
demand that the |ψ1 |2 + |ψ2 |2 should
be a scalar, will require a = d∗ ; b =
−c∗ . These 1+4 conditions on the 8
parameters, show that the linear transformation has 3 free parameters, exactly what a rotation has.
15 The
right hand side is what you will
write as σα if the σα -s transformed
like the components of a vector. The
left hand side tells you that this is almost true, except for a similarity transformation by a matrix R.
θμ =
1 μαβ
ωαβ
2
(5.18)
The antisymmetry of ωαβ is required for (and guarantees) |x|2 to remain
invariant under such a rotation.
Before proceeding further, we should confess to the crime we are committing by talking about “rotations about an axis”13 . To every rotation
described by three parameters (say, θα ≡ θnα with n2 = 1) we can also
associate a 3-dimensional antisymmetric tensor, ωαγ , defined in Eq. (5.18)
with three independent components, such that ω12 = θn3 etc. While θn3
makes you think of “rotation about the z-axis”, the ω12 makes you think
of “rotation in the xy-plane”, which is more precise and generalizes easily
to D = 4.
For this approach to give sensible results, we would like quantities like
σ α ∂α ψ to transform like ψ does; that is, the σ α should behave “like a
vector” under rotation. The operator ∂α does change under the rotation
to ∂β = (∂xα /∂xβ )∂α but the σ α , being a mere set of numbers, does
not change. Further we do expect the components of ψ to undergo a
linear transformation14 of the form ψ = Rψ where R is a 2 × 2 matrix.
(Physically, ψ1 should give you the amplitude for a spin-up state along,
say, the original z− axis which has now been rotated. So ψ better change
when we rotate coordinates.) The question is whether we can associate a
2 × 2 matrix R with each rotation such that σ α ∂α ψ → R[σ α ∂α ψ] under
that rotation.
We can do this if we can find a matrix R such that, for a general
rotation specified by a rotation matrix Mγ α in Eq. (5.16), the following
relation holds15 :
R−1 σ α R = Mγ α σ γ
(5.19)
On multiplying both sides of this equation by ∂α and using Mγ α ∂α =
(∂xα /∂xγ )∂α = ∂γ we immediately get
R−1 [σ α ∂α ]R = σ γ ∂γ ;
σ α ∂α = R[σ γ ∂γ ]R−1
(5.20)
Multiplying the second relation by ψ = Rψ, side by side, we get
σ α ∂α ψ = R[σ γ ∂γ ψ]
(5.21)
which is what we wanted. In fact, all the powers of this operator, viz.
[σ γ ∂γ ]n , — especially the square, which occurs in H — transform in the
same manner. So the Pauli equation will indeed retain its form under
rotations.
All that we have to do now is to find an R such that Eq. (5.19) holds.
This is again easy to do by considering infinitesimal rotations (and ‘exponentiating’ the result to take care of finite rotations). If we write R = 1−iA
5.3. Relativistic Square Roots
195
(where A is a first order infinitesimal matrix that needs to be determined),
we have, for any matrix Q, the result R−1 QR = Q + i[A, Q]. So the left
hand side of Eq. (5.19) is R−1 σ α R = σ α + i[A, σ α ]. The right hand side of
Eq. (5.19), for an infinitesimal rotation, is (σ α + αβγ θβ σ γ ). (This is same
as Eq. (5.17) with x replaced by σ.) So the A has to satisfy the condition
[σα , A] = i αβγ θβ σ γ . From the first relation in Eq. (5.7), we immediately
see that the choice A = (1/2)σα θα = (1/2)θ · σ satisfies this relation! So
we have succeeded in finding an R such that Eq. (5.19) holds:
R = 1 − (iθ/2)(σ · n)
(5.22)
The result for rotation by a finite angle is given by the usual exponentiation:
θ
iθ
θ
R(θ) = exp − (σ · n) = cos
− i(σ · n) sin
(5.23)
2
2
2
In the process we have discovered how the two components of the spinor
ψA transform under this rotation: ψ = Rψ (interpreted as a matrix equation). Since ψ = Rψ looks like x = M x, you might think spinors are ‘like
vectors’; well, it is true, but, only to a limited extent. If you rotate about
any axis by 2π, vectors — and most of the stuff in the lab — do not change
but spinors flip sign, thanks to the θ/2 in R! Clearly, something peculiar
is going on and we will come back to this issue later while discussing the
Lorentz group.16
This transformation law ψ A = RAB ψ B for the two-component spinors
ψ A could in fact be taken as the definition of a 2-spinor. This will demand
the complex conjugate to transform as ψ ∗ → R∗ ψ ∗ but the probability
density |ψ1 |2 + |ψ2 |2 = ψ † ψ (summed over the spin states) should be a
scalar under rotations. This is assured because σ † = σ and R† = R−1
making (ψ † ψ) = (ψ † R† )(Rψ) = (ψ † R−1 )(Rψ) = (ψ † ψ).
As we said before it is a crime to think of “rotations about an axis” but
that issue is easily remedied. We have
i
1 αβγ
1 αβγ
α
ωβγ =
σα ωβγ = − [σ β , σ γ ]ωβγ
(5.24)
σα θ = σα
2
2
4
which allows us to write R in Eq. (5.23) as
i
i
αβ
R = exp − ωαβ S
; S αβ ≡ − [σ α , σ β ]
2
4
(5.25)
Instead of thinking of σα being associated with rotations “about the xα axis”, we now think of S βγ as being associated with rotations “in the βγ
plane” which is a much better way to think of rotations.17 After this
preamble, let us move along and see what relativity does to all these.
5.3
5.3.1
16 There is a simple connection between the factor −1 arising from a
2π rotation and the antisymmetry of
the wave functions describing a pair of
electrons. To see this, consider two
electrons located at x and −x with
spins along the z-axis. Such a state
is represented by the product of two
spinors Ψ12 ≡ ψup (x)ψup (−x). If we
now perform a rotation around the zaxis by π, we will end up interchanging the two particles without affecting
their spins. Such a rotation will introduce a factor R = −i for each spinor
so that Ψ12 → −Ψ21 . Therefore, the
wave function picks up a factor −1
when the particles are interchanged.
Relativistic Square Roots
Square Root of pa pa and the Dirac Matrices
The reason we spent so much of time with the “curiosity” p2 = (p·σ)(p·σ)
is because it generalizes rather trivially to the relativistic context. You just
go from D = 3 to D = 4, the rotation group to the Lorentz group and Pauli
matrices to Dirac matrices! Let us see how.
17 A precise characterization of what
the phrase “being associated with”
means will require some group theory.
We will do it in Sect. 5.4.
196
18 Notation alert: We treat the index
a of γ a like a four-vector index so that
we can define γa ≡ ηab γ b etc. Purely
algebraically, we will then have γ a pa =
γb pb etc., which is convenient.
Chapter 5. Real Life II: Fermions and QED
In the relativistic case, we would have loved to write down some
kind of
“Schrodinger equation” of the form iφ˙ = Hφ with H given by p2 + m2
(see Sect. 1.4.6). Taking a cue from the previous analysis, one would like
to express this square root as a linear function of p and m in the form
(αm+β·p). This will lead to what we want but there is a more elegant way
of doing it, maintaining formal relativistic invariance, which is as follows.
We know that the content of the wave equation describing the relativis2
tic free particles is given by (p2 − m
)ψ(x) = 0 with pa ≡ i∂a . Our aim
now is to think of the square root p2 , in the relativistic case, as being
given by γ i pi where the
γ i need to be determined.18 Then we
four entities
√
2
2
can write the relation p = m as γ a pa = m; so our wave equation can
now take the form:
(γ a pa − m)ψ ≡ (iγ a ∂a − m)ψ = 0
(5.26)
This is quite straightforward and we play the same game which led to
Eq. (5.3) but now in 4-dimensions. We need
η ij pi pj = (γ i pi )(γ j pj ) = (γ i γ j )pi pj =
1 $ i j%
γ , γ pi pj
2
(5.27)
As in the case of Eq. (5.4), we now need to discover a set of four anticommuting matrices which satisfy the condition
$ i j%
γ , γ = 2η ij
(5.28)
Exercise 5.2: Do it yourself!
19 Very imaginatively called gamma
matrices.
Though we will not bother to prove it, you cannot satisfy these relations
with 2×2 or 3×3 matrices. The smallest dimension in which the relativistic
square root works is in terms of 4 × 4 matrices. One useful set of matrices19
which satisfy Eq. (5.28) is given by
0 1
0
σα
,
γα =
(5.29)
γ0 =
1 0
−σ α 0
(This particular choice of matrices which obey Eq. (5.28) is called the Weyl
representation). This can be written more compactly as
0 σm
m
(5.30)
γ =
σ
¯m 0
where we have upgraded the Pauli matrices to 4-dimensions with the notation:
σ
¯ m = (1, −σ α )
(5.31)
σ m = (1, σ α ),
Here, σ α are the standard Pauli matrices and 1 stands for the unit matrix.
As you might have guessed, the form of γ-matrices is far from unique and
which set you choose to use depends on what you want to do. Another set
(called the Dirac representation) which we will occasionally use is given by:
1 0
0
σα
,
γα =
(5.32)
γ0 =
0 −1
−σ α 0
It is convenient, for future purposes, to define the commutators of the
gamma matrices by the relation:
σ mn ≡
i m n
[γ , γ ]
2
(5.33)
5.3. Relativistic Square Roots
197
In the Weyl representation, the explicit form of these commutators are
given by
α
κ
σ
0
0
αβ
αβκ σ
σ 0α = −i
=
,
σ
(5.34)
0 −σ α
0 σκ
So, in this representation, σ αβ is just block diagonal matrix built from
Pauli matrices; for e.g., σ 12 = dia(σ 3 , σ 3 ).
So we have successfully taken a relativistic square root and (given a set
of γ-s), we can explicitly write down the equation Eq. (5.26). We will now
enquire what an equation like Eq. (5.26) could possibly mean and how does
it behaves if we perform a rotation (which we needed to worry about even
in D = 3) or a Lorentz boost (which is a D = 4 feature!).
5.3.2
How does ψ Transform?
To begin with, the ψ in Eq. (5.26) is a 4-component20 object ψα with
α = 1, 2, 3, 4 which is acted upon by the 4 × 4 matrix from the left. Since
we are already familiar with a 2-component spinor describing the electron
from the previous section, let us write the column vector ψ in terms of a
pair of quantities ψL and ψR , each having two components, and investigate
this pair.21 That is, we write
ψL
(5.35)
ψ=
ψR
20 These, of course, are not spacetime
indices.
21 The subscripts L and R will not
make sense to you at this stage but will
become clearer later on; right now, just
think of them as labels.
If we expand out the 4 × 4 matrix equation Eq. (5.26) — which we will
hereafter call the Dirac equation, giving credit where credit is due — into
a pair of 2 × 2 matrix equations, we will get:
(i∂t + iσ · ∇) ψR = mψL ;
(i∂t − iσ · ∇) ψL = mψR
(5.36)
This shows that we are indeed dealing with a pair of 2-component objects
coupled through the mass term; when m = 0 they separate into a pair of
uncoupled equations. Our next task is to figure out what happens to this
equation under rotations and Lorentz boosts.
Rotations are easy. If we rewrite Eq. (5.36) in the form:
i(σ · ∇)ψR = mψL − i∂t ψR ;
−i(σ · ∇)ψL = mψR − i∂t ψL
(5.37)
the only non-trivial term affected by rotations is σ · ∇ = σ α ∂α . But
we already know from the previous section how this term changes under
rotations! We see that the Eq. (5.37) will be form-invariant under rotations
if ψR and ψL transform as the standard 2-spinors we introduced in the
previous section. That is, under pure rotations,
ψL → RψL ;
ψR → RψR
(5.38)
with (see Eq. (5.25)):
i
σ
R = exp (−iθ) ·
= exp[− ωαβ S αβ ]
2
2
(5.39)
where S αβ ≡ −(i/4)[σ α , σ β ]. This solves part of the mystery: Our ψ is
actually made of a pair of objects each of which transforms as an ordinary
2-spinor as far as spatial rotations go.22
22 Once again, there is a peculiar sign
flip on the 2π rotation which we need
to get back to later on.
198
Chapter 5. Real Life II: Fermions and QED
Of course, we now have to ensure invariance under the full Lorentz
transformations — not just spatial rotations. This is again very easy because the idea is essentially the same as the one we encountered in the
last section. Recall that the rotational invariance worked because, for a
rotation described by xα → xα ≡ Mγ α xγ , we could find a matrix R such
that
R−1 σ α R = Mγ α σ γ
(5.40)
23 We will continue to call this transformation ‘Lorentz transformation’,
even though it includes rotations as a
subset; once we do this, we will be able
to reproduce — and check — the previous results for rotations as a special
case.
holds. This immediately implies that R(σ · ∇)R−1 = (σ · ∇ ) and we could
keep the equation invariant by postulating ψ → Rψ.
Now we have to consider the more general class of transformations
xa → xa ≡ Lab xb representing both rotations and Lorentz transformations, which keep xa xa invariant.23 The infinitesimal version is given by
xa ≡ [δba + ω ab ]xb where the six parameters of the antisymmetric tensor
ωab = −ωba now describe ‘rotations’ in the six planes (xy, yz, zx, tx, ty, tz);
i.e., three rotations and 3 boosts. Everything will work out fine, if we can
find a 4 × 4 “rotation matrix” R such that
R−1 γ m R = Lmn γ n =
24 Make sure you appreciate the
smooth transition 3 → 4. Here we
have merely retyped Eq. (5.20) changing Pauli matrices to Dirac matrices,
R → R and Greek indices to Latin indices!
∂xm n
γ
∂xn
(5.41)
[Compare with Eq. (5.19).] As before, multiplying by ∂m
will lead24 to the
results:
(5.42)
R−1 [γ a ∂a ]R = γ c ∂c ; γ a ∂a = R[γ c ∂c ]R−1
If we now postulate the ψ transforms under the Lorentz transformation as
ψ = Rψ
(5.43)
we can again multiply equations Eq. (5.42) and Eq. (5.43) side by side, to
obtain
γ a ∂a ψ = R[γ c ∂c ψ]
(5.44)
25 This is again essentially Eq. (5.25)
retyped with R → R, Pauli matrices
to Dirac matrices and Greek → Latin,
except for a sign flip in S αβ compared
to S mn . This is just to take care of
the fact that [γ α , γ β ] for spatial indices is dia (−[σα , σβ ], −[σα , σβ ]) with
a minus sign, because of the way we
have defined the γ-matrices. So the
minus sign in the definition of S mn
in Eq. (5.45) vis-a-vis the definition of
S αβ in Eq. (5.25) ensures that when
we consider pure rotations, we will get
back the previous results.
Exercise 5.3:
Prove the claim.
(Hint:
First prove [γ l , S mn ] =
(i/2)[γ l , γ m γ b ] = i[ηlm γ n − ηln γ m ];
then, if you use the infinitesimal version of Lorentz transformation, it is
straight forward.)
This will keep the Dirac equation invariant under a Lorentz transformation.
So, if we find a matrix R that satisfies Eq. (5.41), we are through; as a bonus
we find how ψ transforms under Lorentz transformations. (And since these
include rotations as a special case, we can check out the previous results.)
Can we find such an R? Very easy. You can again write R = 1 − iA
and play the same game we did to arrive at Eq. (5.23). But since the R for
3-D rotations is given by Eq. (5.23) with 3 rotational parameters in ωαβ
and S αβ is made from the commutators of the Pauli matrices, it is a good
guess25 to try:
i
i
(5.45)
R ≡ exp − ωmn S mn ; S mn ≡ [γ m , γ n ]
2
4
You can now verify that, with the ansatz in Eq. (5.45), the condition in
Eq. (5.41) holds, which in turn implies Eq. (5.42). So, the Dirac equation
will be Lorentz invariant provided the ψ transforms as:
i
ψ → Rψ = exp − ωmn S mn ψ
(5.46)
2
What does this transformation mean? We already know that under
pure rotations, the ψL and ψR of which ψ is made of, transform in identical manner, just like non-relativistic two-spinors. To see what happens
5.3. Relativistic Square Roots
199
under Lorentz transformations, let us consider a pure boost with ωαβ =
0, ω0α ≡ ηα so that we can write (i/2)ωmn S mn = iω0α S 0α and use the
fact that (in the Weyl representation) S 0α = −(i/2)dia(σ α , −σ α ) to obtain
(i/2)ωmn S mn = (1/2)dia(η · σ, −η · σ). Substituting this into the transformation law Eq. (5.46) and using the block-diagonal nature of the matrices,
we find that ψL and ψR transform differently under a Lorentz boost:
1
1
η · σ ψR
ψL → RL ψL ≡ exp − η · σ ψL ; ψR → RR ψR ≡ exp
2
2
(5.47)
Thus, under pure rotations, ψL and ψR transform identically26 but they
transform differently under Lorentz boost.
For practical computations, it is convenient to write η = rn where r is
the rapidity of the Lorentz boost given by (v/c) = tanh(r). In this case,
the relevant transformation matrices become:
r
r
RR = exp + n · σ
(5.48)
RL = exp − n · σ ,
2
2
26 Since NRQM only cares about rotations as a symmetry, we need not
distinguish between ψL and ψR in the
context of NRQM. In (relativistic) literature, the index of ψL is denoted by
α˙ and one often uses the terminology of
‘dotted’ and ‘undotted’ spinors rather
than that of L and R spinors.
In fact, using Eq. (5.9), we can explicitly evaluate these transformation
matrices to get
R2L = exp(−rn · σ) = cosh(r) − sinh(r)(n · σ) = γ − βγ(n · σ)
(5.49)
and similarly for R2R . This leads to the alternative expressions for the
transformation matrices:
RR = γ + βγn · σ
(5.50)
RL = γ − βγn · σ,
We can now demystify the subscripts L and R by considering how
rotations and Lorentz boosts are affected by the parity transformation,
P ≡ (t → t, x → −x). Under parity, rotations are unaffected while Lorentz
boosts flip sign. So under the parity transformation, ψL and ψR are interchanged. That is,
P : ψR (t, x) → ψL (t, −x);
P : ψL (t, x) → ψR (t, −x)
(5.51)
This can be written more concisely in terms of the 4-spinor ψ, using the
γ 0 matrix, as
P : ψ(t, x) → γ 0 ψ(t, −x)
(5.52)
From this, it is trivial to verify that if ψ(t, x) satisfies the Dirac equation,
the parity transformed spinor defined as γ 0 ψ(t, −x) also satisfies27 the
Dirac equation:
(iγ 0 ∂t +iγ α ∂α −m)γ 0 ψ(t, −x) = γ 0 (iγ 0 ∂t −iγ α ∂α −m)ψ(t, −x) = 0 (5.53)
Putting together the results in Eq. (5.39) and Eq. (5.47), we can write down
how ψL , ψR transform under the most general Lorentz transformations,
specified by ωmn made of ωαβ ≡ αβγ θγ , ω0α ≡ ηα . This is given, in full
glory, by:
σ
ψL → exp (−iθ − η) ·
ψL
(5.54)
2
and
σ
ψR
(5.55)
ψR → exp (−iθ + η) ·
2
Interesting, but rather strange. We will have more to say about this in
Sect. 5.4.
27 Algebra alert: The extra minus sign
arising from γ 0 moving through γ α is
nicely compensated by the derivative
acting on −x instead of x.
200
5.3.3
28 The argument for this replacement
is exactly the same as that in the
case of complex scalar field we studied earlier, in Sect. 3.1.5. Under the
transformation ψ → exp[−iqα(x)]ψ,
Am → Am + ∂m α, we see that
Dm ψ → {exp[−iqα(x)]}Dm ψ, maintaining gauge invariance.
Chapter 5. Real Life II: Fermions and QED
Spin Magnetic Moment from the Dirac Equation
Let us next check whether the Dirac equation gives the correct magnetic
moment for the electron, which we obtained from the Pauli equation in
the last section. As in the non-relativistic case, this fancy square root of
p2 really comes into its own only when we introduce the electromagnetic
coupling. In the relativistic case, this will change Eq. (5.26) to the form28
(iγ m Dm − m)ψ = 0;
Dm ≡ ∂m − ieAm
(5.56)
To see what this implies for the electromagnetic coupling, multiply this
equation by (iγ a Da + m) and use the results
1
({γ m , γ n } + [γ m , γ n ]) Dm Dn = Dm Dm − iσ mn Dm Dn
2
i
e
iσ mn Dm Dn = σ mn [Dm , Dn ] = σ mn Fmn
(5.57)
2
2
γ m γ n Dm Dn =
Simple algebra now gives
e
Dm Dm − σ mn Fmn + m2 ψ = 0
2
(5.58)
The first term D2 ≡ Dm Dm is exactly what we would have got for the
spinless particle and, as before, we do not expect anything new to emerge
from this term. Expanding it out, we can easily see that the spatial part
will give, as before, the contribution
(Dα )2 = ∇2 − eB · (x × p) + O(A2α )
(5.59)
The orbital angular momentum will contribute to the Hamiltonian a magnetic moment m = (e/2m)L.
As before, we find that there is indeed an extra term (e/2)σ mn Fmn
which arises due to our fancy square-rooting. To see its effect, let us again
consider a magnetic field along the z−axis, in which case the (e/2)σ mn Fmn
will contribute a term
e
e
σ 3 (F12 − F21 ) =
2σ 3 B = 2eB · s;
s = σ/2
(5.60)
2
2
Clearly, we get an extra factor of 2 in the contribution of the spin to the
magnetic moment; so this result is not an artifact of the non-relativistic
discussion in the previous section.
To summarize, we have discovered a new class of objects ψ built from a
pair of 2-spinors which have nice Lorentz transformation properties. This
puts them in the same “respectability class” as scalars, four vectors, tensors,
etc. using which one can construct relativistically invariant field theories.
At this stage, you will find it rather mysterious that a new class of objects
like ψα even exists and — since they do — whether there are more such
objects to be discovered. The answer to this question involves classifying
the representations of the Lorentz group and introducing the fields which
can “carry” these representations. This is something which we will turn our
attention to in the next section. We will approach the entire problem from
a slightly more formal angle which will tie up the loose ends and clarify
what is really going on.
5.4. Lorentz Group and Fields
5.4
5.4.1
201
Lorentz Group and Fields
Matrix Representation of a Transformation Group
Consider the set of all linear transformations, xa → x
¯a ≡ Lab xb , of the
Cartesian 4-dimensional coordinates which preserves the interval s2 =
ηab xa xb = t2 − |x|2 . This condition
¯m x
¯n = ηmn (Lmr xr )(Lns xs ) = ηrs xr xs
ηmn x
(5.61)
will hold for all xa if the matrix L satisfies the condition
ηrs = ηmn Lmr Lns ;
η = LT ηL
(5.62)
Taking the determinant, we have [det L]2 = 1 and we will take29 det L = 1
which corresponds to, what is called, proper Lorentz transformations. Further, if you expand out the zero-zero component of Eq. (5.62), it is easy to
show that (L00 )2 ≥ 1 which divides the proper Lorentz transformations into
two disconnected components with L00 ≥ 1 (“orthochronous”) and the one
with L00 ≤ 1 (“non-orthochronous”). We will stick with the orthochronous,
proper, Lorentz transformations.30 This is what you would have usually
thought of as the Lorentz transformations from a special relativity course.
A subset of these transformations will be pure rotations in the three
¯α = Lαβ xβ ; another subset will be pure
dimensional space with x
¯0 = x0 , x
Lorentz boosts along some direction. The successive application of two
rotations will lead to another rotation so they form a group by themselves;
but the successive application of two Lorentz boosts along two different
directions will, in general, lead to a rotation plus a Lorentz boost.31 This
fact, along with the existence of the identity transformation and the inverse
transformation, endows the above set of linear transformations with a group
structure, which we will call the Lorentz group. The composition law for
two group elements L1 ◦ L2 will be associated with the process of making
the transformation with L2 followed by L1 . As we said before, the group
structure is determined by identifying the element of the group L3 which
corresponds to L1 ◦ L2 for all pairs of elements.
This description is rather abstract for the purpose of physics! By and
large, you can get away with what is known as the matrix representation
of the group rather than with the abstract group itself. We can provide a
matrix representation to any group (and, in particular, to the Lorentz group
which we are interested in) by associating a matrix with each element of the
group such that, the group composition law is mapped to the multiplication
of the corresponding matrices. That is, the group element g1 ◦ g2 will
be associated with a matrix that is obtained by multiplying the matrices
associated with g1 and g2 .
In the case of the Lorentz group, a set of k × k matrices D(L) will provide a k−dimensional representation of the Lorentz group if D(L1 )D(L2 ) =
D(L1 ◦ L2 ) for any two elements of the Lorentz group L1 , L2 where L1 , L2
etc. could correspond to either Lorentz boosts or rotations. We can also
introduce a set of k quantities ΨA with A = 1, 2, ...k, forming a column
vector, on which these matrices can act such that, under the action of an
element L1 of Lorentz group (which could be either a Lorentz transformation or a rotation), the ΨA s undergo a linear transformation of the form
B
B
(L1 )ΨB where DA
(L1 ) is a k × k matrix representing the element
ΨA = DA
L1 .
29 Transformations with det L = −1
can always be written as a product of
those with det L = +1 and a discrete
transformation that reverses the sign
of an odd number of coordinates.
30 The non-orthochronous transformations can again be obtained from the
orthochronous ones with a suitable inversion of coordinates.
31 So Lorentz boosts alone do not form
a group. You may have learnt this from
a course in special relativity; if not,
just work it out for yourself. We will
obtain this result in a more sophisticated language later on.
202
32 The k = 4 case is sometimes called
the fundamental (or defining) representation of the Lorentz group.
Chapter 5. Real Life II: Fermions and QED
The simplest example of such a set of k−tuples ΨA occurs when k = 4.
Here we identify ΨA with just the components of any four vector v a and the
B
with the Lorentz transformation matrices Lba . This, of course,
matrices DA
should work since it is this transformation which was used in the first
place, to define the group.32 It is also possible to have a trivial “matrix”
representation with k = 1 where we map all the group elements to identity.
B
In general, however, DA
(L1 ) can be a k × k matrix where k need not
be 4 (or 1). In such a case, we have effectively generalized the idea of
Lorentz transformation from 4-component objects to k-component objects
ΨA which have nice transformation properties under Lorentz transformation. In the case of k = 1, the one-component object can be thought of as
a real scalar field φ(x) which transforms under the Lorentz transformation
¯ x) = φ(x). In the case of k = 4, it could be a 4-vector field
trivially as φ(¯
x) = Lj k Ak (x). In both these cases, we
Aj (x) which transforms as A¯j (¯
could say that the fields “carry” the representation of the Lorentz group.
So you see that, if we have a k−dimensional representation of the
Lorentz group, we can use it to study a k−component field which will
have nice properties and could lead to a relativistically invariant theory.
Hence, the task of identifying all possible kinds of fields which can exist in
nature, reduces to finding all the different k × k matrices which can form
a representation of the Lorentz group. We will now address this task.
5.4.2
Generators and their Algebra
The key trick one uses to accomplish this is the following. You really do
not have to find all the k × k matrices which satisfy our criterion. We
only have to look at matrices which are infinitesimally close to the identity
matrix and understand their structure. This is because any finite group
transformation we are interested in can be obtained by “exponentiating”
an infinitesimal transformation close to the identity element. If we write
the matrix corresponding to an element close to the identity as D ≡ 1 +
A GA where A are a set of infinitesimal scalar parameters, then a finite
transformation can be obtained by repeating the action of this element N
times and taking the limit N → ∞, A → 0 with N A ≡ θA remaining
finite. We then get
lim
N →∞
→0
33 The set of elements G
A which represent elements close to the identity element are called the generators of the
group.
34 This will miss out some global aspects of the group but they can be
usually figured out e.g., by looking at
the range of A . The generators define what is known as the Lie algebra
of the Lie group and for most purposes
we will just concentrate on the Lie algebra of the Lorentz group.
N
1 + A GA
= exp(N A GA ) = exp(θA GA )
(5.63)
which represents some generic element of the group.33
So our task now reduces to finding the matrix representation of the
generators GA of the group34 rather than the matrix representation for all
the elements of the group. What is the property the matrices representing
the generators of the group need to satisfy? To find this out, consider two
matrices U1 = 1 + i 1 G1 and U2 = 1 + i 2 G2 representing two elements
of the group close to the identity (with the factor i introduced for future
convenience). We then know that the element U3 = 1 + i 3 G3 defined by
the operation U3 = U1−1 U2 U1 is also another element of the group close to
the identity. An elementary calculation shows that G3 = G2 + i 1 [G2 , G1 ]
which, in turn, implies that the commutator [G2 , G1 ] of two generators is
also another generator. For a group with N generators, the element close
to identity can be expressed in the form U = 1 + i A GA , and the above
analysis tells us that the commutator between any pair of generators must
5.4. Lorentz Group and Fields
203
be a linear combination of other generators:
[GA , GB ] = ifAB C GC
(5.64)
where the set of numbers fAB C are called the structure constants of the
group. These structure constants determine the entire part of the group
which is continuously connected to the identity since any finite element in
that domain can be obtained by exponentiating a suitable element close to
the identity.
So, all we need to do is to: (i) Determine the structure constants for the
relevant group and (ii) discover and classify all the k × k matrices which
obey the commutation rule given by Eq. (5.64) with the structure constants
appropriate for the Lorentz group. We will now turn to these tasks.
5.4.3
Generators of the Lorentz Group
Let us start with a Lorentz transformation which is close to the identity
transformation represented by the transformation matrix Lmn = δnm +
ω mn . The condition in Eq. (5.62) now requires ωmn = −ωnm . Such an
antisymmetric 4 × 4 matrix has 6 independent elements, so we discover
that the Lorentz group has 6 parameters.35 This also means that there
will be six generators JA (with A = 1, 2, ..., 6) for the Lorentz group and
a generic group element close to the identity can be written in the form
1 + θAJA where the six parameters θA corresponds to the angles of rotation
in the six planes. We can get any other group element by exponentiating
this element close to the identity.
Since we have chosen to represent the 6 infinitesimal parameters by the
second rank antisymmetric tensor ωmn (which has six independent components) rather than by a six component object θA , it is also convenient
to write the generators J A in terms of another second rank antisymmetric tensor as J mn with J mn = −J nm . We are interested in the structure
constants of the Lorentz group, for which, we need to work out the commutators of the generators [J mn , J rs ]. Since the structure constants are
independent of the representation, we can work this out by looking at the
defining representations of the Lorentz group (the k = 4 one), for which we
know the explicit matrix form of the Lorentz transformation. In this representation, J mn is represented by a 4 × 4 matrix with components (J mn )rs ,
the explicit form of which is given by36
(J mn )rs = i (η mr δsn − η nr δsm )
(5.65)
This is a 4-dimensional representation under which all the components of
the four-vector gets mixed up, and hence this is an irreducible representation.
You can now work out the form of [J mn , J rs ] using the explicit form of
matrices in Eq. (5.65) and obtain:
[J mn , J rs ] = i (η nr J ms − η mr J ns − η ns J ms + η ms J nr )
(5.66)
This is the formal result which completely determines the structure of the
Lorentz group. All we need to do is to find k × k matrices for the six
generators in J mn such that these matrices satisfy the same commutation
rule as given above. That will also allow us to determine and classify
all possible matrix representations of the Lorentz group. Obviously, there
35 This, of course, makes sense. Since
one can think of Lorentz boost along,
say, x-axis as a rotation (albeit with a
complex angle) in the tx plane, the 6
parameters correspond to rotations in
the 6 planes xy, yz, zx, tx, ty, tz. We
see the virtue of ‘rotations in planes’
compared to ‘rotations about an axis’.
36 This is easily verified by noticing
that, under an infinitesimal transformation, any four vector changes as
δV m = ω mn V n which can be written as δV r = −(i/2)ωmn (J mn )r s V s .
Substituting the form of the matrix in
Eq. (5.65) reproduces the correct result.
Exercise 5.4: Work this out. It
builds character, if not anything else.
204
37 Recall your QM 101, plus the fact
that with our signature, xα = −xα to
understand the signs.
Chapter 5. Real Life II: Fermions and QED
should be a cleverer way of doing this — and there is. Since it also helps
you to connect up the structure of Lorentz group with what you know
about angular momentum in QM, we will follow that route.
The trick is to think of a representation in the space of (scalar) functions
and come up with a set of differential operators (which act on functions)
and could represent J mn . Since a pure rotation in the αβ plane will have
as generator the standard angular momentum operator 37 , Jαβ = i(xα ∂β −
xβ ∂α ) it is a safe guess that the Lorentz group generators are just:
Jmn = i(xm ∂n − xn ∂m )
38 Make sure you understand the difference between the local change δ0 φ
of the field φ (x) − φ(x) (where the coordinate label is the same) compared
to the change in the field at the same
physical event φ (x ) − φ(x), which, of
course, vanishes for a scalar field.
Exercise 5.5: This is, of course, a representation of the generators as differential operators on the space of functions. The commutators of these generators are now straightforward to find
by just operating on any arbitrary
function. Do it and reproduce the
result in Eq. (5.66) (which will build
more character).
Exercise 5.6: Verify this claim.
39 Notation/sign alert: We use the
notation that boldface 3-vectors have
contravariant components [i.e (v)α ≡
vα ] and the boldface dot product
of 3-vectors is the usual one with
positive signature [i.e.
a · b =
δαβ aα bβ ]; but with our spacetime signature vα = −vα etc. and hence
a · b = −ηαβ aα bβ = −aα bα . Also note
that 123 = 1.
(5.67)
This can be obtained more formally by, say, considering the change δ0 φ ≡
¯
φ(x)−φ(x)
of a scalar field φ(x) at a given xa under an infinitesimal Lorentz
¯ x) = φ(x
¯ + δx) =
¯m + ω mn xn . From φ(¯
transformation x¯m = xm + δxm = x
38
m
¯
φ(x), we find that φ(x) = φ(x − δx) = φ(x) − δx ∂m φ. This leads to the
result
δ0 φ
= φ (x) − φ(x) = −δxm ∂m φ = −ω mn xn ∂m φ
1
i
= − ω mn (xn ∂m − xm ∂n )φ ≡ − ω mn Jmn φ
2
2
(5.68)
In arriving at the third equality, we have used the explicit form for δxm =
ω mn xn and in getting the fourth equality we have used the antisymmetry
of ω mn . The last equality identifies the generators of the Lorentz transformations as guessed in Eq. (5.67).
It is now easy to figure out the matrix representations of the Lorentz
group. In Eq. (5.67) we get back the standard angular momentum operators
of quantum mechanics for Jαβ while J0α leads to a Lorentz boost in the αdirection. Introducing the six parameters (θμ , η μ ) in place of ωαβ by the
standard trick ωαβ ≡ αβμ θμ , ω α0 ≡ η α and defining six new generators
J, K by:
1
Jα = − αβγ J βγ ,
K α = J α0
(5.69)
2
we can express39 the relevant combination (1/2)ωmnJ mn as:
1
1
1
ωmn J mn = ωαβ J αβ + ωα0 J α0 × 2 = θ · J − η · K
2
2
2
(5.70)
which actually has the structure of θA GA with the six generators and six
parameters explicitly spelt out. A generic element L of the Lorentz group
will be given by exponentiating the element close to the identity, and can
be expressed in the form:
L = exp[(−i/2)ωmn J mn ] = exp [−iθ · J + iη · K]
(5.71)
Clearly, J α are the generators for the spatial rotations; the K α are the
generators of the Lorentz boosts.
Using the representation in Eq. (5.67) and the definitions in Eq. (5.69)
we find that Jα and Kα obey the following commutation rules.
[Kα , Kβ ] = −i αβγ J γ
(5.72)
These relations have a simple interpretation. The first one is the standard
commutation rule for angular momentum operators. Since these commutators of J close amongst themselves, it is obvious that the spatial rotations
[Jα , Jβ ] = i αβγ J γ ;
[Jα , Kβ ] = i αβγ K γ ;
5.4. Lorentz Group and Fields
205
alone form a sub-group of the Lorentz group. The second one is equivalent
to saying that Kα behaves like a 3-vector under rotations. The crucial
relation is the third one which shows that the commutator of two boosts is
a rotation (with an important minus sign); so Lorentz boosts alone do not
form a group.
To provide an explicit matrix representations for the Lorentz group, we
only have to provide a matrix representation for the infinitesimal generators
of the Lorentz group in Eq. (5.72). That is, we have to find all matrices
which satisfy these commutation relations.
5.4.4
Representations of the Lorentz Group
To do this, we introduce the linear combination aα = (1/2)(Jα + iKα ) and
bα = (1/2)(Jα − iKα ). This allows the commutation relation in Eq. (5.72)
to be separated into two sets
[aα , aβ ] = i αβμ aμ ;
[bα , bβ ] = i αβμ bμ ;
[aμ , bν ] = 0
(5.73)
These are the familiar commutation rules for a pair of independent angular
momentum matrices in quantum mechanics. We therefore conclude that
each irreducible representation of the Lorentz group is specified by two
numbers n, m, each of which can be an integer or half-integer with the dimensionality (2n+1) and (2m+1). So the representations can be characterized in increasing dimensionality as (0, 0), (1/2, 0), (0, 1/2), (1, 0), (0, 1), ...
etc.
We have already stumbled across two of these representations, one corresponding to the trivial representation by a 1 × 1 matrix (‘scalar’) and
the other corresponding to the defining representation with the generators
represented by 4 × 4 matrices (‘four-vector’) given explicitly in Eq. (5.65).
The really interesting case, for our purpose, arises when we study the representations corresponding to (1/2, 0) and (0, 1/2). These representations
have dimension 2 and hence must be represented by 2 × 2 matrices. From
the commutation relations in Eq. (5.73), it is clear that we can take these
matrices to be two copies of the standard Pauli matrices. They will act
on two component objects ψα with α = 1, 2 treated as a column vector.
We will denote these spinors by the L, R notation (introduced in the last
section), in anticipation of the fact that they will turn out to be identical
to the beasts we came across earlier. That is, we use the notation (ψL )α
with α = 1, 2 for a spinor in the representation (1/2, 0) and by (ψR )α with
α = 1, 2 for a spinor in the representation (0, 1/2). We could call ψL as
the left-handed Weyl spinor and ψR as the right-handed Weyl spinor.40 To
make the connection with the last section, we have to explicitly determine
how these spinors transform, which requires determining the explicit form
of the generators J, K while acting on the Weyl spinors. In the case of the
(1/2, 0) representation, b is represented by σ/2 while a = 0. This leads to
the result41
J =a+b=
σ
;
2
K = −i(a − b) = i
σ
2
(5.74)
We can now write down the explicit transformation of the left-handed Weyl
spinor using Eq. (5.71) which leads to
σ
ψL
(5.75)
ψL → L(1/2)L ψL = exp (−iθ − η) ·
2
40 As we shall see, they transform differently under the action of the Lorentz
group, a fact we already know from the
discussion in the last section. These
are also called ‘dotted’ and ‘undotted’
spinors in the literature.
41 Note that, in this representation the
generator K is not Hermitian, which
is consistent with a theorem that noncompact groups have no non-trivial
unitary representations of finite dimensions. This is, in turn, related to
the fact that a and b are complex
combinations of J and K. This has
some important group theoretical consequences which, however, we will not
get into.
206
Chapter 5. Real Life II: Fermions and QED
In the case of the (0, 1/2) representation, we have J = σ/2 and K = −iσ/2
and the corresponding transformation law is
σ
ψR
(5.76)
ψR → L(1/2)R ψR = exp (−iθ + η) ·
2
This is precisely what we found earlier for ψL and ψR in the last section working out separately for rotations (see Eq. (5.38)) and boosts (see
Eq. (5.47)). The infinitesimal forms of these relations are given by
1
1
[(−iθ + η) · σ]ψR ;
δψL = [(−iθ − η) · σ]ψL
2
2
The corresponding results for the adjoint spinor are
δψR =
(5.77)
1
1
†
†
†
(iθ + η) · (ψR
σ);
δψL
= (iθ − η) · (ψL
σ)
(5.78)
2
2
Before concluding this section, let us settle this business of something
becoming the negative of itself under a 2π rotation. To be precise, this
can never happen under the action of the Lorentz group. Almost by definition, the rotation group, which is a subgroup of Lorentz group, map
2π rotations to identity elements. So, if we are really obtaining the representations of the Lorentz group, we cannot have the funny business of
something becoming the negative of itself under a 2π rotation. Actually,
by exponentiating the elements of the Lie algebra, we have in fact generated
a different group called SL(2,C) which is known as the universal covering
group of the Lorentz group. To be precise, spinors transform under the
representations of SL(2,C) rather than under the representations of the
Lorentz group SO(1,3).
The reason this is not a problem in QFT is because one has the freedom
in the choice of the phase in quantum theory so that two amplitudes differing by a sign lead to the same probability. Mathematically, this means that
we need not have an exact representation of the Lorentz group but can be
content with what is known as the projective representations of the Lorentz
group. If we have two matrices, M (g1) and M (g2) corresponding to two
group elements g1 and g2, then the usual requirement of a representation
is that M (g1)M (g2) = M [g1 ◦ g2]. But if we allow the projective representations, we can also have M (g1)M (g2) = eiφ M [g1 ◦ g2] where φ could, in
general, depend on g1 and g2. It turns out that the projective representations of SO(1,3) are the same as the representations of the SL(2,C) and
that is how we get spinors in our midst. The flip of sign ψ → −ψ under the
θ = 2π rotation is closely related to the fact that the 3-D rotation group is
not simply connected.
While all this is frightfully interesting, it does not make any difference
to what we will be doing. So we will continue to use phrases like “spinor
representations of Lorentz group” etc. without distinguishing between projective representations and honest-to-God representations.
†
δψR
=
5.4.5
Why do we use the Dirac Spinor?
As a bonus, we also now know how the 4-spinor ψ built from a pair of
two spinors (ψL , ψR ) transforms under Lorentz transformations. From the
discussion in the previous section, this has to be in accordance with the
rule in Eq. (5.46) reproduced below:
i
i
mn
ψ; S mn ≡ [γ m , γ n ]
(5.79)
ψ → L1/2 ψ = exp − ωmn S
2
4
5.4. Lorentz Group and Fields
207
Formally, one says that the Dirac spinor (which is another name for this
4-spinor) transforms under the reducible representation (1/2, 0) ⊕ (0, 1/2)
of the Lorentz group. For this to work, it is necessary that S mn defined
in Eq. (5.79) must satisfy the commutation rules for the generators of the
Lorentz group in Eq. (5.66); i.e., if we set J mn = S mn in Eq. (5.66), it
should be identically satisfied. You can verify that this is indeed true.
This shows that S mn can also provide a representation for the Lorentz
group algebra, albeit a reducible one. We saw earlier [see Eq. (5.65)] that
the transformation of the four-vectors provides an irreducible 4-dimensional
representation of the Lorentz group. What we have here is another, different 4-dimensional representation of the Lorentz group but a reducible
one.
But how come the Dirac equation uses the 4-spinor ψ (belonging to
a reducible representation) rather than either ψL or ψR which belong to
the irreducible representations? The reason has to do with parity. We saw
earlier that, under the parity transformation, ψL and ψR are interchanged42
but ψ goes to γ 0 ψ keeping the Dirac equation invariant. (See Eq. (5.53).) In
other words, if we use ψL and ψR separately, we do not get a basis which is
true to parity transformations. While parity is indeed violated by the weak
interactions, the physics at scales below 100 GeV or so, should conserve
parity. So the field we choose should provide a representation for both
Lorentz transformation and parity transformation. This is the fundamental
reason why we work with ψ rather than with ψL or ψR individually.43
In the Weyl representation we have used, L1/2 came up as a block
diagonal matrix and one could separate out the Dirac spinor into L and R
components quite trivially. But we know that one could have equally well
used some other representation related to the Weyl representation by:
γ m → U γ m U −1 ;
ψ → Uψ
You can easily verify that {γ 5 , γ m } = 0, (γ 5 )2 = +1. Further, γ 5 commutes
with S mn so that it behaves as a scalar under rotations and boosts. Since
(γ 5 )2 = 1, we can now define two Lorentz invariant projection operators
1
(1 − γ 5 );
2
PR ≡
1
(1 + γ 5 )
2
(5.82)
which obey the necessary properties PL2 = PL , PR2 = PR , PL PR = 0 of
projection operators. These operators PL and PR allow us to project out
ψL and ψR from ψ by the relations
ψL =
1
(1 − γ 5 )ψ;
2
ψR =
1
(1 + γ 5 )ψ
2
42 More generally, under (t, x) →
(t, −x), we have K → −K and J →
J . This leads to the interchange a ↔
b. Therefore, under the parity transformation, an object in the (n, m) representation is transformed to an object
in the (m, n) representation.
43 In fact, in the standard model of the
electroweak interaction, which we will
not discuss, the left and right handed
components of spin-1/2 particles enter
the theory in a different manner in order to capture the fact that weak interactions do not conserve parity.
(5.80)
in which case it is not obvious how one could define the left and right
handed components in terms of ψ. Is there an invariant way of effecting
this separation?
Indeed there is and this brings us to yet another gamma matrix44 defined as
γ 5 ≡ −iγ 0 γ 1 γ 2 γ 3
(5.81)
PL ≡
Exercise 5.7: Verify this.
(5.83)
One can trivially check that, in the Weyl representation, γ 5 is given by
1 0
5
γ =
(5.84)
0 −1
44 With this notation we do not have a
γ 4 and you jump from γ 3 to γ 5 . This is
related to calling time, the zero-th dimension rather than the fourth dimension! If time was the fourth dimension,
we would have γ a with a = 1, 2, 3, 4,
and the next in line would have been
γ5!
208
Chapter 5. Real Life II: Fermions and QED
and hence these relations are obvious. Since everything is done in an invariant manner, this projection will work in any basis.
We thus discover that the four component Dirac spinor — we (and
Dirac) stumbled on by taking a fancy square root — actually arises from a
much deeper structure. Just to appreciate this fact, let us retrace the steps
which we followed in this section to arrive at the Dirac spinor.
(1) We note that the set of all Lorentz boosts plus rotations form a
group with six parameters that specify the six rotational “angles” in the
six planes of the 4-dimensional space.
(2) We identify the six generators J mn = −J nm of this group and
determine their commutator algebra — which is given by Eq. (5.66).
(3) We look for k × k matrices which satisfy the same commutator
algebra as these generators to determine all the matrix representations
of the Lorentz group. In addition to the trivial representation (k = 1),
we already know a representation by 4 × 4 matrices which is used in the
transformation law for the normal four-vectors.
(4) We separate out the Lorentz algebra into two decoupled copies of
angular momentum algebra given by Eq. (5.73). This allows us to classify
all the representations of the Lorentz group by (n, m) where n and m are
either integers or half-integers.
(5) This takes us to the spinorial representations of the Lorentz group
(1/2, 0) and (0, 1/2). The corresponding 2-spinors ψL and ψR transform
identically under rotations but differently under Lorentz boosts. These
are the basic building blocks from which all other representations can be
constructed.
(5) One can obtain a reducible representation of Lorentz group by building a 4-spinor out of the pair (ψL , ψR ). This is the Dirac spinor which
appears in the Dirac equation.
5.4.6
Spin of the Field
Finally, let us clarify how the notion of spin arises from the Lorentz transformation property for any field. As an example, let us consider the
local variation of the left-handed Weyl spinor under the transformation
(x ) = RψL (x), which is given by
x → x = x + δx, ψL (x) → ψL
δ0 ψL
≡
ψL
(x) − ψL (x) = ψL
(x − δx) − ψL (x)
=
=
ψL
(x ) − δxr ∂r ψL (x) − ψL (x)
(R − 1)ψL (x) − δxr ∂r ψL (x)
(5.85)
We now see that the variation has two parts. One comes from the variation
of the coordinates δxr , which is nonzero even for a spinless scalar field, as we
have seen before. This will lead to the standard orbital angular momentum
contribution given by
i
−δxr ∂r ψL = − ωmn J mn ψL ;
2
J mn ≡ i(xm ∂ n − xn ∂ m )
(5.86)
On the other hand, in the term involving (R − 1), the Taylor expansion
of the Lorentz transformation group element R = exp[−(i/2)ωmnS mn ] will
give an additional contribution. So, the net variation is given by δ0 ψL =
−(i/2)ωmnJ mn ψL where J mn = J mn + S mn .
This result is completely general and describes how we get the correct
spin for the field from its Lorentz transformation property. The orbital
5.4. Lorentz Group and Fields
209
part J mn comes from kinematics and has the same structure for all fields
(including the spinless scalar) while the extra contribution to the angular
momentum — arising from the spin — depends on the specific representation of the Lorentz group which the field represents. In the special case
of the left-handed Weyl spinor, we know the explicit form of S mn from
Eq. (5.75). This tells us that
Sα =
1 αβγ βγ
σα
;
S =
2
2
S α0 = i
σα
2
(5.87)
The left-handed Weyl spinors correspond to S = σ/2. In fact, the righthanded Weyl spinor will also lead to the same result except for a flip of
sign in S α0 .
5.4.7
The Poincare Group
The quantum fields which we study (like the scalar, spinor, vector etc.) are
directly related to the representations of the Lorentz group in the manner
we have outlined above. But to complete the picture, we also need to
demand the theory to be invariant under spacetime translations in addition
to the Lorentz transformations.
A generic element of the spacetime translation group has the structure
exp{−iamP m } where am defines the translation xm → xm + am and P m
are the standard generators of the translation. Lorentz transformations,
along with spacetime translations, form a larger group usually called the
Poincare group. Let us briefly study some properties of this group.
The first thing we need to do is to work out all the new commutators
involving P m . Since translations commute, we have the trivial relation
[P m , P n ] = 0. To find the commutation rule between P m and J rs , we can
use the following
Since energy is a scalar under spatial rotations, we
α trick.
0
must have
J
,
P
=
0.
Similarly, since P α is a 3-vector under rotations,
α β
αβγ γ
P . What we need is a 4-dimensional, Lorentz
we have J , P = i
covariant commutation rule which incorporates these two results. This rule
is unique and is given by
[P m , J rs ] = i(η mr P s − η ms P r )
(5.88)
Along with the standard commutation rules of the Lorentz group, this
completely specifies the structure of the Poincare group. In terms of the
1 + 3 split, these commutation rules read (in terms of J α , K α , P 0 ≡ H and
P α ):
α β
α β
α β
J , K = i αβγ K γ ,
J , P = i αβγ P γ
J , J = i αβγ J γ ,
(5.89)
α β
α β
α β
K , K = −i αβγ J γ ,
K , P = iHδ αβ
P , P = 0,
(5.90)
[J α , H] = 0,
[P α , H] = 0,
[K α , H] = iP α
(5.91)
The physical meaning of these commutation rules is obvious. Equation
(5.89) tells us that J α generates spatial rotations under which K α and P α
transform as vectors. Equation (5.91) tells us that J α and P α commute
with the Hamiltonian (which is the time translation generator) and hence
are conserved quantities. On the other hand, K α is not conserved which is
why the eigenvalues of K α do not assume much significance.
Exercise 5.8: Prove this.
210
45 Such a representation is infinite dimensional because of the labeling by
the continuous parameter p.
Recall that we cannot have unitary, finite dimensional representations of
the Poincare group because it is noncompact.
Exercise 5.9: Prove that W 2 commutes with all generators. (Hint: This
is fairly easy. W 2 , being a Lorentz invariant scalar, clearly commutes with
J mn . Further, from its explicit form
and the antisymmetry of mnrs , it is
easy to see that it commutes with P m
as well.)
Chapter 5. Real Life II: Fermions and QED
At the classical level, relativistic invariance is maintained if we use fields
which transform sensibly under the Poincare group. At the quantum level,
these fields become operators and they act on states in a Hilbert space. In
particular, the one-particle state (which emerges from the vacuum through
the action of a field) should exhibit proper behaviour under the action of the
Poincare group in order to maintain relativistic invariance at the quantum
level. So we need to understand how the Poincare group is represented in
the one-particle states of the Hilbert space.
One particle states, which carry such a representation, will be labelled
by the Casimir operators — which are the operators that commute with all
the generators— of the Poincare group. One Casimir operator is obviously
P 2 ≡ Pm P m which clearly commutes with all other generators. If we
label45 the one-particle state by |p, s where p is the momentum of the
particle and s denotes all other variables, then P 2 has the value m2 in this
one-particle state. It follows that one of the labels which characterize the
one particle state is the mass m of the particle, which makes perfect sense.
There is another Casimir operator for the Poincare group, given by
W 2 = Wi W i where
1
(5.92)
W m = − mnrs Jnr Ps
2
is called the Pauli-Lubanski vector. So, we can use its eigenvalue to further
characterize the one-particle states. Let us first consider the situation when
m = 0. We can then compute this vector in the rest frame of the particle,
getting W 0 = 0, and:
m
m
W α = 0αβγ J βγ = αβγ J βγ = mJ α
(5.93)
2
2
Clearly, the eigenvalues of W 2 are proportional to those of J 2 which we
know are given by j(j + 1) where j is an integer or half integer. Identifying
this with the spin of a one-particle state with mass m and spin j, we have
the result
46 There
is an intuitive way of understanding this result. When m =
0, suitable Lorentz transformations
can reduce P m to the form P m =
(m, 0, 0, 0). Having made this choice,
we still have the freedom of performing
spatial rotations. That is, the space
of one-particle states with momentum
fixed as P m = (m, 0, 0, 0) can still act
as a basis for the representation of spatial rotations. Since we also want to
include spinor representations, the relevant group is SU(2). This brings in
the second spin label j = 0, 1/2, 1, .....
47 The most straightforward way of
obtaining this result is to consider
an infinitesimal Lorentz transformation with parameters ω mn and impose
the condition ω mn Pn = 0 with P m =
(ω, 0, 0, ω). Solving these equations
will determine the structure of the
group element generating the transformation. Identifying with the result in
Eq. (5.96) is then a matter of guess
work.
−Wm W m |p, j = m2 j(j + 1)|p, j,
(m = 0)
(5.94)
46
So, the massive one particle states are labelled by the mass m and the
spin j. As a bonus we see that a massive particle of spin j will have 2j + 1
degrees of freedom.
The massless case, in contrast, turns out to be more subtle. We can
now bring the momentum to the form P m = (ω, 0, 0, ω). Let us ask what
further subgroup of Poincare group will leave this vector unchanged. (Such
a subgroup is called the Little Group.) An obvious candidate is the group
of rotations in the (x, y) plane corresponding to SO(2) and this obvious
choice will turn out to be what we will use. But there is actually another
— less evident47 — Lorentz transformation which leaves P m = (ω, 0, 0, ω)
invariant. The generator for this transformation is given by
Λ = e−i(αA+βB+θC)
(5.95)
where α, β and θ are parameters and A, B, C are matrices with the following
elements
C mn = (J 3 )mn ;
Amn = (K 1 + J 2 )mn ;
B mn = (K 2 − J 1 )mn
These operators J 3 , A, B obey the commutation rules:
3
3
J , A = +iB,
J , B = −iA, [A, B] = 0
(5.96)
(5.97)
5.4. Lorentz Group and Fields
211
This is formally the same algebra as that generated by the operators
px , py , Lz = xpy − ypx which describe translations and rotations of the
Euclidean plane. Also note that the matrices Amn and B mn (with one
upper and one lower index) are non-Hermitian; this is as it should be,
since they provide a finite dimensional representation of the non-compact
Lorentz generators. For our purpose, what is important is the expression
for W 2 which is now given by
−Wm W m = ω 2 (A2 + B 2 )
(5.98)
In principle, we should now consider one-particle states |p; a, b with labels
corresponding to eigenvalues (a, b) of A and B as well. It is then fairly easy
to show that, unless a = b = 0, one can find a continuous set of eigenvalues
for labeling this state with an internal degree of freedom θ (see Problem 11).
While it is a bit strange that nothing in nature seems to correspond to this
structure, the usual procedure is to choose the one-particle states with
eigenvalues a = b = 0. Therefore, from Eq. (5.98), we conclude that the
massless one-particle states have48 W 2 = 0.
Once we have eliminated this possibility, what remains as the Little
Group are the rotations in the x − y plane, viz., SO(2). The irreducible
representations of SO(2) are one-dimensional and the generator is the angular momentum J 3 . So the massless representations can be labelled by the
eigenvalue h of J 3 which represents the angular momentum in the direction
of propagation of the massless particle. (This is called helicity.) Taking a
cue from standard angular momentum lore, one would have thought that
h = 0, ±1/2, ±1 etc. Again, this result turns out to be true but its proof
requires using the fact that the universal covering group of the Lorentz
group is SL(2,C) and that is a double covering.49
One — rather non-obvious — conclusion we arrive at is that massless
particles are characterized by a single value h of the helicity. The helicity h = P · J is a pseudoscalar. So if the interactions conserve parity, to
each state of helicity h there must exist another state of helicity −h and
the question arises whether both represent the same particle. Since electromagnetic interactions conserve parity, it seems sensible to represent the
photon through a representation of Poincare group and parity, corresponding to the two states h = ±1. Then the massless photon will have two
degrees of freedom. In contrast, for massless neutrinos which participate
only in the weak interactions (we always ignore gravity, because we don’t
understand it!) which do not conserve parity, we can associate h = −1/2
with a neutrino and h = +1/2 with an antineutrino.
So, guided by physical considerations plus the group structure, we associate with different particles, the relevant states in the Hilbert space such
that they transform under unitary representations of the Poincare group.
As we said before, all such representations should necessarily be infinite
dimensional; a single particle state |ψ will be an eigenstate of Pa with
Pa |ψ = pa |ψ for the momentum eigenvalues pa with p0 > 0 and p2 ≥ 0.
The |ψ will transform as |ψ → exp(iθab S ab )|ψ where the boost and rotation are specified by the parameters θab and S ab are the generators of
Lorentz group relevant to the particle.50 This Hilbert space is complete
with respect to these states in the sense that
d3 pj 1
d Γψ ≡
(5.99)
d Γψ |ψψ|;
1=
(2π)3 2Ej
ψ
j∈ψ
48 This
agrees well with the limit m →
0 in Eq. (5.94). One could have taken
the easy way out and argued based on
this idea. However, it is intriguing that
a continuous one-parameter representation is indeed allowed by the Little
Group structure for massless particles.
Nobody really knows what to make of
it, so we just ignore it.
49 The standard proof (for SU(2)) that
jz is quantized is purely algebraic. One
considers the matrix elements of jx +
ijy and using the angular momentum
commutation relations, shows that a
contradiction will ensue unless jz is
quantized. Unfortunately, for the Little Group of massless particles, we do
not have any jx , jy and we are treating the generator jz of SO(2) as a single entity. This is why the usual proof
fails.
50 Multi-particle states behave in a
similar way with the transformations
induced by the representations corresponding to each of the particles.
212
Chapter 5. Real Life II: Fermions and QED
where d Γψ is the Lorentz invariant phase space element (except for an
overall delta function) for the total momentum.
5.5
Dirac Equation and the Dirac Spinor
5.5.1
The Adjoint Spinor and the Dirac Lagrangian
Once we think of ψ(x) as a field with definite Lorentz transformation properties, it is natural to ask whether one can construct a Lorentz invariant
action for this field, varying which we can obtain the Dirac equation —
rather than by taking dubious square-roots — just as we did for the scalar
and the electromagnetic fields. Let us now turn to this task.
This would require constructing scalars, out of the Dirac spinor, which
could be used to build a Lagrangian. Normally, given a column vector ψ,
one would define an adjoint row vector ψ † ≡ (ψ ∗ )T and hope that ψ † ψ will
be a scalar. This will not work in the case of the Dirac spinor. Under a
Lorentz transformation, we have ψ → L1/2 ψ, ψ † → ψ † L†1/2 ; but ψ † ψ does
51 This
is related to the fact mentioned
earlier, viz. that the representation of
the Lorentz algebra is not unitary.
not51 transform as a Lorentz scalar because L†1/2 L1/2 = 1.
However, this situation can be easily remedied as follows. Since in
the Weyl representation of the gamma matrices we have (γ 0 )† = γ 0 and
(γ α )† = −γ α , it follows that γ 0 γ m γ 0 = (γ m )† , and hence
(S mn )† = −(i/4) (γ n )† , (γ m )† = (i/4)[(γ m )† , (γ n )† ]
(5.100)
leading to the result
γ 0 (S mn )† γ 0 =
i
i 0 m † 0 0 n † 0
[γ (γ ) γ , γ (γ ) γ ] = [γ m , γ n ] = S mn
4
4
(5.101)
Therefore,
(γ 0 L1/2 γ 0 )†
¯ mψ
Exercise 5.10: Verify that ψγ
transforms as a vector.
52 To be precise, we only want the
equation of motion to be Lorentz invariant rather than the action being
a scalar. (The so called ΓΓ action in
GR is not generally covariant but its
variation gives the covariant Einstein’s
equations.) If you had constructed an
action A by using ψ† rather than ψ¯
in Eq. (5.105), you can still obtain the
Dirac equation for ψ by varying ψ† .
But if you vary ψ you will end up getting a wrong equation which is not consistent with the Dirac equation. (Incidentally, this is a good counter example to the folklore myth that equations
obtained from a variational principle
will be automatically consistent.) The
real problem with A , as far as variational principle goes, is that it is not
real while A is. While we can be cavalier about the action being a Lorentz
scalar, we do not want it to become
complex.
= γ 0 exp(iθmn S mn )† γ 0 = exp(−iθmn γ 0 S mn† γ 0 )
= exp(−iθmn S mn ) = L−1
1/2
(5.102)
† 0
ψ † γ 0 ψ → (ψ † L†1/2 )γ 0 (L1/2 ψ) = (ψ † γ 0 L−1
1/2 L1/2 ψ) = ψ γ ψ
(5.103)
giving us
Therefore, if we define a Dirac adjoint spinor ψ¯ by the relation
¯
ψ(x)
= ψ † (x)γ 0
(5.104)
¯ transforms as a scalar under the Lorentz transformations.
it follows that ψψ
¯ m ψ transforms as a four-vector
In a similar manner, we can show that ψγ
under the Lorentz transformations.
Given these two results, we can construct the Lorentz invariant action
for the Dirac field to be
¯
A = d4 x ψ(x)
(iγ m ∂m − m)ψ(x) ≡ d4 xLD
(5.105)
Since ψ is complex, we could vary ψ and ψ¯ independently in this action;
varying ψ¯ leads to the Dirac equation, while varying ψ leads to the equation
for the adjoint spinor which, of course, does not tell you anything more.52
5.5. Dirac Equation and the Dirac Spinor
213
¯ you can write down
(If you want a more symmetric treatment of ψ and ψ,
another action by integrating Eq. (5.105) by parts and allowing ∂m to act
¯ If you then add the two and divide by 2, you end up getting a more
on ψ.
symmetric form of the action.)
It is also instructive to rewrite the Dirac Lagrangian in terms of the
two spinors with ψ = (ψL , ψR ). A simple calculation gives
† m
† m
†
†
σ
¯ ∂m ψL + iψR
σ ∂m ψR − m(ψL
ψR + ψR
ψL )
LD = iψL
(5.106)
¯ m are defined in Eq. (5.31). You could easily verify that
where σ m and σ
† m
† m
ψL σ
¯ ψL and ψR σ ψR transforms as a four-vector under the Lorentz trans†
†
ψR and ψR
ψL are invariant. For example, using
formations and the pair ψL
†
ψR is not Lorentz invariEq. (5.77) and Eq. (5.78), it is easy to see that ψR
ant because
1 †
1
†
†
ψR [(−iθ +η)·σ]ψR + (iθ +η)·(ψR
σ)ψR = η ·(ψR
σψR ) = 0
2
2
(5.107)
†
ψR is indeed Lorentz invariant:
On the other hand, ψL
†
ψR ) =
δ(ψR
1
†
† 1
†
[(−iθ + η) · σ]ψR + (iθ − η) · (ψL
ψR ) = ψL
σψR ) = 0
δ(ψL
2
2
(5.108)
†
†
†
From these, we can form two real combinations ψL
ψR +ψR
ψL and i(ψL
ψR −
†
ψR ψL ). But since parity interchanges ψL and ψR , only the first combination is a scalar (while the second one is a pseudoscalar). This explains the
structure of the terms which occur in the Dirac Lagrangian in Eq. (5.106)
in terms of the two-spinors.
∗
∗
and ψR
in Eq. (5.106) we again get the Dirac equation in
Varying ψL
terms of ψR and ψL obtained earlier in Eq. (5.36):
σ
¯ m i∂m ψL = mψR ,
σ m i∂m ψR = mψL
(5.109)
Acting by σ m i∂m on both sides of the first equation, and using the second
and the identity {σ m , σ
¯ n } = 2η mn , we immediately obtain
( + m2 )ψL = 0
(5.110)
and a similar result for ψR . Thus each component of the Dirac equation
satisfies the Klein-Gordon equation with mass m.
5.5.2
Charge Conjugation
Another important transformation of the Dirac spinor corresponds to an
operation called charge conjugation. To see this in action, let us start with
the Dirac equation in the presence of an electromagnetic field
[iγ m (∂m − ieAm ) − m] ψ = 0
(5.111)
obtained by the usual substitution ∂m → Dm ≡ ∂m − ieAm . Taking the
complex conjugate of this equation will give
[−iγ m∗ (∂m + ieAm ) − m] ψ ∗ = 0
(5.112)
But since −γ m∗ obeys the same algebra as γ m , it must be possible to
find a matrix M such that −γ m∗ = M −1 γ m M ; if we now further define
Exercise 5.11: Prove that the global
symmetry of the Dirac Lagrangian under ψ → eiα ψ, leads to a conserved
¯ m ψ we
current proportional to the ψγ
met earlier.
214
Chapter 5. Real Life II: Fermions and QED
ψc ≡ M ψ ∗ , then it follows that ψc satisfies the same Dirac equation with
the sign of the charge reversed:
[iγ m (∂m + ieAm ) − m] ψc = 0
53 The
only imaginary gamma matrix
is γ 2 in both Dirac and Weyl representations.
(5.113)
It is easy to verify that γ 2 γ m∗ γ 2 = γ m . Therefore, we can define the charge
conjugated53 spinor ψc as:
(5.114)
ψc ≡ γ 2 ψ ∗
Roughly speaking, if ψ describes the electron, ψc will describe the positron.
5.5.3
Plane Wave solutions to the Dirac Equation
We saw earlier (see Eq. (5.110)) that each component of the Dirac equation satisfies the Klein-Gordon equation with mass m. So a good basis for
the solutions of the Dirac equation is provided by the plane wave modes
exp(±ipx). Let us consider the solutions of the form u(p, s)e−ipx and
v(p, s)eipx for ψ where
(γ a pa − m)u(p, s) = 0;
(γ a pa + m)v(p, s) = 0
(5.115)
The variable s = ±1 takes care of the two possibilities: spin-up and spindown. The two spinors u and v behave like ψ under the Lorentz transformation and, in particular, u
¯u and v¯v are Lorentz scalars. In the rest frame
with pi = (m, 0), we have γ j pj − m = m(γ 0 − 1). Hence, the spinor solution
to the Dirac equation in the rest frame u(p = 0, s) will satisfy (γ 0 −1)u = 0.
All we now need to do is to solve this in the rest frame and normalize the
solutions in a Lorentz invariant manner; then we will have a useful set of
plane wave solutions for the Dirac equation. The expressions for u(p, s)
and v(p, s) in an arbitrary frame can be obtained by using a Lorentz boost
since we know how the Dirac spinor changes under a Lorentz boost. There
are many ways to do this and we will illustrate two useful choices.
In the Weyl basis, (γ 0 − 1)u = 0 implies that u is made of a pair of
identical two-spinors (ζ, ζ). (We have suppressed the spin dependence ζ(s)
in the rest frame to keep the notation simple.) Similarly, v is made of a
pair of spinors (η, −η). For example, we can choose this set to be
⎛ ⎞
⎛ ⎞
⎛ ⎞
⎛ ⎞
1
0
−1
0
⎜0⎟
⎜1⎟
⎜0⎟
⎜1⎟
⎜ ⎟
⎜ ⎟
⎜ ⎟
⎟
u↑ = ⎜
⎝1⎠ and u↓ = ⎝0⎠ ; v↑ = ⎝ 1 ⎠ and v↓ = ⎝ 0 ⎠
1
0
−1
0
(5.116)
One convention for normalizing this spinor is by u† u = 2Ep which implies
ζ † ζ = m in the
√ rest frame. We will rescale the spinor and take it to have
the form ζ = m ξ in the rest frame. Under a Lorentz boost, these spinors
transform as
√
√
RL
0
m
ξ
m
R
ξ
L
√
u(p, s) = Ru(m, s) =
= √
(5.117)
0 RR
mξ
m RR ξ
Using the result in Eq. (5.50) and using γ = (E/m), γβn = p/m, we find
that the boosted spinors have the form
√
√
E −p·σξ
E −p·ση
√
√
; v(p, s) =
(5.118)
u(p, s) =
E +p·σξ
− E +p·ση
5.5. Dirac Equation and the Dirac Spinor
215
Another choice is to work in the Dirac representation and use the
normalization u
¯u = 1. If we choose the Dirac representation of the γmatrices, then the four independent spinors we need: u(p, +1), v(p, +1)
and v(p, −1), u(p, −1) can be taken to be the column vectors
⎛ ⎞
⎛ ⎞
⎛ ⎞
⎛ ⎞
1
0
0
0
⎜0⎟
⎜1⎟
⎜0⎟
⎜0⎟
⎟
⎜ ⎟
⎜ ⎟
⎜ ⎟
u=⎜
(5.119)
⎝0⎠ and ⎝0⎠ ; v = ⎝1⎠ and ⎝0⎠
0
0
0
1
This choice corresponds to the implicit normalization conditions of the form
v¯(p, s)v(p, s) = −1,
u
¯(p, s)u(p, s) = 1,
u
¯v = 0,
v¯u = 0
(5.120)
Obviously, Lorentz invariance plus basis independence tells us that these
relations should hold in general.54 If you now do the Lorentz boost and
work through the algebra you will get the solution in an arbitrary frame to
be:
−γ a pa + m
v(p, s) =
v(0, s)
2m(Ep + m)
(5.121)
It is often useful to express these normalization conditions in a somewhat
different form. To do this, note that in the rest frame, we have the relation
γ a pa + m
u(p, s) =
u(0, s);
2m(Ep + m)
ua (p, s)¯
ub (p, s) =
s
s
va (p, s)¯
vb (p, s) =
1 0
γ + 1 ab
2
1 0
γ − 1 ab
2
(5.122)
To express this in a more general form, we have to rewrite the right hand
side as a manifestly Lorentz invariant expression. It is possible to do this
rather easily if you know the result! The trick is to write55
a
γ pa + m
ua (p, s)¯
ub (p, s) =
2m
ab
s
a
γ pa − m
va (p, s)¯
vb (p, s) =
(5.123)
2m
ab
s
and verify that it holds in the rest frame. If you did not know the result,
you could reason as follows: There is a theorem which states that any 4 × 4
matrix can be expressed as a linear combination of a set of 16 matrices56
$
%
1, γ m , σ mn , γ m γ 5 , γ 5
(5.124)
So the left hand side of Eq. (5.123), being a 4×4 matrix, should be expressible as a linear combination of these matrices; all we need to do is to reason
out which of these can occur and with what coefficients. You cannot have
γ 5 or γ m γ 5 because of parity; you cannot have σ mn because of Lorentz
invariance when you realize that you only have a single Lorentz vector pa
available. Therefore, the right hand side must be a linear combination of
γ a pa and m. You can fix the relative coefficient by operating from the left
by γ a pa . The overall normalization needs to be fixed by taking α = β and
summing over α.
54 Notice that u
¯u = 1 while v¯v = −1;
we do not have a choice in this matter.
Also note that, under the charge conjugation, the individual spinor components change as (u↑ )c = v↓ , (u↓ )c =
v↑ , and (v↑ )c = u↓ , (v↓ )c = u↑ . In
other words, charge conjugation takes
particles into antiparticles and flips the
spin.
Exercise 5.12: Obtain Eq. (5.121).
55 If we use the alternative normalisation of u
¯u = 2m in the rest frame,
we need to redefine
u and v by scaling
√
up a factor ( 2m) so that you don’t
have the factor 2m on the right hand
side of Eq. (5.123). This is particularly
useful if you want to study a massless
fermion.
56 A simple way to understand this
result is as follows. We know that
γ m γ n = ±γ n γ m where the sign is
plus if m = n and is minus if
m = n.
This allows us to reorder any product of gamma matrices
as ±γ 0 . . . γ 0 γ 1 . . . γ 1 γ 2 . . . γ 2 γ 3 . . . γ 3 .
In each of the sets, because γ a squares
to ±1, we can simplify this product to
a form ±(γ 0 or 1)×(γ 1 or 1)×(γ 2 or 1)×
(γ 3 or 1). So the net result of multiplying any number of gamma matrices
will be — upto a sign or ±i factor —
one of the following 16 matrices: 1 or
γ m ; or γ m γ n with m = n which can be
written (1/2)γ [m γ n] ; or a product like
γ l γ m γ n with different indices which,
in turn, can be expressed in terms
of (1/6)γ [l γ m γ n] = ilmnr γr γ 5 ; or
γ 0 γ 1 γ 2 γ 3 = −iγ 5 . These are clearly
16 linearly independent 4 × 4 matrices.
Given the fact that any 4 × 4 matrix
has only 16 independent components,
it follows that any matrix can be expanded in terms of these.
216
57 Here, as well as in many other expressions, we will suppress the spin index and write u(p) for u(p, s) etc.
Chapter 5. Real Life II: Fermions and QED
Another combination which usually comes up in the calculations is
u
¯(p )γm u(p). There is an identity (called the Gordon identity) which allows
us to express terms of the kind u
¯(p )γm u(p) in a different form involving
p + p and q ≡ p − p . This identity is quite straightforward to prove along
the following lines:57
1
u
¯(p )(γ n pn γm + γm γ n pn )u(p)
u
¯(p )γm u(p) =
2m
1
=
u
¯(p )(pn γn γm + γm γn pn )u(p)
2m
1
=
u
¯(p )(pn [ηmn + iσmn ] + [ηmn − iσmn ]pn )u(p)
2m
1
u
¯(p )[(pm + pm ) + iσmn q n ]u(p)
=
(5.125)
2m
The existence of the σ mn term tells you that, when coupled to an electromagnetic potential in the Fourier space through u¯(p )γm u(p)Am (q) where
q = p − p is the relevant momentum transfer, we are led to a coupling
involving the term
1
1
(5.126)
σ mn qn Am = σ mn (qn Am − qm An ) = σ mn Fnm
2
2
in the interaction. This is the real source of a magnetic moment coupling
arising in the Dirac equation. We will need these relations later on while
studying QED.
5.6
5.6.1
Quantizing the Dirac Field
Quantization with Anticommutation Rules
Now that we have found the mode functions which satisfy the Dirac equation, it is straightforward to elevate the Dirac field as an operator and
attempt its quantization. The mode expansion in plane waves will be
d3 p
ψ(x) =
b(p, s)u(p, s)e−ipx + d† (p, s)v(p, s)eipx
3/2
1/2
(2π) (Ep /m)
s
(5.127)
As usual, the px in the exponentials are evaluated on-shell with p0 = Ep =
(p2 + m2 )1/2 . In the right hand side, we have integrated over the momentum p and summed over the spin s. The overall normalization is slightly
different from what we are accustomed to (e.g., in the case of a scalar field)
and has a m1/2 factor; but this is just a matter of choice in our normalisation of the mode functions. Obviously one could rescale u and v to change
this factor. The b and d† are distinct, just as in the case of the complex
scalar field we encountered in Sect. 3.5 (see Eq. (3.148)), because ψ is complex. Here b annihilates the electrons and d† creates the positrons; both
operations have the effect of reducing the amount of charge by e = −|e|.
The key new feature in the quantization, of course, has to do with
the replacement of the commutation relations between the creation and
the annihilation operators by anti-commutation relations. That is, for the
creation and annihilation operators for the electron, we will impose the
conditions
{b(p, s), b† (p , s )} = δ (3) (p − p )δss
{b(p, s), b(p , s )} = 0;
{b† (p, s), b† (p , s )} = 0
(5.128)
5.6. Quantizing the Dirac Field
217
There is a corresponding set of relations for the creation and annihilation
operators for the positrons involving d and d† .
The rest of the discussion proceeds in the standard manner. One starts
with a vacuum state annihilated by b and d and constructs the one particle
state by acting on it with b† or d† . You cannot go any further because
(b† )2 = (d† )2 = 0 showing that you cannot put two electrons (or two
positrons) in the same state. Since we have done much of this in the case
of bosonic particles, we will not pause and describe them in detail, except
to highlight the key new features.
An obvious question to ask is what happens to the field commutation
rules, when we impose Eq. (5.128). You would have guessed that they will
also obey anticommutation rules rather than commutation rules. This is
¯ )} at equal
in fact easy to verify for, say, the anticommutator {ψ(x), ψ(x
times t = t . The calculation is simplified by using translational invariance
and setting t = t = 0 as well as x = 0. This will give, with straightforward
algebra using the anticommutation rule for the creation and annihilation
operators, the result
d3 p
¯
(5.129)
{ψ(x, 0), ψ(0)} =
(2π)3 (Ep /m)
×
u(p, s)¯
u(p, s)e−ip·x + v(p, s)¯
v (p, s)eip·x
s
We now use the result in Eq. (5.123) to obtain58
a
d3 p
¯
(γ pa + m)e−ip·x + (γ a pa − m)eip·x
{ψ(x, 0), ψ(0)} =
(2π)3 (2Ep )
d3 p
2p0 γ 0 e−ip·x = γ 0 δ (3) (x)
(5.130)
=
(2π)3 (2Ep )
58 We have also used the fact that the
integral vanishes when the integrand is
an odd function of p.
In other words,59
59 From the structure of the Dirac La¯ m ∂m − m)ψ, we
grangian, L = ψ(iγ
find that the canonical momentum is
indeed given by πψ = [∂L/∂(∂0 ψ)] =
¯ 0 = iψ† . So the choice of variiψγ
ables for imposing the commutation
rules makes sense; but, of course, it
is the anticommutator rather than the
commutator which we use.
{ψa (x, t), iψb† (0, t)} = iδ (3) (x) δab
(5.131)
In the same way, one can also show that anticommutators {ψ, ψ} = 0
and {ψ † , ψ † } = 0. One could have also postulated these anticommutation
rules for the field operators and obtained the corresponding results for the
creation and annihilation operators.
This anticommutation rule which we use for fermions has several nontrivial implications all round. As a first illustration of the peculiarities
involved, let us compute the Hamiltonian for the field in terms of the creation and annihilation operators. We begin with the standard definition of
the Hamiltonian density given by
H=π
∂ψ
¯ α ∂α + m)ψ
− L = ψ(iγ
∂t
(5.132)
Note that the last expression has only the spatial derivatives which can be
converted into a time derivative using the Dirac equation. This allows us
to express the Hamiltonian in the form:
3
¯ 0 ∂ψ
H = d x H = d3 x ψiγ
(5.133)
∂t
Plugging in the mode expansion for ψ and carrying out the spatial integration, it is easy to reduce60 this to the form:
60 Make sure you understand the manner in which various terms come about.
Very schematically, ψ¯ contributes (b† +
d) while ψ˙ contributes (b − d† ) where
the minus sign arises from ∂t ψ. So the
product goes like (b† + d)(b − d† ) ∼
b† b − dd† because the condition v¯u = 0
kills the cross terms.
218
Chapter 5. Real Life II: Fermions and QED
H=
d3 p
Ep b† (p, s)b(p, s) − d(p, s)d† (p, s)
(2π)3 s
(5.134)
When we re-order the second term with d† to the left of d, we have to use
the anticommutation rule and write
−d(p, s)d† (p, s) = d† (p, s)d(p, s) − δ (3) (0)
(5.135)
The last term is the Dirac delta function in momentum space, evaluated
for zero momentum separation; that is:
δ (3) (0) = lim d3 x exp[ip · x] = d3 x
(5.136)
p→0
which is just the (infinite) normalisation volume. Substituting Eq. (5.135)
into Eq. (5.134), we get the final expression for the Hamiltonian to be
d3 p
H=
Ep b† (p, s)b(p, s) + d† (p, s)d(p, s)
3
(2π) s
d3 p
Ep
(5.137)
−
d3 x
(2π)3 s
61 So,
if nature has the same number of bosonic and fermionic degrees
of freedom, then the total zero-point
energy will vanish — which would be
rather pleasant. This is what happens
in theories with a hypothetical symmetry, called supersymmetry, which the
particle physicists are fond of. So far,
we have seen no sign of it, having discovered only roughly half the particles
predicted by the theory; but many live
in hope.
Exercise 5.13: Try this out.
62 You may object saying that even
the expression in Eq. (5.137) is infinitely negative because of the zeropoint energy. But remember that you
are not supposed to raise issues about
zero-point energy in a QFT course; or
rather, you make it go away by introducing the normal ordering prescription we had discussed earlier.
63 The fact that such a simple, universal rule — viz., that fermions obey the
Pauli exclusion principle — requires
the formidable machinery of QFT for
its proof, is probably because we do not
understand fermions at a sufficiently
deep level. But most physicists would
disagree with this view point.
The first two terms are the contributions from electrons and positrons of
energy Ep corresponding to the momentum p and spin s. This makes
complete sense. The last term has the form:
1
d3 p
E
E0 = − d3 x
2
(5.138)
p
(2π)3 s
2
This is our old friend, the zero-point energy, with (1/2)Ep being contributed by an electron and a positron separately, for each spin state. But
the curious fact is that it comes with an overall minus sign unlike the
bosonic degrees of freedom which contribute +(1/2)Ep per mode. This is
yet another peculiarity of the fermionic degree of freedom.61
In our approach, we postulated the Pauli exclusion principle for fermions
and argued that this requires the creation and annihilation operators to
anticommute rather than commute. This, in turn, leads to the anticommutation rules for the field with all the resulting peculiarities. A somewhat
more formal approach, usually taken in textbooks, is to start from the anticommutation rules for the field and obtain the anticommutation rules for
the creation and annihilation operators as a result. In such an approach,
one would first show that something will go wrong if we use the usual
commutation rules rather than anticommutation rules for the field operators. What goes wrong is closely related to the expression for the energy
we have obtained, which — once we throw away the zero-point energy —
is a positive definite quantity. If you quantize the Dirac field using the
commutators, you will find that the term involving d† d in Eq. (5.137) will
come with a negative sign and the energy will be unbounded from below.
This is the usual motivation given in textbooks for using anticommutators
rather than commutators for quantizing the Dirac field.62 It is possible to
provide a rigorous proof that one should quantize fermions with anticommutators and bosons with commutators by using very formal mechanisms
of quantum field theory; but we will not discuss it.63
5.6. Quantizing the Dirac Field
5.6.2
219
Electron Propagator
Having quantized the Dirac field, we can now determine its propagator. A
cheap, but useful, trick to obtain the propagator is the following. We know
ˆ a )Ψ(x) = 0, the
that when any field Ψ obeys an equation of the kind K(∂
propagator in momentum space is given by (1/K(pa )) and the real space
propagator, fixed by translational invariance, is just:
d4 p e−ip·(x−y)
(5.139)
G(x − y) =
(2π)4 K(p)
For example, the Klein-Gordon field has K(p) = p2 − m2 + i and the
corresponding propagator is precisely what we find64 using the prescription
ˆ a) =
in the above equation. (See e.g., Eq. (1.96).) For the Dirac field, K(∂
a
a
iγ pa − m and hence K(p) = γ pa − m + i . Hence, the propagator in
momentum space (which is conventionally written with an extra i-factor)
is given by:
i
(γ a pa + m)
iS(p) ≡ a
=i 2
(5.140)
γ pa − m + i
p − m2 + i
where the second expression is obtained by multiplying both the numerator
and the denominator by (γ a pa + m). This propagator, like everything else
in sight, is a 4×4 matrix. The corresponding real space propagator is given
by
d4 p −ip·x i(γ a pa + m)
d4 p −ip·x
i
iS(x) =
=
e
e
(2π)4
γ a pa − m + i
(2π)4
p2 − m2 + i
(5.141)
In fact, one can write this even more simply as
d4 p
e−ip·x
a
= (iγ a ∂a + m)G(x) (5.142)
S(x) = (iγ ∂a + m)
(2π)4 p2 − m2 + i
where G(x) is the scalar field propagator.
If we want to do this more formally (without the cheap trick), we have
¯
to define the propagator in terms of the time ordered product T [ψ(x)ψ(0)]
of the Dirac operators, in a Lorentz invariant manner. The definition of
this operator was quite straightforward for bosonic fields, leading to a relativistically invariant expression. But in the case of fermionic fields we again
have to be a little careful in its definition.
To see what is involved, let us take the propagator in Eq. (5.141), obtained above through our trick, and try to write it in terms of the three
dimensional momentum integral involving d3 p rather than d4 p. As usual,
when we do the p0 integration in the complex plane, we have poles at
p0 = ±(Ep − i ). So, for x0 > 0, we need to close the contour in the lower
half plane to make the factor exp(−ip0 x0 ) converge. Going around the pole
at +[Ep − i ] in the clockwise direction (see Fig. 5.1(a)), we get
d3 p −ipx γ a pa + m
e
(5.143)
iS(x) = (−i)i
(2π)3
2Ep
On the other hand, when x0 < 0, we have to close the contour in the upper
half plane but now we go around the pole at −[Ep − i ] in the anticlockwise
direction. This leads to
d3 p iEp x0 +ip·x 1
2
iS(x) = i
e
(−Ep γ 0 − pα γα + m)
(5.144)
(2π)3
−2Ep
64 Of course, we need to add a positive
imaginary part to m to get the correct
i factor; this rule is universal.
Im po
x0 > 0
−Ep + i
Re po
Ep − i
(a)
Im po
−Ep + i
Re po
Ep − i
x0 < 0
(b)
Figure 5.1: Contours to compute the
p0 -integral for the propagator.
220
Chapter 5. Real Life II: Fermions and QED
We now change the integration variable from p to −p to get the Lorentz
invariant factors eipx and γ a pa . Then this contribution becomes:
d3 p ipx 1
iS(x) = −
e
(γ a pa − m)
(5.145)
(2π)3
2Ep
Putting these two together, we get:
a
a
d3 p
0 γ pa + m −ipx
0 γ pa − m ipx
iS(x) =
e
e
)
−
θ(−x
)
θ(x
(2π)3 (Ep /m)
2m
2m
(5.146)
This suggests that, if we have to get the same propagator from the timeordered product of the fields, we need to introduce a crucial minus sign in
the time ordered product and define it as
¯
¯ − θ(−x0 )ψ(0)ψ(x)
¯
T ψ(x)ψ(0)
≡ θ(x0 )ψ(x)ψ(0)
(5.147)
There is actually a simple reason why the time ordering operator for
fermions needs to be defined with a minus sign. We know that fermionic
annihilation operators, for example, anticommute for generic momenta and
times. Therefore,
T {ap (t)aq (t )} = −T {aq (t )ap (t)}
(5.148)
So, if we try to define the time ordering operation as merely moving the
operators around and putting them in time order, this equation will imply
that time ordered product vanishes! To prevent this from happening, we
should define the time ordering operator for fermions by keeping track of
the minus signs which arise when we move the anticommuting operators
through each other. So, the correct definition will require the extra minus
sign as in:
T {ψ(x)χ(y)} = ψ(x)χ(y)θ(x0 − y0 ) − χ(y)ψ(x)θ(y0 − x0 )
(5.149)
Having motivated the minus sign from reverse engineering our result
in Eq. (5.141), we could just take Eq. (5.147) to be our definition and
¯
compute the vacuum expectation value 0|T [ψ(x)ψ(0)]|0.
An elementary
0
calculation now gives, for x > 0
d3 p
¯
u(p, s)¯
u(p, s)e−ipx
iS(x) = 0|ψ(x)ψ(0)|0 =
(2π)3 (Ep /m) s
d3 p
γ a pa + m −ipx
e
=
(5.150)
3
(2π) (Ep /m)
2m
For x0 < 0, one needs to be slightly more careful, but it is again easy to
obtain the result
d3 p
¯
iS(x) = −0|ψ(0)ψ(x)|0
=−
v¯(p, s)v(p, s)e−ipx
3
(2π) (Ep /m) s
a
γ pa − m −ipx
d3 p
=−
(5.151)
e
(2π)3 (Ep /m)
2m
Comparing with the result in Eq. (5.146), we find that everything turns
out fine once we have defined the time ordering operator with the minus
sign.
5.6. Quantizing the Dirac Field
221
Another closely related object we could compute is the anticommutator
at unequal times, defined by
¯
iS = {ψ(x), ψ(y)}
(5.152)
The computation is straightforward and is very similar to the one we did
above. The final result can be expressed in the form
iS(x − y) = (iγ a ∂a + m)[G(x − y) − G(y − x)];
where
G(x − y) =
∂a ≡
d3 p
e−ip(x−y)
(2π)3 2Ep
∂
∂xa
(5.153)
(5.154)
is the standard scalar field propagator. The other two anticommutators
vanish
¯
¯
{ψ(x), ψ(y)} = {ψ(x),
ψ(y)}
=0
(5.155)
This result, at the face of it, should ring some alarm bells. In the
case of the scalar field, we computed the corresponding quantity, viz. the
commutator [φ(x), φ(y)] which is exactly the term within the square bracket
in Eq. (5.153). We also saw that [φ(x), φ(y)] = 0 for (x − y)2 < 0, and
made a song and dance of it, claiming that this fact — viz. the scalar
fields at two spacelike separated events commute with each other — is an
expression of causality. This doesn’t hold for the Dirac field. Instead, what
we get from the known structure of G(x) is the result:
¯
{ψ(x), ψ(y)}
= 0;
(x − y)2 < 0
(5.156)
That is, the anti-commutator (rather than the commutator) vanishes outside the light cone. So, what happened to causality in the presence of
fermions?
The only way to save the theory is to declare that ψ(x) is not a direct
observable in the laboratory.65 But if that is the case, what are the observables for the fermionic fields? We would expect them to be built out
of quadratic bilinear of the field, like
O(x) = ψ¯a (x) Oab (x) ψb (x)
(5.157)
where Oab (x) is essentially a Dirac matrix consisting of c-numbers and
differential operators.66 We would like to have the commutator, (not the
anticommutator), [O(x), O(y)] of these observables to vanish for spacelike
separations; otherwise we are sunk! Let us calculate this commutator which
is given by
[O(x), O(y)] = Oab (x) Ocd (y) ψ¯a (x)ψb (x), ψ¯c (y)ψd (y)
(5.158)
Using the operator identities
[A, BC] = {A, B} C − B {A, C}
(5.159)
we can expand out the expression and relate it to the basic anticommutators
of the fields at unequal times:
(5.160)
ψ¯a (x)ψb (x), ψ¯c (y)ψd (y)
$
%
¯
¯
¯
= ψa (x) ψb (x), ψc (y) ψd (y) − ψc (y) {ψb (x), ψd (y)}
%
$
%
$
+ ψ¯a (x), ψ¯c (y) ψd (y) − ψ¯c (y) ψ¯a (x), ψd (y) ψb (x)
[A, BC] = [A, B] C + B [A, C] ;
65 This might sound rather shady, but
remember that spinors flip in sign if
you rotate a coordinate by 2π. It is
difficult to conceive of a measuring device in the lab which will become “the
negative of itself” if we rotate the lab
by 2π!
66 A simple example of such an observable will be the Dirac current operator
m = q(γ m ) .
for which Oab
ab
222
Chapter 5. Real Life II: Fermions and QED
On using the results in Eq. (5.153) and Eq. (5.155), we can reduce this to
the form
ψ¯a (x)ψb (x), ψ¯c (x)ψd (x) = iSbc (x − y)ψ¯a (x)ψd (y)
−iSda (y − x)ψ¯c (y)ψb (x) (5.161)
We now see that the vanishing of the anti-commutator S(x − y) at spacelike separations (which we obtained earlier in Eq. (5.156)) now allows us
to conclude that the commutator of the observables vanishes at spacelike
separations:
[O(x), O(y)] = 0
for (x − y)2 < 0
(5.162)
67 Note
that causality arguments only
prove that particles with integer spin
should be quantized with commutation
rules; it cannot be used to prove that
particles with half-integer spins must
be quantized using anticommutation
rules. This is because observables for
spinors are bilinear in spinors and essentially behave like integer spin particles.
68 Recall the curious property of functions of Pauli matrices described by
Eq. (5.9). You may now see a clear
similarity between Grassmannians and
structures involving Pauli or Dirac matrices.
Very gratifying, when you think about it. This is the conventional expression for causality.67
5.6.3
Propagator from a Fermionic Path Integral
In the study of the bosonic fields (like scalar or vector fields) we could
also interpret the propagator using an external source. For example, if the
Lagrangian for a free scalar field is taken to be L = −(1/2)φDφ, then we
could add a source term to the Lagrangian of the form Jφ and perform the
path integral over φ to define the vacuum persistence amplitude Z[J] =
exp[−(1/2)JD−1 J]. A double functional differentiation of Z[J] allows us
to define the propagator as D−1 .
In principle, we can do the same with the Dirac Lagrangian by adding
¯
suitable sources and performing the path integral. But if we treat ψ (and ψ)
as just c-number complex functions, we will get a result which is incorrect.
This again has to do with the fact that the Dirac field anticommutes with
itself, making ψ 2 = 0 ! While manipulating the Dirac field as a c-number
function, one has to take into account this fundamental non-commutative
nature of the fermionic fields. This can be done by introducing c-number
quantities — called Grassmannians — which anticommute pairwise. We
will now introduce the curious mathematics of Grassmannians and define
the path integral for fermions using them.
In a way, Grassmannians are simpler than ordinary numbers. The fundamental property of these objects is that, for any two Grassmannians η
and ξ, we have the relation ηξ = −ξη which immediately makes η 2 = 0 for
any Grassmannian. As a result of this, the Taylor series expansion of any
function f (η) of a Grassmannian truncates with the linear term68 giving
f (η) = a + bη. As for integration over the Grassmannians, we will insist
that the shifting of the dummy integration variable should not change the
result. That is, we must have
dη f (η + ξ) = dη f (η)
(5.163)
Using the fact that the functions on both sides are linear ones, we will get
the condition that integral over η of (bξ) should vanish for all ξ. Since b
is just an ordinary number, this leads to the first rule of Grassmannian
integration
dη = 0
(5.164)
Further, given three Grassmannians, χ, η and ξ, we have χ(ηξ) = (ηξ)χ;
i.e., the product (ηξ) commutes with the Grassmannian χ. Based on this
5.6. Quantizing the Dirac Field
223
motivation, we will declare that the product of any two anticommuting
numbers should be an ordinary number and, in particular, the integral of
η over η should be a pure number which — by convention — is taken to
be unity. This leads to the second rule of Grassmannian integration
dη η = 1
(5.165)
These two rules are sufficient to develop a consistent integral calculus involving the Grassmannians.
There are, however, some surprises in these integrals which you need to
be aware of. For example, if f (η) is a Grassmannian of the form a + bη,
then the integral of f gives the result69
dη f (η) = dη (a + bη) = b
(5.166)
But if f (η) is an ordinary number (so that b is a Grassmannian), the same
integral gives you the result with a minus sign
dη f (η) = dη (a + bη) = −b
(5.167)
We can use these rules to compute Gaussian integrals over Grassmannians which is what we need in order to compute Fermionic path integrals.
If we have a set of N Grassmannians denoted by η = (η1 , η2 , . . . ηN ) and
similarly for another set η¯, then the Gaussian integral is given by
(5.168)
dη d¯
η eη¯Aη = det A
where A = Aij is an anti-symmetric N × N matrix. In the case of the
Dirac Lagrangian, we can think of ψ¯ and ψ as independent Grassmannian
valued quantities and evaluate the generating functional in the form
4
¯
Z =
Dψ Dψ¯ ei d xψ(iγ∂−m+i)ψ
det (iγ∂ − m + i ) = etr log(iγ∂−m+i)
=
(5.169)
where we have omitted the overall multiplicative constant. This trace can
be simplified using a trick (and the cyclic property of the trace) along the
following lines:
tr log(iγ∂ − m) =
=
=
tr log γ 5 (iγ∂ − m)γ 5 = tr log(−iγ∂ − m)
1
[tr log(iγ∂ − m) + tr log(−iγ∂ − m)]
2
1
tr log(∂ 2 + m2 )
(5.170)
2
leading to the result
Z = exp
1
tr log(∂ 2 + m2 − i )
2
(5.171)
In contrast, when we do the corresponding thing for a bosonic field, we end
up getting the result
1/2
1
i d4 x 12 [(∂φ)2 −(m2 −i)φ2 ]
Z =
Dφ e
=
det[∂ 2 + m2 − i ]
1
= e− 2 tr log(∂
2
+m2 −i)
(5.172)
69 So the integral of f (η) over η is the
same as the the derivative of f (η) with
respect to η!
224
70 You
will recall that ln Z is closely
related to the zero-point energy of the
field (see Eq. (2.71)). The flip of sign
suggests that the zero-point energy for
the fermionic field will have a sign opposite to that of the bosonic fields; this
is indeed what we found in Sect. 5.6.1;
see Eq. (5.138).
Chapter 5. Real Life II: Fermions and QED
So, essentially, there is a flip of sign in the generating function when we
move from bosons to fermions.70
As for the propagator, we can now proceed very much along the way
we did with the bosonic field. We first introduce two Grassmannian spinor
¯ In the resulting generating
sources η and η¯ corresponding to ψ and ψ.
functional, defined as,
¯
¯
− m)ψ + η¯ψ + ψη]
(5.173)
Z(η, η¯) = DψDψ¯ exp i d4 x[ψ(iγ∂
we complete the square in the usual manner:
¯
¯ = (ψ¯ + η¯K −1 )K(ψ + K −1 η) − η¯K −1 η
ψKψ
+ η¯ψ + ψη
(5.174)
and perform the integration by shifting the integration measure. This leads
to the (expected) result
Z(η, η¯) = exp −i d4 x η¯(iγ∂ − m)−1 η
(5.175)
allowing us to identify the propagator as the inverse of the operator (iγ∂ −
m). This agrees with the result we obtained earlier.
Extending the path integral formalism to fermionic degrees of freedom
brings within our grasp the technical prowess of the path integral to deal
with issues in QED. We will describe two such applications.
5.6.4
71 We use S rather than A for the action for typographical clarity.
Ward Identities
As a simple but important application of this, we will obtain a set of identities (usually called the Ward-Takahashi identities) which play a key role
in QED. The importance of these identities lies in the fact that they relate
to features of the exact, interacting, field theory.71
To derive these in a simple context, let us consider the effective ac¯ ψ, Ai ) which depends on
tion corresponding to the original action S(ψ,
¯
(ψ, ψ, Ai ). As we described in Sect. 4.5, one can obtain the effective action
as a Legendre transform by introducing a source term for the fields. In
the example discussed in Sect. 4.5, we used just one scalar field φ and a
source J with φc = J|φ|J denoting the expectation value in the state with
¯ ψ, Ai ), we need to introduce
source J. Now, since we have three fields (ψ,
three sources η¯, η, Ji and define the effective action through a Legendre
transform with respect to each of these sources. We will also have three
background expectation values ψ¯c , ψc , Aci which will act as the arguments
for the effective action. In this case, the generating functional will be
Z[ψ¯c , ψc , Aci ] ≡ exp(iSeff [ψ¯c , ψc , Aci ])
(5.176)
where Seff (ψ¯c , ψc , Aci ) is the effective action obtained by the Legendre transform with respect to the sources η¯, η, Ji . Let us suppose that we compute
this effective action, say, order by order in a perturbation theory. To the
lowest order, this would be given by the classical action which can be expressed in the form
0 ¯
[ψc , ψc , Aci ]
Seff
1
= d4 xd4 y Aic (x)D(0)ij (x − y)Ajc (y) + ψ¯c (x)S(0) (x − y)ψc (y)
2
−e d4 xψ¯c (x)Ajc (x)γj ψc (x)
(5.177)
5.6. Quantizing the Dirac Field
225
where the subscript ‘0’ indicates the free-field expressions for the propagators including the effect of any gauge fixing terms. When we compute the
effective action, including higher order effects, we will expect it to have a
form like
Seff [ψ¯c , ψc , Am
c ]
1
n
¯
= d4 x d4 y Am
(x)D
(x
−
y)A
(y)
+
ψ
(x)S(x
−
y)ψ
(y)
mn
c
c
c
2 c
−e d4 x d4 y d4 z ψ¯c (x)Am
c (y)Λm (x, y, z)ψc (z) + · · · (5.178)
In this expression, Dmn and S would correspond to the photon and electron
propagators incorporating the effect of interactions computed possibly upto
a certain order in perturbation theory. Similarly, Λm (x, y, z) incorporates
the effect of interactions at the electron-photon vertex to the same order
of approximation. (We will see an explicit computation of these quantities
later in Sect. 5.7.)
We now use the fact that the properly defined interacting theory of QED
must be gauge invariant. This, in turn, implies that the effective action
itself must be invariant under the local gauge transformations, except for
any gauge fixing terms we might have added. Let us assume for a moment
that there are no gauge fixing terms in the action (which, of course, is not
true and we will get back to it soon). Then the effective action must be
invariant under the local gauge transformations given by
δψ = iθ(z)ψ,
¯
δ ψ¯ = −iθ(z)ψ,
δAm =
1 m
∂ θ(z)
e
(5.179)
This result can be expressed as
δSeff
δSeff
δSeff m
δψc +
δA
0 = δSeff = d4 x δ ψ¯c ¯ +
c
δψc
δAm
δ ψc
c
δSeff
δSeff
δSeff
1
=
d4 z −iψ¯c ¯ + iψc
− ∂ m m θ(z) (5.180)
δψc
e
δAc
δ ψc
where the last term has been rewritten by an integration by parts. Since θ
is arbitrary, the expression within the brackets must vanish. The resulting
relations are called the Ward-Takahashi identities.72 We see that the term
involving Am will vanish only if the exact photon propagator satisfies the
constraint
∂ m Dmn (x − y) = 0;
k m Dmn (k) = 0
(5.181)
where the second relation holds in the Fourier space.
Let us now get back to the issue of gauge fixing terms in the action. A
gauge fixing term of the form
1
1
−
d4 x(∂m Am )(∂n An ) = −
d4 xd4 y(∂m Am )x δ(x − y)(∂n An )y
2λ
2λ
1
d4 xd4 y Am (x)[∂m ∂n δ(x − y)]An
= −
2λ
(5.182)
will contribute a term proportional to km kn to Dmn (k) in the Fourier space.
When we repeat the calculation in Eq. (5.180) with a gauge fixing term
72 In our case, we could have also obtained the same result more directly by
performing a gauge transformation on
Eq. (5.178).
226
Chapter 5. Real Life II: Fermions and QED
added, δSeff will not be zero but will be equal to the contribution from
the gauge fixing term. This will change the result in Eq. (5.181) to the
condition:
iλkn
k m Dmn (k) = − 2
(5.183)
k
This has an interesting implication. We know that, at the tree-level, the
free-field photon propagator satisfies Eq. (5.183) (see Eq. (3.205)). So if we
write the exact propagator, separating out the tree-level in the form
i
km kn
Dmn (k) = − 2 ηmn − (1 − λ) 2
(5.184)
+ Dmn (k)
k
k
then Eq. (5.183) implies that
k m Dmn (k) = 0
(5.185)
In other words, the gauge parameter λ does not pick up any corrections due
to interactions — which makes sense since we do not expect the interactions
to take cognizance of an artificial gauge fixing term which we have added for
our convenience. This result reinforces the fact that no physical phenomena
in the theory can depend on the gauge parameter λ.
The constraint in Eq. (5.185), along with Lorentz invariance, tells you
that the corrections to the photon propagator Dmn (k) must be expressible in the form (k 2 g mn − k m k n )Π(k 2 ) where Π(k 2 ) is a function to be
determined. We will see later that explicit computation does lead to this
result.
As regards the terms involving ψ¯c ψc in Eq. (5.180), we obtain the condition:
−iS(x − y)δ(z − x) + iS(x − y)δ(z − y) −
∂
Λm (x, z, y) = 0
∂z m
(5.186)
Fourier transforming this result, we obtain yet another identity in momentum space relating the exact electron propagator and the exact vertex
function. Taking the limit of p → p, this result can be expressed in the
momentum space in an equivalent form as
Λm (p, 0, p) =
∂SF
∂pm
(5.187)
We will have occasion to explicitly verify this relation later on.
5.6.5
Schwinger Effect for the Fermions
As a second application of the path integral over fermionic fields, we will
compute the effective action for electromagnetism obtained by integrating
out the electron field. This is in exact analogy with the corresponding
result for the complex scalar field, obtained in Sect. 4.2. The idea again is
to perform the integration over ψ and ψ¯ in the full path integral:
(5.188)
Z = DAj DψDψ¯ exp(iS)
where
S=
1 2
¯
d x − Fmn + ψ(iγD − m)ψ
4
4
(5.189)
5.6. Quantizing the Dirac Field
227
and we have not indicated the gauge fixing terms for simplicity. This will
allow us to define the effective Lagrangian for the electromagnetic field as:
DψDψ¯ exp(iS) ≡ exp iSeff [Aj ] = exp i d4 xLeff [Aj ]
(5.190)
This path integral can be calculated using a trick similar to the one
which led to Eq. (5.171). We note that, from the result73
Det(iγ a Da − m) = exp [Tr log(iγ a Da − m)]
d4 P
a
4
Tr log(γ Pa − m)
= exp
d x
(2π)4
d4 P
a
4
= exp
Tr log(−γ Pa − m)
d x
(2π)4
= Det(−iγ a Da − m)
(5.191)
73 This is essentially a generalization
of Eq. (5.170), proved here in a fancier
way for fun!
we have Tr x| ln(iγD − m)|x = Tr x| ln(−iγD − m)|x. Therefore, either
expression is equal to half the sum. So:
Tr x| ln(iγD − m)|x =
1
Tr x| ln(−(γD)2 − m2 )|x
2
(5.192)
Therefore, the effective Lagrangian for the electromagnetic field is given
by:
1 2
i
− Tr x| ln(−(γD)2 − m2 )|x
(5.193)
Leff = − Fmn
4
2
Expressing the logarithm using the integral representation, this becomes
2
i ∞ ds −ism2
1 2
e
+
Tr x|e−is(γD) |x
(5.194)
Leff = − Fmn
4
2 0 s
Further, on using the result,
e
(γD)2 = Dm Dm + Fmn σ mn
2
we can write
(5.195)
ds −ism2
ˆ
x))2 − e2 Fmn σmn ]s
e
Tr x|ei[p−eA(ˆ
|x
s
0
(5.196)
The rest of the calculation for computing the expectation value in
Eq. (5.196) proceeds exactly as in the case of the complex scalar field,
by reducing the problem to that of a harmonic oscillator. The final result
can be expressed in a covariant form as:
∞
ds −m2 s
1 2
e2
Leff = − Fmn − 2
e
(a cot eas)(b coth ebs)
(5.197)
4
8π 0 s
Leff
1 2
i
= − Fmn
+
4
2
∞
Exercise 5.14: Obtain Eq. (5.197).
where a and b are defined — as before, see Eq. (4.20) — by:
a2 − b 2 = E 2 − B 2 ;
ab = E · B
(5.198)
Here, and in what follows, ea, eb etc. stand for |ea|, |eb| etc. which are
positive definite.
As in the case of the scalar field, the expression for Leff is divergent
but these divergences are identical in structure to the ones obtained in the
228
Chapter 5. Real Life II: Fermions and QED
case of the scalar field. This can be seen by a perturbative expansion of
the integrand in the coupling constant e, which will give
1
e 2 s2 2
(a − b2 ) + 7(ab)2
− (a2 − b2 ) −
3
45
(5.199)
The first two terms lead to the divergences; the first one is an infinite
constant independent of Fab and can be ignored, while the second one —
being proportion to (a2 − b2 ) = (E 2 − B 2 ) — merely renormalizes the
charge.74 Subtracting out these two terms, we will get the finite part of
the effective Lagrangian to be
∞
e2
ds −sm2
1 2
e
−
(5.200)
Leff = − Fmn
2
4
32π 0 s
4
2 2
× 4(a cot eas)(b coth ebs) − 2 2 − Fmn
e s
3
(a cot eas)(b coth ebs) =
74 The magic of renormalizability, as
usual. It is only because this divergent term has the same structure as
the original term that we could proceed further.
1
e 2 s2
which is to be compared with Eq. (4.45). The last term in the square
2
bracket combines with −(1/4)Fnm
to eventually lead to a running coupling
constant exactly as in Eqns. (4.44), (4.47) and (4.55), but with Eq. (4.51)
replaced by
∞
ds −m2 s
e2
e
(5.201)
Z=
12π 2 0 s
This, in turn, will lead to a slightly different β-function — compared to
the scalar field case — given by
β(e) = μ
∂e
e3
=
∂μ
12π 2
(5.202)
We will later obtain the same result (see Eq. (5.274)) from perturbation
theory.
As in the case of the charged scalar field, the imaginary part of the
effective Lagrangian will give you the pair production rate per unit volume
per second through the relation
iA 2
e = eiA e−iA∗ = ei(A−A∗ ) = e−2Im [A] = e−2V T Im Leff
(5.203)
Calculating the imaginary part exactly as in the case of scalar field, we
obtain the result
∞
∞
αE 2 1
−nπm2
1 1 −m2 sn
e
=
exp
(5.204)
2Im (Leff ) =
4π n=1 s2n
2π 2 n=1 n2
eE
Except for numerical factors, this matches with the result in the case of
scalar field and is again non-perturbative in the coupling constant e.
There is a somewhat more physical way of thinking about the effective
Lagrangian, which is worth mentioning in this context. To do this, let us
again begin with the standard QED Lagrangian
1 2
¯
¯ mψ
+ ψ(iγ∂
− m)ψ − eAm ψγ
L = − Fmn
4
(5.205)
which describes the interaction between ψ and Am . We are interested in
studying the effect of fermions on the electromagnetic field in terms of an
5.6. Quantizing the Dirac Field
229
effective Lagrangian for the electromagnetic field. From general principles,
we would expect such an effective Lagrangian to have the form
1 2
m
− eAm JA
Leff = − Fmn
4
(5.206)
m
where JA
is a suitable current. A natural choice for this current is the
expectation value of the Dirac current in a quantum state |A containing
an electromagnetic field described by the vector potential Am . That is, we
think of the effective Lagrangian as given by Eq. (5.206) with
m
m
¯
JA
≡ A|ψ(x)γ
ψ(x)|A
(5.207)
Computing this current is best done by using the following trick. We
first note that, in the absence of any electromagnetic field, the vacuum
expectation value in Eq. (5.207) can be written in the form
m
m
J0m (x) = 0|ψ¯b (x)γba
ψa (x)|0 = −Tr 0|ψa (x)ψ¯b (x)γba
|0
ˆ m |x
(5.208)
≡ −Trx|Gγ
ˆ such that the spinor
where we have introduced a Green function operator G
ˆ with
Green function G(x, y) can be expressed as the matrix element x|G|y
ˆ=
G
i
γp − m + i
(5.209)
In the presence of an external electromagnetic field, Eq. (5.208) will get
replaced by
m
ˆ A γ m |x
JA
(x) = −Trx|G
(5.210)
ˆ A is the Green function operator in the presence of an external
where G
electromagnetic field. This is given by the usual replacement:
ˆA =
G
i
γ pˆ − eγA(ˆ
x) − m + i
(5.211)
Multiplying the numerator and denominator by γ(p − eA) + m and using
the standard result
e
(γp − eγA(ˆ
x))2 = (ˆ
p − eA(ˆ
x))2 − Fmn (ˆ
x)σ mn
2
(5.212)
we get:
ˆA =
G
i(γp − eγA(ˆ
x) + m)
(ˆ
p − eA(ˆ
x))2 − 2e Fmn (ˆ
x)σ mn − m2 + i
(5.213)
This means that the Dirac propagator, in the presence of an electromagnetic
field, has the integral representation given by
GA (x, y)
=
=
i
|x
(5.214)
y|
γp − eγA − m + i
∞
2
ˆ
ds e−is(m −i) y|(γp − eγA(ˆ
x) + m)e−iHs |x
0
where
e
ˆ = −(ˆ
H
pm − eAm (ˆ
x) σ mn
x))2 + Fmn (ˆ
2
(5.215)
230
Chapter 5. Real Life II: Fermions and QED
As an aside, and just for comparison, let us consider the corresponding expressions for the complex scalar field. In this case, the relevant
Lagrangian is
1 2
L = − Fmn
− φ∗ (D2 + m2 )φ
(5.216)
4
with Dm = ∂m + ieAm . We will now define a propagator in the presence
of an external electromagnetic field as the expectation value of the time
ordered field operators in the state |A containing an electromagnetic field
Am :
(5.217)
GA (x, y) = A|T {φ(y)φ∗ (x)} |A
In the standard operator notation with ∂j → −ipj , this expression can be
thought of as the matrix element of the Green’s function operator defined
by:
i
ˆA =
G
(5.218)
(ˆ
p − eA(ˆ
x))2 − m2 + i
Using the standard integral representation we can then write the Green’s
function in the form
GA (x, y) =
=
i
ˆ A |x = y|
|x
y|G
(ˆ
p − eA(ˆ
x))2 − m2 + i
∞
2
ˆ
ds e−is(m −i) y|e−iHs |x
(5.219)
0
where the effective Hamiltonian is
ˆ = −(ˆ
H
p − eA(ˆ
x))2
(5.220)
Comparing Eq. (5.215) with Eq. (5.220), we find that the only extra term
is due to the spin of the particle, leading to (e/2)Fab σ ab .
We are now in a position to evaluate the current in Eq. (5.210). Using
ˆ A , we get
the integral representation for G
∞
ˆ
m
−is(m2 −i)
m
−iHs
JA = −Tr
ds e
x|γ (γp − eγA + m)e
|x
∞0
2
= −
ds e−is(m −i)
0
2
mn
e
× x|Tr γ m (γp − eγA)ei((p−eA) − 2 σmn F )s |x (5.221)
where we have used the fact that the trace of the product of an odd number
of gamma matrices is zero. But it is obvious that this expression can be
equivalently written in the form
∞
ds −is(m2 −i)
i ∂
ˆ
m
e
JA
=−
Tr x|e−iHs |x
(5.222)
2e ∂Am 0 s
ˆ is given by Eq. (5.215). Integrating both sides with respect to Am
where H
and substituting into Eq. (5.206), we find that the effective Lagrangian is
given by
1 2
i ∞ ds −is(m2 −i)
ˆ
e
Leff = − Fmn
+
Tr x|e−iHs |x
(5.223)
4
2 0 s
This result agrees with the expression we got earlier in Eq. (5.196) obtained
by integrating out the spinor fields in the path integral. Further, it shows
5.6. Quantizing the Dirac Field
231
that the effective Lagrangian can indeed be interpreted as due to interaction
with a spinor current in a state containing an external electromagnetic field,
as indicated in Eq. (5.206). (The same idea also works for a complex scalar
field. In this case, the current is more difficult to evaluate because of the
|φ|2 Am Am term in the Lagrangian, but it can be done.)
Incidentally, the corresponding expression for the complex scalar field
is given by
∞
1 2
ds −is(m2 −i)
ˆ
e
x|e−iHs |x
(5.224)
Leff = − Fmn − i
4
s
0
= pi(γp+m)
2−m2 +i
Figure 5.2: Electron propagator
p
Figure 5.3: Photon propagator
ˆ = −(ˆ
where H
p −eA(ˆ
x))2 . This differs from the expression in Eq. (5.223) as
regards the sign of the second term, the factor half and the fermionic trace.
Of these, the most crucial difference is the sign which is responsible for the
difference in the sign of the zero point energy of the fermionic field vis-a-vis
the bosonic field. This is most easily seen by evaluating the expressions in
Eq. (5.223) and Eq. (5.224) in the absence of the electromagnetic field by
setting Aj = 0. Such a calculation was done for the scalar field in Sect. 1.4.5
where we showed that it leads to the zero point energy of the harmonic
oscillators making up the field. If you repeat the corresponding calculations
for the fermionic fields, you will find that the fermionic trace, along with
the factor −(1/2), leads to an overall factor of 4(−1/2) = −2 compared
to the result for the complex scalar field. The factor 2 is consistent with
the Dirac spinor having twice as many degrees of freedom as a complex
scalar field; the extra minus sign is precisely the sign difference between
fermionic and bosonic zero point energies, which we encountered earlier in
Eq. (5.138).
= ieγ m
Figure 5.4: Interaction vertex
p
p
Figure
5.6.6
Feynman Rules for QED
Having discussed some nonperturbative aspects of QED, we shall now take
up the description of QED as a perturbation theory in the coupling constant
e. The first step in that direction will be to derive the necessary Feynman
diagrams in the momentum space for QED. This can be done exactly as
in the case of λφ4 theory by constructing the generating function and its
Taylor expansion. The essential idea again is to expand the generating
functional of an interacting theory as a power series in a suitable coupling
constant and re-interpret each term by a diagram. We will not bother
to derive these explicitly since — in this particular case — the rules are
intuitively quite obvious; we merely state them here for future reference
and for the sake of completeness:
(a) With each internal fermion line, we associate the propagator in the
momentum space iS(p). (See Fig. 5.2.)
(b) With each internal photon line, we associate a photon propagator
Dmn (p). Sometimes it is convenient to use a propagator with an infinitesimal mass for the photon or a gauge fixing term. Most of the time, we will
use the Feynman gauge. (See Fig. 5.3.)
There is only one kind of vertex in QED which connects two fermion
lines with a photon line. We conserve momentum at every vertex while
drawing the diagram and labeling it.
(c) Each vertex where the two fermion and one photon lines meet, is
associated with the factor ieγ m . (See Fig. 5.4.)
mn
= −iη
p2+i
5.5:
= us(p)
= u¯s(p)
Incoming
elec-
tron/positron
p
p
Figure
5.6:
= v¯s(p)
= v s(p)
Outgoing
elec-
tron/positron
p
p
= m(p)
= ∗m(p)
Figure 5.7: Incoming/outgoing pho-
ton
232
Chapter 5. Real Life II: Fermions and QED
The external lines can represent electrons, positrons or photons. The
rules for these are as follows.
(d) For an incoming fermion line, associate u(p, s) and for an outgoing
fermion line, associate u
¯(p , s ); for antifermions, use v and v¯ instead of u
and u¯. (See Fig. 5.5 and Fig. 5.6.)
(e) With an incoming photon line, associate a polarization factor m (p);
with the outgoing photon line associate a polarization vector ∗m (p). (See
Fig. 5.7.)
These essentially summarize the translational table between the diagrams and algebraic factors. Momenta associated with the internal lines
should be integrated over, with the usual measure d4 p/(2π)4 . When several different diagrams contribute to a given process, we should introduce
a (−1) factor if we interchange external, identical fermions to obtain a new
Feynman diagram. This rule can be understood by looking at the structure
of, say, a 4-point function
¯ 3 )ψ(x2 )ψ(x
¯ 4 )}|0
G(x1 , x2 , x3 , x4 ) = 0|T {ψ(x1 )ψ(x
(5.225)
It is clear that G(x1 , x2 , x3 , x4 ) = −G(x1 , x2 , x4 , x3 ) due to anticommutation rules. For a similar reason, we also need to introduce an overall factor
of −1 for any closed fermion loop. We will use these rules extensively in
what follows.
5.7
75 As we have stressed several times
before, the process of renormalization,
a priori, has very little to do with the
existence or removal of divergences in
perturbative field theory. The real reason for renormalization has to do with
the fact that the parameters which occurred in the original Lagrangian have
no direct physical meaning and the
physically relevant parameters have to
be defined operationally in terms of
physical processes. These conceptual
issues have been already highlighted in
the case of λφ4 theory and hence we
will not go over them again.
One Loop Structure of QED
As in the case of λφ4 theory, the really non-trivial aspects of QED arise
only when we go beyond the lowest order perturbation theory (viz., treelevel). This is because, when we consider a diagram involving integration
over internal momenta (which could arise, e.g., in diagrams with one or
more internal loops), we have to integrate over the momenta corresponding to these loops. The propagators corresponding to the internal loops
will typically contribute a (1/p2 ) or (1/p4 ) factor to the integral at large
momenta, and the integration measure will go as p3 dp. This could lead to
the contributions from the diagrams which require integration over the internal momenta, to diverge at large p. As in the λφ4 theory, it is necessary
to understand and reinterpret the perturbation theory when this occurs.75
In the case of QED, the bare Lagrangian contains the parameters representing the bare mass of the electron (me ), the bare mass of the photon
(mγ ≡ 0) and the coupling constant which is the charge of the electron
(e). Taking a cue from the λφ4 theory, we would expect renormalization
to change all these parameters to the physical values which are observed in
the laboratory. We would expect that the physical mass of the photon, mγ ,
should remain zero to all orders of perturbation theory. On the other hand,
we do not have any strong constraints on the electron mass and the coupling constant except that they should correspond to the values observed
in the lab.
As a simple example of how this works, consider the electronic charge.
The charge of the electron can be operationally defined using the fact that
the potential energy of electrostatic interaction between two electrons scales
as e2 /r at sufficiently large distances; that is, at distances r (/mc)
where m is the mass of the lightest charged particle in the theory. In
5.7. One Loop Structure of QED
233
Fourier space, this is equivalent to the statement that the propagator scales
as V (k) = e2 /k 2 for sufficiently small k, i.e., for k m. (We have ignored
the tensor indices ηab of the propagator for simplicity.) This shows that
the operational definition of electronic charge in our theory is determined
by the limit:
k 2 V (k)
(5.226)
e2R = lim
2
k →0
If the interactions involving the lightest charged fermion in the theory
(which we will take to be the electron) modify V (k) to the form (e2 /k 2 )[1 −
e2 Π(k)2 ]−1 where Π(k 2 ) is a calculable function — and, as we shall see later,
this is precisely what happens — then the physical charge measured in the
laboratory (which we will call the renormalized charge) is given by
e2R = lim
k 2 V (k) =
2
k →0
e2
1 − e2 Π(0)
(5.227)
In other words, if the interactions modify the nature of propagator (and
hence Coulomb’s law), they will also change the operationally defined value
of the electronic charge in the lab. It will no longer be the parameter e2
which you introduced into the Lagrangian but will be determined by the
expression in Eq. (5.227); and it can be calculated76 if we can calculate
Π(k).
More generally, you may decide to fix the coupling constant, not at
k = 0 but at some other value, say k = k1 . That is, you define the coupling
constant as e1 ≡ k12 V (k12 ). At tree-level, since V ∝ 1/k 2 , the e21 = e2 so
defined is independent of the k1 you have chosen. But when the propagator
is modified to
(5.228)
V (k) = (e2 /k 2 )[1 − e2 Π(k 2 )]−1
this is no longer true, and the numerical value of the coupling constant
e1 = e(k1 ) is now a function of the energy scale which we used to define it,
and is given by e21 = e2 [1 − e2 Π(k12 )]−1 . Equivalently, e and e1 are related
by:
(5.229)
e−2 = e−2 (k1 ) + Π(k12 )
A crucial piece of information about the theory is contained in the form
of this functional dependence.77 In our case, if we evaluate the coupling
constants e1 and e2 at two scales k = k1 and k = k2 and use the fact that,
in Eq. (5.229), left hand side is the same at both k = k1 and k = k2 , we
get
1
1
− 2 = Π(k22 ) − Π(k12 )
(5.230)
2
e1
e2
which relates the coupling constants defined at two different energy scales.
We would expect such a relation, which connects directly observable quantities, to be well-defined and finite. (We will see later that this is indeed
the case.)
As in the case of the λφ4 theory, the renormalization program requires
us to undertake the following steps. (a) Identify processes or Feynman
diagrams which will lead to divergent contributions.78 (b) Regularize this
divergence in some sensible manner. (c) See whether all the divergent terms
can be made to disappear by changing the original parameters λ0A of the
theory to the physically observable set λren
A . (d) Re-express the theory
and, in particular, the non-divergent parts of the amplitude in terms of
76 Of course, since interactions modify
the form of V (k 2 ), they not only make
the electronic charge observed in the
laboratory, e2R , different from the parameter e2 introduced into the theory,
but also modify the actual form of the
electromagnetic interaction.
77 This is another way of thinking
about the “running of the coupling
constant”.
78 By and large, the procedure we have
outlined deals with UV divergences
of the theory arising from large values of the internal momentum during
the integration. In QED (and, more
generally, in theories involving massless bosons), one also encounters IR
divergences which arise at arbitrarily
low values of the internal momentum
which is integrated over. While conceptually rather curious, the IR divergences arise due to different physical
reasons compared to UV divergences
and are considered relatively harmless.
We will have occasion to mention IR
divergences briefly later on. When we
talk about divergences, they usually
refer to the UV divergence unless otherwise specified.
p
k
k
p−k
Figure 5.8:
One loop correction to
photon propagator
234
k
p
μ
p−k
ν
p
Figure 5.9: One loop correction to
electron propagator
q
p − k
p
p−k
k
p
Figure 5.10: Correction to the electro-
magnetic vertex
79 At this order,
there are also
other diagrams like the ones given in
Fig. 5.11. These, however, can be
thought of as correcting the propagators themselves and will not contribute
to the magnetic moment of the electron.
the physical parameters λren
A . This will form the basis for comparing the
theory with observations.
With the above motivation in mind, let us consider the diagrams in
QED involving a single internal momentum integration, which lead to divergences. It is easy to verify that there are three such primitive diagrams.
(i) The first one is that in Fig. 5.8. This figure represents the lowest
order correction to the photon propagator line (which will be a simple wavy
line with a momentum labelled k originally.) You could think of this as
a process in which a photon converts itself into a virtual electron-positron
pair, which disappears, leading back to the photon. That is, this is a
correction to the photon propagator due to an electron loop.
(ii) Figure 5.9 is the counterpart for the electron propagator (which,
originally, would have been a straight line with momentum label p) due
to the emission and subsequent absorption of a photon of momentum k
by the electron. This provides the lowest order correction to the electron
propagator.
(iii) Figure 5.10 is probably the most complicated of the three processes
we need to consider. Here, a photon (with an internal momentum k) is
emitted and reabsorbed while the electron is interacting with an external
photon (momentum q). Unlike the other two diagrams, this involves three
vertices rather than two. The net effect of such a process is to correct the
coupling between the photon and electron in a — as we shall see — rather
peculiar manner. You will find that the new coupling involves not only the
charge but also the magnetic moment of the electron, thereby changing the
value of the gyro-magnetic ratio.79
If you use the Feynman rules to write down the expressions for these
three amplitudes, you will find that all of them are divergent. We need to
regularize all these expressions and then show that the divergent parts can
be incorporated into the parameters of the theory, modifying them to new
values just as we did in the case of the λφ4 theory. Once this is done, we will
be able to predict new and non-trivial physical effects in terms of physical
parameters of the theory. We will do this in the next few sections.80
5.7.1
Figure 5.11: Sample of three diagrams
at the same order.
80 We
need to carry out this program
to all orders in perturbation theory in
order to convince ourselves that the
theory is well defined. This is indeed
possible, but we shall concentrate on
illustrating this procedure at the lowest order which essentially involves integration over a single internal momentum variable. Extending this approach
to all orders in perturbation theory is
technically more complicated but does
not introduce any new conceptual features.
Chapter 5. Real Life II: Fermions and QED
Photon Propagator at One Loop
We will begin by computing the one loop correction to the photon propagator, described by the Feynman diagram in Fig. 5.8, and describing its
physical consequences. We already know from the Ward-Takahashi identity
[see Eq. (5.181)] that the amplitude corresponding to this diagram must
have a form
iΠmn = i[k 2 ηmn − km kn ]e2 Π(k 2 ) ≡ iPmn (k)e2 Π(k 2 )
(5.231)
We have scaled out a factor e2 from Π(k 2 ) since we know that Fig. 5.8 will
have this factor arising from the two vertices. The rest of the structure of
Πmn is decided by Lorentz invariance and the condition k m Πmn = 0.
Incidentally, you can understand the condition k m Πmn = 0 and its
relation to gauge invariance in a rather simple manner, which is worth
mentioning. The interaction of an electron with an external potential is
described, at the tree-level, by the diagram in Fig. 5.12(a); at the next
order, this will get modified by the addition of the diagram in Fig. 5.12(b).
Algebraically, this requires the addition of the two amplitudes given by
5.7. One Loop Structure of QED
(1)
Mfi
= e¯
uf γ m ui Am (q);
(2)
Mfi = e¯
uf γ m ui
235
−i
iΠmn An (q) (5.232)
q 2 + i
(We are using the free-field photon propagator in Eq. (3.195), obtained by
the gauge choice ζ = λ = 1 in Eq. (3.203), etc. This will be our choice
throughout.) If we now choose to describe the external electromagnetic
field in a different gauge, so that Am (q) → Am (q) + iqm χ(q), then the
gauge invariance of the second order term will require the condition
Πmn (q)q n = 0
q
X
(a)
q
q
(5.233)
Note that this holds for a virtual photon with q = 0.
Getting back to Eq. (5.231), we note that the factor Pab satisfies a useful
identity, which can be directly verified:
bl
η
Pab (k)
Plm (k) = Pam (k)
(5.234)
k2
X
2
This allows us to sum up the corrections to the photon propagator coming from a geometric series of electron loops in a rather simple manner.
(This is analogous to what we did to the scalar propagator in λφ4 theory;
see Eq. (4.145)) For example, the sum of the terms corresponding to the
Feynman diagrams shown in Fig. 5.13. can be represented by
Drs (k) =
=
=
=
−iηrs
−iηra
−iηbs
+
iΠab (k 2 )
2
2
k
k
k2
−iηra
−iη
−iηns
bl
+ 2 iΠab (k 2 ) 2 Πln (k 2 )
+ ···
k
k
k2
i(k 2 ηrs − kr ks ) 2
−iηrs
+
[e Π(k 2 ) + e4 Π2 (k 2 ) + · · · ]
2
k
k4
−i(k 2 ηrs − kr ks ) e2 Π(k 2 )
−iηrs
+
k2
k4
1 − e2 Π(k 2 )
1
ikr ks e2 Π(k 2 )
−iηrs
+
(5.235)
k 2 1 − e2 Π(k 2 )
k 4 1 − e2 Π(k 2 )
In this expression, the second term with kr ks can be ignored because it will
never contribute to any physical process. This is because, in the computation of any physical amplitude, the propagator Drs will be sandwiched
between two conserved current terms, in the form J r (k)Drs (k)J s (k). Since
gauge invariance demands that any current to which the photon couples
needs to be conserved, we will necessarily have ks J s (k) = 0 in Fourier
space. This means that terms with a free tensor index on the momentum,
like kr in the propagator, will not contribute to the physical processes;
therefore, we shall hereafter ignore the second term in Eq. (5.235).
Before proceeding further, we will rescale the photon propagator by a
factor e2 and use the rescaled propagator Drs → e2 Drs . At this stage, you
can think of it as just a redefinition. But this is far from a cosmetic exercise and has an important physical meaning, which is worth emphasizing.
Conventionally, you would have written the Lagrangian describing QED in
the form
¯ m (∂m − ieAm ) − m]ψ − 1 Fmn F mn
L = ψ[iγ
(5.236)
4
In this description, the photon field has nothing to do with the electric
charge e. The charge appears as a property of the electron in the Dirac
(b)
Figure 5.12: Scattering of an electron
(a) by an external potential and (b) an
one loop correction to the same.
=
+
+
+ ······
Figure 5.13: Diagrammatic representation of the geometric progression in
Eq. (5.235).
236
Chapter 5. Real Life II: Fermions and QED
sector when we couple ψ to Aj , which sounds reasonable. This theory is
invariant under the gauge transformations given by ψ(x) → eieΛ(x) ψ(x)
and Am (x) → Am (x) + ∂m Λ(x). The gauge transformation of the Dirac
field involves a phase term which depends on the charge e of the electron
field, but the photon field is still oblivious of the electronic charge e.
We can, however, provide a completely equivalent description of this
theory by rescaling Am → (1/e)Am , thereby obtaining the Lagrangian:
¯ m (∂m − iAm ) − m]ψ − 1 Fmn F mn
L = ψ[iγ
4e2
81 Of course, we get the same physical results even when we work with
Eq. (5.236). We will come back to this
aspect later in Sect. 5.8.
(5.237)
In this case, the charge has disappeared from the Dirac sector and appears
as an overall constant (1/e2 ) in front of the electromagnetic action. This
seems to suggest that the electric charge is more of a property of the photon field rather than the Dirac field! The gauge transformations are now
given by ψ → eiα ψ, Am → Am + ∂m α and are completely independent
of the electronic charge. Obviously, this rescaling of the vector potential
is equivalent to rescaling the photon propagator by Drs → e2 Drs which
is what we want to do. So we are going to work with the Lagrangian in
Eq. (5.237) rather than Eq. (5.236).
Since the two descriptions are mathematically the same, you may wonder why we are making such a fuss about it and why we prefer the latter
description. The reason has to do with the fact that the corrections which
the photon propagator picks up due to the electron loops (as in Fig. 5.8)
do depend on the electronic charge e which appears at the pair of vertices
in Fig. 5.8. As we mentioned earlier (see Eq. (5.227)), one of the effects
of the corrections to the photon propagator is to renormalize the value of
e2 ; that is, the e2 you introduce into the Lagrangian and the e2R which
will be measured in the lab will be different due to renormalization. But
how do we know that the renormalization changes both the charge of the
electron and the proton in the same manner, especially since protons participate in strong interactions while electrons do not? To pose this question
sharply, consider a theory involving protons, electrons and photons with a
Dirac-like term added to the Lagrangian in Eq. (5.236), where the proton
will be described with a different mass mp and a charge having the same
magnitude as the electron. (In addition, we can throw in the strong interaction terms for the proton.) We are keeping the charges of the electron
and proton equal at the tree-level by fiat. But it is not a priori obvious
that the corrections due to interactions will not lead to different renormalized charges for the electron and proton — which, of course, will be a
disaster. We know that this does not happen, but this is not obvious when
we write the Lagrangian in the form of Eq. (5.236). This fact is easier to
understand if we transfer the burden of the electric charge to the photon
field and write the Lagrangian as in Eq. (5.237). If we now add a proton
field to this Lagrangian, it will not bring in any new charge parameter (but
only a mass mp ). So clearly, what happens to e2 is related to how the
photon propagator gets modified due to all sorts of interactions, and the
equality of the electron and proton charges are assured; after all, charge is
a property of the electromagnetic field and not of the matter field81 when
we work with the Lagrangian in Eq. (5.237).
In this picture, which is conceptually better, the photon propagator
describes the ease with which it propagates through a vacuum polarized
by the existence of virtual, charged particle-antiparticle pairs. We will
5.7. One Loop Structure of QED
237
therefore work with the rescaled propagator written as
Drs → e2 Drs (k) ≡ −
e2
iηrs
2
k [1 − e2 Π(k 2 )]
(5.238)
and have dropped the second term in Eq. (5.235). With future applications
in mind, we will rewrite the factor which occurs in this expression in the
following form:
e2
k 2 [1 − e2 Π(k 2 )]
=
=
(e2 /k 2 )
[1 − e2 (Π(k 2 ) − Π(0)) − e2 Π(0)]
(e2R /k 2 )
[1 − e2R (Π(k 2 ) − Π(0))]
(5.239)
where we have defined the renormalized charge e2R as in Eq. (5.227), by
e2R ≡
e2
1 − e2 Π(0)
(5.240)
We have essentially written Π(k 2 ) in Eq. (5.238) as [Π(k 2 ) − Π(0)] + Π(0),
pulled out the factor 1−e2 Π(0) from both numerator and denominator and
redefined the electric charge by Eq. (5.227). This exercise might appear
a bit bizarre, but it has fairly sound physical motivation based on the
operational definition of the charge, which led to Eq. (5.227).
When we explicitly compute Π(k 2 ) — which we will do soon — we will
see that the quantity Π(0) is divergent while [Π(k 2 ) − Π(0)] is finite and
well defined. If we now define the renormalized charge as a finite quantity
given by Eq. (5.240) (by absorbing a divergent term into the parameter e2 ),
the final propagator in Eq. (5.239) is finite and well defined when expressed
in terms of e2R . This expression also tells you that in the limit of k 2 → 0,
(which corresponds to very large distances), the propagator in Eq. (5.239)
reduces to the form e2R /k 2 which — in real space — will reproduce the
Coulomb potential e2R /r. This is precisely the manner in which we would
have defined the electronic charge at low energies (that is, at the k 2 → 0
limit). So our renormalized charge in Eq. (5.240) is indeed the physically
observed charge in the lab at arbitrarily low energies.82
Let us now compute the propagator by evaluating the Feynman diagram
explicitly. We shall write down the contribution arising from the diagram
in Fig. 5.8 as
d4 p
iΠmn (k) = −(−ie)2
tr γ m S(p)γ n S(p − k)
(5.241)
(2π)4
where S(q) = i[γq − m]−1 is the electron propagator. (We will write γ i qi ,
etc. as simply γq when no confusion is likely to arise.) Naive counting of
powers shows that the integral will go as (1/p2 )p3 dp which is quadratically
divergent. But we do know that the final result must be expressible in
the form of Eq. (5.231), which suggests that the integral will actually go
as (1/p4 )p3 dp and hence will diverge only logarithmically. But since the
integral is divergent, we need to introduce a procedure for regularizing it
before proceeding further.
One powerful technique we will use involves working in n dimensions
with sufficiently small n to make the integrals converge and analytically
continue the result back to n = 4. (We did this earlier in Sect. 4.7.2
82 Note that everything we said above
remains valid even if Π(0) is a finite
quantity. The physical charge is indeed
measured as the coefficient of (1/k 2 )
in the limit of k → 0 and is given by
Eq. (5.240). This is what an experimentalist will measure and not the e2
you threw into the Lagrangian. From
this point of view, renormalization of a
parameter in the Lagrangian (like the
charge) is a physical phenomena and
has nothing to do, a priori, with any divergence — a fact we have stressed several times in our discussions. Because
Π(0) happens to be divergent in QED,
this process also allows us to cure the
divergence.
238
Chapter 5. Real Life II: Fermions and QED
to tackle the divergences in the λφ4 theory.) The divergences will now
reappear but can be easily isolated in a controlled manner.
Since the technique of dimensional regularization requires our working
with a generic dimension n, it is important to understand the dimensions
of different quantities when we work in n dimensions. This is most easily
done by examining different terms in the Lagrangian in Eq. (5.236) and
ensuring that each of them remain dimensionless. (You will, of course, get
the same results from Eq. (5.237).) To begin with, from the term involving
¯ a ∂a ψ, it is clear that dim [ψ] = (1/2)(n − 1). This will ensure the
iψγ
¯ term as well, provided dim [m] = 1 for all
correct dimensionality for mψψ
n. From the F ab Fab part, we discover that dim [Aj ] = (n/2) − 1. (This
will also ensure the correct dimension for all the gauge fixing terms when
¯ j ψ] = n, we must
we add them.) Finally, in order to maintain dim [eAj ψγ
have dim [e] = 2 − (n/2). So, when we work in n-dimensions, we should
find that we get a dimensionless amplitude only if e2 has the dimension of
μ4−n where μ is an arbitrary mass scale.
There are two ways of incorporating this result when we do our analysis.
The conventional procedure (followed in most textbooks) is to actually
replace the coupling constant e2 by e2 (μ)4−n , where μ is an arbitrary mass
scale, thereby retaining the dimensionless status of e2 . (This is similar to
what we did in the case of λφ4 theory in Eq. (4.146).) The fact that e2
acquires a dimension when n = 4 suggests that it is better to pull out
a e2 μn−4 factor out of Π(k) in n− dimensions (rather than just e2 ) and
rewrite Eq. (5.240) in the form
e2R =
e2 μn−4
1 − (e2 μn−4 )Π(0)
(5.242)
This introduces an arbitrary mass scale μ, thereby keeping the renormalized
coupling constant e2R dimensionless. Rearranging the terms, we get
'
&
1
1
+
Π(0)
μ−2 = 2 ;
2 ≡ 4 − n
(5.243)
2
eR
e
Since the right hand side is independent of mass scale μ we have introduced,
it follows that we must have the condition
1
−2
μ
+ Π(0) = (independent of μ)
(5.244)
e2R
This will prove to be useful later on — once we have evaluated Π(0) — to
determine the “running” of the coupling constant e2R with respect to the
mass scale μ.
In this approach, it is a little bit unclear how the final results are going
to be independent of the parameter μ which we have introduced. Therefore
we will not do this replacement right at the beginning but will carry on
keeping e2 to be a constant with dimension μn−4 . We will see that, at a
particular stage, we can introduce a mass scale into the theory in a manner
which makes it obvious that none of the physical results will depend on μ.
So let us evaluate Πmn (k) in Eq. (5.241) keeping e2 as it is. Proceeding
to n dimensions and rationalizing the Dirac propagator will allow us to
rewrite the amplitude as
dn p tr {γ m (m + γp)γ n (m + γ(p − k))}
mn
2
(5.245)
iΠ (k) = −e
(2π)n
(p2 − m2 )((p − k)2 − m2 )
5.7. One Loop Structure of QED
239
where e2 is defined in n - dimensions and is not dimensionless. The traces
involved in the expression can be computed in a fairly straightforward
manner83 to give the numerator to be:
$
%
N ≡ tr{· · · } = pm (p − k)n + (p − k)m pn − [m2 + p(p − k)]g mn (tr1)
(5.246)
In obtaining the result, we have left (tr 1) unspecified, but as we shall
see, this is irrelevant to our computation as long as (tr 1) = 4 in the four
dimensional limit. The simplest way to compute the momentum integrals
is to proceed as follows: (i) Analytically continue to the Euclidean sector
(with dn p → idn pE , p2 → −p2 etc.). (ii) Write each of the factors in the
denominator as exponential integrals using
∞
1
=
dt e−tF (p)
(5.247)
F (p)
0
83 The following relations which hold
in a spacetime of even dimensionality
n, are useful here and elsewhere.
(1) tr 1 = 2n/2 ,
(2) tr γ m γ n = −2n/2 g mn ,
(3) tr γ m γ n γ r γ s = 2n/2 (g mn g rs −
g mr g ns + g ms g nr )
(4) tr γ m1 γ m2 · · · γ mk = 0; k = odd
(iii) Complete the square in the Gaussian and evaluate the momentum
integrals as standard Gaussian integrals. (These are standard tricks; we
used them earlier in Sect. 4.7.2.) The first two steps give
∞
2
2
2
2
dnE p
mn
2
Π (k) = −e
ds dt e−s(p +m ) e−t(p−k) +m ) N (5.248)
(2π)n 0
Completing the square requires shifting the momentum variable to p =
p − [tk/(s + t)], which leads to
∞
n
2
2
st
dE p
Πmn (k) = −e2 (tr 1)
ds dt e−(s+t)m e− s+t k
(5.249)
n
(2π)
0
&
'
2
st
st
m n
mn
2
2
2
2pm pn − 2
k
k
−
g
+
p
−
k
m
e−(s+t)p
(s + t)2
(s + t)2
The Gaussian integrals are easy to do using the standard result:
1
1
dnE p m n −Ap2
p p e
= g mn
n
(2π)
2A (4πA)n/2
(5.250)
and we obtain:
&
'
(tr 1)e2 ∞ ds dt
st 2
−(s+t)m2
k
Πmn (k) = −
e
exp
−
s+t
(4π)n/2 0 (s + t)n/2
&
'
1
st
1 mn
st
n
m n
2
2
mn
g
−
−2
k
k
−
m
+
k
g
s+t
(s + t)2
2 s + t (s + t)2
(5.251)
At this stage, one resorts to yet another useful trick (see Eq. (4.165) of
Sect. 4.7.2) to make further progress. We will introduce a factor of unity
written in the form
∞
dq δ(q − s − t)
1=
(5.252)
0
and rescale the integration variables (s, t) to (α, β) where s = αq, t = βq.
Fairly straightforward manipulation now shows that the result indeed has
the form advertised84 in Eq. (5.231):
Πmn (k) = (k 2 g mn − k m k n )P(k)
(5.253)
84 The fact that k Πmn (k) = 0
m
— which leads to this specific structure can be directly verified from
Eq. (5.241). You first extend the integral to n-dimensions to make it convergent, and then take the dot product with km . If you now write kγ =
[γ(p + k) − m] − [γp − m], you will
find that the expression reduces to the
difference between two integrals which
are equal when you shift the variable
of integration in one of them from p
to p + k. It is this shifting which is
invalid for the original divergent integral in D = 4; but we can do it after
proceeding to n-dimensions.
240
Chapter 5. Real Life II: Fermions and QED
where
85 This
procedure (viz. multiplying
by unity, 1 = (μn−4 /μn−4 )) clearly
demonstrates that P(k 2 ) is independent of the mass scale μ as long as you
do not do anything silly while regularizing the expression.
n 2 1
(tr 1)
e
Γ 2−
dα α(1 − α)[m2 + α(1 − α)k 2 ](n/2)−2
P(k) = −2
2
(4π)n/2
0
(5.254)
Our aim now is to take the n → 4 limit which will require us to take the limit
of → 0 in a factor x , with = n/2 − 2 inside the integral. Writing x =
exp( ln x) and taking the limit, we will obtain a term of the form (1+ ln x).
For ln x to make sense, it is necessary that x is a dimensionless object. To
take care of this fact, we will rewrite the Eq. (5.254) by multiplying and
dividing the expression by a factor μn−4 where μ is an unspecified finite
energy scale.85
(tr 1)
n 2 n−4
e μ
×
(5.255)
P(k 2 ) = − 2
Γ
2
−
2
(4π)n/2
2
(n/2)−2
1
m − α(1 − α)k 2
dα α(1 − α)
≡ e2 μn−4 Π(k 2 )
2
μ
0
Since we know that e2 in n- dimensions has the dimension μ4−n , this keeps
e2 μn−4 dimensionless, as it should. We have also rotated back from the
Euclidean to the Lorentzian space which changes the sign of k 2 . The last
equality defines Π(k 2 ), after pulling out the factor e2 μn−4 .
Let us first evaluate Π(0) using this expression, in the limit of ≡
[2 − (n/2)] → 0. Setting k = 0 and tr1 = 4, we have
2
1
μ
1
Γ( )
dα α(1 − α)
(5.256)
Π(0) = − 2
2π
m2
0
The integral has a value (1/6). We now take the limit of → 0 using the
expansion of the gamma function
Γ( ) ≈
1
− γE
(where γE is Euler’s constant) and writing
μ
μ
μ 2
≈ 1 + 2 ln
.
= exp 2 ln
m
m
m
86 To
be precise, there is a bit of an
ambiguity here. Because Π(0) is a divergent quantity, you could have subtracted out any amount of finite terms
and redefined it as a new Π(0). Again,
no physical result should depend on
how you do this subtraction, but in
higher order computations certain subtractions will turn out to be more
advantageous than others. What we
have done here is closely related to the
minimal subtraction described in Sect.
4.8. Since our emphasis is on concepts
rather than calculations, we will not
elaborate on this again.
(5.257)
(5.258)
This leads to the result86
Π(0) = −
1
(1 + O( ))
12π 2
(5.259)
Once we know the structure of Π(0) — which, of course, diverges as (1/ )
as → 0 — we can use it in Eq. (5.244). Substituting the form of Π(0)
into Eq. (5.244) and expanding the result to the relevant order, we demand
that the expression
&
'
μ
μ
1
1
1
1
1
1 − 2 ln
+ O( )
−
= 2 + 2 ln
−
2
2
eR
12π
m
eR
6π
m
12π 2
(5.260)
remains independent of μ. Evaluating the right hand side at two values of
μ, say μ = μ1 and μ = μ2 and subtracting one from the other, we get the
result
1
1
μ2
1
= 2
−
ln
(5.261)
e2R (μ2 )
eR (μ1 ) 6π 2
μ1
5.7. One Loop Structure of QED
241
This allows us to relate the renormalized coupling constant at two different
scales as:
−1
e2 (μ1 )
μ2
(5.262)
e2R (μ2 ) = e2R (μ1 ) 1 − R 2 ln
6π
μ1
This is analogous to the result we obtained earlier nonperturbatively (see
Eq. (4.55) of Sect. 4.2.2) and Eq. (5.202) for the charged scalar field and
the Dirac field.87 The sign of the coefficient of the logarithmic term is
important. We see that, because of the negative sign, the effective charge
becomes larger at short distances (i.e., at larger energies) and the coupling
gets stronger.
Let us now proceed to evaluate the physical part of the propagator
contributed by Πphy (k 2 ) ≡ Π(k 2 ) − Π(0). To do this, we first evaluate
Eq. (5.255) in the limit of small by replacing x by (1 + ln x) to obtain:
1
Π(k ) = − 2 Γ( )
2π
1
2
0
μ2
dα α(1 − α) 1 + ln 2
m − k 2 α(1 − α)
(5.263)
From this we can obtain, by direct subtraction, the result:
Πphy
≡ Π(k ) − Π(0) = − 2 Γ( )
2π
1
dα α(1 − α) ln
2
0
m2
m2 − k 2 α(1 − α)
(5.264)
87 Since we are evaluating Π
mn perturbatively to first order in e2 , the
question again arises as to whether it
is legitimate to write an equation like
Eq. (5.262) with a denominator containing 1 − e2R (....) kind of factor. As
we mentioned in the case of λφ4 theory
— see the discussion after Eq. (4.179)
— the result is similar to the one in
Eq. (5.238). The Eq. (5.238), in turn,
was obtained by summing a subclass
of diagrams in Fig. 5.13 as a geometric progression to all orders in the
coupling constant. The expression in
Eq. (5.262) thus captures the effect of
summing over a class of diagrams on
the running of the coupling constant.
On using Γ( ) → 1 when → 0, we get:
1
Πphy (k ) = 2
2π
2
0
1
k2
dα α(1 − α) ln 1 − 2 α(1 − α)
m
(5.265)
which is perfectly well-defined, independent of μ and, of course, finite.88
This quantity Πphy (k 2 ) does play an operational role in scattering amplitudes. To see this, consider the scattering of two very heavy fermions f1
and f2 which have the same charge as the electron but much larger masses
m1 m and m2 m. We want to study the lowest order scattering of
these fermions at energies m2 k 2 (m21 , m22 ). At these energies, the
photon propagator will get modified due to virtual electron loops in exactly
the manner described earlier. The lowest order scattering of these fermions,
without incorporating electron loops, will be governed by the diagram in
Fig. 5.14 which translates into the tree-level amplitude given by
Mtree
=
≡
−iηmn
[¯
u(p1 )γ m u(k1 )][¯
u(p2 )γ n u(k2 )]
q2
−iηmn m
j (p1 , k1 )j n (p2 , k2 )
(−ie)2
q2
(−ie)2
(5.266)
where q i is the momentum transfer.89 We, however, know that the photon
propagator in the diagram in Fig. 5.14 gets modified due to the electron
loops. (It will also get modified by, say, muon loops and even by loops of
the heavy fermions f1 , f2 we are studying. However, the most dominant
effect will come from the lightest charged fermion, which we take to be the
electron.) At the next order of approximation, the corrected amplitude will
be given by
Mloop = (−ieR )2
−iηmn
J m (p1 , k1 )J n (p2 , k2 )
q 2 (1 − e2R Πphy (q 2 ))
(5.267)
88 One can evaluate the integral explicitly, but it is not very transparent;
see Eq. (5.277). We will discuss its explicit form later.
u¯(p1)
u¯(p2)
u(k1)
u(k2)
Figure 5.14: Diagram for tree-level
scattering of two heavy fermions.
89 This is a bit schematic in the sense
that we have ignored spin labels etc.
They are not important for the point
we want to make.
242
Chapter 5. Real Life II: Fermions and QED
where J m (p1 , k1 ) = u
¯(p1 )Λm (p1 , k1 )u(k1 ) etc. in which the vertex corrections have changed γ m to Λm . (We will work this out explicitly in
Sect. 5.7.3.) In the limit of |q 2 | m2i , however, we can still approximate
the vertex functions by Λm (p1 , k1 ) ≈ γ m , and also use the approximate
form for the photon propagator. It is then obvious that the key effect of
the 1-loop correction is to actually change the effective value of the electronic charge from eR (fixed at large distances, i.e., at low energies) to the
effective value e2eff (q) (for a scattering involving the momentum transfer
q 2 ), where:
1
1
= 2 − Πphy (q 2 )
(5.268)
e2eff
eR
and Πphy (q 2 ) is given by Eq. (5.265). It is easy to verify that
⎧
0
(−q 2 m2 )
⎪
⎪
⎨
1
d
q2 2
≈
⎪
dq
e2eff
⎪
⎩− 1
(−q 2 m2 )
12π 2
(5.269)
To a good approximation, e2eff remains approximately constant for |q 2 | <
m2 and it starts running at a constant rate for |q 2 | m2 . This analysis
shows that the concept of a running coupling constant is an operationally
well defined result which is testable by scattering experiments.
There is another aspect of the running coupling constant which is worth
mentioning at this stage. In Eq. (5.228), we determined how the Coulomb
potential is modified by the electron loops by evaluating a sum over a set
of diagrams in geometric progression. Suppose we do not perform such a
sum and stick to the lowest order correction. Then Eq. (5.228) will read:
V (k) ≈
e2
[1 + e2 Π(k 2 ) + O(e4 )]
k2
(5.270)
If we now define the coupling constant at some scale k = k1 by k12 V (k1 ) ≡
e2 (k1 ), then we have e21 = e2 [1 + e2 Π(k12 )], which can be solved, to the same
order of approximation we are working in, to give e2 = e21 [1 − e21 Π(k12 )].
Substituting for e2 in Eq. (5.270), we get:
k 2 V (k) ≈ e2 (k1 ) 1 − e2 (k1 )[Π(k12 ) − Π(k 2 )]
(5.271)
So the corrected potential is
2
e2 (k1 )
m − k12 α(1 − α)
e2 (k1 ) 1
V (k) ≈
dα α(1 − α) ln
1−
k2
2π 2 0
m2 − k 2 α(1 − α)
(5.272)
If we now consider the limit k1 m, this result reduces to:
e2 (k1 )
e2 (k1 ) k1
ln
V (k) ≈
1−
(5.273)
k2
6π 2
k
90 Terms like e3 (k )[de(k )/dk ] =
1
1
1
O(e6 ) are consistently ignored to arrive at this result.
We certainly do not want V (k) to depend on the arbitrary scale k1 we used
to define our coupling constant; so we demand k1 (dV /dk1 ) = 0. Working this out, to the order of accuracy90 we are interested in, we get the
differential equation:
k1
de2 (k1 )
e4 (k1 )
≡ β(e2 ) =
dk1
6π 2
(5.274)
5.7. One Loop Structure of QED
243
which again defines the beta-function of the theory. This is the same as
Eq. (5.202) we obtained earlier from the Schwinger effect.
If we now integrate this differential equation exactly, we get the result
that relates the coupling constant at two different energy scales by:
−1
e2 (k1 )
k2
2
2
ln
(5.275)
e (k2 ) = e (k1 ) 1 −
6π 2
k1
This is algebraically same as the result we found earlier in Eq. (5.262), but
there is a subtle difference. In the above analysis, we started with a one loop
result in Eq. (5.270) and did not sum over the geometric series of diagrams.
We then took the large momentum approximation (k1 m, k2 m) which
led to the potential in Eq. (5.273) in the ‘leading log’ approximation.91 The
demand that V (k) should not depend on the arbitrary scale k1 we have introduced, then led to the beta function of the theory in Eq. (5.274). We
integrated this equation for the running coupling constant, pretending that
it is exact; that is, we did not ignore terms of the order O(e6 ), which were
indeed ignored in the result in Eq. (5.270) which we started with. This
process has led to the same result as we would have obtained by summing
over a specific set of diagrams as a geometric progression! This is no big
deal in 1-loop QED, but in more complicated cases or at higher orders,
obtaining the beta function to a certain order of perturbation and integrating it exactly, allows us to bypass the (more) complicated summation
of a subset of diagrams.
In this particular case of 1-loop QED, we can also use the form of
Π(k) − Π(k1 ) for any two values of k and k1 in Eq. (5.230) to obtain the
coupling constants defined at these two scales as:
1
1
−
= Π(k12 ) − Π(k 2 )
(5.276)
e2 (k) e2 (k1 )
2
1
1
m − k12 α(1 − α)
=
dα α(1 − α) ln
2π 2 0
m2 − k 2 α(1 − α)
When we do not invoke the approximation k1 m, k2 m, this result is
more complicated.
Let us next consider the explicit form of Πphy (q 2 ). The presence of the
logarithm signals the fact that it is non-analytic in q 2 . You can evaluate it
by straightforward integration techniques to obtain the result92
1
q2
e2
Πphy (q 2 ) =
dα
α(1
−
α)
ln
1
−
α(1
−
α)
2π 2 0
m2
e2
1
5 4 m2
2m2
2
−
=
+
)
(5.277)
−
f
(q
1
+
4π 2
9 3 q2
3
q2
where f (q 2 ) has different forms in different ranges and is given by
(
0
2
1 − 4m
2
q2 + 1
4m
2
, (q 2 < 0)
1 − 2 ln (
(5.278)
f (q ) =
q
4m2
−1
1−
0
q2
4m2
1
, (0 < q 2 ≤ 4m2 )
− 1 tan−1 4m2
2
q
−
1
2
q
(
0
0
2
1 + 1 − 4m
q2
4m2
4m2
(
=
1 − 2 ln
−
iπ
1
−
, (4m2 < q 2 )
2
q
q
4m2
1− 1−
91 Recall that something very similar
happened with the running of λ in the
λφ4 theory which we commented upon
in Sect.4.7.2 just after Eq. (4.179).
92 The UV divergences which arise in
a theory are usually proportional to
polynomials in momenta. In a nonrenormalizable theory, these polynomials — translating to higher and higher
order derivatives in real space — lead
to local short distance effects. In contrast, the finite part arising from the
loops (even in a non-renormalizable
theory) can have non-analytic momentum dependence which leads to residual, well defined, long distance interactions.
= 2
q2
Exercise 5.15: Prove this result.
(Hint: Integrate by parts to eliminate
the logarithm and introduce the variable v = 2β − 1. The rest of it can be
done by careful complex integration.)
244
Exercise 5.16: Prove this result
Chapter 5. Real Life II: Fermions and QED
We see that for q 2 > 4m2 , the function Πphy (q 2 ) picks up an imaginary
part. This is the threshold for the production of e+ − e− pairs by the
photon. If we interpret (1 − Πphy ) as analogous to the dielectric function
of a medium (in this case the vacuum), the negative imaginary part signals
the absorption of the electromagnetic radiation. In fact, this analogy can
be made more precise in the form of a dispersion relation
1
Πphy (q ) = q 2
π
2
93 As
we described earlier in Sect. 3.7,
the appearance for branch-cut (and
imaginary part) in the propagators
signals intermediate states with some
specified mass threshold.
In this
particular case, the threshold q 2 =
4m2 arises from an intermediate
state |e+ e− containing an e+ e−
pair which will contribute a factor
|0|Am |e+ e− |2 . Any such state with
p = 0 will be labelled by the momenta
p1 = (/2, k/2), p2 = (/2, −k/2).
The momentum transfer is now q 2 =
(p1 + p2 )2 = 2 = |k|2 + 4m2 . The factor (1 − (4m2 /q 2 ))1/2 in Eq. (5.278)
in fact arises from the phase space
density k 2 dk = kE dE ∝ k d(q 2 ) =
(q 2 − 4m2 )1/2 dq 2 . It is the creation
of e+ e− pairs which lead to the imaginary part in the propagator.
Exercise 5.17: Prove Eq. (5.280).
[Hint:
Write
α(1 − α)
as
(d/dα)[(α2 /2) − (α3 /3)].
Substituting and integrating by parts, you
should be able to get the relevant result with r(s) expressed as an integral
which can be easily evaluated.]
Exercise 5.18: Prove Eq. (5.284),
Eq. (5.285) and Eq. (5.286).
94 This change in the Coulomb potential does affect the atomic energy levels
and contributes to the so called Lamb
shift (which is the shift in the energy
levels of 2S1/2 and 2P1/2 of the hydrogen atom, which have the same energy
in Dirac’s theory). There is, however,
a much larger contribution to this energy shift which arises from the vertex
terms. We shall not discuss this effect.
∞
0
dq 2
Im Πphy (q 2 )
q 2 (q 2 − q 2 − i )
(5.279)
which tells you that the imaginary part of Πphy determines the entire function.93
Yet another useful integral representation for Πphy is given by the expression:
∞
1
k2
(5.280)
Πphy (k 2 ) =
ds r(s) 2
4π 4m2
k +s
where
1 1
r(s) =
3π s
1−
4m2
s
2m2
1+
s
(5.281)
This form is useful to determine the modification of the Coulomb potential
due to the vacuum polarization, described by Πphy . The electromagnetic
field produced by a given source is always given by
Am (x) = d4 x Dmn (x − x ) J n (x )
(5.282)
Since the propagator — which is the Green’s function connecting the source
to the field — is now modified by Πphy , it also affects the field produced
by any source. Using Eq. (5.280), we can now write:
1
e2 ∞
r(s)
d4 k
+
ds
eik(x−x ) Jm (x )
4
2
2
(2π) k
4π 4m2 k + s
(5.283)
As an illustration, consider the Coulomb field produced by a point charge;
in this case, the scalar potential evaluates to
√
q
e2 ∞
− s |r|
ds r(s)e
(5.284)
φ(r) =
1+
4π|r|
4π 4m2
Am (x) = e2
d4 x
It is easy to obtain the limiting forms of this potential using the corresponding limitings forms of r(s). We get
m|r| 1 :
φ(r)
e2 e−2m|r|
q
1 + 3/2
4π|r|
4π
(m|r|)3/2
(5.285)
m|r| 1 :
−γE
e
e2
q
1 + 2 ln
φ(r)
4π|r|
6π
2m|r|
(5.286)
and
It is obvious that the potential gets modified at distances of the order of
the Compton wavelength of the electron which is precisely what one would
have expected.94
5.7. One Loop Structure of QED
5.7.2
245
Electron Propagator at One Loop
Just as the photon propagator is corrected by electron loops, the electron
propagator is also corrected by emission and re-absorption of a virtual
photon as indicated in the diagram of Fig. 5.9. The corresponding algebraic
expression is given by
i
−iηmn n
dn k m
2
−iΣ(p) = (−ieμ )
γ
γ
(5.287)
(2π)n
γ(p − k) − m k 2
where we have already gone over to n− dimensions. We have also replaced
e by eμ right at the beginning, thereby keeping e dimensionless. In exactly the same way as the photon propagator was corrected by the virtual
electron loop, the process described above corrects the electron propagator.
Just as in the case of photon propagator, we can also sum up an infinite
number of photon loop diagram while evaluating the correction to the electron propagator. Taking care about ordering of factors, this will lead to
the result
¯
iS(p)
≡ S(p)[1 + (−iΣ(p))S(p) + (−iΣ(p)S(p))2 + · · · ]
i
= S(p)[1 + iΣ(p)S(p)]−1 =
(5.288)
γp − m − Σ(p)
Exercise 5.19: Fill in the details and
obtain this result. (This is again similar to what we did for the scalar field
to obtain Eq. (4.145).
So, the inverse propagator for the electron will get modified to the form
S¯−1 (p) = γp − m − Σ(p)
(5.289)
To see what this could lead to, let us write Σ(p) in a Taylor series
expansion:
Σ(p) = A + B(γp − m) + ΣR (p)(γp − m)2
(5.290)
Substituting this into the propagator, and treating A, B, ΣR as perturbative
quantities of order e2 , we get
¯
iS(p)
=
i
γp − m − A − B(γp − m) − (γp − m)2 ΣR (p) + i
i
(γp − m − A)(1 − B) (1 − (γp − m)ΣR (p)) + i
(1 + B)i
(5.291)
(γp − m − A) (1 − (γp − m)ΣR (p)) + i
This expression shows that the propagator has been modified in two ways.
First, the mass has shifted from m to m+ A which defines the renormalized
mass. (We saw a similar feature in the case of λφ4 theory in Sect. 4.7.1)
Second, we have also acquired an overall multiplicative factor Z2 ≡ (1 + B)
in the numerator.95 As we shall see, both A and B will be divergent
quantities. We can absorb the divergence in A by redefining the mass.
To absorb the divergence in Z2 (which is the residue of the pole of the
propagator) we need to rescale the original field ψ(x) and ψ † (x) by ψ =
−1/2
Z2
ψ etc. The propagators for the primed fields will now have unit
residue at the pole, as they should. Once this is done, we have a finite
quantity ΣR (p) which will describe non-trivial, finite effects of photon loops
on the electron propagator. We will now see how this works out.
Expression in Eq. (5.287) can, again, be evaluated just as before.96 The
95 The notation Z , Z , .... etc is of his1
2
torical origin and contains no significance.
96 The details are given in Mathematical Supplement 5.9.1.
246
Chapter 5. Real Life II: Fermions and QED
final result is given by
,
e2
e2
γp(1 + γE ) − 2m(1 + 2γE )
(−γp + 4m) +
Σ(p) =
16π 2
16π 2
1
αm2 − α(1 − α)p2
(5.292)
dα [γp(1 − α) − 2m] ln
+2
4πμ2
0
97 Recall
that ∂p2 /∂(γ j pj ) = 2γ j pj
where n = 4 − 2 .
The expression clearly has a divergence in the first term when → 0.
This is the usual UV divergence which arises because the momentum of
the virtual photon can be arbitrarily high. The cure for this UV divergence
is based on the standard renormalization procedure in a manner similar to
what we did for the photon propagator. We shall describe this procedure
in a moment. However, before we do that, let us take note of the fact
that there is another divergence lurking in Eq. (5.292) in the integral in
the last term. One way to see this is to try and evaluate the coefficient B
which appears in the Taylor series expansion in Eq. (5.290). By definition,
B is given by B ≡ ∂Σ/∂(γ j pj ) evaluated at γ j pj = m. Differentiating
Σ(p) in Eq. (5.292) with respect to γ j pj and97 evaluating the result at
γ j pj = m, p2 = m2 , we get:
1
dα
∂Σ
e2
e2 1
B=
+
+ (finite terms) (5.293)
=
−
j
2
2
∂(γ pj )
16π
4π 0 α
γp=m
Exercise 5.20: Work out explicitly
the effect of mγ .
The first term diverges when → 0 and is the usual UV divergence. The
second term diverges in the lower limit of the integration which is actually an infrared divergence. The reason for the infrared divergence is
conceptually quite different from the UV divergence. It has to do with
the fact that a process like the one in Fig. 5.9 is always accompanied by
the emission of very low frequency photons. This can be taken care of
mathematically by adding a small mass mγ to the photons and taking the
limit mγ → 0 right at the end of the calculation. One can easily show that
changing k 2 to (k 2 − m2γ ) in Eq. (5.287) will replace αm2 in Eq. (5.292)
by [αm2 + (1 − α)m2γ ]. Then the infrared divergence in Eq. (5.293) will
disappear and, instead, we will get a term proportional to ln(mγ /m). Since
our interest is essentially in the UV divergence and renormalization, we will
not discuss IR divergences in detail. (See Problem 18.)
From Eq. (5.292), we see that the UV divergent piece in the last term
is given by
2
e2
Σdiv (p) = −
(4m − γp)
(5.294)
(4π)2 n − 4
which allows us to write the divergent part of the inverse propagator as
98 This
actually defines a running
mass parameter mR (μ) because e2 depends on μ. From Eq. (5.244), we see
that, to the lowest order, e2 satisfies
the relation μ(de2 /dμ) = (n − 4)e2 . So
we get μ(dmR /dμ) = −(3e2 /8π 2 )mR .
We will see later that the physical
mass, defined by the location of the
pole of the propagator, is independent
of μ.
−1
(p) =
S¯div
=
γp − m − Σdiv (p)
(5.295)
&
'
2
2
2
6
e
e
γp − m 1 −
1−
(4π)2 n − 4
(4π)2 n − 4
To take care of these divergences, we need to do two things. First, we need
to renormalize the mass of the electron to a physical value98 by the relation
e2
6
(5.296)
m = mR 1 +
(4π)2 n − 4
5.7. One Loop Structure of QED
247
The correction to the mass δm arising from this process (while divergent)
is proportional to m. In other words, δm → 0 as m → 0. This is actually
related to the fact that the Dirac Lagrangian remains invariant under the
transformation ψ → e−iαγ5 ψ in the m = 0 limit. This symmetry is retained
to all orders of perturbation theories, which, in turn, implies that a nonzero mass should not be generated by the loop corrections. In other words,
this symmetry — called the chiral symmetry — protects the electron mass
from what is usually called the hierarchy or fine-tuning problem. You may
recall that in the φ4 theory, the one loop correction to the scalar propagator
changes the mass in the form m2 = m20 + (constant)M 2 , where M is a high
mass scale in the theory (see the margin note after Eq. (4.72)). We see
that, in contrast to λφ4 theory, the QED correction to the mass δm is
determined by the mass scale m itself and is protected by chiral symmetry.
The second thing we have to do is to take care of the overall divergent
factor in front of the curly bracket in Eq. (5.295). This is done by renor¯ themselves by rescaling them, so that we can write:
malizing ψ (and ψ)
S(p) = Z2 SR (p);
Z2 ≡ 1 +
2
e2
(4π)2 n − 4
(5.297)
where SR (p) is a finite renormalized propagator and Z2 is a divergent constant. (This correction is also sometimes called the wave function renormalization.) Once this is done, we get a finite renormalized propagator
given by
−1
(p) = γp − mR − Σren (p);
SR
Σren (p) = Σ(p) − Σdiv (p)
(5.298)
where
Σren (p) =
−
e2
(4π)2
−
1
dα [4m − 2(1 − α)γp] ln
0
2
αm2 − α(1 − α)p2
4πμ2
e
[2(1 + 2γE )m − (1 + γE )γp]
(4π)2
(5.299)
and we have taken the n → 4 limit.
The physical mass of the electron, however, is determined by the loca−1
tion of the pole of the propagator at which SR
(p) = 0 with mphy = γ a pa .
99
This leads to the result:
mphy
=
=
mR + Σren (p)|γ j pj =mR
&
2
'
mR
e2
mR 1 −
3 ln
− 4 + 3γ
(4π)2
4πμ2
(5.300)
Re-expressing the propagator in terms of the physical mass, we get the final
result to be
¯
(5.301)
S −1 (p) = γp − mphy − Σ(p)
where
¯
Σ(p)
=
αm2 − (1 − α)p2
dα [4m − 2(1 − α)γp] ln
λm2
0
2
2
e
m
+ γE − 2
−
(−γp + m) ln
(5.302)
(4π)2
4πμ2
e2
−
(4π)2
1
99 You can now explicitly verify
that mphy satisfies the condition
dmphy /dμ = 0 as it should.
248
100 We
encountered the same issues in
the λφ4 theory. As you will recall,
when we subtract only the divergent
terms, it is called the minimal subtraction scheme (MS). Occasionally,
one includes a factor of 4πe−γE into
the subtracted term, which is the MS
scheme.
101 Numerically,
the difference between
mphy and mR will be small if one
chooses the renormalization point μ ≈
mphy ; this is evident from Eq. (5.300).
All these facts are not very pertinent
to one loop QED, but the renormalization scheme and the subtraction procedure can make significant differences to
the ease of calculation in QCD and in
higher order perturbation theory.
Chapter 5. Real Life II: Fermions and QED
Incidentally, the above calculations also illustrate several standard techniques used in renormalization theory, especially in higher order perturbation theory, QCD etc. For example, notice that we have actually introduced
two different masses mR and mphy . The first one mR was obtained by regularizing the divergences which appeared in the theory (see Eq. (5.296)).
The divergent terms in this expression (as well as in the definition of Z2 in
Eq. (5.297)) were chosen without any finite parts. One could have always
redefined the finite quantities which arise in Eq. (5.298) by adding some
finite parts to the divergent quantities and redefining the terms.100 Any
one of these subtraction schemes allows us to define the renormalized mass
mR . As one can see from Eq. (5.296), the mR does pick up radiative corrections and ‘runs’ with the renormalization scale μ. In contrast, the physical
mass mphy is always defined using the location of the pole of the electron
propagator. In our scheme mphy does not depend on the renormalization
scale μ, as we have explicitly seen above.101
It should be stressed that the specific form of the divergence of Σ(p)
played a crucial role in our ability to carry out the renormalization. Because
Σ contained two divergent pieces, one proportional to γ j pj and the other
proportional to m, we could write
γp − m − Σ(p) =
=
e2
(γp − 4m) + Σfin (p)
8π 2
−1
Z2 (γp − (m + δm)) + Σfin (p)
(5.303)
(γp − m) +
This, in turn, allowed us to reabsorb the divergences by a mass renormalization and wave function renormalization. If, for example, we had picked
up a divergent term like e.g., p2 /m in Σ(p), we would be sunk; QED would
not have been renormalizable.
Also note that the finite piece Σfin (p) is non-analytic in p — unlike
the divergent pieces which are analytic in p. This will turn out to be a
general feature in all renormalization calculations. There is a branch-cut
in the expression when the argument of the log becomes negative; i.e., when
p2 > m2 /x. In particular, at x = 1, there is a branch-cut at p2 = m2 and
Σ(p) will pick up an imaginary piece. This is similar to what we saw as
regards the photon propagator (see Eq. (5.278)) and in Sect. 3.7.
5.7.3
102 We
do not have to worry about a
term involving σmn q n because it can
be expressed as a linear combination
of the first two terms, in expressions
of the form u
¯(p )Λm u(p). (As usual,
we will suppress the spin parameter in
u(p, s) and simply denote it by u(p)
etc.) We will only be concerned with
the vertex function which occurs in this
function.
Vertex Correction at One Loop
In this case, we need to compute the contribution from the diagram in Fig.
5.10 which translates into the algebraic expression
i
i
dn k −iη nr
3
γm
γr
−ieμ Λm = (−ieμ )
γn
(2π)n k 2
γ(p − k) − m
γ(p − k) − m
(5.304)
The net effect of such a diagram will be to change the vertex factor from
−ieγm to −ieΛm in D = 4. As mentioned earlier, it is convenient to pull
out a dimensionful factor μ and write (in n = 4 − 2 dimensions) the
relevant contribution as −ieμΛm ; this will again keep e dimensionless.
Before we plunge into the calculation, (which is probably the most
involved of the three we are performing, since this requires working with
three vertices), we will pause for a moment to understand what such a
modification of γm → Λm physically means.
Given the vectors which are available, the most general form for the
modified vertex is given by102
5.7. One Loop Structure of QED
249
Λm (p, q, p = p + q) = γm A + (p + p)m B + (p − p)m C
(5.305)
The coefficients A, B, C could contain expressions like γp or γp . But when
sandwiched between the two spinors, in a term of the form u¯(p )Λm u(p),
we can replace γp and γp by m on-shell. Further, since p2 = p2 = m2 , the
only non-trivial quantity left for us to consider as an independent variable
is q 2 = 2m2 − 2p · p . Thus, A, B, C can be thought of as functions of q 2 .
There is one more simplification we can make. Gauge invariance requires
that q m u
¯(p )Λm u(p) = 0. This is trivially satisfied for the first term in
Eq. (5.305) because of the Dirac equation and due to the identity (p − p) ·
(p + p) = p2 − p2 = m2 − m2 = 0 for the second term. But there is no
reason for this to hold for the last term in Eq. (5.305) unless C = 0. Thus
we conclude that, in the right hand side of Eq. (5.305) we can set C = 0
and need to retain only the first two terms.
It is conventional to replace the second term using the Gordon identity
(see Eq. (5.125)) and write the resulting expression in the form
iσmn q n
2
2
F2 (q ) u(p)
(5.306)
u
¯(p )Λm u(p) = u¯(p ) γm F1 (q ) +
2m
where F1 and F2 are called the form-factors. When we do the computation, however, it is more convenient to re-write this result (again using the
Gordon identity) in terms of (p + p) and (p − p). Doing this, we find that
to leading order in the momentum transfer q, we can write this expression
in the form
&
'
(p + p)m
iσ mn qn
m
F1 (0) +
[F1 (0) + F2 (0)] u(p) (5.307)
u
¯Λ u = u
¯(p )
2m
2m
By definition, the coefficient of the first term is the electric charge in
the q → 0 limit, and hence we must have F1 (0) = 1. This tells us that the
primary effect of the vertex correction is to shift the magnetic moment from
its Dirac value by a factor 1 + F2 (0). In other words, radiative corrections
lead to the result that the gyro-magnetic ratio of the electron is actually
g = 2[1 + F2 (0)]. This is a clear prediction which can be compared with
observations once we have computed Λm and extracted F2 (0) from it. We
shall now proceed to do this.
We compute the expression in Eq. (5.304) by the usual tricks to reduce
(1)
(2)
it to Λm = Λm + Λm where the first term has a UV divergent piece with
103
the structure
Λ(1)
m =
e2 1
γm + finite;
16π 2
2 = 4 − n
(5.308)
and the second term is finite and is given by
1−α
1
e2
dα
dβ
(5.309)
16π 2 0
0
γn (γ j pj (1 − β) − αγ j pj + m)γm (γ j pj (1 − α) − βγ j pj + m)γ n
×
[−m2 (α + β) + α(1 − α)p2 + β(1 − β)p2 − αβ2p · p]
Λ(2)
m =
(1)
Once again, we need to take care of the divergent part Λm using a
renormalization prescription. But let us first compute the finite part and
extract from it the value of F2 (0) which will give us the anomalous magnetic
103 The
details are given in Mathematical Supplement 5.9.2. This term also
has an IR divergent piece which can
be regulated by adding a small mass
to the photon. We will be primarily
concerned with the UV divergence.
250
Chapter 5. Real Life II: Fermions and QED
moment of the electron. A somewhat lengthy calculation (given in Mathematical Supplement 5.9.2) shows that the finite part, when sandwiched
between two Dirac spinors, can be expressed in the form
u
¯(p
)Λ(2)
m u(p)
1−α
1
−e2
1
= u¯(p )
dα
dβ
16π 2 m2 0
(α
+
β)2
0
× − 2γm m2 [2(β + α) + (α + β)2 − 2]
−2iσmn q n m[(α + β) − (α + β)2 ] u(p)
(5.310)
Right now, we are only interested in the part which is the coefficient of
iσmn /2m; this is given in terms of the two easily evaluated integrals
1
1−α
dα
0
dβ
0
1
= 1;
(α + β)
1
1−α
dα
0
dβ =
0
1
2
The term which we are after now reduces to
iσmn q n e2
iσmn q n α
=
2m
8π 2
2m
2π
(5.311)
(5.312)
so that F2 (0) = α/2. This leads to the famous result by Schwinger, viz.,
that the gyro-magnetic ratio of the electron is given by
α
(5.313)
g =2 1+
2π
5.8
104 Note
that the divergent (1/)
terms satisfy the identity Λm =
−∂Σ(p)/∂pm . This is implied by the
Ward identity in Eq. (5.187) which
should hold to all orders in the perturbation theory. Here, we have explicitly
verified it to the one loop order.
QED Renormalization at One Loop
We have now computed the corrections to the photon propagator, the electron propagator and the vertex term at the one loop level. Each of the
results involved a divergent piece and finite terms. To clarify the renormalization program, we will now summarize how the divergent pieces were
handled.
Let us begin by re-writing our results, isolating the divergences in the
form104 (see Eq. (5.294), Eq. (5.259), Eq. (5.308):
Σ(p) =
Πmn (k)
=
Λ(1)
m (p, q, p ) =
e2
(−γ j pj + 4m) + finite
16π 2
e2
(km kn − k 2 gmn ) + finite
12π 2
e2
γm + finite
16π 2
(5.314)
Let us begin by eliminating the divergence in the electron propagator which
actually has two separate pieces, one proportional to γ j pj and another
proportional to m. To take care of these, we will now add a counter-term
to the Lagrangian of the form
¯
¯ j ∂j ψ − Aψψ
ΔL = iB ψγ
(5.315)
where A and B are undetermined at present. These counter-terms will lead
to two more Feynman rules requiring the factors iB(γ j pj ) and −iA in the
5.8. QED Renormalization at One Loop
251
respective diagrams. So, all the amplitudes in the modified theory (with
the counter-term in the Eq. (5.315) added) will now have the contribution
e2
(−γ j pj + 4m) + finite − iA + iBγ j pj
16π 2
(5.316)
We now need to choose A and B such that this amplitude is finite. This is
easily done with the choices:
−iΣ(p) − iA + iBγ j pj = −i
A=−
me2
,
4π 2
B=−
e2
16π 2
(5.317)
The trick now is to match the fermionic part in the two Lagrangians: The
modified physical Lagrangian L + ΔL and the original Lagrangian L0 with
bare parameters which are indicated with a subscript zero. That is, we
demand
¯ j ∂j ψ − (m + A)ψψ
¯ = iψ¯0 γ j ∂j ψ0 − m0 ψ¯0 ψ0
i(1 + B)ψγ
(5.318)
As indicated in the earlier discussion, this can be accomplished by a mass
renormalization and a wave function renormalization of the form:
ψ0 = Z2 ψ,
m0 = m + δm
(5.319)
where
Z2
m0
=
1+B =1−
=
Z2−1 (m
=
e2
,
16π 2
e2
+ A) = m 1 +
16π 2
2
3e
= m + δm
m 1−
16π 2
e2
1− 2
4π
(5.320)
We had already commented on the fact that the correction δm to the mass
is proportional to m (unlike in the λφ4 theory).
Let us next consider the divergent term in the photon propagator which
we hope to tame by adding a counter-term to the electromagnetic Lagrangian. We are working with the Lagrangian of the form105
1
1
1
L = − Fmn F mn − (∂m Am )2 = Am (ηmn )An
4
2
2
(5.321)
which includes the gauge fixing term. Hence, a priori, one could have added
a counter-term of the kind
ΔL = −
E
C
Fmn F mn − (∂m Am )2
4
2
(5.322)
This is, however, a little bit disturbing since you would not like a pure gauge
fixing term to pick up corrections due to renormalization. Let us, however,
proceed keeping these terms general at this stage and try to determine
C and E. These counter-terms in the Lagrangian will lead to two more
Feynman rules requiring the inclusion of the factors −iC(k 2 ηmn − km kn )
and −iEk 2 ηmn in the respective diagrams. But we see that the divergent
term in the propagator has the structure:
iΠmn (k) = −i
e2
(k 2 ηmn − km kn )
12π 2
(5.323)
105 We
can use either the Lagrangian in
Eq. (5.237) or the one in Eq. (5.236);
we will use the latter since it will clarify
an important point. Exercise: Redo
this analysis using the Lagrangian in
Eq. (5.237).
252
Chapter 5. Real Life II: Fermions and QED
which tells you that we can take care of everything with the choice
C=−
106 This
matches with the conclusion
we obtained earlier in Eq. (5.183) and
Eq. (5.184), from the Ward identity,
viz. that the gauge fixing parameter
does not pick up radiative corrections.
This, however, does not mean that the
structure of the propagator at higher
orders will have the same form as the
original one in whatever gauge you are
working with. But, the subsequent calculations at higher order can always be
done in the same gauge.
e2
;
12π 2
E=0
(5.324)
In other words, the gauge fixing term does not invite corrections at the one
loop level. With this addition, the electromagnetic part of the Lagrangian
takes the form106
1+C
1
Z3
1
L+ΔL = −
Fmn F mn − (∂m Am )2 ≡ − Fmn F mn − (∂m Am )2
4
2
4
2
(5.325)
where
e2
(5.326)
Z3 = 1 + C = 1 −
12π 2
As usual, we identify L + ΔL with the original Lagrangian plus any gauge
terms which you choose to add. This is easily taken care of by another
wave function renormalization of the form
Am
Z3 Am
(5.327)
0 =
Having handled the pure Dirac sector and the pure electromagnetic sector, we finally take up the vertex correction which modifies the interaction
term. Playing the same game once again, we now add a counter-term of
the form
¯ j Aj ψ
(5.328)
ΔL = −Deμ ψγ
which will lead to a Feynman rule of the form −iDeμγm . (As described
earlier, we have kept e to be dimensionless and introduced a mass scale μ.)
(1)
We now need to choose D such that −ieμ (Λm + Dγm ) is finite, where
(1)
Λm = (e2 /16π 2 )γm plus finite terms. This is easily taken care of by the
choice D = −(e2 /16π 2 ). The full Lagrangian will then become
¯ m ψ ≡ −Z1 eμ Am ψγ
¯ mψ
L + ΔL = −(1 + D)eμ Am ψγ
(5.329)
with
e2
(5.330)
16π 2
We thus find that we need 3 multiplicative renormalization factors for the
wave functions given by:
Z1 = 1 + D = 1 −
Z1 = Z2 = 1 −
e2
,
16π 2
Z3 = 1 −
e2
12π 2
(5.331)
Equating the modified Lagrangian to the bare one as regards the interaction
term, we obtain the condition:
2
¯ m ψ = −e0 Am ψ¯0 γm ψ0 = −e0 Z3
¯ mψ
Z2 Am ψγ
−Z1 eμ Am ψγ
0
(5.332)
where we have related the bare fields to the renormalized fields.
Let us now put these results together. The renormalization prescription
used in conjunction with dimensional regularization relates the bare and
renormalized quantities through the relations
1
,
Am = √ A(0)
Z3 m
1
ψ = √ ψ (0) ,
Z2
mR =
1 (0)
m ,
Zm
eR =
1 d−4 (0)
μ 2 e
Ze
(5.333)
5.8. QED Renormalization at One Loop
253
leading to a QED Lagrangian of the form
1
2
¯
¯ − μ 4−d
¯
2 e Z Z
+ iZ2 ψγ∂ψ
− mR Z2 Zm ψψ
L = − Z3 Fmn
R e 2 Z3 ψγAψ
4
(5.334)
In this rescaling, we have kept eR and the various Zs as dimensionless. If
we write every Z in the form Zi = 1 + δi , then we know from our previous
studies that the divergent parts of δi are given by
2
e2R
e2R
e2R
e2R
1
4
3
δ2 =
− , δ3 =
−
− , δe =
, δm =
16π 2
16π 2
3
16π 2
16π 2 3
(5.335)
to O(e2R ). This, in turn, allows us to obtain the running of the coupling constants directly from the Lagrangian. For example, since the renormalized
Lagrangian depends on μ but not the bare Lagrangian, we must have
d
μ dZe
μ d
d
[μ eR Ze ] = μ eR Ze +
eR +
(5.336)
0 = μ e0 = μ
dμ
dμ
eR dμ
Ze dμ
We know that, to leading order in eR , we have Ze = 1 giving
μ
d
eR = − eR
dμ
while, to the next order:
d
e2
e2 2
1 eR
d
d
e
μ
= − R2
1 + R2
=
μ Ze = μ
R
2
dμ
dμ
16π 3
12π
dμ
12π
(5.337)
(5.338)
We therefore get
β(eR ) ≡ μ
e3
e3
d
eR = − eR + R 2 → R 2
dμ
12π
12π
(5.339)
where the last expression is obtained in the limit of → 0. This agrees with
our previous result for the β function in e.g., Eq. (5.202) or in Eq. (5.274).
(Note that the β’s differ by a trivial factor 2 depending on whether we define
β as μ(deR /dμ) or as μ(de2R /dμ). But we have now calculated it using only
the counter-terms without summation of logs etc.107 In a similar manner,
we can also work out the “running” of the electron mass. Since the bare
mass must be independent of μ, we have the result
d
μ dmR
μ dZm
d
0 = μ m0 = μ
(Zm mR ) = Zm mR
+
(5.340)
dμ
dμ
mR dμ
Zm dμ
It is convenient to define a quantity γm ≡ [(μ/mR ) (dmR /dμ)] (which is
called the anomalous dimension). Using the fact that Zm depends on μ
only through eR , we have
γm = −
1 dZm deR
μ dZm
=−
μ
Zm dμ
Zm deR
dμ
(5.341)
Using the known expressions for Zm and β(eR ) correct to one loop order,
we find that
1
3e2
2
− eR = δm = − R2
δm
(5.342)
γm = −
1 + δm e R
2
8π
107 Notice
that the β function depends √
on the combination Ze =
Z1 /(Z2 Z3 ). Here, Z1 arises from the
electron-photon vertex, Z3 from the
vacuum polarization and Z2 from the
electron self-energy. In QED (but not
in other theories like QCD), Z1 = Z2
and hence the β function can be calculated directly from Z3 itself. The fact
that Z1 = Z2 is not a coincidence but
is implied, again, by the Ward identity in QED. Consequently, the charge
renormalization only depends on Z3
which is the photon field renormalization factor. (This would have been obvious if we had used the Lagrangian
in Eq. (5.237); we used the form in
Eq. (5.236) to explicitly show that we
still get the correct result.) The procedure of computing the running coupling constant through the μ dependence of the bare Lagrangian is more
useful in theories when Z1 = Z2 .
254
Chapter 5. Real Life II: Fermions and QED
This matches with the result obtained earlier from Eq. (5.296).
The fact that Z1 = Z2 is also closely related to the charge being a
property of the photon, along the lines we described earlier. To see this
more explicitly, consider a theory with a quark having the charge Qq = 2/3
and an electron with charge Qe = −1. A Lagrangian which includes both
the fields will be given by
L
108 This
is an explicit demonstration
of the features we described just after Eq. (5.237). We used the example of quark + electron here to emphasize that the charges need not have the
same magnitude at the tree-level.
1
2
+ iZ2e ψ¯e γ∂ψe − eR Z1e ψ¯e γAψe + iZ2q ψ¯q γ∂ψq
= − Z3 Fmn
4
2
+ eR Z1q ψ¯q γAψq
(5.343)
3
Because Z1e = Z2e and Z1q = Z2q , this Lagrangian can be rewritten in the
form:
2
1
2
¯
¯
L = − Z3 Fmn + Z2e ψe (iγ∂ − eR γA)ψe + Z2q ψq iγ∂ + eR γA ψq
4
3
(5.344)
In other words, the relative coefficient between iγ∂ and eR (γA) does not
pick up any radiative correction. So the ratio of the charges of the electron and the quark does not change due to radiative corrections, which
of course is vital.108 In our way of describing the Lagrangian, we would
have rescaled the vector potential by Am → Am /eR . Then, the above
Lagrangian becomes:
Z1e
1
2
+ Z2e ψ¯e iγ∂ −
γA ψe
L = − 2 Z3 Fmn
4eR
Z2e
Z1q 2
γA ψq
+ Z2q ψ¯q iγ∂ +
(5.345)
Z2q 3
At the tree-level, all the Z-factors are unity and this Lagrangian is invariant
under the gauge transformations of the form
ψq → e(2/3)iα ψq ,
ψe → e−iα ψe ,
Am → Am + ∂m α
(5.346)
These gauge transformations only involve the numbers −1 and 2/3 but not
eR . Further, such a transformation has nothing to do with perturbation
theory. As long as we do everything correctly using a regulator which
preserves gauge invariance, the loop corrections, counter-terms etc. will
all respect this symmetry and we must have Z1 = Z2 to all orders of
perturbation theory. This shows the overall consistency of the formalism.
The above discussion involved dealing with the bare parameters of a Lagrangian, a set of counter-terms and the final Lagrangian in terms of physical, renormalized quantities. Both the bare terms and the counter-terms
are divergent but the final renormalized expressions are finite. There is an
alternative way of viewing this procedure, which is algebraically equivalent
and conceptually better. In this approach, we start with a QED Lagrangian
written in the form
1
2
¯
¯ − eR Z1 ψγAψ
¯
− Z2 Zm mR ψψ
(5.347)
+ iZ2 ψγ∂ψ
L = − Z3 Fmn
4
√
where we have used the abbreviation Z1 ≡ Ze Z2 Z3 . We next expand the
parameters around the tree-level values as
Z 1 ≡ 1 + δ1 ,
Z 2 ≡ 1 + δ2 ,
Z 3 ≡ 1 + δ3 ,
Z m = 1 + δm ,
Z e = 1 + δe
(5.348)
5.9. Mathematical Supplement
255
where δe = δ1 − δ2 − (1/2)δ3 + O(e4R ). This allows us to separate the
Lagrangian into one involving the physical fields and the other involving
counter-terms, in the form109
L =
1 2
1
2
¯
¯ − eR ψγAψ
¯
− Fmn
+ iψγ∂ψ
− mR ψψ
(5.349)
− δ3 Fmn
4
4
¯
¯ − eR δ1 ψγAψ
¯
+ iδ2 ψγ∂ψ
− (δm + δ2 )mR ψψ
The counter-terms will lead to new Feynman rules shown in Fig 5.15. We
can now work out a perturbation theory based on this renormalized Lagrangian. This has the virtue that, even though the counter-terms are
arbitrarily large numbers — scaling as (1/ ) in dimensional regularization
— they are defined by their Taylor expansions in powers of eR starting at
O(e2R ). So, the regularized perturbation expansion (with finite but arbitrarily small ) can be formally justified when eR is small. This should be
contrasted with the previous discussion in which we used the perturbation
expansion in the bare coupling e0 which itself diverges as and hence leads
to a somewhat dubious procedure.
Finally, we comment on the relation between perturbation theory and
the computation of effective action. You may be wondering what is the relation between the order-by-order perturbation theory based on renormalized coupling constants, say, and the effective Lagrangians we computed
in Sect. 5.6.5 (or in Sect. 4.2). The answer to this question is somewhat
complicated but we will briefly mention some relevant aspects.
To begin with, the key idea behind the calculation of the effective Lagrangian is to capture the effect of higher order Feynman diagrams (say,
the one involving several loops) by a tree-level Lagrangian. In other words,
a tree-level computation using the effective Lagrangian incorporates the
effects of summing over a class of Feynman diagrams, depending on the nature of approximations used in the computation of the effective Lagrangian.
The way the effective Lagrangian achieves this is by modifying the structure
of the vertices in a specific manner. For example, the computation of the
Euler-Heisenberg effective Lagrangian did not use the photon propagator
at all. Nevertheless, the effects of complicated Feynman diagrams involving
photon propagators can be reproduced from the effective Lagrangian by a
suitable modification of the vertex.
In fact, the situation is better than this at least in a formal sense. In
computing the Euler-Heisenberg Lagrangian, we integrated out the electron
field but assumed the existence of a background electromagnetic field Aj .
¯ ψ] by
One could have in fact computed an effective Lagrangian Γ[A, ψ,
assuming background values for both the vector potential and the electron
field and integrating out the quantum fluctuations of all the fields. Such
an effective Lagrangian, when used at the tree-level will, in principle, give
the two-point function containing all the quantum corrections. The price
we pay for getting exact results for a tree-level effective action is that it
will be highly non-local.
5.9
5.9.1
Mathematical Supplement
Calculation of the One Loop Electron Propagator
The integrals involved in Eq. (5.287) can be evaluated exactly as before
by writing the propagators in exponential form, completing the square and
109 This
is similar to what we did in
Sect. 4.8 for the λφ4 theory.
m
n = i[γpδ2 − (δm + δ2)mR ]
m
n = iδ3(p2g mn − pmpn)
m
n = −iδ3p2g mn
= −ieR δ1γ m
Figure 5.15: Diagrams generated by
the counter-terms and their algebraic
equivalents.
256
110 As
we mentioned earlier, the exponential form was first introduced by
Schwinger, and Feynman introduced
the technique we use below, which is
essentially the same.
Chapter 5. Real Life II: Fermions and QED
performing the momentum integral. However, we will do it in a slightly
different manner to introduce a technique you will find in many other textbooks.
This will involve combining the two denominators in a particular way
which is completely equivalent to writing the propagators in exponential
form.110 We again begin with the expression:
−iηmn n
i
dn k m
γ
γ
n
j
j
(2π)
γ pj − γ kj − m k 2
−iΣ(p) = (−ieμ )2
dn k γm (γ j pj − γ j kj + m)γ m
(2π)n
[(p − k)2 − m2 ]k 2
= −e2 μ2
(5.350)
We now combine the two denominators using the identity:
1
=
AB
1
dα
0
1
[A + (B − A)α]2
(5.351)
We then get:
γm (γ j pj − γ j kj + m)γ m
dn k
n
(2π) [α(p − k)2 − αm2 + (1 − α)k 2 ]2
0
(5.352)
We next introduce the variable k = k −αp which eliminates any k ·p terms
from the denominator. This allows us to ignore terms which are odd under
k → −k in the numerator. This leads to:
1
−iΣ(p) = −e2 μ2
dα
dα
0
dn k γm [(1 − α)γ j pj − γ j kj + m]γ m
(2π)n [k 2 − αm2 + α(1 − α)p2 ]2
1
= −e2 μ2
dαγm [(1 − α)γ j pj + m]γ m
0
×
1
−iΣ(p) = −e2 μ2
1
dn k
n
2
2
(2π) [k − αm + α(1 − α)p2 ]2
(5.353)
We now rotate to the Euclidean sector and perform the k0 integration,
thereby obtaining
−iΣ(p) =
=
=
− n/2)
(4π)n/2
2 2 Γ(2
−ie μ
1
dαγm [(1 − α)γ j pj + m]
(5.354)
0
×γ m [αm2 − α(1 − α)p2 ]n/2−2
Γ(2 − n/2) 1
−ie2 μ2
dα[(1 − α)γ j pj (2 − n) + mn]
(4π)n/2
0
−ie2
Γ( )
16π 2
×[αm2 − α(1 − α)p2 ]n/2−2
1
dα[(1 − α)γ j pj (−2 + 2 ) + m(4 − 2 )]
0
αm2 − α(1 − α)p2
×
4πμ2
−
(5.355)
5.9. Mathematical Supplement
257
Taking the → 0 limit and using Γ( ) −1 − γE , we find that:
(1/ ) − γE 1
dα[(1 − α)γ j pj (−2 + 2 ) + m(4 − 2 )]
16π 2
0
αm2 − α(1 − α)p2
× 1 − ln
4πμ2
,
e2
e2
j
= −i
(−γ pj + 4m) − i
γ j pj (1 + γE ) − 2m(1 + 2γE )
16π 2
16π 2
1
j
αm2 − α(1 − α)p2
+2
dα γ pj (1 − α) − 2m ln
4πμ2
0
−iΣ(p) = −ie2
(5.356)
This is the result used in the text.
5.9.2
Calculation of the One Loop Vertex Function
The calculation proceeds exactly as before except that it is more complicated. We first reduce the expression in Eq. (5.304) to the form
−ieμ Λm
=
=
=
1
1
dn k −iη nr
γm
γr
γn
(2π)n k 2
γ(p − k) − m
γ(p − k) − m
dn k γn (γ(p − k) + m)γm (γ(p − k) + m)γ n
−(eμ )3
(2π)n k 2 [(p − k)2 − m2 ][(p − k)2 − m2 ]
1−α
1
dn k
dα
dβ
−2(eμ )3
(2π)n
0
0
γn (γ(p − k) + m)γm (γ(p − k) + m)γ n
× 2
[k − m2 (α + β) − 2k · (αp + βp ) + αp2 + βp2 ]3
(5.357)
(ieμ )3
Next, we combine the denominators using a slightly generalized version of
the formula used earlier:
1
1−α
1
1
=2
dα
dβ
(5.358)
abc
[a(1
−
α
−
β)
+ αb + βc]3
0
0
Shifting the variable to k → k − αp − βp reduces it to the expression
1
1−α
dn k
(2π)n
0
0
γn (γp(1 − β) − αγp − γk + m)γm (γp(1 − α) − βγp − γk + m)γ n
×
[k 2 − m2 (α + β) + α(1 − α)p2 + β(1 − β)p2 − αβ2p · p]3
Λm = −2i(eμ )2
(2)
≡ Λ(1)
m + Λm
dα
dβ
(5.359)
(1)
The Λm contains two factors of k a in the numerator and is divergent; the
(2)
Λm has no k a in the numerator and is finite. (Note that terms with one
(1)
k a vanish due to k → −k symmetry.) Let us first evaluate Λm .
Writing γn γ j kj γm γ j kj γ n = kr ks γn γ r γm γ s γ n and using a standard integral
Exercise 5.21: Prove this.
258
Chapter 5. Real Life II: Fermions and QED
dn l
lm ln
1
= i(−1)n/2
(5.360)
(2π)n [l2 + M 2 + 2l · q]A
(4π)n/2 Γ(A)
1
Γ(A − n/2)qm qn
Γ(A − 1 − n/2)
+
gmn
×
2
(M 2 − q 2 )A−1−n/2
(M 2 − q 2 )A−n/2
we can evaluate the integral with two factors of k to obtain
kr ks
dn k
n
2
2
(2π) [k − m (α + β) + α(1 − α)p2 + β(1 − β)p2 − αβ2p · p]3
1
1
grs
= i(−1)n/2
(4π)n/2 Γ(3) 2
Γ(2 − n/2)
×
[−m2 (α + β) + α(1 − α)p2 + β(1 − β)p2 − αβ2p · p]2−n/2
1
=i
grs
4(4π)n/2
Γ( )
×
(5.361)
2
2
[m (α + β) − α(1 − α)p − β(1 − β)p2 + αβ2p · p]
Using now the result
grs γn γ r γm γ s γ n = γn γ r γm γr γ n = γn (2 − n)γm γ n = (2 − n)2 γm (5.362)
we can easily isolate the singular part and obtain
1−α
1
1
2
Λ(1)
4γm + finite
=
−2ie
dα
dβ
i
m
4(4π)2
0
0
e2 1
γm + finite
=
(5.363)
16π 2
The evaluation of the non-singular part makes use of the integral:
1
Γ(A − n/2)
dn l
= i(−1)n/2
(2π)n [l2 + M 2 + 2l · q]A
(4π)n/2 Γ(A)(M 2 − q 2 )A−n/2
(5.364)
This will lead to the result in Eq. (5.309) of the text.
To proceed further and obtain F2 (0), we first note that sandwiching the
above expression between the Dirac spinors gives
,
1−α
1
e2
2
u
¯(p )Λm u(p) = u¯(p )
dα
dβ
(5.365)
16π 2 0
0
γn (γ j pj (1 − β) − αγ j pj + m)γm (γ j pj (1 − α) − βγ j pj + m)γ n
×
u(p)
[−m2 (α + β) + α(1 − α)p2 + β(1 − β)p2 − αβ2p · p]
To simplify the numerator, we use the standard identities:
γn γ j a j γ j b j γ n
γn γ j a j γ n
=
=
4a · b,
−2γ j aj
γn γ j aj γ j bj γ j cj γ n = −2γ j cj γ j bj γ j aj ,
(5.366)
which gives
γn (γ j pj (1 − β) − αγ j pj + m)γm (γ j pj (1 − α) − βγ j pj + m)γ n
=
=
−2(γ pj (1 − α) − βγ j pj )γm (γ j pj (1 − β) − αγ j pj )
+4m(pm (1 − α) − βpm ) + 4m(pm (1 − β) − αpm )
−2(γ j pj (1 − α) − βm)γm (γ j pj (1 − β) − αm)
+4m(pm (1 − α) − βpm ) + 4m(pm (1 − β) − αpm )
(5.367)
j
− 2γm m2
− 2γm m2
5.9. Mathematical Supplement
259
To arrive at the last expression, we have used the on-shell condition arising
from the Dirac equation. We next use the easily provable results:
γ j pj γm γ j pj
=
2(pm + pm )m − 3m2 γm ,
γ j p j γm
=
2pm − γm γ j pj → 2pm − γm m,
γm γ j pj
=
2pm − γ j pj γm → 2pm − γm m .
(5.368)
These allow us to reduce the numerator to the form:
N = −2 (1 − α)(1 − β)(2(p + p )m m − 3m2 γm )
−β(1 −
β)m(2pm
− mγm ) − α(1 − α)m(2pm − mγm ) + αβm
2
+4m pm (1 − 2β) + pm (1 − 2α) − 2γm m2
= −2γm m2 [−3(1 − α)(1 − β) + β(1 − β) + α(1 − α) + αβ + 1]
(5.369)
+4pm m[β − αβ − α2 ] + 4pm m[α − βα − β 2 ]
while the denominator becomes D = −m2 (α+β)2 . The structure of the denominator shows that we can re-write the numerator, as far as the integral
goes, in the equivalent form:
N = −2γm m2 [4(β + α) − (α + β)2 − 2] + 2(pm + pm )m[(α + β) − (α + β)2 ]
(5.370)
Plugging in the numerator and denominator into Eq. (5.365), we get:
1−α
1
−e2
1
u(p)
=
u
¯
(p
)
u
¯(p )Λ(2)
dα
dβ
m
2
2
16π m 0
(α + β)2
0
× − 2γm m2 [4(β + α) − (α + β)2 − 2]
+2(pm + pm )m[(α + β) − (α + β)2 ] u(p)
1−α
1
−e2
1
dα
dβ
= u¯(p )
16π 2 m2 0
(α
+
β)2
0
× − 2γm m2 [4(β + α) − (α + β)2 − 2]
+2[2mγm − iσmn q n ]m[(α + β) − (α + β)2 ] u(p)
1−α
1
−e2
1
dα
dβ
= u¯(p )
16π 2 m2 0
(α
+
β)2
0
× − 2γm m2 [2(β + α) + (α + β)2 − 2]
−2iσmn q n m[(α + β) − (α + β)2 ] u(p)
(5.371)
This is the expression used in the text.
A Potpourri of Problems
Problem 1. Evaluation of 0|φ(x)φ(y)|0 for Spacelike Separation
Find 0|φ(x)φ(y)|0 in the coordinate space for (x − y)2 = −r2 < 0. Pay
special attention to the convergence properties of the integral.
Solution: We need to evaluate the integral
∞
d3 p 1 −ip·(x−y)
−i
peipr
G(x − y) =
e
=
dp
(2π)3 2ωp
2(2π)2 r −∞
p 2 + m2
(P.1)
where we have performed the angular part of the integration. This integral
is not convergent as it stands. One can distort the integration contour in
to the complex plane, but the contribution from the semicircle at infinity
in the complex plane will not vanish. One possibility is to explicitly add
a convergence factor like exp(− p) and take the limit → 0 at the end of
the calculation. A simpler trick is to re-write the integral in the form
∞
∞
peipr
(p − p2 + m2 )eipr
dp
=
dp
+ 2πδ(r)
(P.2)
p 2 + m2
p 2 + m2
−∞
−∞
where the Dirac delta function on the right hand side is cancelled by the
contribution from the second term in the integrand. The integrand now
falls as (1/p2 ) for large p and the contour integration trick will work for
this integral. This tells you that the usual regulators miss the 2πδ(r)
contribution; but for strictly spacelike separations we can take r > 0 (since
r = 0 will be on the light cone). So everything is fine and we can use the
standard integral representation for an avatar of the Bessel function
∞
xe−μx dx
√
= uK1 (uμ)
[u > 0, Re μ > 0]
(P.3)
x2 − u2
u
to get the final answer.
Problem 2. Number Density and the Non-relativistic Limit
Consider a free, massive, scalar field φ(x) which is expanded in the terms
of creation and annihilation operators in the standard manner as
1
d3 k
√
φ(x) =
a(k)e−ik·x + a† (k)eik·x ≡ φ(+) (x) + φ(−) (x)
3
(2π)
2ωk
(P.4)
© Springer International Publishing Switzerland 2016
T. Padmanabhan, Quantum Field Theory, Graduate Texts in Physics, DOI 10.1007/978-3-319-28173-5
261
262
A Potpourri of Problems
Since the total number of particles is an integral over a†k ak , it makes sense
to define a number density operator N (xi ) in the real space through the
relations
d3 k †
N≡
a
(k)a(k)
≡
d3 x N (xi )
(P.5)
(2π)3
(a) Find an expression for N in terms of φ± .
(b) In the text, we lamented the fact that localized particle states cannot
be introduced in a meaningful fashion in QFT. To see this in action once
again, let us study the expectation value of N in one-particle states. Define
a one-particle state |ψ through the relations
d3 k
d3 k
†
ψ(k)a
(k)|0
≡
ψ(k)|1k ,
(P.6)
|ψ ≡
3
(2π)
(2π)3
with
ψ|ψ =
d3 k
|ψ(k)|2 = 1
(2π)3
(P.7)
Compute the expectation value ψ|N (0, x)|ψ in terms of the “wavefunction” ψ(x). What do you find?
(c) Show that things work out fine in the nonrelativistic limit.
√
(d) What happens if you choose ψ(k) ∝ ωk or if you choose ψ(k) ∝
√
[1/ ωk ]? Calculate the expectation value of the Hamiltonian for a general
ψ(k) and explain why the above choices do not really lead to localization
of a particle. What happens in the non-relativistic case?
Solution: (a) It is easy to verify that a possible choice for N is given by
the expression
←
→
∂ (+) i
φ (x )
(P.8)
N (xi ) = iφ(−) (xi )
∂t
(b) The evaluation of the expectation value is straightforward. We get
d3 k d3 k ∗
ψ|N (0, x)|ψ = i
ψ (k )ψ(k)
(P.9)
(2π)3 (2π)3
× k |φ(−) (x)φ˙ (+) (x) − φ˙ (−) (x)φ(+) (x)|k = Re (χ∗1 (x) χ2 (x))
involving two distinct “wavefunctions” given by
d3 k √
d3 k 1
ik·x
χ1 (x) ≡
ψ(k)e
;
χ
(x)
≡
ωk ψ(k)eik·x
√
2
(2π)3 ωk
(2π)3
(P.10)
The fact that you cannot work with a single, unique wavefunction is again
related to the fact that particle states are not localizable.
(c) When c → ∞, the two wavefunctions reduce, except for a normalization,
to
√
1
χ1 (x) → √
(P.11)
ψ(x),
χ2 (x) → m cψ(x)
mc
with
ψ(x) =
1
(2π)3
d3 k ψ(k)eik·x
(P.12)
In this case, the expectation value of the number density becomes the
standard probability density |ψ|2 of NRQM. This is the best one can do.
263
A Potpourri of Problems
√
(d) If we choose ψ(k) ∝ ωk , then χ1 (x) is proportional to the Dirac
delta function δ(x) suggesting that one of the two wavefunctions describes
a particle localized at the origin. But in this case, χ2 is proportional to
G+ (x) and is spread over a Compton wavelength of the particle. [The
√
situation is similar for the other choice ψ(k) ∝ [1/ ωk ].] More importantly,
the expectation value of the energy density operator is non-zero all over
the space. The expectation value of the normal ordered Hamiltonian
1 ˙
H(0, x) = : φ(0,
(P.13)
x)2 + |∇φ(0, x)|2 + m2 φ(0, x)2 :
2
is given by
1
|χ2 (x)|2 + c2 ∇χ∗1 (x) · ∇χ1 (x) + m2 c4 |χ1 (x)|2
2
(P.14)
Again, things work out in the nonrelativistic limit when we have
ψ|H(0, x)|ψ =
ψ|H(0, x)|ψ → mc2 |ψ(x)|2 +
1
|∇ψ(x)|2
2m
(P.15)
If you integrate this over all space, the first term gives mc2 while the
second term gives the expectation value of the nonrelativistic kinetic energy
operator.
Problem 3. From [φ, π] to [a, a† ]
Using the standard mode expansion of the field φ(x) in terms of the creation and annihilation operators, prove that the equal time commutation
rule, [φ(x), π(y)] = iδ (3) (x − y), implies the commutation rules [ap , a†p ] =
(2π)3 δ (3) (p − p ).
Solution: Using the relations
1
d3 p
ap + a†−p eip·x
φ(x) =
3
(2π)
2ωp
3
d p
ω p
†
a
π(y) =
(−i)
−
a
eip ·y
p
−p
(2π)3
2
(P.16)
which are valid at t = 0, one can determine the creation and annihilation
operators in terms of φ(x) and π(y). A straightforward computation will
then lead to the necessary result. An alternative route is the following:
Use Eq. (P.16) to compute the commutator [φ(x), π(y)] in terms of the
different sets of the commutators of creation and annihilation operators.
If you now write the Dirac delta function in Fourier space, you should be
able to argue that the only consistent solutions for the commutators of the
creation and annihilation operators are the standard ones.
Problem 4. Counting the Modes between the Casimir Plates
Prove the counting used in the text for the degrees of freedom associated
with different wavenumbers of the electromagnetic wave modes between
the Casimir plates.
264
A Potpourri of Problems
Solution: Choose a coordinate system with one boundary at z = 0 and
consider the region 0 ≤ x ≤ Lx , 0 ≤ y ≤ Ly , 0 ≤ z ≤ Lz . From Maxwell’s
equations, it is straightforward to show that B⊥ = 0, E = 0 where ⊥
and indicate the components perpendicular and parallel to the surface
respectively. These translate to the conditions A = 0, ∂Az /∂z = 0. If we
take the plane wave modes to have the standard form A(t, x) = ei(k·x−ωt)
with · k = 0, where the wave vectors satisfy the standard boundary
conditions
πnx
πny
πnz
, ky =
, kz =
, (nα = 0, 1, 2, . . .)
(P.17)
kx =
Lx
Ly
Lz
then we can write the expansion for Ax in the form
Ax (x, y, z) =
∞
∞
∞
x (k) cos(kx x) sin(ky y) sin(kz z)
(P.18)
nx =0 ny =0 nz =0
The expansions for Ay and Az follow the same pattern. If all the three nα
are non-zero, then the gauge condition puts one constraint; therefore only
two of the coefficients in α (k) can be chosen independently. If only two
of the nα ’s are non-zero, then the situation is a little tricky. Consider, for
example, the case with nz = 0 with nx = 0, ny = 0. Since Ax and Ay
contain the factor sin(kz z), they both vanish leaving only one independent
solution which will be proportional to z (k). So there are two harmonic
oscillator modes when all of nα are non-zero but only one oscillator mode
for every triplet in which only two components are non-zero. This is the
result which was used in the text.
Problem 5. Effective Potential for m = 0
Consider the effective potential Veff of the λφ4 theory in the limit of m20 →
0. We know that the theory exhibits SSB when m20 < 0 while, it has a
single minimum (at φ = 0) when m20 > 0. The question arises as to what
happens to the theory due to higher order corrections when m20 = 0. In
this case, it is convenient to define the physical mass and coupling constant
by the conditions
2
4
d V
d V
m2phys =
;
λ
=
.
(P.19)
2
dφ φ=0
dφ2 φ=Λ
where Λ is an arbitrary energy scale. Work out the effective potential using
this prescription in terms of Λ.
Solution: In this case, the effective potential (worked out in the text)
becomes
λ
λ2
λφ2
Veff = φ4 +
φ4 ln 2
(P.20)
2
4!
(16π)
4μ
From Eq. (P.19), we find that
2
λΛ2
5
24
λ=λ+
λ +
λ2 ln 2 .
(P.21)
2
8π
(16π)
4μ
allowing us to express (4μ2 /λ) in terms of Λ:
ln
25
λΛ2
=−
2
4μ
6
(P.22)
265
A Potpourri of Problems
Substituting back into Eq. (P.20), we get:
Veff =
φ2
λ 4
λ2
25
4
φ +
ln
φ
−
4!
(16π)2
Λ2
6
(P.23)
The constant (25/6), of course, can be reabsorbed into ln Λ2 ; it is conventional to leave it as it is.
Once again, we have obtained a finite Veff without any cut-off dependence,
but it has an apparent dependence on Λ. We have already seen that this is
only apparent because the coupling constant λ is now defined at the energy
scale Λ. If we change this scale Λ to Λ , then we can retain the form of Veff
by changing λ to λ , such that
λ = λ +
3λ2
Λ
ln
16π 2
Λ
(P.24)
Under Λ → Λ , λ → λ transformations, Veff is invariant: Veff (λ , Λ ) =
V (λ, Λ) + O(λ3 ). This is exactly the “running” of the coupling constant we
have seen earlier.
Problem 6. Anomalous Dimension of the Coupling Constant
Consider the massless λφ4 theory in d = 4 with the Lagrangian L =
(1/2)(∂φ)2 + (λ/4!)φ4 . The resulting action is invariant under the scale
transformation xm → s−1 xm , φ → sφ with λ remaining invariant. Figure
out how the effective coupling constant of the renormalized theory changes
under the same scaling transformation.
Solution: This is fairly straight forward. Using the analysis given in the
text, you should be able to show that, to the lowest order,
2
3λeff
λeff → λeff exp
ln s = λeff s(3λ/16π ) ≡ sγ λeff
(P.25)
16π 2
The constant γ is called the anomalous scaling dimension of the coupling
constant.1
Problem 7. Running of the Cosmological Constant
One key result we arrived at in the study of renormalization (both in the
case of the λφ4 theory and QED) is that the renormalized constants appearing in the Lagrangian can depend on an arbitrary energy scale μ and
“run with it”. This idea will be applicable even to a constant term (say,
−Λ0 ) added to the standard scalar field Lagrangian.2 We found in Sect.
1.4.5 that the coincidence limit of the Green’s function, G(x; x), can be
interpreted in terms of the zero point energy of the infinite number of oscillators. This zero point energy will combine with the bare parameter Λ0
to lead to a renormalized cosmological constant Λ. Regularize the relevant expressions by the standard methods and show that the renormalized
cosmological constant Λ runs with the scale μ according to the equation
μ
1 m4
dΛ
= (4 − n)Λ −
dμ
2 (4π)2
(P.26)
1 This
is one of the many examples
in QFT where a classical symmetry is
lost in the quantum theory. One way
of understanding this result is to note
that our regularization involves using
the arbitrary dimension d. The original symmetry existed only for d = 4
and hence one can accept dimensional
regularization leading to a final result
in which a dimension dependent symmetry is not respected.
2 Because of historical reasons, Λ is
called the cosmological constant; for
our purpose it is just another bare parameter in the original, unrenormalized Lagrangian like m0 , λ0 etc..
266
A Potpourri of Problems
where m is the mass of the scalar field. Also show that the regularized zero
point energy is given by the expression
4
1 m4
3
m
ln
+ γE −
− Λ(μ)
(P.27)
E=
4 (4π)2
4πμ2
2
where the running of Λ(μ) ensures that E is independent of μ.
Solutions: Most of the work for this problem is already done in the text.
Just for fun, we will derive it using a more formal approach along the
following lines. From the Euclidean vacuum-to-vacuum amplitude of the
theory, expressed in the form
&
'
1
1 2 2
−W
n
2
= [Dφ] exp − (d xE ) (∂m φ) + m0 φ − Λ0
(P.28)
Z ≡e
2
2
we immediately obtain the relation
∂
1
ln Z = −
2
∂m0
2
where
GE (x, x) =
3 For a more challenging task, you
can try computing the running of Λ
and E in the interacting λφ4 theory. You will find that, at O(λ), the
Eq. (P.26) does not change!
(a)
(b)
−iηmn
k2
i
k 2−m2
ie(pm + pm)
(c)
2ie2ηmn
(dn xE )GE (x, x)
(dn kE )
1
(2π)n k 2 + m20
(P.29)
(P.30)
is the coincidence limit of the Green’s function. We are working in Euclidean n-dimensional space in anticipation of dimensional regularization
being used to handle Eq. (P.30). Using the standard Schwinger trick, we
can evaluate Eq. (P.30) to obtain
GE (x, x) =
(m20 )[(n/2)−1]
n
Γ
1
−
2
(4π)n/2
which allows us to integrate Eq. (P.29) and obtain:
&
'
n
1
mn0
n
Z = exp − (d xE )E ; E =
Γ
1
−
− Λ0
2
(4π)n/2 n
(P.31)
(P.32)
(compare with Eq. (2.84)). Taking the limit n → 4 and isolating the
divergences, we find that the renormalized cosmological constant Λ and
the bare constant Λ0 are related by
'
&
1 m40
1
n−4
+Λ
(P.33)
Λ0 = μ
2 (4π)2 n − 4
The standard condition μ(dΛ0 /dμ) = 0 now gives us the running of the
cosmological constant in Eq. (P.26). In the free theory, physical mass is
the same as the bare mass and this identification has been made in this
context.3 Substituting back Eq. (P.33) into Eq. (P.32), we get Eq. (P.27).
(d)
Problem 8. Feynman Rules for Scalar Electrodynamics
−iλ
(e)
Figure P.1:
scalar QED.
Feynman rules for the
In Sect. 4.2, we studied the scalar electrodynamics non-perturbatively in
order to compute the Schwinger effect. This theory is described by the
action
1
λ ∗ 2
4
mn
∗
m
2 ∗
(P.34)
+ (Dm φ) (D φ) − m φ φ − (φ φ)
A = d x − Fmn F
4
4
267
A Potpourri of Problems
where Dm (x) = ∂m − ieAm (x) is the standard gauge covariant derivative.
Work out and write down the Feynman rules for this theory.
Solution: The Feynman diagram with equivalent algebraic expressions are
given in Fig. P.1. Figures P.1(a) and (b) are straightforward and represent
the photon propagator and the scalar propagator. The last figure (Fig.
P.1(e)) represents the self-interaction vertex of the scalar field and has
nothing to do with the electromagnetic coupling. Figures P.1(c) and (d)
represent the photon-scalar vertices of the theory. Since there are both
linear and quadratic couplings to the vector potential, we get two kinds
of vertices — one with 3 lines and the other with 4 lines. The first one
is similar to the one in standard QED while the second one is new to the
scalar field interaction.
Problem 9. Two-loop Contribution to the Propagator in the λφ4
Theory
In Sect. 4.8 we described the renormalized perturbation theory for the λφ4
theory. We found that, at the one loop order, we needed counter-terms to
take care of the renormalization of the mass and the coupling constant. But
we did not require any wave function renormalization. The purpose of this
problem is to compute the corrections to the propagator to O(λ2R ) and show
that everything works out fine with a renormalization of the wavefunction.
To do this, it is convenient to relate the bare and renormalized quantities
through the equations:
1/2
φB = Zφ φR ,
m2B =
1
(m2 + δm2 ),
Zφ R
λB =
1
(λR + δλ)
Zφ2
+
(P.35)
a1
where Zφ is the wavefunction renormalization. The Lagrangian can now
be separated into the renormalized part and the counter-terms in the form
1
1
λR 4
1
1
δλ
(∂μ φR )2 − m2R φ2R −
φ + (Zφ − 1)(∂μ φR )2 − δm2 φ2R − φ4R
2
2
4! R 2
2
4!
(P.36)
We will expand the parameters as a power series in the coupling constant
by
L=
Zφ − 1 =
δλ =
z1 λR + z2 λ2R + · · · ,
δm2 = m2R (a1 λR + a2 λ2R + · · · ),
b2 λ2R + b3 λ3R + · · ·
(P.37)
We already know from our study correct to O(λR ), in the main text, that:
z1 = 0;
+
μ−2
;
a1 =
2(4π)2
3μ−2
b2 =
2(4π)2
b2
+
(P.38)
Draw the relevant diagrams with two-loops for the corrections to the propagator and compute the resulting expression. Use the dimensional regularization and show that the new divergences can be removed with the
choices
2
μ−4
μ−4
1
z2 =
,
a
(P.39)
=
−
2
24(4π)4
4(4π)4 2
Solution: This requires fairly involved computation, but you should be able
to fill in the details to the following steps. The relevant Feynman diagrams
z2/a2
+
Figure P.2: Two-loop diagrams that
contribute to the propagator.
268
A Potpourri of Problems
are shown in Fig. P.2. The contribution from the first one is given by the
integral
I1
=
1
(−iλR )2
4
=
iλ2R m2R μ−4
4(4π)4
2
−i
−i
idn p
idn k
(2π) p2 + m2R
(2π) k 2 + m2R
2
4πμ2
Γ( )Γ( − 1)
(P.40)
m2R
where the integrals are in Euclidean space. Standard dimensional regularization techniques will now give the divergent contribution to be
1
4πμ2
iλ2 m2 μ−4 1
I1 = − R R 4
1
−
2γ
(P.41)
+
+
2
log
E
4(4π)
2
m2R
where we have used the MS scheme. The contribution from the second
diagram is given by:
I2
=
=
2
idn p
−i
1
(−iλR )(−ia1 λR m2R )
2
(2π) k 2 + m2R
ia1 λ2R m2R μ−2 4πμ2
Γ( )
2
2(4π)2
mR
(P.42)
where we know from the result in the text that
a1 =
μ−2
2(4π)2
(P.43)
Again, the divergent contribution can be isolated by the MS scheme, leading
to
iλ2R m2R μ−4 1
1
4πμ2
I2 =
−γE + log 2
(P.44)
+
4(4π)4
2
mR
(Interestingly, you will find that the worst divergences proportional to −2
cancel when we add I1 and I2 .) The third contribution is given by
−i
1
idn p
ib2 λ2R m2R μ−2 4πμ2
2
=−
Γ( − 1)
I3 = (−ib2 λR )
2
(2π) p2 + m2R
2(4π)2
m2R
(P.45)
where we know from the result in the text that
b2 =
3μ−2
2(4π)2
(P.46)
This leads to:
I3 =
1
4πμ2
3iλ2R m2R μ−4 1
1
−
γ
+
+
log
E
4(4π)4
2
m2R
(P.47)
Diagram 4 is completely straightforward to compute and gives the contribution (iz2 λ2R p2 − ia2 λ2r m2R ).
Let us next get on to diagram 5 which is possibly the most complicated
one we need to calculate. Here we need to evaluate the integral
n
d k dn q
1
1
1
iλ2R
I5 =
(P.48)
6
(2π) (2π) k 2 + m2R q 2 + m2R (p + k + q)2 + m2R
269
A Potpourri of Problems
We will obtain the divergent parts by a series of tricks rather than attempt
a brute force computation. There are two divergent contributions at O(p0 )
and O(p2 ) if we expand I5 as a power series in p. We will evaluate the
coefficients of the divergences separately.
For the first one, we can set p = 0 and write the integral, say, as I(m2R ),
scaling out the factor iλ2R /6. It is a little easier to compute the derivative
I (m2R ) rather than I(m2R ) itself. With a little bit of guess work, you can
separate out I (m2R ) into two terms of which the second one is finite and
the first one is given by
1
dn k dn q
2
I (mR ) = −3
(P.49)
(2π)d (2π)d (k 2 + m2R )(q 2 + m2R )2 (k + q)2
This is a standard integral and its evaluation leads to the result
I (m2R ) = −
3(m2R )−2 [Γ( )]2
2(4π)4−2 1 −
Integrating I (m2R ), we find the divergent part to be
3m2R μ−4 1
1
4πμ2
2
I(mR ) = −
+
3 − 2γE + 2 log 2
2(4π)4
2
mR
(P.50)
(P.51)
This takes care of the divergence of O(p0 ).
To determine the second divergence of O(p2 ), it is again good to combine
the denominators of the integral in I5 by the Schwinger-Feynman trick.
One can then pick out the divergent pieces which are proportional to p2 .
This requires a fair amount of algebraic ingenuity but the final answer is
given by
p2 μ−4 ∞
xyz
dx dy dz δ(x + y + z − 1)
(P.52)
D=−
2(4π)4 0
(xy + yz + zx)3
This triple integral, in spite of the appearances, has the simple value of
(1/2). This is most easily done by writing the denominator using the
Schwinger trick, obtaining
∞
1 ∞
D=
dx dy dz δ(x + y + z − 1) xyz
dt t2 e−t(xy+yz+zx) (P.53)
2 0
0
√
If you now rescale x, y, z by t and change variables to u = yz, v =
zx, w = xy (with the Jacobian du dv dw = 2xyzdxdydz), then you will
find that the expression reduces to
1
1 ∞
(P.54)
du dv dw e−(u+v+w) =
D=
2 0
2
Therefore, the final answer for the divergent part is given by
iλ2R m2R μ−4
1
4πμ2
p2
1
I5 =
− 2+
−3 + 2γE − 2 log 2 −
4(4π)4
mR
6m2R
(P.55)
Adding up all the divergent contributions, we get our final result to be
p2
1
iλ2R m2R μ−4 2
− −
(P.56)
I=
+ iz2 λ2R p2 − ia2 λ2R m2R
4(4π)4
2
6m2R
270
A Potpourri of Problems
You would have noticed that there were several cancellations, as to be
expected. The remaining divergences can be eliminated if we choose
μ−4
μ−4
1
2
z2 =
,
a
=
−
(P.57)
2
24(4π)4
4(4π)4 2
Clearly, we now need a non-zero wavefunction renormalization.
Problem 10. Strong Field Limit of the Effective Lagrangian in
QED
In the text, we computed the QED effective Lagrangian resulting from integrating out the charged scalar and fermionic particles. The real part of the
effective Lagrangian was computed for weak fields in a Taylor series expansion. Do the same for strong fields and show that the resulting corrections
have the asymptotic forms given by
3
e2
eff
2
2
2 (c)
2
2
(E − B ) log −4e
(E − B ) + · · ·
(P.58)
Lφ ≡ −
192π 2
(mc2 )4
for the scalar field and
Leff
ψ
3
e2
2
2
2 (c)
2
2
≡−
(E − B ) log −4e
(E − B ) + · · ·
48π 2
(mc2 )4
(P.59)
for the fermionic field.
Solution: This is completely straightforward and can be obtained by evaluating the parametric integral for Leff using the saddle point approximation.
Problem 11. Structure of the Little Group
4 See
Sect. 5.4.7.
5 As we said in the main text, nobody has found a physical situation in
which θ is relevant. So we consider
only states with a = b = 0.
Let |p; a, b be an eigenstate of the operators A and B with eigenvalues a
and b related to the Little Group in the massless case.4 Show that one
can now construct another eigenstate |p; a, b, θ of A and B with θ being a
continous parameter, leading5 to a continous set of eigenvalues if a = 0 or
b = 0.
Solution: The required state can be defined through the relation |p; a, b, θ
3
≡ e−iθJ |p; a, b. Using the usual trick of writing
3
3
3
3
(P.60)
A e−iθJ |p; a, b = e−iθJ eiθJ A e−iθJ |p; a, b
3
3
and the commutation rule eiθJ A e−iθJ = A cos θ − B sin θ, it is easy to
show that |p; a, b, θ is indeed an eigenket of A and B with eigenvalues
(a cos θ − b sin θ) and (a sin θ + b cos θ) respectively.
Problem 12. Path Integral for the Dirac Propagator
In the text, we showed that the propagator G(x, y) for a spinless relativistic
particle can be obtained by a parametric integration over the proper time
s of a 5-dimensional kernel K(x, y; s). The latter can be thought of as a
271
A Potpourri of Problems
quantum mechanical propagator of the form x| exp(−isH)|y with H =
p2 +m2 . This allows one to obtain a path integral representation for G(x, y)
which is manifestly Lorentz invariant. Show that one can carry out similar
steps for a spin-half Dirac particle as well. What kind of paths should one
sum over in the spacetime in this case?
Solution: There is a general trick to obtain any propagator of the form
d4 p eip·(x−y)
(P.61)
G(x, y) ≡
(2π)4 H(p)
in terms of a parametric integration. To do this, we use the Schwinger trick
to write the denominator H(p) in an integral representation and use the
fact that p|y = exp(ipy). This gives
∞
∞
d4 p −isH(p)−ip·(x−y)
ˆ
G(x, y) = i
ds
e
=
i
dsx|e−isH(p)
|y
(2π)4
0
0
∞
ds K(x, y; s)
(P.62)
≡ i
0
In the final expression, we think of H(ˆ
p) as a quantum mechanical operator
with the usual representation pˆa = i∂a . (It is assumed that H has a
regularizer −i to make the integral converge in the upper limit.) The
kernel K(x, y; s) has the obvious path integral representation given by
s
dτ [pa x˙ a − H(p)]
(P.63)
K(x, y; s) = Dp Dx exp i
0
In the case of a spin half particle, we can take H(p) = γ a pa − m + i ,
resulting in the path integral representation for the Dirac propagator
∞
ds D4 x D4 p eiA
(P.64)
S(x, y) =
0
where
s
dτ [pa x˙ a − (γ a pa − m)]
A=
(P.65)
0
Notice that, unlike the bosonic case, the Hamiltonian is now (formally)
linear in the momentum. (The result is formal because both the kernel and
the Dirac propagator are 4 × 4 matrices.) If we proceed in the standard
manner, the functional integration over the momentum will lead to a Dirac
delta functional of the form δ[x˙ a −γ a ]! If we interpret this result in terms of
the eigenvalues, this would require every bit of the path to have a velocity
equal to ±1. So the paths we need to sum over are on the light cone
everywhere.6
Problem 13. Dirac Propagator as a Green Function
Show that the Dirac propagator satisfies the differential equation for the
Green’s function, viz.
(iγ m ∂m − m)S(x − y)ab = iδ (4) (x − y)δab
by explicitly working it out in the coordinate representation.
(P.66)
6 One can actually compute such a
path integral and obtain the Dirac
propagator.
Stated without any
preamble, this construction will look
unnecessarily mysterious. But, as we
have emphasized, once you know the
propagator — say, from the field theory — it is trivial to do a reverse engineering and obtain this path integral expression for the Dirac propagator. (The details of this calculation are available, for example, in J.V.
Narlikar, ‘Path Amplitudes for Dirac
Particles’, (1972), Jour.Indian Math.
Soc, 36, p.9.) Unfortunately, trying to
get the propagators for higher spin particles using path integrals — even for
a massive spin one particle — does not
lead to any better insight than what is
provided by the field theory.
272
A Potpourri of Problems
Solution: This is straightforward to do in Fourier space but may be fun to
work it out in real space to see how exactly it comes about. We need to
start with the expression in
S(x−y)ab = θ(x0 −y 0 )0|ψa (x)ψ¯b (y)|0−θ(y 0 −x0 )0|ψ¯b (y)ψa (x)|0 (P.67)
and act on it with the operator (iγ m ∂m −m). The computations are simplified by noticing the following fact: The partial derivative ∂m will act on the
correlator as well as on the θ function. Of these, the terms arising from ∂m
acting on the correlator will lead to terms like (iγ m ∂m −m)0|ψa (x)ψ¯b (y)|0
which identically vanish due to the Dirac equation. So, we need to only
worry about terms arising from the θ function, leading to Dirac delta functions. These terms give, through straightforward computation, the result:
(iγ m ∂m − m)S(x − y)ab
=
iγ 0 δ(x0 − y 0 )0|{ψa (x), ψ¯b (y)}|0,
=
iγ 0 δ(x0 − y 0 )0|{ψa (x), ψb† (y)γ 0 }|0,
=
i(γ 0 )2 δ(x0 − y 0 ){ψa (x), ψb† (y)}0|0,
=
iδ(x0 − y 0 )δ (3) (x − y)δab
(P.68)
Problem 14. An Alternative Approach to the Ward Identities
Consider a field theory with an action A(φ) which is invariant under the
transformation φ → φ = φ + δ φ where is a infinitesimal constant parameter. Let j m be the associated conserved current in the classical theory.
(a) Show that in quantum theory, ∂m j m = 0 where · · · denotes the
path integral average.
(b) Let O(φ) denote a member of a class of local operators which change
as O(φ) → O(φ + δ φ) = O(φ) + δ O where δ O := (δ φ)(∂O/∂φ), when
the field changes by φ → φ = φ + δ φ. Prove that
∂μ j (x)
μ
n
i=1
7 So
this is not a gauge transforma-
tion.
8 The current may now include a possible contribution from the change in
the path integral measure as well.
Oi (xi ) = −
n
δ D (x − xi )δOi (xi )
i=1
Oj (xj )
(P.69)
j =i
That is, the divergence of a correlation function involving the current j m
and a product of n local operators vanishes everywhere except at the locations of the operator insertions.
¯
(c) Consider the transformations ψ → ψ = eiα ψ, ψ¯ → ψ¯ = e−iα ψ,
7
Am → Am = Am in standard QED where we have not changed Aj . Promote this to a local transformation (again without changing Aj ) and obtain
¯ 2 ).
the resulting Ward identity related to the operator j m (x)ψ(x1 )ψ(x
Solutions: (a) Since the theory is invariant when = constant, the variation
of the path integral to the lowest order must be proportional to ∂m when
is promoted as a spacetime dependent function. Therefore, to the lowest
order, we must have8
Z = Dφ e−A[φ ] = Dφ e−A[φ] 1 −
j m ∂m dD x
(P.70)
M
Since the zeroth order term on both sides of Eq. (P.70) are equal, and (x)
is arbitrary, an integration by parts leads to the result ∂m j m = 0.
273
A Potpourri of Problems
(b) Start with the result
Dφ e−A[φ]
n
Oi (φ(xi )) =
×
1−
n
Oi (φ (xi )) =
Dφ e−A[φ]
i=1
i=1
Dφ e−A[φ ]
⎤
⎡
n
n
j m ∂m dD x ⎣ Oi (xi ) +
δ Oi (xi )
Oj (xj )⎦
M
i=1
i=1
j =i
(P.71)
The first equality is trivial and the second follows by writing φ = φ + δ φ
and expanding Dφ exp(−A[φ ]) and the operators to first order in (x).
This leads to the result
n
n
dD x (x)∂m j m (x)
Oi (xi ) = −
δ Oi (xi )
Oj (xj )
M
i=1
i=1
j =i
(P.72)
We now use a simple trick of writing
δ Oi (xi ) =
dD x δ D (x − xi ) (x) δOi (xi )
(P.73)
M
so that all the terms in Eq. (P.72) become proportional to (x). This leads
to the result quoted in the question.
(c) You should first verify that the transformation stated in the text with
constant α is indeed a quantum symmetry. The QED action is definitely
invariant but you need to verify that the Jacobian of the path integral
measure is indeed unity, which turns out to be the case.
¯ m ψ and the correlation funcConsider now the resulting current j m = ψγ
¯
tion ψ(x1 )ψ(x2 ). Since we now have δψ ∝ ψ, the Ward identity obtained
in part (b) above becomes
¯ 2 ) = −δ D (x−x1 )ψ(x1 )ψ(x
¯ 2 )+δ D (x−x2 )ψ(x1 )ψ(x
¯ 2 )
∂m j m (x)ψ(x1 )ψ(x
(P.74)
If you now introduce the Fourier transforms (in D = 4) by:
¯ 2 )
M m (p, k1 , k2 ) := d4 x d4 x1 d4 x2 eip·x eik1 ·x1 e−ik2 ·x2 j m (x)ψ(x1 )ψ(x
¯ 2 )
(P.75)
M0 (k1 , k2 ) := d4 x1 d4 x2 eik1 ·x1 e−ik2 ·x2 ψ(x1 )ψ(x
then we get the momentum space Ward identity
ipm M m (p, k1 , k2 ) = M0 (k1 + p, k2 ) − M0 (k1 , k2 − p)
(P.76)
This is precisely the one obtained in the text.
Problem 15. Chiral Anomaly
In the text, we computed the expectation value of the electromagnetic
current in a quantum state, |A, hosting an external vector potential Aj . In
a similar fashion, one can define a pseudoscalar current as the expectation
274
A Potpourri of Problems
5
¯
ψ(x)|A. As usual, define a and b by E · B = ab,
value J5 ≡ A|ψ(x)γ
2
2
2
2
E −B =a −b .
(a) Compute J5 and show that it is given by
J5 = i
e2
ab
4π 2 m
(P.77)
(b) When m = 0, the Dirac Lagrangian is invariant under the the chiral
transformation ψ → exp(iαγ 5 ) ψ, leading to a conserved current J m5 . This
current is no longer conserved when m = 0. Show that
A|∂m J m5 |A = −
e2
ab
2π 2
(P.78)
Solutions: (a) The procedure is almost identical to what we did for the
scalar and fermionic currents in the text. We first note that J5 can be
expressed in the form
5
¯
ψ(x)|A = −Tr x|GA γ 5 |x
J5 = A|ψ(x)γ
∞
2
2
= −Tr
ds e−ism x|(γp − eγA + m)γ5 ei(γp−eγA) s |x
∞0
2
ˆ
= −m
ds e−ism Tr x|γ5 e−iHs |x
(P.79)
0
The evaluation of the matrix element proceeds exactly as before, using:
x|e−iHs |x = x; 0|x; s =
ˆ
(es)2 ab
i i(1/2)esσmn F mn
e
2
16π
Im cosh(es(b + ia))
(P.80)
and
leading to
(σmn F mn )2 = 4(b2 − a2 ) + 8iγ5 ab
(P.81)
Tr γ5 ei(1/2)esσF = −4i Im cosh(es(b + ia))
(P.82)
These results should be enough for you to obtain the result quoted in the
question.
(b) Recall that the standard QED Lagrangian is invariant under the chiral
transformation ψ → exp(iαγ 5 ) ψ in the limit of m → 0, with a conserved
¯ k γ 5 ψ. When m = 0, this axial current satisfies the
current J k5 = ψγ
condition
¯ 5ψ
(P.83)
∂k J k5 = 2imψγ
5
¯
Using our result for A|ψ(x)γ
ψ(x)|A, we find the result quoted in the
question.
Problem 16. Compton Scattering: A Case Study
We never completely worked out any probability amplitude using the treelevel Feynman diagrams in the main text. The purpose of this exercise is
to encourage you to do this for Compton scattering, with a bit of guidance
in the form of a solution.
275
A Potpourri of Problems
One possible experimental set up for Compton scattering is as follows. A
photon with energy ω, travelling along the z-axis, hits an electron at rest.
After the scattering, the photon carries an energy ω and travels at an angle
θ to the z-axis. You can assume that the photon is initially unpolarized (so
that you should average over the initial polarizations) and the electron is in
a mixed spin state (so that you average over the initial spins). In the final
state, we usually measure the momenta of the electron and photon but not
the final polarization or spin. (So you can sum over the final polarization
and spins as well.)
(a) Draw the relevant tree-level Feynman diagrams and translate them into
an algebraic expression for the amplitude M.
(b) Show that
11
|M|2
(P.84)
2 2
s,s
pol
2
p12
1
1
p14
1
1
4
2
4
= 2e
+m
+
+ 2m
−
−
p14
p12
p12
p14
p12
p14
where, in the left hand side, you have averaged over the polarizations and
the spins of both the initial and the final states, and pAB is a shorthand
for pA − pB .
Solutions: (a) This part is easy. There are two contributing diagrams
shown in Fig. P.3. Writing down the relevant algebraic expressions, we
find that the net tree-level amplitude is given by
n
γ (γp1 + γp2 + m)γ m
iM = −ie2 m (p2 ) ∗n (p4 ) u
¯s (p3 )
(p1 + p2 )2 − m2
γ m (γp1 − γp4 + m)γ n
us (p1 )
+
(p1 − p4 )2 − m2
≡
−ie2 m (p2 ) ∗n (p4 )Mmn
(P.85)
(b) This part involves lots and lots of work but is a fairly standard calculation in QED. You should be able to fill in the details of the steps
outlined below. To begin with, we need to compute |M|2 . So calculate
(Mmn )∗ = (Mmn )† . Evaluating this expression from the previous result,
we get
n
γ (γp1 − γp4 + m)γ m
†
(Mmn ) = u
¯†s (p1 )
(p1 − p4 )2 − m2
γ m (γp1 + γp2 + m)γ n
us (p3 ) (P.86)
+
(p1 + p2 )2 − m2
We can now write down an expression for |M|2 . It is easiest to average
over the initial and final polarizations at this stage. This will give
1
|M|2
2
=
†
1 4
e
m (p2 ) ∗a (p2 )
∗n (p4 ) b (p4 )Mmn Mab
2
=
†
1
1 4
e (−gma )(−gnb )Mmn Mab = e4 Mmn M†mn (P.87)
2
2
pol
pol
pol
p1
p4
p1 + p2
p2
p3
p1
p4
p1 − p4
p2
p3
Figure P.3: Feynman diagram for the
Compton scattering
276
A Potpourri of Problems
The next step is to average over the spins. The summing over s is fairly
easy, and we get:
11
1
|M|2 = e4
u
¯s (p3 )M us (p3 )
(P.88)
2 2
4
s
ss
with
M
≡
pol
γ m (γp1 − γp4 + m)γ n
γ n (γp1 + γp2 + m)γ m
+
(γp1 + m)
(p1 + p2 )2 − m2
(p1 − p4 )2 − m2
γn (γp1 − γp4 + m)γm
γm (γp1 + γp2 + m)γn
+
(P.89)
×
(p1 − p4 )2 − m2
(p1 + p2 )2 − m2
Plugging everything in, the problem reduces to the calculation of a bunch
of traces of the gamma matrices in the expression
11
|M|2
2 2
ss
pol
γ n (γp1 + γp2 + m)γ m
γ m (γp1 − γp4 + m)γ n
1 4
= e Tr
+
(γp1 + m)
4
(p1 + p2 )2 − m2
(p1 − p4 )2 − m2
γn (γp1 − γp4 + m)γm
γm (γp1 + γp2 + m)γn
+
(γp3 + m)
×
(p1 − p4 )2 − m2
(p1 + p2 )2 − m2
(P.90)
This is fairly tedious and the best way to do this is to separate it out term
by term. For example, the term which has a factor m4 reduces to
γnγm
1 4 4
γmγn
e m Tr
Q =
+
4
(p1 + p2 )2 − m2
(p1 − p4 )2 − m2
γn γm
γm γn
×
+
(p1 − p4 )2 − m2
(p1 + p2 )2 − m2
1
1
1
+ 2 + 2
(P.91)
= 4e4 m4
p12 p14
p12
p14
If you similarly calculate all the rest and add them up, you should get the
result quoted in the question.
Problem 17. Photon Mass from Radiative Corrections in d = 2
We saw that the radiative corrections preserve the condition m = 0 for the
photon in standard QED in d = 4. This depended crucially on the fact
that the Π(p2 ) — computed by summing the electron loop contribution to
the photon propagator — has no pole at p = 0. Curiously enough, the
situation is different in d = 2 where Π(p2 ) acquires a pole, and as a result,
the photon acquires a mass! Prove this result — originally obtained by
Schwinger — by computing Πmn (p2 ) in d = 2.
Solution: We now need to compute in d = 2 the standard integrals for
Πmn (p) and regularize them. We will choose dimensional regularization
and set D = 2 − . Then the relevant integral is given by
dD q Tr [(γq − γp)γn γqγm ]
2
2
(P.92)
−iΠmn (p) = (ie) (−i )
(2π)D
q 2 (q − p)2
277
A Potpourri of Problems
We will use, for the traces, the relations9
Tr(γm γn γr γs ) = 2 (gmn grs − gmr gns + gms grn )
(P.93)
The integral can be evaluated exactly as we did in the main text for d = 4,
and will now lead to the result
2ie2π D/2 1
x2 pm pn
dx 2
Γ
1
+
−iΠmn = −
D
(2π)
2
(−p2 x + p2 x2 )1+/2
0
gmn
1
−
Γ
2 (−p2 x + p2 x2 )/2
2
x2 p2
Γ 1+
− gmn
2
2
2
1+/2
2
(−p x + p x )
1
2−
−
Γ
2 (−p2 x + p2 x2 )/2
2
xpm pn
−2
Γ
1
+
2
(−p2 x + p2 x2 )1+/2
2
p x
(P.94)
Γ 1+
+ gmn
2
(−p2 x + p2 x2 )1+/2
Tr(γm γn ) = 2gmn ;
Taking the limit D → 2, → 0, we find that
−iΠmn (p) = i(pm pn − p2 gmn )Π(p2 ) = −
ie2
(pm pn − p2 gmn )
πp2
(P.95)
This is a finite quantity in d = 2. The sum of the electron loops will now
lead to a photon propagator of the form
i gmn − (pm pn /p2 )
(P.96)
iDmn (p) = −
p2 (1 − Π(p2 ))
Because Πmn (p2 ) ∝ (1/p2 ), the resulting propagator has the structure
i gmn − (pm pn /p2 )
iDmn (p) = −
(P.97)
p2 − (e2 /π)
√
showing that the photon acquires a mass mγ = e/ π.
Problem 18. Electron Self-Energy with a Massive Photon
Compute Σ(p, m) for the one loop electron self-energy graph when the photon has a mass mγ . This can be done using dimensional regularization but
for some additional practice, try it out using a technique called the PauliVillars regularization. The idea behind this regularization is to replace the
(1/k 2 ) of the photon propagator by
1
1
1
→ 2− 2
2
k
k
k − Λ2
(P.98)
where Λ is a large regulator mass scale. Obviously, Λ → ∞ gives back
the correct photon propagator, but the modified photon propagator in
9 The coefficient 2 on the right hand
side could have been replaced by any
analytic function F (D) with F (2) = 2.
We simplify the notation by setting it
to 2 right from the start.
278
A Potpourri of Problems
Eq. (P.98) has better UV convergence which will make the integrals finite. Show that you can now express Σ in the form Σ(p, m) = A(p2 )(γp) +
mB(p2 ), where:
(1 − α)Λ2
dα (1 − α) ln
−α(1 − α)p2 + αm2 + (1 − α)m2γ
0
1
(1 − α)Λ2
4e2
B(p2 ) = −
dα
ln
(P.99)
(4π)2 0
−α(1 − α)p2 + αm2 + (1 − α)m2γ
2e2
A(p ) = +
(4π)2
1
2
Solution: This is completely straightforward, though a bit tedious. You
can perform the calculation exactly as in the case of mγ = 0. The photon
propagator can be taken as
Dmn =
−iηmn
k 2 − m2γ
in the generalization of a Feynman gauge.
(P.100)
Annotated Reading List
There are several excellent text books about QFT, with more being written
every year, if not every month! It is therefore futile to give a detailed
bibliography on such a well developed subject. I shall content myself with
sharing my experience as regards some of the books in this subject, which
the students might find useful.
1. I learnt quantum field theory, decades back, from the following two
books:
• Landau L.D. and Lifshitz E.M., Relativistic quantum theory,
Parts I and II, (Pergamon Press, 1974).
The first three chapters of Part I are masterpieces of — concise
but adequate — description of photons, bosons and fermions.
There is a later avatar of this Volume 4 of the Course of Theoretical Physics, in which the discussion of some of the “outdated”
topics are omitted, and hence is much less fun to read!
• Roman P., Introduction to Quantum Field Theory, (John Wiley,
1969).
The discussion of the formal aspects of QFT (LSZ, Wightman
formalism, ...) is presented in a human readable form. Unfortunately, the author does not discuss QED.
2. The following two books are very different in style, contents and intent, but I enjoyed both of them.
• Zee A., Quantum Field Theory in a Nutshell, (Princeton University Press, 2003).
Possibly the best contemporary introduction to QFT, with correct balance between concepts and calculational details and delightful to read.
• Schwinger J., Particles, Sources and Fields, Vol. I, II and III,
(Perseus Books, 1970).
An extraordinarily beautiful approach which deserves to be better known among students — and also among professors but
(alas!) they are usually too opinionated to appreciate this work
— for clarity, originality and efficiency of calculations.
You will find the influence of the above two books throughout the
current text!
3. A conventional treatment of QFT, covering topics more extensive
(and different) compared to the present one, from a modern perspective can be found in the following two books:
© Springer International Publishing Switzerland 2016
T. Padmanabhan, Quantum Field Theory, Graduate Texts in Physics, DOI 10.1007/978-3-319-28173-5
279
280
Annotated Reading List
• Peskin M. E. and Schroeder D.V., An introduction to Quantum
Field Theory, (Addison-Wesley Publishing Company, 1995).
• Alvarez-Gaume L. and Vazquez-Mozo M. A., An Invitation to
Quantum Field Theory, (Springer, 2012).
Another modern treatment, clear and to the point, is available in
• Maggiore Michele, A Modern Introduction to Quantum Field
Theory, (Oxford University Press, 2005).
There are, doubtless, several other books of similar genre, but these
are among my personal favourites.
4. Very down-to-earth, unpretentious discussion, with rich calculational
details can be found in the books by Greiner. Three of them which
are of particular relevance to QFT are:
• Greiner W., Relativistic Quantum Mechanics Wave Equations,
(Springer, 2000).
• Greiner W. and Reinhardt J., Quantum Electrodynamics, (Springer,
1994).
• Greiner W. and Reinhardt J., Field Quantization, (Springer,
1996).
These are very student-friendly and give details of the calculations
which you may not find in many other books.
Index
action
classical field theory, 70
classical mechanics, 68
complex scalar field, 77
Dirac field, 212
electromagnetic field, 82
harmonic oscillators, 37
real scalar field, 74
vacuum persistence amplitude,
56
anomalous magnetic moment, 249,
250
antiparticle
causality, 35
charge and mass, 34
necessity for, 36
propagation amplitude, 32
antiphoton, 107, 124
Bernoulli numbers, 117
beta function
λφ4 theory, 152
electromagnetic field, 146
QED, 228, 243
Casimir effect, 115, 263
causality, 21
antiparticle, 35
Dirac field, 221
chiral anomaly, 274
cosmological constant
running of, 265
zero-point energy, 265
Coulomb field
QED corrections to, 244
Davies-Unruh effect, 96
dimensional regularization, 62, 120
λφ4 theory, 169
electron propagator, 246
photon propagator, 238
vertex function, 250
vs cut-off regularization, 171
Dirac equation, 197
plane wave solutions, 214
Dirac matrices, 196
Dirac representation, 196
Weyl representation, 196
effective action, 58, 136
λφ4 theory, 147
electromagnetic field, 139
ground state energy, 137
perturbation theory, 162, 255
effective field theory, 133, 150
effective Lagrangian, 58
calculational techniques, 59
electromagnetic field, 180, 228
QED, 228
renormalization, 139
strong field limit, 270
zero-point energy, 59
effective potential
for m = 0, 264
scalar field, 149
electromagnetic field, 2
action, 82
antiphoton, 124
as a gauge field, 78
Coulomb gauge, 86
covariant quantization, 110
energy momentum tensor, 84
quantization in radiation gauge,
106
running coupling constant, 145
spontaneous emission, 124
stimulated emission, 124
electron propagator
at one loop, 245
calculation of, 256
energy momentum tensor, 73, 75
Euclidean propagator, 25
Fadeev-Popov procedure, 113
Feynman diagrams
λφ4 theory, 154
Compton scattering, 274
© Springer International Publishing Switzerland 2016
T. Padmanabhan, Quantum Field Theory, Graduate Texts in Physics, DOI 10.1007/978-3-319-28173-5
281
282
Feynman propagator
analytical structure, 125
complex scalar field, 102
Dirac field, 219
electromagnetic field, 112
fall-off condition, 127
general expression, 219
real scalar field, 102
time-ordered product, 102
Feynman rules
λφ4 theory, 156
for QED, 231
in momentum space, 160
scalar electrodynamics, 267
functional calculus, 64
gauge invariance
consequences of, 84
Gauss theorem, 72
Gordon identity, 216, 249
Grassmannians, 222
ground state energy, 11
group
generators of, 203
matrix representation
of, 201
Hamiltonian evolution
non-relativistic particle, 6
path integral, 9
relativistic particle, 27
harmonic oscillators
action, 37
propagation amplitude, 30
zero-point energy, 26
Higg’s mechanism
spontaneous symmetry
breaking, 90
imaginary time, 10
in statistical mechanics, 12
infrared divergence, 246, 277
Jacobi action
path integral, 13
Klein-Gordon equation, 74
Lorentz group
generators of, 203
Lie algebra of, 205
representation of, 205
LSZ formalism, 165
INDEX
non-Abelian gauge theories, 83
non-relativistic limit, 39, 78
non-relativistic particle
Hamiltonian evolution, 6
path integral, 4
propagation amplitude, 6
normal ordering
zero-point energy, 95
notation, 3
one loop divergences
λφ4 theory, 169
electron propagator, 245
photon propagator, 234
QED, 232
vertex correction, 248
one loop renormalization
λφ4 model, 177
QED, 250
particle localization, 2, 5, 20, 28, 262
path integral
Davies-Unruh effect, 101
fermions, 222
for Dirac propagator, 271
from time slicing, 39
Hamiltonian evolution, 9
Jacobi action, 13
non-relativistic particle, 4
relativistic particle, 16, 41
transitivity constraint, 4
vacuum functional, 99
Pauli exclusion principle, 2, 190
Pauli matrices, 191
properties of, 192
Pauli-Villars regularization, 277
perturbation theory
λφ4 theory, 153
running coupling constant, 171
photon propagator
at one loop, 234
Poincare group, 209
Casimir operators, 210
Lie algebra of, 210
Little group of, 270
propagation amplitude
3-d Fourier transform, 22
4-d Fourier transform, 24
antiparticle, 32
harmonic oscillators, 30
non-relativistic particle, 6
relativistic particle, 19
INDEX
time-ordered product, 31
vacuum persistence amplitude,
48
quantization
complex scalar field, 77, 101
Dirac field, 216
electromagnetic field, 105
in Schrodinger picture, 90
real scalar field, 74, 91
quantum field theory
necessity for, 1
relativistic particle
Hamiltonian evolution, 27
path integral, 16, 41
propagation amplitude, 19
renormalization, 134
beta function, 146
effective Lagrangian, 139
running coupling constant, 145
renormalization group, 133
Rindler metric, 98
rotations, 194
running coupling constant
λφ4 theory, 151
electromagnetic field, 145
in perturbation theory, 171
QED, 228, 233, 241
283
spontaneous symmetry breaking, 3,
87, 104
complex scalar field, 88
ferromagnet, 88
Higg’s mechanism, 90
real scalar field, 87
superconductivity, 90
with gauge field, 89
time-ordered product
Dirac field, 220
Feynman propagator, 102
propagation amplitude, 31
transitivity constraint
path integral, 4
relativistic particle, 20
transverse delta function, 108
two-loop renormalization
λφ4 theory, 267
vacuum functional, 96
path integral, 99
vacuum persistence amplitude, 12,
45
action, 56
functional Fourier transform, 53
imaginary part of, 50
interaction energy, 50
propagation amplitude, 48
vertex correction
at one loop, 248
vertex function
calculation of, 257
scattering amplitude
analytical structure of, 185
Schwinger effect
charged scalar field, 142
fermions, 227
Ward-Takahashi identities, 224, 272
spectral density, 126
zero-point energy
Spin magnetic moment
Casimir effect, 116
from the Dirac equation, 200
cosmological constant, 265
from the Pauli equation, 193
Dirac field, 218
spinor
effective Lagrangian, 59, 180
adjoint of, 212
electromagnetic field, 107
transformation under boosts, 198
normal ordering, 95
transformation under parity, 208
real scalar field, 94
transformation under rotation,
zeta
function, 129
195